CN119440621A - Model conversion method, related device and medium - Google Patents
Model conversion method, related device and medium Download PDFInfo
- Publication number
- CN119440621A CN119440621A CN202411439111.2A CN202411439111A CN119440621A CN 119440621 A CN119440621 A CN 119440621A CN 202411439111 A CN202411439111 A CN 202411439111A CN 119440621 A CN119440621 A CN 119440621A
- Authority
- CN
- China
- Prior art keywords
- model
- image
- target
- verification
- conversion
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/70—Software maintenance or management
- G06F8/76—Adapting program code to run in a different environment; Porting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/60—Software deployment
- G06F8/61—Installation
- G06F8/63—Image based installation; Cloning; Build to order
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Processing Or Creating Images (AREA)
Abstract
The embodiment of the disclosure provides a model conversion method, a related device and a medium. The method comprises the steps of deploying a model conversion mirror image to a node where a target chip is located, and converting an original model into a target model corresponding to the target chip through the model conversion mirror image. And then the original model verification mirror image and the target model verification mirror image are used for respectively verifying the original model before conversion and the target model after conversion, and a conversion deviation value capable of judging the model conversion effect is obtained according to the first verification result and the second verification result, so that the automatic verification of the target model after conversion is realized. Based on the verification result (conversion deviation value), an image is built on the target model meeting the requirements and stored in an image warehouse so as to be called later. Embodiments of the present disclosure are directed to enabling the usability of a converted model. The embodiment of the disclosure can be applied to scenes such as computer vision, natural language processing, voice recognition, automatic driving, edge calculation and the like.
Description
Technical Field
The disclosure relates to the technical field of deep learning, in particular to a model conversion method, a related device and a medium.
Background
The development of deep learning algorithms is not only stopped at the framework level, but also various novel chip structures are continuously emerging. These chip structures have strong advantages in deep learning applications, such as high speed computing, low power consumption, etc. Therefore, the deep learning model is converted into a model suitable for various chips, and further application of the deep learning is facilitated.
In the related art, existing model conversion is rarely directed at automatic conversion among different frames and chips, and is a single conversion function, so that effects before and after model conversion cannot be compared, and the model effect obtained by final conversion cannot be ensured. How to guarantee the usability of the converted model is a problem to be discussed currently.
Disclosure of Invention
The embodiment of the disclosure provides a model conversion method, a related device and a medium, which aim to ensure the usability of a converted model.
In a first aspect, an embodiment of the present disclosure provides a model conversion method, including:
obtaining a target image from an image warehouse, wherein the target image comprises a model conversion image, an original model verification image and a target model verification image;
Deploying the model conversion mirror image to a node with a target chip;
Acquiring an original model through the model conversion mirror image and converting the original model into a target model, wherein the target model can be deployed on the target chip;
calling the original model verification mirror to verify the original model to obtain a first verification result;
invoking the target model checking mirror to check the target model to obtain a second checking result;
Determining a conversion deviation value according to the first check result and the second check result;
Under the condition that the conversion deviation value is smaller than a preset threshold value, constructing a target model mirror image;
and sending the target model image to the image warehouse.
In a second aspect, embodiments of the present disclosure provide a model conversion system, comprising:
the first device is used for acquiring a target image from an image warehouse, wherein the target image comprises a model conversion image, an original model verification image and a target model verification image;
the first device is used for deploying the model conversion mirror image to a second device with a target chip;
The second device is configured to obtain an original model through the model conversion mirror image and convert the original model into a target model, where the target model can be deployed on the target chip;
the second device is used for calling the original model verification mirror to verify the original model to obtain a first verification result;
The second device is used for calling the target model verification mirror to verify the target model to obtain a second verification result;
the second device is configured to determine a conversion deviation value according to the first check result and the second check result;
The second device is configured to construct a target model image when the conversion deviation value is less than a preset threshold value;
the second device is configured to send the target model image to the image repository.
Optionally, the target image further comprises an original model evaluation image and a target model evaluation image;
The second device is used for calling the original model evaluation mirror to evaluate the original model under the condition that the target model is quantized, so as to obtain a first performance result of the original model;
the second device evaluates the quantized target model through the target model evaluation mirror to obtain a second performance result of the quantized target model;
the second device is specifically configured to determine to construct the target model image according to the conversion deviation value, the first performance result, and the second performance result.
Optionally, the first device is configured to obtain evaluation task information, and generate a first evaluation configuration file and a second evaluation configuration file according to the evaluation task information;
The first device is further configured to send an original model evaluation mirror address, a data set address, and an original model address to the second device according to the first evaluation configuration file;
the second device is used for pulling the original model evaluation mirror image, the first evaluation data set and the original model through the original model evaluation mirror image address, the data set address and the original model address;
The second device is used for evaluating the original model according to the first evaluation data set through the original model evaluation mirror image to obtain the first performance result;
The first device is further configured to send a target model evaluation mirror address, the data set address, and a target model address to the second device according to the second evaluation configuration file;
The second device is used for pulling the target model evaluation mirror image, a second evaluation data set and the target model through the target model evaluation mirror image address, the data set address and the target model address;
and the second device is used for evaluating the target model according to the second evaluation data set through the target model evaluation mirror image to obtain the second performance result.
Optionally, the target image further includes a data initialization image;
the first device is used for deploying the data initialization mirror image to the second device with a target chip;
the second device is used for obtaining data corresponding to the original model through the data initialization mirror image and sharing the data to the model conversion mirror image;
the second device is used for determining the original model based on the data corresponding to the original model through the model conversion mirror image and converting the original model into the target model.
Optionally, the first device is configured to obtain verification task information, and generate a first verification configuration file according to the verification task information;
The first device is further configured to send an original model verification mirror address and an original model address to the second device according to the first verification configuration file;
and pulling the original model verification mirror image and the original model by the second equipment through the original model verification mirror image address and the original model address so that the original model is verified by the second equipment through the original model verification mirror image to obtain the first verification result.
Optionally, the first device is configured to obtain verification task information, and generate a second verification configuration file according to the verification task information;
The first device is further configured to send a target model verification mirror address and a target model address to the second device according to the second verification configuration file;
and the second equipment pulls the target model verification mirror image and the target model through the target model verification mirror image address and the target model address so that the second equipment verifies the target model through the target model verification mirror image to obtain a second verification result.
Optionally, the first device is configured to generate an image building file when the conversion deviation value is smaller than a preset threshold, where the image building file is used to generate the target model image;
The first device is further configured to encapsulate the image build file and the object model into a build data packet, and send the build data packet to the second device;
The second device is used for disassembling the construction data packet so that the second device can construct the target model image according to the image construction file and the target model;
the second device is used for uploading the target model to a storage service, and recording the association of the target model mirror image and the storage path of the target model.
In a third aspect, an embodiment of the present disclosure provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the model conversion method as in the first aspect when executing the computer program.
In a fourth aspect, embodiments of the present disclosure provide a computer-readable storage medium storing computer-executable instructions that when executed by a computer implement the model conversion method as in the first aspect.
In a fifth aspect, embodiments of the present disclosure provide a computer program product comprising a computer program or computer instructions stored in a computer-readable storage medium, the computer program or computer instructions being read from the computer-readable storage medium by a processor of a computer device, the processor executing the computer program or computer instructions causing the computer device to perform a model conversion method as in the first aspect.
In the method, a mirror image warehouse is arranged, and a required target mirror image, such as model conversion, model verification and the like, is pulled from the mirror image warehouse. And calling different target images or the combination of a plurality of target images to realize the processes of automatic model conversion, model verification and the like. The decoupling of different functional flows is realized by setting different target images, so that the model conversion images are conveniently deployed on the heterogeneous chip frame (the node where the target chip is located) to convert the original model into the target model, the compatibility and the expansion of different heterogeneous chips can be supported, and the conversion of various target models is realized. The method and the device have the advantages that the model conversion mirror image is deployed to the node where the target chip is located, and the original model is converted into the target model corresponding to the target chip through the model conversion mirror image. And then the original model verification mirror image and the target model verification mirror image are used for respectively verifying the original model before conversion and the target model after conversion, and a conversion deviation value capable of judging the model conversion effect is obtained according to the first verification result and the second verification result, so that the automatic verification of the target model after conversion is realized. Based on the verification result (conversion deviation value), a mirror image is built on the target model meeting the requirements and stored in a mirror image warehouse so as to be called later, and therefore the availability of the converted model can be ensured.
Additional features and advantages of the disclosure will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the disclosure. The objectives and other advantages of the disclosure will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
FIG. 1 is a system architecture diagram to which a model transformation method according to an embodiment of the present disclosure is applied;
FIG. 2 is a schematic flow interaction diagram of services of a model transformation system to which the model transformation method according to the embodiment of the present disclosure is applied;
FIG. 3 is a flow chart of a model conversion method according to one embodiment of the present disclosure;
FIG. 4 is an interactive schematic diagram of a model transformation flow of a model transformation system to which the model transformation method according to an embodiment of the present disclosure is applied;
FIG. 5 is an interactive schematic diagram of a model evaluation flow of a model conversion system to which the model conversion method according to an embodiment of the present disclosure is applied;
FIG. 6 is an interactive schematic diagram of a model verification process of a model transformation system to which the model transformation method according to an embodiment of the present disclosure is applied;
FIG. 7 is an interactive schematic diagram of a packing construction flow of a model transformation system to which the model transformation method according to an embodiment of the present disclosure is applied;
FIG. 8 is a block diagram of a model conversion device according to an embodiment of the present disclosure;
Fig. 9 is a terminal block diagram of the model conversion method shown in fig. 3 according to an embodiment of the present disclosure;
Fig. 10 is a server configuration diagram of the model conversion method shown in fig. 3 according to an embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical methods and advantages of the present disclosure more apparent, the present disclosure will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present disclosure.
It should be noted that although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order different from that in the flowchart. The terms first, second and the like in the description and in the claims and in the above-described figures, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
In the description of the embodiments of the present disclosure, unless explicitly defined otherwise, terms such as arrangement, installation, connection, etc. should be construed broadly and the specific meaning of the terms in the embodiments of the present disclosure can be reasonably determined by a person skilled in the art in combination with the specific contents of the technical solutions. In this disclosure, the words "further," "exemplary," or "optionally" are used to indicate by way of example, illustration, or description that the embodiment or embodiments are not to be construed as preferred or advantageous over other embodiments or designs. The use of the words "further," "exemplary," or "optionally" and the like is intended to present the relevant concepts in a concrete fashion.
Before proceeding to further detailed description of the disclosed embodiments, the terms and terms involved in the disclosed embodiments are described, which are applicable to the following explanation:
Pytorch is an open source Python machine learning library, which is an open source deep learning framework for machine learning and deep learning, and is used for application programs such as natural language processing.
TensorFlow is a symbol mathematical system based on data flow programming, which is widely applied to the programming realization of various machine learning algorithms.
Caffe (Convolutional Architecture for Fast Feature Embedding, fast feature embedded convolution architecture) is a deep learning framework with expressive, speed and thought modularity, developed by berkeley artificial intelligence research group and berkeley vision and learning center.
MindSpore Chinese name thinking is a full scene artificial intelligence computing framework, which aims to realize three targets of developed, high-efficiency execution and unified full scene deployment.
TensorRT is an optimizer and runtime engine for high performance deep learning reasoning.
The AI development platform is a platform for providing full-flow AI development service, can perform mass data processing, large-scale distributed training, deployment of an end Bianyun model as required, operation and maintenance management, helps users to quickly establish and deploy the model, manages full-period AI workflows, meets the requirements of different development levels, reduces AI development and use thresholds, and realizes smooth, stable and reliable operation of the system. Wherein AI is artificial intelligence, ARTIFICIAL INTELLIGENCE.
The container cluster management platform is an open-source container orchestrator, and can support automatic deployment, large-scale scalability and application containerization management.
ONNX (Open Neural Network Exchange ), an open standard, intended to facilitate model interoperability between different deep learning frameworks. ONNX allow sharing and execution of trained models between different deep learning frameworks.
The image construction file is a text file and contains a series of instructions for guiding tools how to construct the image, including selecting a base image, installing software, setting an environment and the like.
In the related technology, the intelligent technology is continuously developed, and the deep learning algorithm is greatly developed in the fields of image recognition, natural language processing, voice recognition and the like. Each large company and research institution develop own deep learning framework TensorFlow, pyTorch, mindSpore and the like. These frameworks provide a powerful deep learning algorithm library and running environment that helps researchers and developers develop efficient deep learning models quickly.
However, the development of deep learning algorithms is not only on the framework level, but various novel chip structures are also emerging. These chip structures have strong advantages in deep learning applications, such as high speed computing, low power consumption, etc. Therefore, the deep learning model is converted into a model suitable for various chips, and the method has important significance for improving the efficiency and performance of deep learning application.
However, the existing model conversion is seldom directed at automatic conversion among different frames and chips, and the existing model conversion is a single conversion function, so that the existing model conversion is single in use and is difficult to generalize. On the other hand, the conventional model conversion cannot compare the effects before and after the model conversion, so that the model effect obtained by final conversion cannot be ensured. How to guarantee the usability of the converted model is a problem to be discussed currently.
Based on this, the embodiment of the disclosure provides a model conversion method, a related device and a medium, and a needed target mirror image, such as model conversion, model verification and the like, is pulled from a mirror image warehouse by setting the mirror image warehouse. And calling different target images or the combination of a plurality of target images to realize the processes of automatic model conversion, model verification and the like. The decoupling of different functional flows is realized by setting different target images, so that the model conversion images are conveniently deployed on the heterogeneous chip frame (the node where the target chip is located) to convert the original model into the target model, the compatibility and the expansion of different heterogeneous chips can be supported, and the conversion of various target models is realized. The method and the device have the advantages that the model conversion mirror image is deployed to the node where the target chip is located, and the original model is converted into the target model corresponding to the target chip through the model conversion mirror image. And then the original model verification mirror image and the target model verification mirror image are used for respectively verifying the original model before conversion and the target model after conversion, and a conversion deviation value capable of judging the model conversion effect is obtained according to the first verification result and the second verification result, so that the automatic verification of the target model after conversion is realized. Based on the verification result (conversion deviation value), a mirror image is built on the target model meeting the requirements and stored in a mirror image warehouse so as to be called later, and therefore the availability of the converted model can be ensured.
Compared with the traditional model conversion scheme, the model conversion method provided by the embodiment of the invention has the advantages that the function is single and time-consuming, the automatic conversion process can be realized through the built-in algorithm and the mirror image, the model conversion method can be converted into the model suitable for different frame types, the automatic precision evaluation of the converted model is supported, and the manual intervention is reduced. Meanwhile, the user can confirm whether the model conversion is successful or not according to the automatic precision evaluation result, including whether the converted model can run on the target equipment, whether the precision is lost or not and the like.
System architecture and scenario description applied to embodiments of the present disclosure
Fig. 1 is a system architecture diagram to which a model conversion method according to an embodiment of the present disclosure is applied. It includes an object terminal 130, the internet 120, a server 110, etc.
The server 110 refers to a computer system capable of providing a model conversion processing service to the object terminal 130. The server 110 is required to have higher stability, security, performance, and the like than the target terminal 130. The server 110 may also be a database capable of providing a data acquisition service to the object terminal 130. The server 110 may be one high-performance computer in a network platform, a cluster of multiple high-performance computers, a portion of one high-performance computer (e.g., a virtual machine), a combination of portions of multiple high-performance computers (e.g., virtual machines), etc. The server 110 may also communicate with the target terminal 130 via a wired or wireless connection, or via the internet 120, to exchange data.
The object terminal 130 is a device for displaying the result of the model conversion process so as to allow the object to view, or providing an operation interface for uploading and configuring task information related to the model conversion, etc., and the object may input a data set required to perform the model conversion process and/or input various task information in the object terminal 130. The object terminal 130 includes various forms of a desktop computer, a laptop computer, a PDA (personal digital assistant), a mobile phone, a car terminal, a home theater terminal, a dedicated terminal, a tablet, and the like. In addition, the device can be a single device or a set of a plurality of devices. For example, a plurality of devices are connected through a local area network, and a display device is commonly used for cooperative work to form a terminal. The terminals may also communicate with the server 110 to exchange data, either in a wired or wireless manner, or via the internet 120.
Fig. 2 is a flow interaction schematic diagram of services of a model conversion system to which the model conversion method of the embodiment of the present disclosure is applied. As shown in fig. 2, the model conversion system includes, but is not limited to, a mirror build service, a model conversion service, a model evaluation service, a model verification service, a flow management service, a mirror build service, a storage service, and a mirror repository.
In the preparation stage of the basic environment, service operation images required by each stage (model evaluation, model verification and the like) are constructed through image construction services and uploaded to an image warehouse.
The model conversion service completes the conversion from the original model to the target model by calling the model conversion mirror image in the source code, the data and the basic mirror image of the algorithm model.
The system pulls the model evaluation service from the mirror image warehouse to evaluate the original model and the target model, and returns the evaluation result to the flow management service. And comparing the evaluation result of the original model with the evaluation result of the target model, and judging whether the converted target model meets the requirement.
And the system pulls the model verification service from the mirror image warehouse to verify the original model and the target model, and returns a verification result to the flow management service. And judging whether the converted target model meets the requirement or not according to the verification result of the original model and the verification result of the target model.
The flow management service builds the target model with less precision loss after conversion into a target model mirror image, uploads the target model mirror image to a mirror image warehouse, uploads the corresponding target model to the storage service, and builds the relation between the target model mirror image and the storage path of the target model for subsequent deployment and application of the target model.
The model conversion system can realize full-flow automation of converting single model input into diversified model output. Specifically, through a process management service and three business function services (a model conversion service, a model evaluation service and a model verification service), the full-process automation of functions such as model input, conversion, quantization, verification, evaluation, package generation and the like is realized through service operation. The system can provide full-automatic one-stop model conversion capability, supports model conversion of heterogeneous platforms, has function expansion property, and can greatly reduce time of manual operation and possibility of errors.
Exemplary, based on the above model conversion system, the model conversion method of the embodiment of the disclosure specifically includes the following steps:
the system acquires a target mirror image from a mirror image warehouse, wherein the target mirror image comprises a model conversion mirror image, an original model verification mirror image and a target model verification mirror image;
the model conversion service deploys a model conversion mirror image to a node with a target chip;
the model conversion service acquires data and algorithm source codes corresponding to an original model, and converts the original model into a target model with a model frame adapted to a frame of a target chip;
the model evaluation service calls an original model verification mirror image to verify an original model to obtain a first verification result;
the model evaluation service calls a target model verification mirror image to verify the target model to obtain a second verification result;
The flow management service determines a conversion deviation value according to the first check result and the second check result;
And the flow management service constructs a target model mirror image under the condition that the conversion deviation value is smaller than a preset threshold value, and sends the target model mirror image to a mirror image warehouse.
In the above example, by setting the mirror warehouse, the required target mirror image is pulled from the mirror warehouse, such as model conversion, model verification and the like. And calling different target images or the combination of a plurality of target images to realize the processes of automatic model conversion, model verification and the like. The decoupling of different functional flows is realized by setting different target images, so that the model conversion images are conveniently deployed on the heterogeneous chip frame (the node where the target chip is located) to convert the original model into the target model, the compatibility and the expansion of different heterogeneous chips can be supported, and the conversion of various target models is realized. The method and the device have the advantages that the model conversion mirror image is deployed to the node where the target chip is located, and the original model is converted into the target model corresponding to the target chip through the model conversion mirror image. And then the original model verification mirror image and the target model verification mirror image are used for respectively verifying the original model before conversion and the target model after conversion, and a conversion deviation value capable of judging the model conversion effect is obtained according to the first verification result and the second verification result, so that the automatic verification of the target model after conversion is realized. Based on the verification result (conversion deviation value), a mirror image is built on the target model meeting the requirements and stored in a mirror image warehouse so as to be called later, and therefore the availability of the converted model can be ensured.
It should be understood that the foregoing merely illustrates a description of some of the application scenarios of the present disclosure. Business scenarios to which the present disclosure can be applied may include, but are not limited to, the specific embodiments set forth above.
General description of embodiments of the disclosure
It is emphasized that the disclosed embodiments can be adapted for a variety of application scenarios, such as computer vision, natural language processing, speech recognition, autopilot, edge computation, etc. In the related art, existing model conversion is rarely directed at automatic conversion among different frames and chips, and is a single conversion function, so that effects before and after model conversion cannot be compared, and the model effect obtained by final conversion cannot be ensured. Some embodiments of the present disclosure provide a model conversion method, related apparatus, and medium, which aim to be able to guarantee usability of a converted model.
The model conversion method is a method for automatically converting an original model into a target model which can be adapted to a target chip by calling different target images.
The model conversion method of the embodiment of the disclosure can be executed in a server, a terminal, a part of the server and a part of the terminal.
As shown in fig. 3, according to one embodiment of the present disclosure, a model conversion method includes:
step 310, obtaining a target mirror image from a mirror image warehouse, wherein the target mirror image comprises a model conversion mirror image, an original model verification mirror image and a target model verification mirror image;
Step 320, deploying the model conversion mirror image to a node with a target chip;
Step 330, obtaining an original model through a model conversion mirror image and converting the original model into a target model, wherein the target model can be deployed on a target chip;
step 340, calling an original model verification mirror image to verify the original model to obtain a first verification result;
Step 350, calling a target model verification mirror image to verify the target model to obtain a second verification result;
step 360, determining a conversion deviation value according to the first check result and the second check result;
step 370, under the condition that the conversion deviation value is smaller than a preset threshold value, constructing a target model mirror image;
step 380, the target model image is sent to an image repository.
Steps 310 to 330 are briefly described below.
In step 310, the image repository refers to the data storage space used to store the various types of images. The images in the various images include, but are not limited to, basic images (such as model conversion images, data initialization images, etc.) and pre-built service running images (such as original model verification images, target model verification images, original model evaluation images, target model evaluation images, etc.).
The target image refers to the relevant image that needs to be used in the model conversion process.
In step 320, the target chip refers to a chip that is desired to be used for deploying the original model for deep learning, or for model training, model running. The target chip may be a specific chip, or may refer to a corresponding chip frame. The node having the target chip refers to a computing resource node having the target chip, for example, a server node, a computer, a virtual machine node having a target chip frame, and the like.
In step 330, the model conversion image refers to an image file that converts the original model into the target model. The target model refers to a model that can be adapted to a target chip. Illustratively, the frame to which the target model corresponds is the same as the frame of the target chip. Because the target model is matched with the target chip, the target model can be deployed on the corresponding target chip, and model training or calculation is performed by utilizing the capabilities of the target chip, such as high-speed calculation, low power consumption and the like.
In step 340, the original model verification image refers to an image file capable of performing a verification process on the original model and obtaining a verification result. The first verification result refers to a verification result corresponding to the original model.
In step 350, the target model verification image refers to an image file capable of executing a verification process on the target model and obtaining a verification result. The second checking result refers to a checking result corresponding to the target model, namely, a checking result of the converted model.
In step 360, the conversion deviation value is calculated according to the first verification result and the second verification result, and is used for representing the deviation of the target model (converted model) relative to the original model. For example, the conversion deviation value may be obtained by subtracting the first check result from the second check result and calculating the standard deviation. It should be noted that, the first verification result and the second verification result each include a plurality of result values, and the number of specific result values corresponds to the number of groups of input vectors used for verification.
In step 370, the preset threshold is a threshold for determining whether the conversion deviation value meets the requirement. When the conversion deviation value is larger than or equal to a preset threshold value, the deviation between the converted target model and the original model is excessively large, and the converted target model does not meet the requirement. When the conversion deviation value is smaller than a preset threshold value, the deviation between the converted target model and the original model is within an acceptable range, and the converted target model meets the requirements.
The object model image refers to an image file of the object model. By constructing the target model mirror image, the subsequent deployment and application of the target model can be more conveniently carried out, and particularly, the deployment on a heterogeneous system can be better compatible.
In step 380, the target model image is sent to an image repository and stored. The target image can be pulled directly from the image repository when it is subsequently needed to be used or deployed.
The embodiments of steps 310 to 380 described above pull the required target image from the image repository by setting up the image repository, such as model conversion, model verification, etc. And calling different target images or the combination of a plurality of target images to realize the processes of automatic model conversion, model verification and the like. The decoupling of different functional flows is realized by setting different target images, so that the model conversion images are conveniently deployed on the heterogeneous chip frame (the node where the target chip is located) to convert the original model into the target model, the compatibility and the expansion of different heterogeneous chips can be supported, and the conversion of various target models is realized. The method and the device have the advantages that the model conversion mirror image is deployed to the node where the target chip is located, and the original model is converted into the target model corresponding to the target chip through the model conversion mirror image. And then the original model verification mirror image and the target model verification mirror image are used for respectively verifying the original model before conversion and the target model after conversion, and a conversion deviation value capable of judging the model conversion effect is obtained according to the first verification result and the second verification result, so that the automatic verification of the target model after conversion is realized. Based on the verification result (conversion deviation value), a mirror image is built on the target model meeting the requirements and stored in a mirror image warehouse so as to be called later, and therefore the availability of the converted model can be ensured.
The foregoing is a general description of steps 310 through 380, and the detailed description of the implementation of steps 310 through 380 follows.
In step 310, a target image is obtained from an image repository, wherein the target image includes a model conversion image, an original model verification image, and a target model verification image.
In one embodiment, based on the model transfer system provided in fig. 2, a basic environment preparation process is further included before step 310. The reference environment preparation flow includes, but is not limited to:
preparing a quantized data set, and uploading the quantized data set to a system for subsequent model conversion and model quantization, wherein the quantized data set is used for model quantization and is determined based on a model to be quantized;
preparing an evaluation algorithm, a verification algorithm and an evaluation data set of an original model, and uploading the evaluation algorithm, the verification algorithm and the evaluation data set to a system;
Preparing a target model evaluation algorithm and a verification algorithm, and uploading the target model evaluation algorithm and the verification algorithm to a system;
The mirror image construction service uses a basic mirror image built in the system and algorithm code resources uploaded by a user to construct a mirror image task, and a target mirror image is obtained after the task is operated, wherein the mirror image task comprises an original model evaluation mirror image, an original model verification mirror image, a target model evaluation mirror image and a target model verification mirror image. The algorithm code resources comprise the evaluation algorithm code and the verification algorithm code of the original model, the evaluation algorithm code and the verification algorithm code of the target model and the like.
In step 320, a model conversion image is deployed onto a node with a target chip.
In one embodiment, based on the model conversion system provided in FIG. 2, the model conversion service will convert the model pulled from the mirror repository into a mirror image. The model conversion service deploys the model conversion mirror image on the node with the target chip through the container cluster management platform so that the model conversion mirror image runs on the node of the target chip.
The original model can be converted into different target models corresponding to different target chips by deploying the model conversion mirror image on the node where the different target chips are located. The method and the device can convert a single original model into a diversified target model, and can realize the expansion of different target chips based on the expansion of nodes in a container cluster. Thereby realizing the conversion supporting pluggable expansion of different target models.
In step 330, the original model is obtained by a model conversion mirror and converted into a target model, wherein the target model can be deployed on a target chip.
In one embodiment, the target image further comprises a data initialization image. Step 330 includes:
Deploying the data initialization mirror image to a node with a target chip;
obtaining data corresponding to an original model through a data initialization mirror image, and sharing the data to a model conversion mirror image;
and determining an original model based on data corresponding to the original model through a model conversion mirror image, and converting the original model into a target model.
In this embodiment, the data initialization mirroring refers to a mirroring file for acquiring/calling data required for a corresponding flow.
The data corresponding to the original model comprises an algorithm program of the original model, model parameters corresponding to the trained original model and other related data of the original model. The data initialization mirror image shares the data corresponding to the original model to the model conversion mirror image, so that the model conversion mirror image can obtain the original model which needs to be converted and related data.
In the embodiment, the data required for model conversion is acquired through the data initialization mirror image, and in the model conversion process, the system deploys the data initialization mirror image on the corresponding node, so that even if the node can automatically acquire related data, a user does not need to specially perform some series of configuration such as a data acquisition process, a data acquisition path and the like, manual operation is reduced, and the process is simplified.
Exemplary, the model conversion process is described in detail based on the model conversion system provided in fig. 2. The model conversion system corresponding to fig. 4 is a part of the model conversion system corresponding to fig. 2. As shown in fig. 4, the specific process of model conversion performed by the model conversion system involves interactions among a Web side (Web side), a business database, a flow management service, a model conversion service, a storage service, and a mirror repository, including, but not limited to, the following steps:
(1) The Web terminal provides an operation interface for the user, the user can input or repair task information in advance at the Web terminal, and the Web management service responsible for data interaction with the Web terminal submits the task information and uploads the task information to the storage service and the service database respectively.
(2) The process management service invokes task information from the service database, converts it into YAML (a markup language for data serialization) format, and submits the task information to the model conversion service for corresponding service scheduling.
(3) And pulling the data initialization mirror image and the model conversion mirror image from the mirror image warehouse by the model conversion service, and placing the data initialization mirror image and the model conversion mirror image on a node with a target chip in the cluster to run through a dispatching mechanism of the container cluster management platform.
(4) The data initialization mirror running on the node downloads the algorithmic code of the original model, the model parameters and data characterizing the original model from the storage service and shares these data to the model conversion mirror via the Pod (container) sharing mechanism.
(5) The model conversion mirror executes a mirror default conversion script that converts the original model into an intermediate representation format (ONNX).
(6) The model conversion mirror image maps and converts the original frame and the operator of the target frame, copies the weight of the operator, and converts the intermediate representation format (ONNX) into the target model supporting the target chip.
As shown in FIG. 4, when the frame of the original model is MindSpore frames, the corresponding format is mindir, the original frame (MindSpore) and the target frame (TensorRT) are mapped and converted by converting the frame from mindir to onnx, the model is converted into engine format, and finally the target model which is matched with the target chip of the TensorRT frame is obtained.
As shown in FIG. 4, when the frame of the original model is TensorFlow frames, the corresponding format is.pb, the original model is firstly converted from.pb to. onnx format, the original frame (TensorFlow) and the operator of the target frame (TensorRT) are subjected to mapping conversion, the model is converted into the.engine format, and finally the target model which is matched with the target chip of the TensorRT frame is obtained.
As shown in FIG. 4, when the frame of the original model is Pytorch, the corresponding format is. Pt, the original model is firstly converted from. Pt to. Onnx, the original frame (Pytorch) and the operator of the target frame (TensorRT) are mapped and converted, the model is converted into. Engine format, and finally the target model which is adapted to the target chip of the TensorRT frame is obtained.
As shown in FIG. 4, when the frame of the original model is Pytorch frame, the corresponding format is. Pt, the original frame (Pytorch) and the target frame (MindSpore) are mapped and converted to the operator of the target frame (MindSpore) from the. Pt to the onnx format, and the model is converted to the mindir format, so that the target model which is matched with the target chip of the MindSpore frame is finally obtained.
(7) And when the user has a quantization requirement, uploading a corresponding quantized data set in a preparation stage.
(8) Uploading the obtained target model and the compressed/quantized model (if any) to a storage service;
In the process, the process management service is responsible for tracking the task state in the whole process, updating the task state to a service database in real time and displaying the task state to a user through a Web end, wherein the task state comprises a conversion stage of an original model, whether conversion is completed, conversion success/failure and the like.
It should be noted that, by constructing a corresponding target image and mounting the target image to a node in a cluster constructed based on the container cluster management platform, the node operates to provide different functional services. Because the container cluster can access the nodes meeting the interface requirements in real time through the configuration management interface. Therefore, the types of the convertible models can be expanded in real time by expanding the nodes with different chips or different frames, and the pluggable model conversion is realized.
In the above embodiment, the data storage is mainly performed by using a service database and a storage service, and the task scheduling and the container management are performed by using a container cluster management platform. And opening an automatic flow for realizing model conversion through a model conversion service and a flow management service. As shown in FIG. 4, the system can support the mutual conversion among a plurality of frame models such as mindspore, tensorf low, pytorch, tensorRT and the like, and has an extensible function.
In an embodiment, the target image further includes an original model evaluation image and a target model evaluation image. After step 330, the model conversion method of the present disclosure further includes:
Under the condition that the target model is quantized, an original model evaluation mirror image is called to evaluate the original model to obtain a first performance result of the original model, and the target model evaluation mirror image is called to evaluate the quantized target model to obtain a second performance result of the quantized target model.
In this embodiment, the original model evaluation image refers to an image file generated based on an evaluation algorithm for an original model before performing model evaluation. The original model evaluation mirror image can be used for carrying out a series of automatic operations related to the pulling, evaluation, output result and the like of the original model.
The target model evaluation image refers to an image file generated based on an evaluation algorithm corresponding to the target model before model evaluation. The target model evaluation mirror image can be used for carrying out a series of automatic operations related to the pulling, evaluation, output result and the like of the target model.
Quantization is a process of converting high-precision floating point number parameters in a deep learning model into low-precision integers or fixed point numbers to reduce the storage space and computational resource requirements of the model while maintaining the performance of the model as much as possible.
The first performance result is obtained by evaluating the original model and is used for representing the result data of the overall performance of the original model. The second performance result is obtained by evaluating the target model and is used for representing the result data of the overall performance of the target model.
In the embodiment, the original model evaluation mirror image and the target model evaluation mirror image are called, so that automatic evaluation of the original model and the target model is realized, user operation is reduced, and the efficiency of the whole process of model conversion is improved.
In an embodiment, before the original model is evaluated by calling the original model evaluation image to obtain the first performance result of the original model, the method further includes:
acquiring evaluation task information, and generating a first evaluation configuration file and a second evaluation configuration file according to the evaluation task information;
invoking an original model evaluation mirror image to evaluate the original model to obtain a first performance result of the original model, wherein the method comprises the following steps:
According to the first evaluation configuration file, sending an original model evaluation mirror image address, a data set address and an original model address to an artificial intelligence development platform;
Invoking an artificial intelligence development platform to pull an original model evaluation image, a first evaluation data set and an original model through the original model evaluation image address, the data set address and the original model address so that the artificial intelligence development platform evaluates the original model through the original model evaluation image according to the first evaluation data set to obtain a first performance result;
and calling a target model evaluation mirror to evaluate the quantized target model to obtain a second performance result of the quantized target model, wherein the method comprises the following steps of:
according to the second evaluation configuration file, sending the target model evaluation mirror image address, the data set address and the target model address to the container cluster management platform;
and calling the container cluster management platform to pull the target model evaluation image, the second evaluation data set and the target model through the target model evaluation image address, the data set address and the target model address so that the container cluster management platform evaluates the target model through the target model evaluation image according to the second evaluation data set to obtain a second performance result.
In this embodiment, the evaluation task information refers to data information required for evaluating a model, such as a sample data set, an original model, an algorithm, a framework, and the like, corresponding to the model. The first evaluation profile refers to a file for defining and managing various parameters and settings of the evaluation task execution of the original model. The second evaluation profile refers to a file for defining and managing various parameters and settings of the execution of the evaluation task of the target model.
Exemplary, the model evaluation flow is described in detail based on the model conversion system provided in fig. 2. The model conversion system corresponding to fig. 5 is a part of the model conversion system corresponding to fig. 2. As shown in fig. 5, the specific process of model evaluation performed by the model conversion system involves interactions among a Web terminal (Web terminal), a business database, a flow management service, a model evaluation service, a storage service, and a mirror warehouse, including but not limited to the following steps:
(1) The user can select evaluation task information (such as a data set, an original model, an algorithm, a framework and the like) through the Web management service, and the selected information is stored in a business database and a storage service. In other examples, the Web management service may select task information corresponding to the current model to be evaluated by itself based on evaluation task information of each model input by the user in advance.
(2) The flow management service retrieves the evaluation task information from the service database, generates a task configuration file applicable to the scheduling of the AI development platform and the container cluster management platform according to the evaluation task information, and submits the task configuration file to the model evaluation service. And simultaneously, writing the task configuration file into a service database. For example, in this example, the original model is evaluated by the AI development platform, and the task profile, i.e., the first evaluation profile, applicable to the AI development platform is used at this time, and the target model is evaluated by the container cluster management platform, and the task profile, i.e., the second evaluation profile, applicable to the container cluster management platform is used at this time.
(3) And submitting the mirror image address, the data set address and the algorithm address to a cluster constructed based on the AI development platform service by the model evaluation service according to the first evaluation configuration file, performing task scheduling by the cluster based on the configuration file, pulling an original model evaluation mirror image from a mirror image warehouse, and operating, and pulling an algorithm, a model, a data set and the like from a storage service by the original model evaluation mirror image for evaluation.
(4) After the original model evaluation mirror image completes the model evaluation, the evaluation result (first performance result) is submitted to the service database.
(5) And the model evaluation service submits the mirror image address, the data set address and the algorithm address to a cluster constructed based on the container cluster management platform according to the second evaluation configuration file, the cluster performs task scheduling based on task configuration, and a target model evaluation mirror image is pulled from a mirror image warehouse. And the target model evaluation mirror image is evaluated by pulling an algorithm, a model, a data set and the like from the storage service.
(6) After the target model evaluation mirror image is evaluated by the model, an evaluation result (second performance result) is submitted to a service database.
In the process, the flow management service is responsible for tracking the task state in the whole process, updating the task state to the service database in real time, and displaying the evaluation task information to the user through the Web terminal.
In the above examples, the structured data and unstructured data storage is primarily performed using a business database and storage services. Based on the defined interface specification, calls to different architecture platforms can be implemented. Based on an AI development platform, the evaluation and result reporting and storage of the original model are realized. Based on the container cluster management platform, the evaluation and result reporting and storage of the target model are realized. According to the task configuration information of the user, the system automatically generates operation rules required by executing the task, and further, the evaluation and result comparison of the original model and the target model can be realized through calling the service and the function service.
In step 340, the original model verification mirror image is invoked to verify the original model, and a first verification result is obtained.
In one embodiment, step 340 includes:
Acquiring verification task information, and generating a first verification configuration file according to the verification task information;
According to the first verification configuration file, an original model verification mirror image address and an original model address are sent to an artificial intelligent development platform;
And dispatching the artificial intelligent development platform to pull the original model verification mirror image and the original model through the original model verification mirror image address and the original model address so that the artificial intelligent development platform verifies the original model through the original model verification mirror image to obtain a first verification result.
In this embodiment, the verification task information refers to data information required for verifying the model. The first verification profile refers to a file for defining and managing various parameters and settings for verification task execution of the original model.
In the embodiment, the original model verification mirror image is deployed into the AI development platform, and the AI development platform and the original model verification mirror image are utilized to realize automatic verification of the original model, so that manual intervention is reduced, and the efficiency of the whole flow of model conversion is improved.
In step 350, the target model verification mirror image is called to verify the target model, so as to obtain a second verification result.
In one embodiment, step 350 includes:
Acquiring verification task information, and generating a second verification configuration file according to the verification task information;
According to the second checking configuration file, the target model checking mirror image address and the target model address are sent to the container cluster management platform;
And the dispatching container cluster management platform pulls the target model check mirror image and the target model through the target model check mirror image address and the target model address so that the container cluster management platform checks the target model through the target model check mirror image to obtain a second check result.
In the present embodiment, the second verification configuration file refers to a file of various parameters and settings for defining and managing verification task execution of the target model.
In the embodiment, the target model verification mirror image is deployed into the container cluster management platform, and the container cluster management platform and the target model verification mirror image are utilized to realize automatic verification of the target model, so that manual intervention is reduced, and the efficiency of the overall process of model conversion is improved.
It should be noted that, all of the steps 340, 350 and 360 belong to the steps of model verification, and the process of model verification of the steps 340 to 360 will be described in detail by way of an example.
Exemplary, the model verification process is described in detail based on the model conversion system provided in fig. 2. The model conversion system corresponding to fig. 6 is a part of the model conversion system corresponding to fig. 2. As shown in fig. 6, the specific process of model verification by the model conversion system involves interactions between a Web side (Web side), a business database, a flow management service, a model verification service, a storage service, and a mirror repository, including, but not limited to, the following steps:
(1) The user can select verification task information (such as original model, algorithm, framework, etc.) through the Web management service, and store the selected information in the business database and the storage service. In other examples, the Web management service may select task information corresponding to the current model to be checked by itself based on the check task information of each model input by the user in advance.
(2) The flow management service invokes the verification task information from the service database, generates task configuration suitable for the scheduling of the AI development platform and the container cluster management platform according to the verification task information, submits the task configuration to the model verification service, and writes the verification task information into the service database. For example, in this example, the original model is checked by the AI development platform, and the task profile of the AI development platform is applied to the first check profile, and the target model is checked by the container cluster management platform, and the task profile of the container cluster management platform is applied to the second check profile.
(3) The model checking service submits a mirror image address, a model address and an algorithm address to a cluster constructed based on the AI development platform service according to the first checking configuration file, task scheduling is carried out by the cluster based on the configuration file, an original model checking mirror image is pulled from a mirror image warehouse and operated, and the original model checking mirror image pulls an original model checking algorithm and an original model from a storage service;
(4) After the original model verification algorithm is executed, five groups of random vectors are generated and input into the original model, and five groups of corresponding verification reasoning results are obtained. It should be noted that the number of sets of random vectors may be arbitrarily set based on the need.
(5) And after the original model verification mirror image completes verification reasoning, uploading the original model, the five input random vectors and a corresponding verification reasoning result to a storage service.
(6) And the model verification service submits the mirror image address, the model address and the algorithm address to a cluster constructed based on the container cluster management platform according to the second verification configuration file, the cluster performs task scheduling based on the configuration file, pulls the target model verification mirror image from the mirror image warehouse and operates, and pulls the target model verification algorithm, the target model, five groups of input vectors and the original model verification reasoning result from the storage service. After the target model verification algorithm is executed, five groups of input vectors are input into the target model, and verification reasoning results of the five groups of corresponding target models are obtained.
(7) And subtracting the verification reasoning result of the target model from the verification reasoning result of the original model, solving a standard deviation to obtain a conversion deviation value, and submitting the conversion deviation value to a service database.
In the process, the process management service is responsible for tracking the task state in the whole process, updating the task state to the service database in real time, and displaying the verification task information in the service database to the user through the Web terminal.
In the above examples, the structured data and unstructured data storage is primarily performed using a business database and storage services. Based on the AI development platform, the checksum result of the original model is reported and stored. And based on the container cluster management platform, reporting and storing the checksum result of the target model are realized. According to the task configuration information of the user, the system automatically generates operation rules required by executing the verification task, and further, the verification sum result comparison of the original model and the target model can be realized through calling the service and the function service.
In step 370, in the event that the transition deviation value is less than a preset threshold, a target model image is constructed.
In the embodiment of model evaluation after step 330, in the case where the target model is quantized, the original model and the quantized target model need to be evaluated to obtain the first evaluation result and the second evaluation result. In this case, in one embodiment, step 370 includes determining to construct the target model image based on the conversion bias value, the first performance result, and the second performance result.
In this embodiment, the difference between the quantized overall performance of the target model and the overall performance of the original model can be determined by comparing the first performance result with the second performance result. And when the difference between the first performance result and the second performance result does not exceed the preset upper limit, the quantized performance of the target model meets the requirement. And under the condition that the quantized performance of the target model meets the requirement and the conversion deviation value also meets the requirement, determining that the target model is available, and constructing a target model mirror image so as to schedule and deploy the target model later.
In one embodiment, step 370 comprises:
generating an image construction file under the condition that the conversion deviation value is smaller than a preset threshold value, wherein the image construction file is used for generating a target model image;
packaging the mirror image construction file and the target model into a construction data packet, and sending the construction data packet to a container cluster management platform;
The container cluster management platform is scheduled to disassemble the constructed data packet so that the container cluster management platform constructs a target model mirror image according to the mirror image construction file and the target model;
Invoking a container cluster management platform to upload the target model to a storage service;
the target model image is recorded in association with a storage path of the target model.
In this embodiment, the image build file refers to a configuration file for building a container image. The construction data packet refers to a data packet comprising various files and data required for constructing the target model mirror image.
Illustratively, the packet construction flow will be described in detail based on the model conversion system provided in fig. 2. The model conversion system corresponding to fig. 7 is a part of the model conversion system corresponding to fig. 2. As shown in fig. 7, the specific process of packaging and constructing by the model conversion system involves interactions among a business database, a flow management service, a mirror image construction service, a storage service, and a mirror image warehouse, including, but not limited to, the following steps:
(1) The flow management service checks whether the service database has conversion deviation values and evaluation results. If so, judging whether the set standard is met. If not, the basic running mirror image is constructed, and only the mirror image construction file on which the basic running mirror image is constructed depends is generated. Wherein, in this disclosure, the set criteria is a threshold value set according to an empirical value, which can be modified by a configuration file. The setting of the threshold value may vary in different tasks.
(2) If the service database has a conversion deviation value and an evaluation result and meets the standard (the conversion deviation value is smaller than the threshold value, for example), generating a mirror image construction file on which the final target model mirror image construction depends, and packaging objects such as the mirror image construction file, the algorithm file, the model file and the like into a finished construction data packet. The flow management service uploads the build data packet to the storage service and then requests the mirror build service to execute the packaging build flow. If the conversion deviation value and the evaluation result exist in the service database but do not meet the standard, only the task state is updated.
(3) After receiving the request parameters, the mirror image construction service creates construction operation in the cluster, downloads and mounts construction data packets in the parameters into the operation, disassembles the construction data packets in the operation, moves files such as mirror image construction files to a mirror image construction tool, uses a default construction path, and starts a construction flow of a target model mirror image. As shown in fig. 7, in the case where it is determined that the built-up base running image is a built-up model evaluation image (e.g., an original model evaluation image, a target model evaluation image, etc.), the built-up job in the cluster builds up model evaluation image, and pushes the built-up image to the image repository.
(4) Under the condition of constructing the mirror image of the target model, the target model file is uploaded to the storage service after the job unpacks the constructed data packet. And moving files such as the mirror image construction file to a mirror image construction tool, constructing a target model mirror image by using a default construction path, and recording the storage path relation between the mirror image and the target model. Pushing the constructed target model image to an image warehouse for storage by the job so as to call and deploy the subsequent target model.
In the above example, according to the checksum evaluation result of the model, whether the model conversion is successful is judged, and the model is packaged and stored after the model conversion is successful. The whole system does not need manual intervention except that the visual page of the Web management service needs to be manually selected and submitted with task information. By designing the scheduling service and the functional service, the system realizes efficient and diversified one-stop model conversion and mirror image packaging.
Apparatus and device descriptions of embodiments of the present disclosure
It will be appreciated that, although the steps in the various flowcharts described above are shown in succession in the order indicated by the arrows, the steps are not necessarily executed in the order indicated by the arrows. The steps are not strictly limited in order unless explicitly stated in the present embodiment, and may be performed in other orders. Moreover, at least some of the steps in the flowcharts described above may include a plurality of steps or stages that are not necessarily performed at the same time but may be performed at different times, and the order of execution of the steps or stages is not necessarily sequential, but may be performed in turn or alternately with at least a portion of the steps or stages in other steps or other steps.
Fig. 8 is a schematic structural diagram of a model conversion system 800 according to an embodiment of the disclosure. The model conversion system 800 includes:
a first device 810, configured to obtain a target image from an image repository, where the target image includes a model conversion image, an original model verification image, and a target model verification image;
a first device 810 for deploying a model conversion image onto a second device 820 having a target chip;
A second device 820 for obtaining an original model through a model conversion mirror image and converting the original model into a target model, wherein the target model can be deployed on a target chip;
the second device 820 is configured to invoke the original model verification mirror image to verify the original model, so as to obtain a first verification result;
A second device 820, configured to invoke a target model verification image to verify the target model, so as to obtain a second verification result;
A second device 820 for determining a conversion offset value based on the first and second verification results;
A second device 820 for constructing a target model image in case the conversion deviation value is smaller than a preset threshold value;
A second device 820 for sending the target model image to the image repository.
Optionally, the target image further comprises an original model evaluation image and a target model evaluation image;
The second device 820 is configured to invoke an original model evaluation image to evaluate the original model under the condition that the target model is quantized, so as to obtain a first performance result of the original model;
The second device 820 evaluates the quantized object model through the object model evaluation mirror to obtain a second performance result of the quantized object model;
The second device 820 is specifically configured to determine to construct a target model image according to the conversion deviation value, the first performance result, and the second performance result.
Optionally, the first device 810 is configured to obtain evaluation task information, and generate a first evaluation configuration file and a second evaluation configuration file according to the evaluation task information;
The first device 810 is further configured to send the original model evaluation image address, the data set address, and the original model address to the second device 820 according to the first evaluation configuration file;
the second device 820 is configured to pull the original model evaluation image, the first evaluation data set, and the original model through the original model evaluation image address, the data set address, and the original model address;
The second device 820 is configured to evaluate the original model according to the first evaluation data set through the original model evaluation image, to obtain a first performance result;
The first device 810 is further configured to send the target model evaluation mirror address, the data set address, and the target model address to the second device 820 according to the second evaluation configuration file;
the second device 820 is configured to pull the target model evaluation image, the second evaluation data set, and the target model through the target model evaluation image address, the data set address, and the target model address;
the second device 820 is configured to evaluate the target model according to the second evaluation data set through the target model evaluation image, to obtain a second performance result.
Optionally, the target image further comprises a data initialization image;
The first device 810 is configured to deploy a data initialization image onto the second device 820 having the target chip;
The second device 820 is configured to obtain data corresponding to the original model through the data initialization mirror image, and share the data with the model conversion mirror image;
The second device 820 is configured to determine an original model based on data corresponding to the original model through a model conversion mirror image, and convert the original model into a target model.
Optionally, the first device 810 is configured to obtain verification task information, and generate a first verification configuration file according to the verification task information;
the first device 810 is further configured to send the original model verification image address and the original model address to the second device 820 according to the first verification configuration file;
The second device 820 pulls the original model verification mirror image and the original model through the original model verification mirror image address and the original model address, so that the second device 820 verifies the original model through the original model verification mirror image to obtain a first verification result.
Optionally, the first device 810 is configured to obtain verification task information, and generate a second verification configuration file according to the verification task information;
the first device 810 is further configured to send the target model verification image address and the target model address to the second device 820 according to the second verification configuration file;
The second device 820 pulls the target model verification mirror image and the target model through the target model verification mirror image address and the target model address, so that the second device 820 verifies the target model through the target model verification mirror image to obtain a second verification result.
Optionally, the first device 810 is configured to generate an image building file when the conversion deviation value is smaller than a preset threshold, where the image building file is used to generate a target model image;
the first device 810 is further configured to encapsulate the image build file and the object model into a build data packet, and send the build data packet to the second device 820;
The second device 820 is configured to disassemble the construction data packet, so that the second device 820 constructs a target model image according to the image construction file and the target model;
the second device 820 is configured to upload the object model to a storage service, and record a storage path association of the object model image with the object model.
It should be noted that, in the above embodiments, the first device 810 and the second device 820 are all electronic devices or a set of multiple electronic devices provided in other embodiments of the disclosure.
Referring to fig. 9, fig. 9 is a block diagram illustrating a structure of a portion of a terminal for implementing a model conversion method according to an embodiment of the present disclosure, the terminal including a Radio Frequency (RF) circuit 910, a memory 915, an input unit 930, a display unit 940, a sensor 950, an audio circuit 960, a wireless fidelity (WI RE LESS FIDE L ITY, wiFi) module 970, a processor 980, and a power source 990. It will be appreciated by those skilled in the art that the terminal structure shown in fig. 9 is not limiting of a cell phone or computer and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
The RF circuit 910 may be used to receive and transmit signals during a message or a call, and specifically, receive downlink information from a base station, process the downlink information with the processor 980, and transmit uplink data to the base station.
The memory 915 may be used to store software programs and modules, and the processor 980 performs various functional applications and data processing of the content terminal by executing the software programs and modules stored in the memory 915.
The input unit 930 may be used to receive input numeric or character information and to generate key signal inputs related to setting and function control of the content terminal. In particular, the input unit 930 may include a touch panel 931 and other input devices 932.
The display unit 940 may be used to display input information or provided information and various menus of the content terminal. The display unit 940 may include a display panel 941.
Audio circuitry 960, speaker 961, microphone 962 may provide an audio interface.
In this embodiment, the processor 980 included in the terminal may perform the model conversion method of the previous embodiment.
Terminals of embodiments of the present disclosure include, but are not limited to, cell phones, computers, intelligent voice interaction devices, intelligent home appliances, vehicle terminals, aircraft, and the like. The disclosed embodiments may be applied to a variety of scenarios including, but not limited to, content recommendation, data screening, and the like.
Fig. 10 is a block diagram of a portion of a server implementing a model conversion method of an embodiment of the present disclosure. Servers may vary widely in configuration or performance and may include one or more central processing units (Centra l Process ing Un its, simply CPUs) 1022 (e.g., one or more processors) and memory 1032, one or more storage mediums 1030 (e.g., one or more mass storage devices) that store applications 1042 or data 1044. Wherein memory 1032 and storage medium 1030 may be transitory or persistent. The program stored on the storage medium 1030 may include one or more modules (not shown), each of which may include a series of instruction operations on a server. Further, central processor 1022 may be configured to communicate with storage medium 1030 and execute a series of instruction operations in storage medium 1030 on a server.
The server(s) may also include one or more power supplies 1026, one or more wired or wireless network interfaces 1050, one or more input/output interfaces 1058, and/or one or more operating systems 1041, such as Windows Server, mac OS XTM, un ixTM, linuxTM, freeBSDTM, and the like.
The central processor 1022 in the server may be used to perform the model conversion method of the embodiments of the present disclosure.
The embodiments of the present disclosure also provide a computer-readable storage medium storing a program code for executing the model conversion method of the foregoing embodiments.
The disclosed embodiments also provide a computer program product comprising a computer program. The processor of the computer device reads the computer program and executes it, causing the computer device to execute the model conversion method described above.
The terms "first," "second," "third," "fourth," and the like in the description of the present disclosure and in the above-described figures, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the disclosure described herein may be capable of operation in sequences other than those illustrated or described herein, for example. Furthermore, the terms "comprises," "comprising," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that in this disclosure, "at least one" means one or more, and "a plurality" means two or more. "and/or" is used to describe the association relationship of the associated contents, and indicates that three relationships may exist, for example, "a and/or B" may indicate that only a exists, only B exists, and three cases of a and B exist simultaneously, where a and B may be singular or plural. The character "/" generally indicates that the context of the association is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one of a, b or c may represent a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
It should be understood that in the description of the embodiments of the present disclosure, the meaning of a plurality (or multiple) is two or more, and that greater than, less than, exceeding, etc. is understood to not include the present number, and that greater than, less than, within, etc. is understood to include the present number.
In the several embodiments provided in the present disclosure, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of elements is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present disclosure may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present disclosure may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods of the various embodiments of the present disclosure. The storage medium includes various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-only Memory (ROM), a random access Memory (Random Access Memory RAM), a magnetic disk, or an optical disk.
It should also be appreciated that the various implementations provided by the embodiments of the present disclosure may be arbitrarily combined to achieve different technical effects.
The above is a specific description of the embodiments of the present disclosure, but the present disclosure is not limited to the above embodiments, and various equivalent modifications and substitutions can be made by those skilled in the art without departing from the spirit of the present disclosure, and are included in the scope of the present disclosure as defined in the claims.
Claims (11)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202411439111.2A CN119440621A (en) | 2024-10-14 | 2024-10-14 | Model conversion method, related device and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202411439111.2A CN119440621A (en) | 2024-10-14 | 2024-10-14 | Model conversion method, related device and medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN119440621A true CN119440621A (en) | 2025-02-14 |
Family
ID=94510709
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202411439111.2A Pending CN119440621A (en) | 2024-10-14 | 2024-10-14 | Model conversion method, related device and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN119440621A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN120104289A (en) * | 2025-05-09 | 2025-06-06 | 济南浪潮数据技术有限公司 | Application migration method, device, equipment, readable storage medium and program product |
-
2024
- 2024-10-14 CN CN202411439111.2A patent/CN119440621A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN120104289A (en) * | 2025-05-09 | 2025-06-06 | 济南浪潮数据技术有限公司 | Application migration method, device, equipment, readable storage medium and program product |
CN120104289B (en) * | 2025-05-09 | 2025-07-08 | 济南浪潮数据技术有限公司 | Application migration method, device, equipment, readable storage medium and program product |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11016673B2 (en) | Optimizing serverless computing using a distributed computing framework | |
CN110780914B (en) | Service publishing method and device | |
EP3731161A1 (en) | Model application method and system, and model management method and server | |
CN110851237B (en) | Container cross-isomerism cluster reconstruction method for domestic platform | |
CN108786112B (en) | Application scene configuration method, device and storage medium | |
CN111797969A (en) | Conversion method and related device of neural network model | |
CN111062521B (en) | Online prediction method, system and server | |
CN112395736B (en) | Parallel simulation job scheduling method of distributed interactive simulation system | |
CN119440621A (en) | Model conversion method, related device and medium | |
CN114238237B (en) | Task processing method, device, electronic device and computer readable storage medium | |
US20200286012A1 (en) | Model application method, management method, system and server | |
CN111651140B (en) | Service method and device based on workflow | |
CN112953993B (en) | Resource scheduling method, equipment, network system and storage medium | |
CN118568250B (en) | Method, device, medium and equipment for realizing callback function based on ollama framework | |
CN118276832A (en) | Multi-model workflow construction method and device, electronic equipment and storage medium | |
CN1985240A (en) | Application splitting for network edge computing | |
CN115002229A (en) | Edge cloud network system, scheduling method, device, system and storage medium | |
CN114116051B (en) | Processing method, device, equipment and storage medium based on neural network model | |
CN113821271A (en) | Method and system for expanding connector of service application integration platform | |
CN116720567A (en) | Model optimization method and related equipment | |
CN113132445A (en) | Resource scheduling method, device, network system and storage medium | |
CN119883223B (en) | General computing force-oriented intelligent computing center model local development method and device | |
CN116700703B (en) | Service processing method, device, equipment and storage medium | |
CN119806787B (en) | Computing task allocation method, device, computer equipment and storage medium | |
CN114881751B (en) | Configuration method, system, electronic device and storage medium for financial scheduling tasks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |