[go: up one dir, main page]

CN120471134A - Neural network model migration accuracy alignment method, device, and computer equipment - Google Patents

Neural network model migration accuracy alignment method, device, and computer equipment

Info

Publication number
CN120471134A
CN120471134A CN202510395350.0A CN202510395350A CN120471134A CN 120471134 A CN120471134 A CN 120471134A CN 202510395350 A CN202510395350 A CN 202510395350A CN 120471134 A CN120471134 A CN 120471134A
Authority
CN
China
Prior art keywords
nodes
node
target node
graph
input data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202510395350.0A
Other languages
Chinese (zh)
Inventor
韩钰
刘洋
蒋思源
刘胤辰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Guixin Technology Co ltd
Original Assignee
Beijing Guixin Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Guixin Technology Co ltd filed Critical Beijing Guixin Technology Co ltd
Priority to CN202510395350.0A priority Critical patent/CN120471134A/en
Publication of CN120471134A publication Critical patent/CN120471134A/en
Pending legal-status Critical Current

Links

Landscapes

  • Testing And Monitoring For Control Systems (AREA)

Abstract

The application provides a neural network model migration precision alignment method, a device and computer equipment, wherein the method comprises the steps of acquiring a first calculation map of a neural network model in a first frame and acquiring a second calculation map of the neural network model in a second frame when migrating the neural network model from the first frame to the second frame; the method comprises the steps of selecting a target node to be checked from a plurality of nodes according to the dependency relationship among the plurality of nodes in a calculation graph, comparing and analyzing first input data and first output data of a first target node in a first calculation graph with second input data and second output data of a second target node in a second calculation graph to obtain a comparison analysis result, and selecting an abnormal node from the plurality of nodes in the second calculation graph according to the comparison analysis result. The application can remarkably improve the accuracy comparison efficiency and accuracy in the process of transferring across frames, automatically detect and position the accuracy difference, and provide powerful support for transferring frames.

Description

Neural network model migration precision alignment method, device and computer equipment
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a neural network model migration precision alignment method, a device and computer equipment.
Background
With the development of large model technology, more and more people begin to provide intelligent generation tasks with large model technology. The current large models are mostly developed based on PyTorch and other frameworks, but the main current domestic AI (ARTIFICIAL INTELLIGENCE ) framework is, for example, PADDLEPADDLE and other frameworks different from PyTorch, and the models need to be migrated from PyTorch-based frameworks to PADDLEPADDLE-based AI frameworks in the use process.
However, the migration model often has errors in calculation accuracy, for example, because node granularity and names of frames before and after migration do not correspond one to one, so that the accuracy errors exist in the positioned nodes, in the prior art, a manual comparison mode is generally adopted to position the nodes with the accuracy errors from all the nodes, but the mode not only increases the workload of developers, but also reduces the migration efficiency of the model.
Disclosure of Invention
In view of the above, the application provides a neural network model migration precision alignment method, a device and a computer device, which are used for solving the problems of workload increase of developers and reduction of model migration efficiency caused by manually positioning nodes with precision errors in the related art.
An embodiment of a first aspect of the present application provides a neural network model migration accuracy alignment method, where the method includes:
When a neural network model is migrated from a first framework to a second framework, a first calculation graph of the neural network model in the first framework is obtained, and a second calculation graph of the neural network model in the second framework is obtained, wherein each calculation graph comprises a plurality of nodes and dependency relations among the nodes;
Selecting a target node to be checked from a plurality of nodes in the calculation diagram according to the dependency relationship between the plurality of nodes aiming at any one of the first calculation diagram and the second calculation diagram;
Comparing and analyzing the first input data and the first output data of the first target node in the first calculation graph and the second input data and the second output data of the second target node in the second calculation graph to obtain a comparison and analysis result;
And screening abnormal nodes from the plurality of nodes of the second computational graph according to the comparison analysis result.
According to the method and the device for determining the accuracy of the cross-frame migration, the target nodes to be checked are screened out of the nodes according to the dependency relations among the nodes in the calculation graph, so that the nodes possibly having the accuracy abnormality can be identified, accuracy comparison efficiency and accuracy in the cross-frame migration process are improved, abnormal nodes are screened out of the nodes in the second calculation graph according to comparison analysis results, the nodes causing the accuracy errors can be accurately determined, and accuracy comparison efficiency is improved.
In the embodiment of the present application, the screening of the target node to be checked from the plurality of nodes according to the dependency relationship between the plurality of nodes in the computation graph includes:
Calculating the operation times related to the hidden layer dimension in the node according to the input data and the output data of the node aiming at any node in the plurality of nodes;
And if the operation times are greater than a preset times threshold, determining the node as the target node.
In the embodiment of the present application, the screening of the target node to be checked from the plurality of nodes according to the dependency relationship between the plurality of nodes in the computation graph includes:
determining the number of nodes of downstream nodes connected with the nodes aiming at any node in the plurality of nodes, wherein the downstream nodes are nodes for receiving output data of the nodes;
And if the number of the nodes is larger than a preset number threshold, determining the node as the target node.
In an embodiment of the present application, after the target node to be inspected is screened out from the plurality of nodes according to the dependency relationship between the plurality of nodes in the computation graph, the method further includes:
And respectively setting monitoring points at the input position and the output position of the target node, and operating the neural network model, wherein the monitoring points are used for monitoring the input data and the output data of the target node.
In the embodiment of the application, the setting of the monitoring point at the input position and the output position of the target node respectively comprises the following steps:
Determining the monitoring priority of the target node according to the node type of the target node;
and if the monitoring priority of the target node is greater than the preset priority, respectively setting monitoring points at the input position and the output position of the target node.
In an embodiment of the present application, after determining that the target node is an abnormal node, the method further includes:
judging whether the target node comprises a child node or not;
If the target node does not contain child nodes, the target node is taken as a final abnormal node;
and if the target node comprises a plurality of sub-nodes, taking the plurality of sub-nodes as a plurality of nodes in the computational graph, and repeatedly executing the step of screening the target node to be checked from the plurality of nodes according to the dependency relationship among the plurality of nodes in the computational graph until the target node does not comprise the sub-nodes, and taking the target node as a final abnormal node.
In the embodiment of the present application, comparing and analyzing first input data and first output data of a first target node in the first computation graph, and second input data and second output data of a second target node in the second computation graph, to obtain a comparison and analysis result, including:
Judging whether the data sequence attributes of the first input data and the second input data are consistent or not, wherein the data sequence attributes comprise data types and data arrangement sequences;
if the data sequence attributes of the first input data and the second input data are consistent, acquiring a plurality of first sub-input data of the first input data and a plurality of second sub-input data of the second input data;
Screening a plurality of first target data in a preset data range from the plurality of first sub-input data, and screening a plurality of second target data in the preset data range from the plurality of second sub-input data;
Calculating a first statistical index of the first input data through the plurality of first target data and a second statistical index of the second input data through the plurality of second target data, wherein the first statistical index and the second statistical index comprise a mean value and a variance;
and comparing and analyzing the first statistical index of the first input data with the second statistical index of the second input data to obtain a comparison and analysis result.
In an embodiment of the present application, obtaining a first computational graph of the neural network model in the first framework includes:
Generating a first initial computational graph of the neural network model in the first framework;
The method comprises the steps of converting a first initial computational graph into a first computational graph according to a general computational graph representation method, wherein the general computational graph representation method is used for presenting neural network models of different frameworks in a general computational graph representation mode;
Obtaining a second computational graph of the neural network model in the second framework, comprising:
Generating a second initial computational graph of the neural network model in the second framework;
and converting the second initial calculation graph into the second calculation graph according to the general calculation graph representation method.
An embodiment of a second aspect of the present application provides a neural network model migration accuracy alignment apparatus, including:
The computing graph acquisition module is used for acquiring a first computing graph of the neural network model in a first framework and acquiring a second computing graph of the neural network model in a second framework when the neural network model is migrated from the first framework to the second framework, wherein each computing graph comprises a plurality of nodes and a dependency relationship among the nodes;
the target node screening module is used for screening target nodes to be checked from a plurality of nodes in the first calculation diagram and the second calculation diagram according to the dependency relationship among the nodes in the calculation diagram;
The comparison analysis result generation module is used for comparing and analyzing the first input data and the first output data of the first target node in the first calculation graph and the second input data and the second output data of the second target node in the second calculation graph to obtain a comparison analysis result;
and the abnormal node judging module is used for screening abnormal nodes from the plurality of nodes of the second calculation graph according to the comparison analysis result.
An embodiment of the third aspect of the present application provides a computer device, where the computer device includes a memory and a processor, where the memory and the processor are communicatively connected to each other, and the memory stores computer instructions, and the processor executes the computer instructions, thereby executing the neural network model migration accuracy alignment method described in the first aspect.
An embodiment of the fourth aspect of the present application provides a computer readable storage medium, where computer instructions are stored, where the computer instructions are configured to cause a computer to perform the neural network model migration accuracy alignment method described in the first aspect.
Additional aspects and advantages of the application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the application.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to designate like parts throughout the figures.
In the drawings:
Fig. 1 is a schematic flow chart of a neural network model migration accuracy alignment method according to an embodiment of the present application;
FIG. 2 is a schematic diagram showing the same node position of a first target node in a first calculation map and the same node position of a second target node in a second calculation map according to an embodiment of the present application;
FIG. 3 is a schematic diagram showing a node position of a first target node in a first calculation graph and a node position of a second target node in a second calculation graph according to an embodiment of the present application;
FIG. 4 is a schematic structural diagram of a neural network model migration accuracy alignment device according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a computer device according to an embodiment of the present application;
fig. 6 is a schematic diagram of a storage medium according to an embodiment of the present application.
Detailed Description
Exemplary embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the application to those skilled in the art.
It is noted that unless otherwise indicated, technical or scientific terms used herein should be given the ordinary meaning as understood by one of ordinary skill in the art to which this application belongs.
Technical scenarios related to the embodiments of the present application are described below.
Frame precision alignment refers to the conversion of a model originally implemented based on PyTorch to another frame (e.g., PADDLEPADDLE). The calculation accuracy of the converted model training or reasoning program needs to be compared with the original framework. The common precision alignment is manual module-by-module comparison, operators before conversion and operators after conversion are not necessarily in one-to-one correspondence, and time and labor are wasted. The embodiment of the application uses an automatic method to analyze codes once, automatically runs out the poor-precision tree of operators or ops, thus being beneficial to directly comparing different frame realizations of the same model and finding out the precision problem existing in a certain node.
With the development of deep learning and large model technology, more and more people begin to provide intelligent generation tasks with deep learning or large model technology. Most of the current large models are developed based on foreign frameworks such as PyTorch, and the demands for domestic software and hardware migration are becoming more and more urgent from the perspective of autonomous control. Since the domestic AI framework is not mature, there is often an error in the accuracy of calculation after the migration from the original foreign framework such as PyTorch to the domestic AI framework (for example, PADDLEPADDLE). After each modification, the developer needs to test and locate the accuracy problem.
The problem at present is that,
(1) Current positioning accuracy problems require manual step-by-step search by a developer.
(2) After repairing a problem, a cumbersome test and locating the next problem is still required
(3) The operators of the frames before migration and the operators of the frames after migration often do not have a one-to-one correspondence in granularity and name, so where to insert test points requires manual analysis.
The embodiment of the application provides an automatic operator precision alignment method which comprises the steps of analyzing a program before migration, analyzing the program after migration, automatically comparing according to semantics, automatically inserting test points, and automatically reporting precision problems during testing.
According to an embodiment of the present application, there is provided an embodiment of a neural network model migration accuracy alignment method, it should be noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order different from that herein.
In this embodiment, a neural network model migration accuracy alignment method is provided, fig. 1 is a flowchart of the neural network model migration accuracy alignment method according to an embodiment of the present application, as shown in fig. 1, where the flowchart includes the following steps:
Step S101, when migrating a neural network model from a first framework to a second framework, acquiring a first computational graph of the neural network model in the first framework, and acquiring a second computational graph of the neural network model in the second framework.
In the embodiment of the application, each calculation graph comprises a plurality of nodes and dependency relations among the nodes, wherein any node is a linear full connection layer, a convolution layer, an attention layer, an activation function (e.g. ReLU) layer or a standardization layer, and the dependency relations among the nodes can be understood as the data flow direction among the nodes, for example, if the output data of the node 1 is the input data of the node 2, the dependency relations among the node 1 and the node 2 are determined.
In some embodiments, the step S101 further comprises steps S1011-S1014:
step S1011, generating a first initial computational graph of the neural network model in the first framework.
Wherein the first initial computational graph is a structure defined by the neural network model in a first framework (e.g., pyTorch) for representing the computational logic of the neural network model and the dependencies between nodes.
Step S1012, converting the first initial computation graph into the first computation graph according to a general computation graph representation method.
The general computational graph representation method is used for presenting the neural network models of different frameworks through a general computational graph representation mode (for example, ONNX format).
The general calculation graph representation method in the embodiment of the application takes account of the independence, the high efficiency and the flexibility of the frames, so that the frames can be rapidly exported and converted, and the input and output information of each node required by the downstream task is reserved. Not only is the smoothness of frame migration promoted, but also the universality and expansibility of the cross-frame computing graph are improved.
Step S1013 generates a second initial computational graph of the neural network model in the second framework (PADDLEPADDLE).
Wherein the second initial computational graph is a structure of the neural network model defined in a second framework (e.g., PADDLEPADDLE) for representing the computational logic of the neural network model and the dependencies between nodes.
Step S1014, converting the second initial computation graph into the second computation graph according to the general computation graph representation method.
Similarly, reference is made to step S1012.
Step S102, for any one of the first computation graph and the second computation graph, selecting a target node to be inspected from a plurality of nodes in the computation graph according to the dependency relationship between the plurality of nodes.
The target node to be inspected refers to a node which is easy to cause precision difference among a plurality of nodes.
In some embodiments, the step S102 includes steps a 1-a 2:
Step a1, for any node of the plurality of nodes, calculating the operation times related to Hidden layer Dimension (Hidden Dimension) in the node according to the input data and the output data of the node.
The number of operations related to hidden layer dimensions in the above computing node is described by way of example:
When the mathematical operation performed in any node is that matrix a (B1 x S1 x H1) and matrix B (B2 x S2 x H2) are multiplied, i.e., (B1 x S1 x H1) (B2 x S2 x H2), where B (Batch Size) in each matrix represents the number of samples, S (Sequence Length) represents the sequence length, H (Hidden Dimension) represents the hidden layer dimension, and the number of operations related to the hidden layer dimension can be determined to be H1 x H2.
And a step a2 of determining the node as the target node if the operation times are greater than a preset time threshold.
The size of the preset frequency threshold is not particularly limited, and the corresponding preset frequency threshold can be set according to different types of nodes. And if the operation times are smaller than or equal to a preset times threshold value, determining that the node does not belong to the target node.
In the embodiment of the application, the target node refers to a computation-intensive node in a plurality of nodes, the computation-intensive node involves complex mathematical operation, the computation complexity is high, the overall performance and efficiency of the model are obviously affected, and the precision difference is easily caused in the process of migrating the model from one frame to another frame.
In some embodiments, the step S102 includes steps b 1-b 2:
and b1, determining the number of nodes of downstream nodes connected with any node in the plurality of nodes.
The downstream nodes are nodes for receiving output data of the nodes, such as A- & gt B, A- & gt C, A- & gt D, wherein B, C and D are downstream nodes of A, and the number of the downstream nodes connected with the node A is 3.
And b2, if the number of the nodes is larger than a preset number threshold, determining that the node is the target node.
The size of the preset number threshold is not specifically limited, and the preset number threshold is generally greater than 1, for example, 2, 3, and the like, and the corresponding preset number threshold may be set according to different types of nodes. And if the number of the nodes is smaller than or equal to a preset number threshold, determining that the nodes do not belong to the target node.
The embodiment of the application can rapidly screen out the nodes with larger influence on subsequent calculation by determining the number of the downstream nodes connected with the nodes and comparing the number with the preset number threshold. The output data of these nodes is relied upon by multiple downstream nodes. This screening method can help a developer or researcher quickly locate key nodes in the model to determine if these key nodes are outlier nodes that cause differences in accuracy.
In some specific embodiments, after the step S102, the method further includes:
And c1, respectively setting monitoring points at the input position and the output position of the target node, and operating the neural network model. The monitoring point is used for monitoring input data and output data of the target node.
In embodiments of the present application, the placement of monitoring points may be understood by inserting hook functions or black box program instrumentation at the input and output of a particular node (e.g., layer or operation) of the neural network model, which are used to capture and record the input data and output data of the node.
In some embodiments, the step c1 includes steps c11 to c12:
And c11, determining the monitoring priority of the target node according to the node type of the target node.
Wherein, the sensitivity of different types of nodes (such as an activation function, a normalization layer, a convolution layer and the like) to the precision is different, so that different types of target nodes correspond to different levels of monitoring priority, for example, an MLP layer is generally more sensitive to the precision difference than an activation function layer, and therefore, higher monitoring priority is arranged around the MLP layer nodes.
And c12, if the monitoring priority of the target node is greater than the preset priority, setting monitoring points at the input position and the output position of the target node respectively.
In the embodiment of the application, when the monitoring priority of the target node is smaller than or equal to the preset priority, no monitoring point is respectively set at the input position and the output position of the target node. The preset priority may be set according to actual situations, and is not specifically limited herein.
Step S103, comparing and analyzing the first input data and the first output data of the first target node in the first computation graph, and the second input data and the second output data of the second target node in the second computation graph, to obtain a comparison analysis result.
In some embodiments, the node position of the first target node in the first computational graph is the same as the node position of the second target node in the second computational graph.
For example, A (a1→a2→a3) and B (b1→b2→b3), wherein the node positions of a2 and B2 are the same. As another example, as shown in fig. 2, when the first target node is computer_ atten (calculate attention layer) in the first calculation graph, the second target node is computer_ atten (calculate attention layer) in the second calculation graph.
In some embodiments, the node locations of the first target node in the first computational graph and the second target node in the second computational graph correspond, but are not identical.
For example, as shown in fig. 3, when the first target node in PyTorch includes "h×qd", "h×kvd" and "h×kvd", the second target node corresponding to the first target node is "H (qd+2kvd)" in PADDLENLP.
In the embodiment of the application, the first input data and the second input data are subjected to comparative analysis to obtain a first comparative analysis result, and the first output data and the second output data are subjected to comparative analysis to obtain a second comparative analysis result.
In some embodiments, step S103 includes steps S1031-S1035:
step S1031, determining whether the data sequence attributes of the first input data and the second input data are consistent.
In the embodiment of the application, the data sequence attribute comprises a data type and a data arrangement sequence, wherein the data type refers to a storage format of data in a memory and comprises a 32-bit floating point number (namely float 32), a 16-bit floating point number (namely float 16), a 32-bit integer (namely int 32) and an 8-bit integer (namely int 8), and the data arrangement refers to the storage sequence of the data in the memory and comprises NCHW and NHWC:
NCHW the presentation data is stored in the order of [ Batch Size, channels, height, width ]. This is the default arrangement of PyTorch and TensorFlow.
NHWC the presentation data is stored in the order of [ Batch Size, height, width, channels ]. This is another common arrangement of TensorFlow, particularly computationally efficient on a GPU.
Step S1032, if the data sequence attributes of the first input data and the second input data are identical, acquiring a plurality of first sub-input data of the first input data, and acquiring a plurality of second sub-input data of the second input data.
Each input data comprises a plurality of sub-input data, each output data comprises a plurality of sub-output data, and the first input data, the second input data, the first output data and the second output data are the same.
Step S1033, screening a plurality of first target data within a preset data range from the plurality of first sub-input data, and screening a plurality of second target data within the preset data range from the plurality of second sub-input data.
Specifically, the preset data range may be set according to actual situations, which is not limited herein, and invalid data may be effectively filtered by setting the preset data range.
Step S1034, calculating a first statistical index of the first input data from the plurality of first target data, and calculating a second statistical index of the second input data from the plurality of second target data.
Specifically, the first statistical indicator and the second statistical indicator include a mean and a variance.
Step S1035, performing a comparative analysis on the first statistical index of the first input data and the second statistical index of the second input data, to obtain a comparative analysis result.
In some embodiments, the specific implementation manner of comparing the first output data with the second output data to obtain the second comparison analysis result may refer to the above steps S1031-S1035, which are not described herein.
According to the embodiment of the application, the statistical indexes of the input data and the output data of the same node on different frames in the neural network model are calculated, and the abnormal nodes causing the precision difference can be screened from a plurality of nodes by comparing the statistical indexes of the two frames, for example, the statistical indexes of the two frames are compared, so that the node with larger difference can be found out to serve as the abnormal node.
And step S104, screening abnormal nodes from the plurality of nodes of the second calculation graph according to the comparison analysis result.
In the embodiment of the application, the abnormal node can be screened out by the following modes:
When the average value difference between the input data or the output data of any node in the second calculation graph and the node at the same node position in the first calculation graph is calculated to be larger than the preset average value difference threshold value, the any node in the second calculation graph can be determined to be an abnormal node.
Similarly, when the variance difference between the input data or the output data of any node in the second calculation map and the node at the same node position in the first calculation map is calculated to be larger than the preset variance difference threshold, the any node in the second calculation map can be determined to be an abnormal node.
In some embodiments, after the step S104, the method further includes steps S1051 to S1053:
in step S1051, it is determined whether the target node includes a child node.
Step S1052, if the target node does not include child nodes, the target node is the final abnormal node.
Step S1053, if the target node includes a plurality of sub-nodes, using the plurality of sub-nodes as a plurality of nodes in the computation graph, and repeating the step of screening the target node to be checked from the plurality of nodes according to the dependency relationship between the plurality of nodes in the computation graph until the target node does not include a sub-node, and using the target node as a final abnormal node.
For example, the above steps S1051 to S1053 are described, in which when there are a plurality of nodes, i.e., a→b→c→d, and the node B is preliminarily determined to be an abnormal node through the steps S101 to S104, the node B may be determined to be a final abnormal node directly if the node B does not include a plurality of sub-nodes, whereas if the node B includes a plurality of sub-nodes, i.e., b1→b2→b3→b4, it is necessary to repeatedly perform the steps S102 to S104 to locate an abnormal sub-node from among B1, B2, B3 and B4, and if there is no more refined node among the abnormal sub-nodes, it is determined that the abnormal sub-node is the final abnormal node. If there are more refined nodes in the abnormal child nodes, the above steps need to be repeated continuously.
In the embodiment of the application, the nodes with precision errors are gradually refined by continuously iterating the process, and further analyzed until the specific nodes with the precision problems are accurately positioned. The iterative optimization mechanism ensures the high efficiency and accuracy of precision comparison, thereby minimizing error accumulation and inconsistency in the migration process.
The embodiment of the application can obviously improve the accuracy comparison efficiency and accuracy in the cross-frame migration process, automatically detect and position the accuracy difference, and provide powerful support for frame migration.
Example 2:
Corresponding to the implementation manner of the neural network model migration precision alignment method, the embodiment of the application also provides a neural network model migration precision alignment device, which is used for executing the neural network model migration precision alignment method described in the embodiment. As shown in fig. 4, the neural network model migration accuracy alignment apparatus includes:
The computing graph acquisition module is used for acquiring a first computing graph of the neural network model in a first framework and acquiring a second computing graph of the neural network model in a second framework when the neural network model is migrated from the first framework to the second framework, wherein each computing graph comprises a plurality of nodes and a dependency relationship among the nodes;
the target node screening module is used for screening target nodes to be checked from a plurality of nodes in the first calculation diagram and the second calculation diagram according to the dependency relationship among the nodes in the calculation diagram;
The comparison analysis result generation module is used for comparing and analyzing the first input data and the first output data of the first target node in the first calculation graph and the second input data and the second output data of the second target node in the second calculation graph to obtain a comparison analysis result;
and the abnormal node judging module is used for screening abnormal nodes from the plurality of nodes of the second calculation graph according to the comparison analysis result.
Optionally, the target node screening module is further configured to calculate, for any node of the plurality of nodes, an operation number of times related to a hidden layer dimension in the node according to input data and output data of the node, and if the operation number of times is greater than a preset number of times threshold, determine that the node is the target node.
Optionally, the target node screening module is further configured to determine, for any node of the plurality of nodes, a node number of a downstream node connected to the node, where the downstream node is a node, among the plurality of nodes, for receiving output data of the node, and determine that the node is the target node if the node number is greater than a preset number threshold.
Optionally, the device further comprises a data monitoring module, wherein the data monitoring module is used for respectively setting monitoring points at the input position and the output position of the target node after the target node to be checked is screened out from the nodes according to the dependency relationship among the nodes in the calculation graph, and running the neural network model, and the monitoring points are used for monitoring the input data and the output data of the target node.
Optionally, the data monitoring module is further configured to determine a monitoring priority of the target node according to a node type of the target node, wherein different types of target nodes correspond to different levels of monitoring priorities, and if the monitoring priority of the target node is greater than a preset priority, monitoring points are set at an input position and an output position of the target node respectively.
Optionally, the device further comprises an iteration module, wherein the iteration module is used for judging whether the target node contains sub-nodes after judging that the target node is an abnormal node, if the target node does not contain sub-nodes, the target node is a final abnormal node, if the target node contains a plurality of sub-nodes, the plurality of sub-nodes are used as a plurality of nodes in the computation graph, and the step of screening the target node to be checked from the plurality of nodes according to the dependency relationship among the plurality of nodes in the computation graph is repeatedly executed until the target node does not contain sub-nodes, and the target node is the final abnormal node.
Optionally, the comparison analysis result generation module is further configured to determine whether data sequence attributes of the first input data and the second input data are consistent, where the data sequence attributes include a data type and a data arrangement order, if the data sequence attributes of the first input data and the second input data are consistent, obtain a plurality of first sub-input data of the first input data and a plurality of second sub-input data of the second input data, screen out a plurality of first target data in a preset data range from the plurality of first sub-input data, screen out a plurality of second target data in the preset data range from the plurality of second sub-input data, calculate a first statistical index of the first input data through the plurality of first target data, calculate a second statistical index of the second input data through the plurality of second target data, where the first statistical index and the second statistical index include a mean value and a variance, and analyze the first statistical index of the first input data and the second statistical index of the first input data, and the second statistical index of the second input data, and obtain the comparison analysis result.
The computing device comprises a computing graph acquisition module, a computing graph representation module and a computing graph conversion module, wherein the computing graph acquisition module is used for generating a first initial computing graph of the neural network model in the first framework, converting the first initial computing graph into the first computing graph according to a general computing graph representation method, the general computing graph representation method is used for presenting the neural network models of different frameworks through a general computing graph representation mode, generating a second initial computing graph of the neural network model in the second framework, and converting the second initial computing graph into the second computing graph according to the general computing graph representation method.
The neural network model migration precision alignment device provided by the embodiment of the application and the neural network model migration precision alignment method provided by the embodiment of the application have the same beneficial effects as the method adopted, operated or realized by the application program stored by the neural network model migration precision alignment device due to the same inventive concept.
The embodiment of the application also provides computer equipment for executing the neural network model migration precision alignment method. Referring to fig. 5, a schematic diagram of a computer device according to some embodiments of the present application is shown. As shown in fig. 5, the computer device 5 includes a processor 500, a memory 501, a bus 502 and a communication interface 505, where the processor 500, the communication interface 505 and the memory 501 are connected through the bus 502, and the memory 501 stores a computer program that can be run on the processor 500, and when the processor 500 runs the computer program, the neural network model migration accuracy alignment method provided by the foregoing embodiment of the present application is executed.
The memory 501 may include a high-speed random access memory (Random Access Memory, RAM), and may further include a non-volatile memory (non-volatile memory), such as at least one disk memory. The communication connection between the system network element and at least one other network element is implemented via at least one communication interface 505 (which may be wired or wireless), the internet, a wide area network, a local network, a metropolitan area network, etc. may be used.
Bus 502 may be an ISA bus, a PCI bus, an EISA bus, or the like. The buses may be classified as address buses, data buses, control buses, etc. The memory 501 is configured to store a program, and the processor 500 executes the program after receiving an execution instruction, and the neural network model migration accuracy alignment method disclosed in the foregoing embodiment may be applied to the processor 500 or implemented by the processor 500.
The processor 500 may be an integrated circuit chip with signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuitry in hardware or instructions in software in the processor 500. The processor 500 may be a general-purpose processor including a central processing unit (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), etc., or may be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in the memory 501, and the processor 500 reads the information in the memory 501, and in combination with its hardware, performs the steps of the method described above.
The computer equipment provided by the embodiment of the application and the neural network model migration precision alignment method provided by the embodiment of the application have the same beneficial effects as the method adopted, operated or realized by the computer equipment and the neural network model migration precision alignment method provided by the embodiment of the application are based on the same inventive concept.
The embodiment of the present application further provides a computer readable storage medium corresponding to the neural network model migration accuracy alignment method provided in the foregoing embodiment, referring to fig. 6, the computer readable storage medium is shown as an optical disc 50, on which a computer program (i.e. a program product) is stored, where the computer program, when executed by a processor, performs the neural network model migration accuracy alignment method provided in any of the foregoing embodiments.
It should be noted that examples of the computer readable storage medium may also include, but are not limited to, a phase change memory (PRAM), a Static Random Access Memory (SRAM), a Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a flash memory, or other optical or magnetic storage medium, which will not be described in detail herein.
The computer readable storage medium provided by the above embodiment of the present application has the same advantages as the method adopted, operated or implemented by the application program stored in the computer readable storage medium, because the same inventive concept is adopted by the neural network model migration accuracy alignment method provided by the embodiment of the present application.
It should be noted that:
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the application may be practiced without these specific details. In some instances, well-known structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the above description of exemplary embodiments of the application, various features of the application are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed application requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this application.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features but not others included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the application and form different embodiments. For example, in the following claims, any of the claimed embodiments can be used in any combination.
The present application is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present application are intended to be included in the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1.一种神经网络模型迁移精度对齐方法,其特征在于,所述方法包括:1. A neural network model migration accuracy alignment method, characterized in that the method comprises: 在将神经网络模型从第一框架迁移至第二框架时,获取所述第一框架中所述神经网络模型的第一计算图,以及获取所述第二框架中所述神经网络模型的第二计算图;每个计算图包括多个节点以及节点之间的依赖关系;任一节点为线性全连接层、卷积层、注意力层、激活函数层或者标准化层;When migrating a neural network model from a first framework to a second framework, obtaining a first computational graph of the neural network model in the first framework, and obtaining a second computational graph of the neural network model in the second framework; each computational graph includes a plurality of nodes and dependencies between the nodes; any node is a linear fully connected layer, a convolutional layer, an attention layer, an activation function layer, or a normalization layer; 针对所述第一计算图和所述第二计算图中的任一计算图,根据所述计算图中多个节点之间的依赖关系从所述多个节点中筛选出待检查的目标节点;For any one of the first computation graph and the second computation graph, filter out a target node to be checked from the multiple nodes according to dependency relationships between the multiple nodes in the computation graph; 将所述第一计算图中第一目标节点的第一输入数据和第一输出数据,以及所述第二计算图中第二目标节点的第二输入数据和第二输出数据进行对比分析,得到对比分析结果;Comparing and analyzing the first input data and the first output data of the first target node in the first computation graph, and the second input data and the second output data of the second target node in the second computation graph, to obtain a comparative analysis result; 根据所述对比分析结果从所述第二计算图的多个节点中筛选出异常节点。According to the comparative analysis result, abnormal nodes are screened out from multiple nodes of the second computation graph. 2.根据权利要求1所述的方法,其特征在于,根据所述计算图中多个节点之间的依赖关系从所述多个节点中筛选出待检查的目标节点,包括:2. The method according to claim 1, wherein the step of selecting a target node to be checked from the plurality of nodes according to the dependency relationship between the plurality of nodes in the computation graph comprises: 针对所述多个节点中的任一节点,根据所述节点的输入数据和输出数据计算所述节点中与隐藏层维度相关的运算次数;For any node among the plurality of nodes, calculating the number of operations related to the hidden layer dimension in the node according to the input data and output data of the node; 如果所述运算次数大于预设次数阈值,则确定所述节点为所述目标节点。If the number of operations is greater than a preset number threshold, the node is determined to be the target node. 3.根据权利要求1所述的方法,其特征在于,根据所述计算图中多个节点之间的依赖关系从所述多个节点中筛选出待检查的目标节点,包括:3. The method according to claim 1, wherein the step of selecting a target node to be checked from the plurality of nodes according to the dependency relationship between the plurality of nodes in the computation graph comprises: 针对所述多个节点中的任一节点,确定与所述节点连接的下游节点的节点数量;所述下游节点是指所述多个节点中用于接收所述节点的输出数据的节点;For any node among the plurality of nodes, determining the number of downstream nodes connected to the node; the downstream node refers to a node among the plurality of nodes for receiving output data of the node; 如果所述节点数量大于预设数量阈值,则确定所述节点为为所述目标节点。If the number of the nodes is greater than a preset number threshold, the node is determined to be the target node. 4.根据权利要求1所述的方法,其特征在于,在根据所述计算图中多个节点之间的依赖关系从所述多个节点中筛选出待检查的目标节点之后,所述方法还包括:4. The method according to claim 1, wherein after selecting a target node to be checked from the plurality of nodes according to the dependency relationships between the plurality of nodes in the computation graph, the method further comprises: 在所述目标节点的输入位置和输出位置分别设置监测点,并运行所述神经网络模型;所述监测点用于监测所述目标节点的输入数据和输出数据。Monitoring points are respectively set at the input position and output position of the target node, and the neural network model is run; the monitoring points are used to monitor the input data and output data of the target node. 5.根据权利要求4所述的方法,其特征在于,在所述目标节点的输入位置和输出位置分别设置监测点,包括:5. The method according to claim 4, wherein monitoring points are respectively set at the input position and the output position of the target node, comprising: 根据所述目标节点的节点类型确定所述目标节点的监测优先级;不同类型的目标节点对应不同等级的监测优先级;Determining the monitoring priority of the target node according to the node type of the target node; different types of target nodes correspond to different levels of monitoring priority; 如果所述目标节点的监测优先级大于预设优先级,则在所述目标节点的输入位置和输出位置分别设置监测点。If the monitoring priority of the target node is greater than the preset priority, monitoring points are respectively set at the input position and the output position of the target node. 6.根据权利要求1所述的方法,其特征在于,在判断所述目标节点为异常节点之后,所述方法还包括:6. The method according to claim 1, characterized in that after determining that the target node is an abnormal node, the method further comprises: 判断所述目标节点是否包含子节点;Determine whether the target node contains child nodes; 如果所述目标节点不包含子节点,则将所述目标节点为最终异常节点;If the target node does not contain any child nodes, the target node is considered as the final abnormal node; 如果所述目标节点包含多个子节点,则将所述多个子节点作为所述计算图中的多个节点,重复执行根据所述计算图中多个节点之间的依赖关系从所述多个节点中筛选出待检查的目标节点的步骤,直至所述目标节点不包含子节点时,将所述目标节点为最终异常节点。If the target node contains multiple child nodes, the multiple child nodes are used as multiple nodes in the calculation graph, and the step of filtering out the target node to be checked from the multiple nodes according to the dependency relationship between the multiple nodes in the calculation graph is repeatedly performed until the target node contains no child nodes, and the target node is regarded as the final abnormal node. 7.根据权利要求1或2所述的方法,其特征在于,将所述第一计算图中第一目标节点的第一输入数据和第一输出数据,以及所述第二计算图中第二目标节点的第二输入数据和第二输出数据进行对比分析,得到对比分析结果,包括:7. The method according to claim 1 or 2, wherein the step of comparing and analyzing the first input data and the first output data of the first target node in the first computation graph, and the second input data and the second output data of the second target node in the second computation graph, to obtain a comparative analysis result comprises: 判断所述第一输入数据和第二输入数据的数据序列属性是否一致,所述数据序列属性包括数据类型和数据排布顺序;Determining whether data sequence attributes of the first input data and the second input data are consistent, the data sequence attributes including data type and data arrangement order; 如果所述第一输入数据和第二输入数据的数据序列属性一致,则获取所述第一输入数据的多个第一子输入数据,以及获取所述第二输入数据的多个第二子输入数据;If the data sequence attributes of the first input data and the second input data are consistent, obtaining a plurality of first sub-input data of the first input data, and obtaining a plurality of second sub-input data of the second input data; 从所述多个第一子输入数据中筛选出处于预设数据范围内的多个第一目标数据,以及从所述多个第二子输入数据中筛选出处于所述预设数据范围内的多个第二目标数据;Filtering out a plurality of first target data within a preset data range from the plurality of first sub-input data, and filtering out a plurality of second target data within the preset data range from the plurality of second sub-input data; 通过所述多个第一目标数据计算所述第一输入数据的第一统计指标,以及通过所述多个第二目标数据计算所述第二输入数据的第二统计指标;所述第一统计指标和所述第二统计指标包括均值和方差;Calculating a first statistical indicator of the first input data using the plurality of first target data, and calculating a second statistical indicator of the second input data using the plurality of second target data; the first statistical indicator and the second statistical indicator include a mean and a variance; 将所述第一输入数据的第一统计指标和所述第二输入数据的第二统计指标进行对比分析,得到所述对比分析结果。A first statistical indicator of the first input data and a second statistical indicator of the second input data are compared and analyzed to obtain the comparative analysis result. 8.根据权利要求1或2所述的方法,其特征在于,获取所述第一框架中所述神经网络模型的第一计算图,包括:8. The method according to claim 1 or 2, wherein obtaining a first computational graph of the neural network model in the first framework comprises: 生成所述第一框架中所述神经网络模型的第一初始计算图;Generating a first initial computational graph of the neural network model in the first framework; 根据通用计算图表示方法将所述第一初始计算图转换为所述第一计算图;所述通用计算图表示方法用于将不同框架的神经网络模型通过一种通用的计算图表示方式进行呈现;Converting the first initial computation graph into the first computation graph according to a universal computation graph representation method; the universal computation graph representation method is used to present neural network models of different frameworks using a universal computation graph representation method; 获取所述第二框架中所述神经网络模型的第二计算图,包括:Obtaining a second computational graph of the neural network model in the second framework includes: 生成所述第二框架中所述神经网络模型的第二初始计算图;generating a second initial computational graph of the neural network model in the second framework; 根据所述通用计算图表示方法将所述第二初始计算图转换为所述第二计算图。The second initial computation graph is converted into the second computation graph according to the general computation graph representation method. 9.一种神经网络模型迁移精度对齐装置,其特征在于,所述装置包括:9. A neural network model migration accuracy alignment device, characterized in that the device comprises: 计算图获取模块,用于在将神经网络模型从第一框架迁移至第二框架时,获取所述第一框架中所述神经网络模型的第一计算图,以及获取所述第二框架中所述神经网络模型的第二计算图;每个计算图包括多个节点以及节点之间的依赖关系;任一节点为线性全连接层、卷积层、注意力层、激活函数层或者标准化层;A computational graph acquisition module, configured to, when migrating a neural network model from a first framework to a second framework, acquire a first computational graph of the neural network model in the first framework, and acquire a second computational graph of the neural network model in the second framework; each computational graph includes a plurality of nodes and dependencies between the nodes; any node is a linear fully connected layer, a convolutional layer, an attention layer, an activation function layer, or a normalization layer; 目标节点筛选模块,用于针对所述第一计算图和所述第二计算图中的任一计算图,根据所述计算图中多个节点之间的依赖关系从所述多个节点中筛选出待检查的目标节点;a target node screening module, configured to screen a target node to be checked from a plurality of nodes in any one of the first and second computation graphs according to dependency relationships between the plurality of nodes in the computation graph; 对比分析结果生成模块,用于将所述第一计算图中第一目标节点的第一输入数据和第一输出数据,以及所述第二计算图中第二目标节点的第二输入数据和第二输出数据进行对比分析,得到对比分析结果;a comparative analysis result generating module, configured to perform comparative analysis on the first input data and the first output data of the first target node in the first computation graph, and the second input data and the second output data of the second target node in the second computation graph, to obtain a comparative analysis result; 异常节点判断模块,用于根据所述对比分析结果从所述第二计算图的多个节点中筛选出异常节点。An abnormal node judgment module is used to filter out abnormal nodes from multiple nodes of the second calculation graph based on the comparative analysis result. 10.一种计算机设备,其特征在于,包括:10. A computer device, comprising: 存储器和处理器,所述存储器和所述处理器之间互相通信连接,所述存储器中存储有计算机指令,所述处理器通过执行所述计算机指令,从而执行权利要求1至8中任一项所述的神经网络模型迁移精度对齐方法。A memory and a processor, wherein the memory and the processor are communicatively connected to each other, the memory stores computer instructions, and the processor executes the neural network model migration accuracy alignment method according to any one of claims 1 to 8 by executing the computer instructions.
CN202510395350.0A 2025-03-31 2025-03-31 Neural network model migration accuracy alignment method, device, and computer equipment Pending CN120471134A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202510395350.0A CN120471134A (en) 2025-03-31 2025-03-31 Neural network model migration accuracy alignment method, device, and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202510395350.0A CN120471134A (en) 2025-03-31 2025-03-31 Neural network model migration accuracy alignment method, device, and computer equipment

Publications (1)

Publication Number Publication Date
CN120471134A true CN120471134A (en) 2025-08-12

Family

ID=96639416

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202510395350.0A Pending CN120471134A (en) 2025-03-31 2025-03-31 Neural network model migration accuracy alignment method, device, and computer equipment

Country Status (1)

Country Link
CN (1) CN120471134A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120872784A (en) * 2025-09-26 2025-10-31 杭州市拱墅区全息智能技术研究院 Method and system for positioning and repairing inference accuracy exception of cross-frame model based on intelligent agent

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120872784A (en) * 2025-09-26 2025-10-31 杭州市拱墅区全息智能技术研究院 Method and system for positioning and repairing inference accuracy exception of cross-frame model based on intelligent agent

Similar Documents

Publication Publication Date Title
US11386154B2 (en) Method for generating a graph model for monitoring machinery health
EP3682324A1 (en) Method and apparatus for finding long methods in code
CN120471134A (en) Neural network model migration accuracy alignment method, device, and computer equipment
CN115438768A (en) Model reasoning method, device, computer equipment and storage medium
CN119690854A (en) Large model-assisted program function automatic perception fuzzy testing method and system
CN112631925A (en) Method for detecting single variable atom violation defect
JP2009099111A (en) Rule inspection program, rule inspection method, and rule inspection device
CN120012750B (en) Data calculation method and system suitable for biopharmaceutical production process
CN120407363A (en) Hardware operation deviation positioning method, device, equipment and storage medium
CN119149402A (en) Performance parameter tuning sequence determining method, device, equipment and medium
CN119597412A (en) Data processing device, method, system and readable storage medium
WO2025218046A1 (en) Fault analysis method and apparatus, computer device, and storage medium
CN112328239A (en) CIM model definition method and device
CN118071009A (en) Data prediction method and system
CN114676134A (en) Anomaly detection method, device, electronic device and storage medium for Hive table
CN114281691A (en) Test case sequencing method, device, computing device and storage medium
CN120872849B (en) Compatibility evaluation method and system of power grid business application to processor architecture
CN119182825B (en) A method and system for adapting CAN signal differences for automotive applications on a platform-based basis.
CN112927811B (en) Processing system and processing method of economic benefit model on medical data information
CN119227603B (en) A logic synthesis and verification method based on memristor-assisted logic
CN114500266A (en) Method, device and equipment for analyzing working state of node
CN116795681A (en) Data flow debugging method and device based on rule engine
CN120743999A (en) Visual processing method and device for property data
CN118672895A (en) Intelligent software defect detection method
CN119166524A (en) Method, device, electronic device and storage medium for selecting access control examples

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination