CN120471134A

CN120471134A - Neural network model migration accuracy alignment method, device, and computer equipment

Info

Publication number: CN120471134A
Application number: CN202510395350.0A
Authority: CN
Inventors: 韩钰; 刘洋; 蒋思源; 刘胤辰
Original assignee: Beijing Guixin Technology Co ltd
Current assignee: Beijing Guixin Technology Co ltd
Priority date: 2025-03-31
Filing date: 2025-03-31
Publication date: 2025-08-12

Abstract

The application provides a neural network model migration precision alignment method, a device and computer equipment, wherein the method comprises the steps of acquiring a first calculation map of a neural network model in a first frame and acquiring a second calculation map of the neural network model in a second frame when migrating the neural network model from the first frame to the second frame; the method comprises the steps of selecting a target node to be checked from a plurality of nodes according to the dependency relationship among the plurality of nodes in a calculation graph, comparing and analyzing first input data and first output data of a first target node in a first calculation graph with second input data and second output data of a second target node in a second calculation graph to obtain a comparison analysis result, and selecting an abnormal node from the plurality of nodes in the second calculation graph according to the comparison analysis result. The application can remarkably improve the accuracy comparison efficiency and accuracy in the process of transferring across frames, automatically detect and position the accuracy difference, and provide powerful support for transferring frames.

Description

Neural network model migration precision alignment method, device and computer equipment

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a neural network model migration precision alignment method, a device and computer equipment.

Background

With the development of large model technology, more and more people begin to provide intelligent generation tasks with large model technology. The current large models are mostly developed based on PyTorch and other frameworks, but the main current domestic AI (ARTIFICIAL INTELLIGENCE ) framework is, for example, PADDLEPADDLE and other frameworks different from PyTorch, and the models need to be migrated from PyTorch-based frameworks to PADDLEPADDLE-based AI frameworks in the use process.

However, the migration model often has errors in calculation accuracy, for example, because node granularity and names of frames before and after migration do not correspond one to one, so that the accuracy errors exist in the positioned nodes, in the prior art, a manual comparison mode is generally adopted to position the nodes with the accuracy errors from all the nodes, but the mode not only increases the workload of developers, but also reduces the migration efficiency of the model.

Disclosure of Invention

In view of the above, the application provides a neural network model migration precision alignment method, a device and a computer device, which are used for solving the problems of workload increase of developers and reduction of model migration efficiency caused by manually positioning nodes with precision errors in the related art.

An embodiment of a first aspect of the present application provides a neural network model migration accuracy alignment method, where the method includes:

When a neural network model is migrated from a first framework to a second framework, a first calculation graph of the neural network model in the first framework is obtained, and a second calculation graph of the neural network model in the second framework is obtained, wherein each calculation graph comprises a plurality of nodes and dependency relations among the nodes;

Selecting a target node to be checked from a plurality of nodes in the calculation diagram according to the dependency relationship between the plurality of nodes aiming at any one of the first calculation diagram and the second calculation diagram;

Comparing and analyzing the first input data and the first output data of the first target node in the first calculation graph and the second input data and the second output data of the second target node in the second calculation graph to obtain a comparison and analysis result;

And screening abnormal nodes from the plurality of nodes of the second computational graph according to the comparison analysis result.

According to the method and the device for determining the accuracy of the cross-frame migration, the target nodes to be checked are screened out of the nodes according to the dependency relations among the nodes in the calculation graph, so that the nodes possibly having the accuracy abnormality can be identified, accuracy comparison efficiency and accuracy in the cross-frame migration process are improved, abnormal nodes are screened out of the nodes in the second calculation graph according to comparison analysis results, the nodes causing the accuracy errors can be accurately determined, and accuracy comparison efficiency is improved.

In the embodiment of the present application, the screening of the target node to be checked from the plurality of nodes according to the dependency relationship between the plurality of nodes in the computation graph includes:

Calculating the operation times related to the hidden layer dimension in the node according to the input data and the output data of the node aiming at any node in the plurality of nodes;

And if the operation times are greater than a preset times threshold, determining the node as the target node.

determining the number of nodes of downstream nodes connected with the nodes aiming at any node in the plurality of nodes, wherein the downstream nodes are nodes for receiving output data of the nodes;

And if the number of the nodes is larger than a preset number threshold, determining the node as the target node.

In an embodiment of the present application, after the target node to be inspected is screened out from the plurality of nodes according to the dependency relationship between the plurality of nodes in the computation graph, the method further includes:

And respectively setting monitoring points at the input position and the output position of the target node, and operating the neural network model, wherein the monitoring points are used for monitoring the input data and the output data of the target node.

In the embodiment of the application, the setting of the monitoring point at the input position and the output position of the target node respectively comprises the following steps:

Determining the monitoring priority of the target node according to the node type of the target node;

and if the monitoring priority of the target node is greater than the preset priority, respectively setting monitoring points at the input position and the output position of the target node.

In an embodiment of the present application, after determining that the target node is an abnormal node, the method further includes:

judging whether the target node comprises a child node or not;

If the target node does not contain child nodes, the target node is taken as a final abnormal node;

and if the target node comprises a plurality of sub-nodes, taking the plurality of sub-nodes as a plurality of nodes in the computational graph, and repeatedly executing the step of screening the target node to be checked from the plurality of nodes according to the dependency relationship among the plurality of nodes in the computational graph until the target node does not comprise the sub-nodes, and taking the target node as a final abnormal node.

In the embodiment of the present application, comparing and analyzing first input data and first output data of a first target node in the first computation graph, and second input data and second output data of a second target node in the second computation graph, to obtain a comparison and analysis result, including:

Judging whether the data sequence attributes of the first input data and the second input data are consistent or not, wherein the data sequence attributes comprise data types and data arrangement sequences;

if the data sequence attributes of the first input data and the second input data are consistent, acquiring a plurality of first sub-input data of the first input data and a plurality of second sub-input data of the second input data;

Screening a plurality of first target data in a preset data range from the plurality of first sub-input data, and screening a plurality of second target data in the preset data range from the plurality of second sub-input data;

Calculating a first statistical index of the first input data through the plurality of first target data and a second statistical index of the second input data through the plurality of second target data, wherein the first statistical index and the second statistical index comprise a mean value and a variance;

and comparing and analyzing the first statistical index of the first input data with the second statistical index of the second input data to obtain a comparison and analysis result.

In an embodiment of the present application, obtaining a first computational graph of the neural network model in the first framework includes:

Generating a first initial computational graph of the neural network model in the first framework;

The method comprises the steps of converting a first initial computational graph into a first computational graph according to a general computational graph representation method, wherein the general computational graph representation method is used for presenting neural network models of different frameworks in a general computational graph representation mode;

Obtaining a second computational graph of the neural network model in the second framework, comprising:

Generating a second initial computational graph of the neural network model in the second framework;

and converting the second initial calculation graph into the second calculation graph according to the general calculation graph representation method.

An embodiment of a second aspect of the present application provides a neural network model migration accuracy alignment apparatus, including:

The computing graph acquisition module is used for acquiring a first computing graph of the neural network model in a first framework and acquiring a second computing graph of the neural network model in a second framework when the neural network model is migrated from the first framework to the second framework, wherein each computing graph comprises a plurality of nodes and a dependency relationship among the nodes;

the target node screening module is used for screening target nodes to be checked from a plurality of nodes in the first calculation diagram and the second calculation diagram according to the dependency relationship among the nodes in the calculation diagram;

The comparison analysis result generation module is used for comparing and analyzing the first input data and the first output data of the first target node in the first calculation graph and the second input data and the second output data of the second target node in the second calculation graph to obtain a comparison analysis result;

and the abnormal node judging module is used for screening abnormal nodes from the plurality of nodes of the second calculation graph according to the comparison analysis result.

An embodiment of the third aspect of the present application provides a computer device, where the computer device includes a memory and a processor, where the memory and the processor are communicatively connected to each other, and the memory stores computer instructions, and the processor executes the computer instructions, thereby executing the neural network model migration accuracy alignment method described in the first aspect.

An embodiment of the fourth aspect of the present application provides a computer readable storage medium, where computer instructions are stored, where the computer instructions are configured to cause a computer to perform the neural network model migration accuracy alignment method described in the first aspect.

Additional aspects and advantages of the application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the application.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to designate like parts throughout the figures.

In the drawings:

Fig. 1 is a schematic flow chart of a neural network model migration accuracy alignment method according to an embodiment of the present application;

FIG. 2 is a schematic diagram showing the same node position of a first target node in a first calculation map and the same node position of a second target node in a second calculation map according to an embodiment of the present application;

FIG. 3 is a schematic diagram showing a node position of a first target node in a first calculation graph and a node position of a second target node in a second calculation graph according to an embodiment of the present application;

FIG. 4 is a schematic structural diagram of a neural network model migration accuracy alignment device according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a computer device according to an embodiment of the present application;

fig. 6 is a schematic diagram of a storage medium according to an embodiment of the present application.

Detailed Description

Exemplary embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the application to those skilled in the art.

It is noted that unless otherwise indicated, technical or scientific terms used herein should be given the ordinary meaning as understood by one of ordinary skill in the art to which this application belongs.

Technical scenarios related to the embodiments of the present application are described below.

Frame precision alignment refers to the conversion of a model originally implemented based on PyTorch to another frame (e.g., PADDLEPADDLE). The calculation accuracy of the converted model training or reasoning program needs to be compared with the original framework. The common precision alignment is manual module-by-module comparison, operators before conversion and operators after conversion are not necessarily in one-to-one correspondence, and time and labor are wasted. The embodiment of the application uses an automatic method to analyze codes once, automatically runs out the poor-precision tree of operators or ops, thus being beneficial to directly comparing different frame realizations of the same model and finding out the precision problem existing in a certain node.

With the development of deep learning and large model technology, more and more people begin to provide intelligent generation tasks with deep learning or large model technology. Most of the current large models are developed based on foreign frameworks such as PyTorch, and the demands for domestic software and hardware migration are becoming more and more urgent from the perspective of autonomous control. Since the domestic AI framework is not mature, there is often an error in the accuracy of calculation after the migration from the original foreign framework such as PyTorch to the domestic AI framework (for example, PADDLEPADDLE). After each modification, the developer needs to test and locate the accuracy problem.

The problem at present is that,

(1) Current positioning accuracy problems require manual step-by-step search by a developer.

(2) After repairing a problem, a cumbersome test and locating the next problem is still required

(3) The operators of the frames before migration and the operators of the frames after migration often do not have a one-to-one correspondence in granularity and name, so where to insert test points requires manual analysis.

The embodiment of the application provides an automatic operator precision alignment method which comprises the steps of analyzing a program before migration, analyzing the program after migration, automatically comparing according to semantics, automatically inserting test points, and automatically reporting precision problems during testing.

According to an embodiment of the present application, there is provided an embodiment of a neural network model migration accuracy alignment method, it should be noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order different from that herein.

In this embodiment, a neural network model migration accuracy alignment method is provided, fig. 1 is a flowchart of the neural network model migration accuracy alignment method according to an embodiment of the present application, as shown in fig. 1, where the flowchart includes the following steps:

Step S101, when migrating a neural network model from a first framework to a second framework, acquiring a first computational graph of the neural network model in the first framework, and acquiring a second computational graph of the neural network model in the second framework.

In the embodiment of the application, each calculation graph comprises a plurality of nodes and dependency relations among the nodes, wherein any node is a linear full connection layer, a convolution layer, an attention layer, an activation function (e.g. ReLU) layer or a standardization layer, and the dependency relations among the nodes can be understood as the data flow direction among the nodes, for example, if the output data of the node 1 is the input data of the node 2, the dependency relations among the node 1 and the node 2 are determined.

In some embodiments, the step S101 further comprises steps S1011-S1014:

step S1011, generating a first initial computational graph of the neural network model in the first framework.

Wherein the first initial computational graph is a structure defined by the neural network model in a first framework (e.g., pyTorch) for representing the computational logic of the neural network model and the dependencies between nodes.

Step S1012, converting the first initial computation graph into the first computation graph according to a general computation graph representation method.

The general computational graph representation method is used for presenting the neural network models of different frameworks through a general computational graph representation mode (for example, ONNX format).

The general calculation graph representation method in the embodiment of the application takes account of the independence, the high efficiency and the flexibility of the frames, so that the frames can be rapidly exported and converted, and the input and output information of each node required by the downstream task is reserved. Not only is the smoothness of frame migration promoted, but also the universality and expansibility of the cross-frame computing graph are improved.

Step S1013 generates a second initial computational graph of the neural network model in the second framework (PADDLEPADDLE).

Wherein the second initial computational graph is a structure of the neural network model defined in a second framework (e.g., PADDLEPADDLE) for representing the computational logic of the neural network model and the dependencies between nodes.

Step S1014, converting the second initial computation graph into the second computation graph according to the general computation graph representation method.

Similarly, reference is made to step S1012.

Step S102, for any one of the first computation graph and the second computation graph, selecting a target node to be inspected from a plurality of nodes in the computation graph according to the dependency relationship between the plurality of nodes.

The target node to be inspected refers to a node which is easy to cause precision difference among a plurality of nodes.

In some embodiments, the step S102 includes steps a 1-a 2:

Step a1, for any node of the plurality of nodes, calculating the operation times related to Hidden layer Dimension (Hidden Dimension) in the node according to the input data and the output data of the node.

The number of operations related to hidden layer dimensions in the above computing node is described by way of example:

When the mathematical operation performed in any node is that matrix a (B1 x S1 x H1) and matrix B (B2 x S2 x H2) are multiplied, i.e., (B1 x S1 x H1) (B2 x S2 x H2), where B (Batch Size) in each matrix represents the number of samples, S (Sequence Length) represents the sequence length, H (Hidden Dimension) represents the hidden layer dimension, and the number of operations related to the hidden layer dimension can be determined to be H1 x H2.

And a step a2 of determining the node as the target node if the operation times are greater than a preset time threshold.

The size of the preset frequency threshold is not particularly limited, and the corresponding preset frequency threshold can be set according to different types of nodes. And if the operation times are smaller than or equal to a preset times threshold value, determining that the node does not belong to the target node.

In the embodiment of the application, the target node refers to a computation-intensive node in a plurality of nodes, the computation-intensive node involves complex mathematical operation, the computation complexity is high, the overall performance and efficiency of the model are obviously affected, and the precision difference is easily caused in the process of migrating the model from one frame to another frame.

In some embodiments, the step S102 includes steps b 1-b 2:

and b1, determining the number of nodes of downstream nodes connected with any node in the plurality of nodes.

The downstream nodes are nodes for receiving output data of the nodes, such as A- & gt B, A- & gt C, A- & gt D, wherein B, C and D are downstream nodes of A, and the number of the downstream nodes connected with the node A is 3.

And b2, if the number of the nodes is larger than a preset number threshold, determining that the node is the target node.

The size of the preset number threshold is not specifically limited, and the preset number threshold is generally greater than 1, for example, 2, 3, and the like, and the corresponding preset number threshold may be set according to different types of nodes. And if the number of the nodes is smaller than or equal to a preset number threshold, determining that the nodes do not belong to the target node.

The embodiment of the application can rapidly screen out the nodes with larger influence on subsequent calculation by determining the number of the downstream nodes connected with the nodes and comparing the number with the preset number threshold. The output data of these nodes is relied upon by multiple downstream nodes. This screening method can help a developer or researcher quickly locate key nodes in the model to determine if these key nodes are outlier nodes that cause differences in accuracy.

In some specific embodiments, after the step S102, the method further includes:

And c1, respectively setting monitoring points at the input position and the output position of the target node, and operating the neural network model. The monitoring point is used for monitoring input data and output data of the target node.

In embodiments of the present application, the placement of monitoring points may be understood by inserting hook functions or black box program instrumentation at the input and output of a particular node (e.g., layer or operation) of the neural network model, which are used to capture and record the input data and output data of the node.

In some embodiments, the step c1 includes steps c11 to c12:

And c11, determining the monitoring priority of the target node according to the node type of the target node.

Wherein, the sensitivity of different types of nodes (such as an activation function, a normalization layer, a convolution layer and the like) to the precision is different, so that different types of target nodes correspond to different levels of monitoring priority, for example, an MLP layer is generally more sensitive to the precision difference than an activation function layer, and therefore, higher monitoring priority is arranged around the MLP layer nodes.

And c12, if the monitoring priority of the target node is greater than the preset priority, setting monitoring points at the input position and the output position of the target node respectively.

In the embodiment of the application, when the monitoring priority of the target node is smaller than or equal to the preset priority, no monitoring point is respectively set at the input position and the output position of the target node. The preset priority may be set according to actual situations, and is not specifically limited herein.

Step S103, comparing and analyzing the first input data and the first output data of the first target node in the first computation graph, and the second input data and the second output data of the second target node in the second computation graph, to obtain a comparison analysis result.

In some embodiments, the node position of the first target node in the first computational graph is the same as the node position of the second target node in the second computational graph.

For example, A (a1→a2→a3) and B (b1→b2→b3), wherein the node positions of a2 and B2 are the same. As another example, as shown in fig. 2, when the first target node is computer_ atten (calculate attention layer) in the first calculation graph, the second target node is computer_ atten (calculate attention layer) in the second calculation graph.

In some embodiments, the node locations of the first target node in the first computational graph and the second target node in the second computational graph correspond, but are not identical.

For example, as shown in fig. 3, when the first target node in PyTorch includes "h×qd", "h×kvd" and "h×kvd", the second target node corresponding to the first target node is "H (qd+2kvd)" in PADDLENLP.

In the embodiment of the application, the first input data and the second input data are subjected to comparative analysis to obtain a first comparative analysis result, and the first output data and the second output data are subjected to comparative analysis to obtain a second comparative analysis result.

In some embodiments, step S103 includes steps S1031-S1035:

step S1031, determining whether the data sequence attributes of the first input data and the second input data are consistent.

In the embodiment of the application, the data sequence attribute comprises a data type and a data arrangement sequence, wherein the data type refers to a storage format of data in a memory and comprises a 32-bit floating point number (namely float 32), a 16-bit floating point number (namely float 16), a 32-bit integer (namely int 32) and an 8-bit integer (namely int 8), and the data arrangement refers to the storage sequence of the data in the memory and comprises NCHW and NHWC:

NCHW the presentation data is stored in the order of [ Batch Size, channels, height, width ]. This is the default arrangement of PyTorch and TensorFlow.

NHWC the presentation data is stored in the order of [ Batch Size, height, width, channels ]. This is another common arrangement of TensorFlow, particularly computationally efficient on a GPU.

Step S1032, if the data sequence attributes of the first input data and the second input data are identical, acquiring a plurality of first sub-input data of the first input data, and acquiring a plurality of second sub-input data of the second input data.

Each input data comprises a plurality of sub-input data, each output data comprises a plurality of sub-output data, and the first input data, the second input data, the first output data and the second output data are the same.

Step S1033, screening a plurality of first target data within a preset data range from the plurality of first sub-input data, and screening a plurality of second target data within the preset data range from the plurality of second sub-input data.

Specifically, the preset data range may be set according to actual situations, which is not limited herein, and invalid data may be effectively filtered by setting the preset data range.

Step S1034, calculating a first statistical index of the first input data from the plurality of first target data, and calculating a second statistical index of the second input data from the plurality of second target data.

Specifically, the first statistical indicator and the second statistical indicator include a mean and a variance.

Step S1035, performing a comparative analysis on the first statistical index of the first input data and the second statistical index of the second input data, to obtain a comparative analysis result.

In some embodiments, the specific implementation manner of comparing the first output data with the second output data to obtain the second comparison analysis result may refer to the above steps S1031-S1035, which are not described herein.

According to the embodiment of the application, the statistical indexes of the input data and the output data of the same node on different frames in the neural network model are calculated, and the abnormal nodes causing the precision difference can be screened from a plurality of nodes by comparing the statistical indexes of the two frames, for example, the statistical indexes of the two frames are compared, so that the node with larger difference can be found out to serve as the abnormal node.

And step S104, screening abnormal nodes from the plurality of nodes of the second calculation graph according to the comparison analysis result.

In the embodiment of the application, the abnormal node can be screened out by the following modes:

When the average value difference between the input data or the output data of any node in the second calculation graph and the node at the same node position in the first calculation graph is calculated to be larger than the preset average value difference threshold value, the any node in the second calculation graph can be determined to be an abnormal node.

Similarly, when the variance difference between the input data or the output data of any node in the second calculation map and the node at the same node position in the first calculation map is calculated to be larger than the preset variance difference threshold, the any node in the second calculation map can be determined to be an abnormal node.

In some embodiments, after the step S104, the method further includes steps S1051 to S1053:

in step S1051, it is determined whether the target node includes a child node.

Step S1052, if the target node does not include child nodes, the target node is the final abnormal node.

Step S1053, if the target node includes a plurality of sub-nodes, using the plurality of sub-nodes as a plurality of nodes in the computation graph, and repeating the step of screening the target node to be checked from the plurality of nodes according to the dependency relationship between the plurality of nodes in the computation graph until the target node does not include a sub-node, and using the target node as a final abnormal node.

For example, the above steps S1051 to S1053 are described, in which when there are a plurality of nodes, i.e., a→b→c→d, and the node B is preliminarily determined to be an abnormal node through the steps S101 to S104, the node B may be determined to be a final abnormal node directly if the node B does not include a plurality of sub-nodes, whereas if the node B includes a plurality of sub-nodes, i.e., b1→b2→b3→b4, it is necessary to repeatedly perform the steps S102 to S104 to locate an abnormal sub-node from among B1, B2, B3 and B4, and if there is no more refined node among the abnormal sub-nodes, it is determined that the abnormal sub-node is the final abnormal node. If there are more refined nodes in the abnormal child nodes, the above steps need to be repeated continuously.

In the embodiment of the application, the nodes with precision errors are gradually refined by continuously iterating the process, and further analyzed until the specific nodes with the precision problems are accurately positioned. The iterative optimization mechanism ensures the high efficiency and accuracy of precision comparison, thereby minimizing error accumulation and inconsistency in the migration process.

The embodiment of the application can obviously improve the accuracy comparison efficiency and accuracy in the cross-frame migration process, automatically detect and position the accuracy difference, and provide powerful support for frame migration.

Example 2:

Corresponding to the implementation manner of the neural network model migration precision alignment method, the embodiment of the application also provides a neural network model migration precision alignment device, which is used for executing the neural network model migration precision alignment method described in the embodiment. As shown in fig. 4, the neural network model migration accuracy alignment apparatus includes:

Optionally, the target node screening module is further configured to calculate, for any node of the plurality of nodes, an operation number of times related to a hidden layer dimension in the node according to input data and output data of the node, and if the operation number of times is greater than a preset number of times threshold, determine that the node is the target node.

Optionally, the target node screening module is further configured to determine, for any node of the plurality of nodes, a node number of a downstream node connected to the node, where the downstream node is a node, among the plurality of nodes, for receiving output data of the node, and determine that the node is the target node if the node number is greater than a preset number threshold.

Optionally, the device further comprises a data monitoring module, wherein the data monitoring module is used for respectively setting monitoring points at the input position and the output position of the target node after the target node to be checked is screened out from the nodes according to the dependency relationship among the nodes in the calculation graph, and running the neural network model, and the monitoring points are used for monitoring the input data and the output data of the target node.

Optionally, the data monitoring module is further configured to determine a monitoring priority of the target node according to a node type of the target node, wherein different types of target nodes correspond to different levels of monitoring priorities, and if the monitoring priority of the target node is greater than a preset priority, monitoring points are set at an input position and an output position of the target node respectively.

Optionally, the device further comprises an iteration module, wherein the iteration module is used for judging whether the target node contains sub-nodes after judging that the target node is an abnormal node, if the target node does not contain sub-nodes, the target node is a final abnormal node, if the target node contains a plurality of sub-nodes, the plurality of sub-nodes are used as a plurality of nodes in the computation graph, and the step of screening the target node to be checked from the plurality of nodes according to the dependency relationship among the plurality of nodes in the computation graph is repeatedly executed until the target node does not contain sub-nodes, and the target node is the final abnormal node.

Optionally, the comparison analysis result generation module is further configured to determine whether data sequence attributes of the first input data and the second input data are consistent, where the data sequence attributes include a data type and a data arrangement order, if the data sequence attributes of the first input data and the second input data are consistent, obtain a plurality of first sub-input data of the first input data and a plurality of second sub-input data of the second input data, screen out a plurality of first target data in a preset data range from the plurality of first sub-input data, screen out a plurality of second target data in the preset data range from the plurality of second sub-input data, calculate a first statistical index of the first input data through the plurality of first target data, calculate a second statistical index of the second input data through the plurality of second target data, where the first statistical index and the second statistical index include a mean value and a variance, and analyze the first statistical index of the first input data and the second statistical index of the first input data, and the second statistical index of the second input data, and obtain the comparison analysis result.

The computing device comprises a computing graph acquisition module, a computing graph representation module and a computing graph conversion module, wherein the computing graph acquisition module is used for generating a first initial computing graph of the neural network model in the first framework, converting the first initial computing graph into the first computing graph according to a general computing graph representation method, the general computing graph representation method is used for presenting the neural network models of different frameworks through a general computing graph representation mode, generating a second initial computing graph of the neural network model in the second framework, and converting the second initial computing graph into the second computing graph according to the general computing graph representation method.

The neural network model migration precision alignment device provided by the embodiment of the application and the neural network model migration precision alignment method provided by the embodiment of the application have the same beneficial effects as the method adopted, operated or realized by the application program stored by the neural network model migration precision alignment device due to the same inventive concept.

The embodiment of the application also provides computer equipment for executing the neural network model migration precision alignment method. Referring to fig. 5, a schematic diagram of a computer device according to some embodiments of the present application is shown. As shown in fig. 5, the computer device 5 includes a processor 500, a memory 501, a bus 502 and a communication interface 505, where the processor 500, the communication interface 505 and the memory 501 are connected through the bus 502, and the memory 501 stores a computer program that can be run on the processor 500, and when the processor 500 runs the computer program, the neural network model migration accuracy alignment method provided by the foregoing embodiment of the present application is executed.

The memory 501 may include a high-speed random access memory (Random Access Memory, RAM), and may further include a non-volatile memory (non-volatile memory), such as at least one disk memory. The communication connection between the system network element and at least one other network element is implemented via at least one communication interface 505 (which may be wired or wireless), the internet, a wide area network, a local network, a metropolitan area network, etc. may be used.

Bus 502 may be an ISA bus, a PCI bus, an EISA bus, or the like. The buses may be classified as address buses, data buses, control buses, etc. The memory 501 is configured to store a program, and the processor 500 executes the program after receiving an execution instruction, and the neural network model migration accuracy alignment method disclosed in the foregoing embodiment may be applied to the processor 500 or implemented by the processor 500.

The processor 500 may be an integrated circuit chip with signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuitry in hardware or instructions in software in the processor 500. The processor 500 may be a general-purpose processor including a central processing unit (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), etc., or may be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in the memory 501, and the processor 500 reads the information in the memory 501, and in combination with its hardware, performs the steps of the method described above.

The computer equipment provided by the embodiment of the application and the neural network model migration precision alignment method provided by the embodiment of the application have the same beneficial effects as the method adopted, operated or realized by the computer equipment and the neural network model migration precision alignment method provided by the embodiment of the application are based on the same inventive concept.

The embodiment of the present application further provides a computer readable storage medium corresponding to the neural network model migration accuracy alignment method provided in the foregoing embodiment, referring to fig. 6, the computer readable storage medium is shown as an optical disc 50, on which a computer program (i.e. a program product) is stored, where the computer program, when executed by a processor, performs the neural network model migration accuracy alignment method provided in any of the foregoing embodiments.

It should be noted that examples of the computer readable storage medium may also include, but are not limited to, a phase change memory (PRAM), a Static Random Access Memory (SRAM), a Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a flash memory, or other optical or magnetic storage medium, which will not be described in detail herein.

The computer readable storage medium provided by the above embodiment of the present application has the same advantages as the method adopted, operated or implemented by the application program stored in the computer readable storage medium, because the same inventive concept is adopted by the neural network model migration accuracy alignment method provided by the embodiment of the present application.

It should be noted that:

In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the application may be practiced without these specific details. In some instances, well-known structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the above description of exemplary embodiments of the application, various features of the application are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed application requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this application.

Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features but not others included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the application and form different embodiments. For example, in the following claims, any of the claimed embodiments can be used in any combination.

The present application is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present application are intended to be included in the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A neural network model migration accuracy alignment method, characterized in that the method comprises:

When migrating a neural network model from a first framework to a second framework, obtaining a first computational graph of the neural network model in the first framework, and obtaining a second computational graph of the neural network model in the second framework; each computational graph includes a plurality of nodes and dependencies between the nodes; any node is a linear fully connected layer, a convolutional layer, an attention layer, an activation function layer, or a normalization layer;

For any one of the first computation graph and the second computation graph, filter out a target node to be checked from the multiple nodes according to dependency relationships between the multiple nodes in the computation graph;

Comparing and analyzing the first input data and the first output data of the first target node in the first computation graph, and the second input data and the second output data of the second target node in the second computation graph, to obtain a comparative analysis result;

According to the comparative analysis result, abnormal nodes are screened out from multiple nodes of the second computation graph.

2. The method according to claim 1, wherein the step of selecting a target node to be checked from the plurality of nodes according to the dependency relationship between the plurality of nodes in the computation graph comprises:

For any node among the plurality of nodes, calculating the number of operations related to the hidden layer dimension in the node according to the input data and output data of the node;

If the number of operations is greater than a preset number threshold, the node is determined to be the target node.

3. The method according to claim 1, wherein the step of selecting a target node to be checked from the plurality of nodes according to the dependency relationship between the plurality of nodes in the computation graph comprises:

For any node among the plurality of nodes, determining the number of downstream nodes connected to the node; the downstream node refers to a node among the plurality of nodes for receiving output data of the node;

If the number of the nodes is greater than a preset number threshold, the node is determined to be the target node.

4. The method according to claim 1, wherein after selecting a target node to be checked from the plurality of nodes according to the dependency relationships between the plurality of nodes in the computation graph, the method further comprises:

Monitoring points are respectively set at the input position and output position of the target node, and the neural network model is run; the monitoring points are used to monitor the input data and output data of the target node.

5. The method according to claim 4, wherein monitoring points are respectively set at the input position and the output position of the target node, comprising:

Determining the monitoring priority of the target node according to the node type of the target node; different types of target nodes correspond to different levels of monitoring priority;

If the monitoring priority of the target node is greater than the preset priority, monitoring points are respectively set at the input position and the output position of the target node.

6. The method according to claim 1, characterized in that after determining that the target node is an abnormal node, the method further comprises:

Determine whether the target node contains child nodes;

If the target node does not contain any child nodes, the target node is considered as the final abnormal node;

If the target node contains multiple child nodes, the multiple child nodes are used as multiple nodes in the calculation graph, and the step of filtering out the target node to be checked from the multiple nodes according to the dependency relationship between the multiple nodes in the calculation graph is repeatedly performed until the target node contains no child nodes, and the target node is regarded as the final abnormal node.

7. The method according to claim 1 or 2, wherein the step of comparing and analyzing the first input data and the first output data of the first target node in the first computation graph, and the second input data and the second output data of the second target node in the second computation graph, to obtain a comparative analysis result comprises:

Determining whether data sequence attributes of the first input data and the second input data are consistent, the data sequence attributes including data type and data arrangement order;

If the data sequence attributes of the first input data and the second input data are consistent, obtaining a plurality of first sub-input data of the first input data, and obtaining a plurality of second sub-input data of the second input data;

Filtering out a plurality of first target data within a preset data range from the plurality of first sub-input data, and filtering out a plurality of second target data within the preset data range from the plurality of second sub-input data;

Calculating a first statistical indicator of the first input data using the plurality of first target data, and calculating a second statistical indicator of the second input data using the plurality of second target data; the first statistical indicator and the second statistical indicator include a mean and a variance;

A first statistical indicator of the first input data and a second statistical indicator of the second input data are compared and analyzed to obtain the comparative analysis result.

8. The method according to claim 1 or 2, wherein obtaining a first computational graph of the neural network model in the first framework comprises:

Converting the first initial computation graph into the first computation graph according to a universal computation graph representation method; the universal computation graph representation method is used to present neural network models of different frameworks using a universal computation graph representation method;

Obtaining a second computational graph of the neural network model in the second framework includes:

The second initial computation graph is converted into the second computation graph according to the general computation graph representation method.

9. A neural network model migration accuracy alignment device, characterized in that the device comprises:

A computational graph acquisition module, configured to, when migrating a neural network model from a first framework to a second framework, acquire a first computational graph of the neural network model in the first framework, and acquire a second computational graph of the neural network model in the second framework; each computational graph includes a plurality of nodes and dependencies between the nodes; any node is a linear fully connected layer, a convolutional layer, an attention layer, an activation function layer, or a normalization layer;

a target node screening module, configured to screen a target node to be checked from a plurality of nodes in any one of the first and second computation graphs according to dependency relationships between the plurality of nodes in the computation graph;

a comparative analysis result generating module, configured to perform comparative analysis on the first input data and the first output data of the first target node in the first computation graph, and the second input data and the second output data of the second target node in the second computation graph, to obtain a comparative analysis result;

An abnormal node judgment module is used to filter out abnormal nodes from multiple nodes of the second calculation graph based on the comparative analysis result.

10. A computer device, comprising:

A memory and a processor, wherein the memory and the processor are communicatively connected to each other, the memory stores computer instructions, and the processor executes the neural network model migration accuracy alignment method according to any one of claims 1 to 8 by executing the computer instructions.