[go: up one dir, main page]

CN109800869B - Data compression method and related device - Google Patents

Data compression method and related device Download PDF

Info

Publication number
CN109800869B
CN109800869B CN201811641325.2A CN201811641325A CN109800869B CN 109800869 B CN109800869 B CN 109800869B CN 201811641325 A CN201811641325 A CN 201811641325A CN 109800869 B CN109800869 B CN 109800869B
Authority
CN
China
Prior art keywords
data
data packet
packet
packets
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811641325.2A
Other languages
Chinese (zh)
Other versions
CN109800869A (en
Inventor
王和国
李爱军
曹庆新
李炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Intellifusion Technologies Co Ltd
Original Assignee
Shenzhen Intellifusion Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Intellifusion Technologies Co Ltd filed Critical Shenzhen Intellifusion Technologies Co Ltd
Priority to CN201811641325.2A priority Critical patent/CN109800869B/en
Publication of CN109800869A publication Critical patent/CN109800869A/en
Priority to PCT/CN2019/114731 priority patent/WO2020134550A1/en
Application granted granted Critical
Publication of CN109800869B publication Critical patent/CN109800869B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0495Quantised networks; Sparse networks; Compressed networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The embodiment of the application discloses a data compression method and a related device, wherein the method is applied to a neural network model comprising an N-layer structure, N is an integer greater than 1, and the method comprises the following steps: acquiring an output data set of an ith layer structure in a neural network model, wherein the output data set comprises at least one m multiplied by N matrix, m and N are integers greater than 1, and i is any one of 1 to N; performing data packet segmentation on at least one mxn matrix to obtain M segmented first data packets, wherein M is an integer greater than or equal to 1; and performing data compression on the M first data packets to obtain two second data packets after data compression. By adopting the embodiment of the application, the compression efficiency of the data in the neural network model can be improved.

Description

Data compression method and related device
Technical Field
The present application relates to the field of neural network technology, and in particular, to a data compression method and related apparatus.
Background
With the development of artificial intelligence, Neural Networks play an increasingly important role, particularly Convolutional Neural Networks (CNNs). In the operation process of the neural network model, the data and the weight have large requirements on the bandwidth of the system, and the requirement of the neural network model on the bandwidth is urgently reduced.
At present, Column Compression (CCS) and Row Compression (CRS) are adopted to compress data in a neural network model, and the compression efficiency of the two compression modes is not high.
Disclosure of Invention
The embodiment of the application provides a data compression method and a related device, which are used for improving the compression efficiency of data in a neural network model.
In a first aspect, an embodiment of the present application provides a data compression method, which is applied to a neural network model including an N-layer structure, where N is an integer greater than 1, and the method includes:
acquiring an output data set of an ith layer structure in the neural network model, wherein the output data set comprises at least one m x N matrix, m and N are integers greater than 1, and i is any one of 1 to N;
performing data packet segmentation on the at least one mxn matrix to obtain M segmented first data packets, wherein M is an integer greater than or equal to 1;
and performing data compression on the M first data packets to obtain two second data packets after data compression.
In one possible example, the obtaining the output data set of the i-th layer structure in the neural network model includes:
when i is 1, acquiring an input data set of a layer 1 structure in the neural network model, wherein the input data set of the layer 1 structure comprises at least one first matrix;
acquiring a weight data set of the 1 st layer structure, decompressing the weight data set of the 1 st layer structure to obtain a second matrix, wherein the second matrix is the decompressed weight data set of the 1 st layer structure;
determining an output data set of the layer 1 structure based on the at least one first matrix and the second matrix;
when i is more than or equal to 2 and less than or equal to N, acquiring an output data set of an i-1 layer structure in the neural network model, wherein the output data set of the i-1 layer structure comprises at least one third matrix;
acquiring a weight data set of the ith layer structure, decompressing the weight data set of the ith layer structure to obtain a fourth matrix, wherein the fourth matrix is the decompressed weight data set of the ith layer structure;
determining an output data set of the i-th layer structure based on the at least one third matrix and the fourth matrix.
In one possible example, the performing packet segmentation on the at least one M × n matrix to obtain M segmented first packets includes:
performing array conversion on the at least one mxn matrix to obtain a one-dimensional array, wherein the one-dimensional array is the at least one mxn matrix subjected to array conversion;
and performing data packet segmentation on the one-dimensional array to obtain M segmented first data packets, wherein each of the 1 st first data packet to the M-1 st first data packet comprises P data, the M first data packet comprises Q data, P is an integer greater than or equal to 1, and Q is an integer greater than or equal to 1 and less than or equal to P.
In one possible example, the performing data compression on the M first data packets to obtain two second data packets after data compression includes:
acquiring data packet information of a jth first data packet, where the data packet information of the jth first data packet includes P indication signals, a first data set, and a length of the jth first data packet, where the indication signals are used to indicate whether each piece of P data included in the jth first data packet is zero, the first data set includes at least one non-zero piece of P pieces of data included in the jth first data packet, and the jth first data packet is any one of the 1 st first data packet to the M-1 st first data packet;
performing the same operation on M-2 first data packets except the jth first data packet from the 1 st first data packet to the M-1 st first data packet to obtain data packet information of each first data packet in the M-2 first data packets;
acquiring data packet information of the mth first data packet, where the data packet information of the mth first data packet includes Q indication signals, a second data set, and a length of the mth first data packet, where the indication signals are used to indicate whether each piece of Q data included in the mth first data packet is zero, and the second data set includes at least one piece of non-zero data in the Q pieces of data included in the mth first data packet;
forming a first sub-data packet by P indicating signals and a first data set included in each of the 1 st to M-1 th first data packets to obtain M-1 first sub-data packets;
forming an Mth first sub-data packet by Q indicating signals and a second data set which are included in the Mth first data packet;
forming a 1 st second data packet by the M-1 first sub-packets and the Mth first sub-packet based on the ordering of the M first data packets;
and forming the length of the M first data packets into a 2 nd second data packet based on the sorting of the M first data packets.
In a possible example, after the performing data compression on the M first data packets to obtain two second data packets after data compression, the method further includes:
executing the same operation on N-1 layers of structures except the ith layer of structure in the N layers of structures to obtain two second data packets corresponding to each layer of structure in the N-1 layers of structures;
and storing the 2N second data packets corresponding to the N-layer structure into a double-data rate synchronous dynamic random access memory (DDR).
In a second aspect, an embodiment of the present application provides a data compression apparatus, which is applied to a neural network model including an N-layer structure, where N is an integer greater than 1, and the apparatus includes:
an obtaining unit, configured to obtain an output data set of an ith layer structure in the neural network model, where the output data set includes at least one m × N matrix, m and N are integers greater than 1, and i is any one of 1 to N;
a dividing unit, configured to perform packet division on the at least one mxn matrix to obtain M divided first data packets, where M is an integer greater than or equal to 1;
and the compression unit is used for carrying out data compression on the M first data packets to obtain two second data packets after data compression.
In one possible example, in obtaining the output data set of the i-th layer structure in the neural network model, the obtaining unit is specifically configured to:
when i is 1, acquiring an input data set of a layer 1 structure in the neural network model, wherein the input data set of the layer 1 structure comprises at least one first matrix;
acquiring a weight data set of the 1 st layer structure, decompressing the weight data set of the 1 st layer structure to obtain a second matrix, wherein the second matrix is the decompressed weight data set of the 1 st layer structure;
determining an output data set of the layer 1 structure based on the at least one first matrix and the second matrix;
when i is more than or equal to 2 and less than or equal to N, acquiring an output data set of an i-1 layer structure in the neural network model, wherein the output data set of the i-1 layer structure comprises at least one third matrix;
acquiring a weight data set of the ith layer structure, decompressing the weight data set of the ith layer structure to obtain a fourth matrix, wherein the fourth matrix is the decompressed weight data set of the ith layer structure;
determining an output data set of the i-th layer structure based on the at least one third matrix and the fourth matrix.
In a possible example, in terms of performing packet segmentation on the at least one M × n matrix to obtain M segmented first packets, the segmentation unit is specifically configured to:
performing array conversion on the at least one mxn matrix to obtain a one-dimensional array, wherein the one-dimensional array is the at least one mxn matrix subjected to array conversion;
and performing data packet segmentation on the one-dimensional array to obtain M segmented first data packets, wherein each of the 1 st first data packet to the M-1 st first data packet comprises P data, the M first data packet comprises Q data, P is an integer greater than or equal to 1, and Q is an integer greater than or equal to 1 and less than or equal to P.
In a possible example, in terms of performing data compression on the M first data packets to obtain two second data packets after data compression, the compression unit is specifically configured to:
acquiring data packet information of a jth first data packet, where the data packet information of the jth first data packet includes P indication signals, a first data set, and a length of the jth first data packet, where the indication signals are used to indicate whether each piece of P data included in the jth first data packet is zero, the first data set includes at least one non-zero piece of P pieces of data included in the jth first data packet, and the jth first data packet is any one of the 1 st first data packet to the M-1 st first data packet;
performing the same operation on M-2 first data packets except the jth first data packet from the 1 st first data packet to the M-1 st first data packet to obtain data packet information of each first data packet in the M-2 first data packets;
acquiring data packet information of the mth first data packet, where the data packet information of the mth first data packet includes Q indication signals, a second data set, and a length of the mth first data packet, where the indication signals are used to indicate whether each piece of Q data included in the mth first data packet is zero, and the second data set includes at least one piece of non-zero data in the Q pieces of data included in the mth first data packet;
forming a first sub-data packet by P indicating signals and a first data set included in each of the 1 st to M-1 th first data packets to obtain M-1 first sub-data packets;
forming an Mth first sub-data packet by Q indicating signals and a second data set which are included in the Mth first data packet;
forming a 1 st second data packet by the M-1 first sub-packets and the Mth first sub-packet based on the ordering of the M first data packets;
and forming the length of the M first data packets into a 2 nd second data packet based on the sorting of the M first data packets.
In one possible example, the data compression apparatus further comprises:
the execution unit is used for executing the same operation on N-1 layers of structures except the ith layer of structure in the N layers of structures to obtain two second data packets corresponding to each layer of structure in the N-1 layers of structures;
and the memory unit is used for storing the 2N second data packets corresponding to the N-layer structure into a double-rate synchronous dynamic random access memory (DDR).
In a third aspect, an embodiment of the present application provides an electronic device, including a processor, a memory, a communication interface, and one or more programs, where the one or more programs are stored in the memory and configured to be executed by the processor, and the program includes instructions for executing steps in the method according to the first aspect of the embodiment of the present application.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium for storing a computer program, where the computer program is executed by a processor to implement some or all of the steps described in the method according to the first aspect of the embodiments of the present application.
In a fifth aspect, embodiments of the present application provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps described in a method as described in the first aspect of embodiments of the present application. The computer program product may be a software installation package.
It can be seen that, in the embodiment of the present application, the data compression apparatus obtains an output data set of an i-th layer structure in the neural network model, where the output data set includes at least one mxn matrix, performs data packet segmentation on the at least one mxn matrix to obtain M divided first data packets, and performs data compression on the M first data packets to obtain two data packets after data compression. Therefore, by dividing at least one M × n matrix into M first data packets and compressing the M first data packets into two second data packets, the output data set is compressed into two second data packets, and the compression efficiency of the data in the neural network model is improved.
These and other aspects of the present application will be more readily apparent from the following description of the embodiments.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments or the background art of the present application, the drawings required to be used in the embodiments or the background art of the present application will be described below.
Fig. 1A is a schematic flowchart of a data compression method according to an embodiment of the present application;
FIG. 1B is a schematic diagram provided by an embodiment of the present application;
FIG. 1C is another schematic illustration provided by an embodiment of the present application;
FIG. 1D is another schematic illustration provided by an embodiment of the present application;
FIG. 2 is a schematic flow chart diagram illustrating another data compression method according to an embodiment of the present disclosure;
FIG. 3 is a schematic flow chart diagram illustrating another data compression method according to an embodiment of the present application;
FIG. 4 is a block diagram illustrating functional units of a data compression apparatus according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed description of the invention
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The following are detailed below.
The terms "first," "second," "third," and "fourth," etc. in the description and claims of this application and in the accompanying drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
The data compression apparatus according to the embodiment of the present application may be integrated in an electronic device, and the electronic device may include various handheld devices, vehicle-mounted devices, wearable devices, computing devices or other processing devices connected to a wireless modem, and various forms of User Equipment (UE), Mobile Stations (MS), terminal devices (terminal device), and the like. For convenience of description, the above-mentioned devices are collectively referred to as electronic devices.
The following describes embodiments of the present application in detail.
Referring to fig. 1A, fig. 1A is a schematic flow chart of a data compression method applied to a neural network model including an N-layer structure, where N is an integer greater than 1, the data compression method includes:
step 101: the data compression device obtains an output data set of an ith layer structure in the neural network model, wherein the output data set comprises at least one m x N matrix, m and N are integers greater than 1, and i is any one of 1 to N.
Taking a convolutional neural network as an example, the N-layer structure includes an input layer, a convolutional layer, a pooling layer, and a full-link layer.
In one possible example, the data compression apparatus obtains an output data set of an i-th layer structure in the neural network model, including:
when i is 1, the data compression device acquires an input data set of a layer 1 structure in the neural network model, wherein the input data set of the layer 1 structure comprises at least one first matrix;
the data compression device obtains the weight data set of the 1 st layer structure, and decompresses the weight data set of the 1 st layer structure to obtain a second matrix, wherein the second matrix is the decompressed weight data set of the 1 st layer structure;
the data compression means determines a set of output data of the layer 1 structure based on the at least one first matrix and the second matrix;
when i is more than or equal to 2 and less than or equal to N, the data compression device acquires an output data set of an i-1 layer structure in the neural network model, wherein the output data set of the i-1 layer structure comprises at least one third matrix;
the data compression device obtains the weight data set of the ith layer structure, and decompresses the weight data set of the ith layer structure to obtain a fourth matrix, wherein the fourth matrix is the decompressed weight data set of the ith layer structure;
the data compression means determines an output data set of the i-th layer structure based on the at least one third matrix and the fourth matrix.
The output data set of the 1 st layer structure is the product of at least one first matrix and a second matrix, and the number of columns of each first matrix is the same as the number of rows of the second matrix.
And the output data set of the ith layer structure is the product of at least one third matrix and a fourth matrix, and the column number of each third matrix is the same as the row number of the fourth matrix.
Step 102: and the data compression device divides the data packet of the at least one M multiplied by n matrix to obtain M divided first data packets, wherein M is an integer greater than or equal to 1.
In one possible example, the data compression apparatus performs packet segmentation on the at least one mxn matrix to obtain M segmented first packets, including:
the data compression device performs array conversion on the at least one mxn matrix to obtain a one-dimensional array, wherein the one-dimensional array is the at least one mxn matrix subjected to array conversion;
the data compression device divides the data packets of the one-dimensional array to obtain M divided first data packets, wherein each of the 1 st first data packet to the M-1 st first data packet comprises P data, the M first data packet comprises Q data, P is an integer greater than or equal to 1, and Q is an integer greater than or equal to 1 and less than or equal to P.
Wherein the size of each of the P data and the Q data is the same.
For example, as shown in fig. 1B, assuming that P is 10, the data compression apparatus performs array conversion on an 8 × 8 matrix (a) to obtain a one-dimensional array (B), where the one-dimensional array (B) includes 64 data, performs packet segmentation on the one-dimensional array (B) to obtain 7 segmented first data packets (c), where each of the 1 st to 6 th first data packets includes 10 data, and the 7 th first data packet includes 4 data.
For example, as shown in fig. 1C, assuming that P is 16, the data compression apparatus performs array conversion on two 8 × 8 matrices (d) to obtain a one-dimensional array (e), where the one-dimensional array (e) includes 128 data, performs packet segmentation on the one-dimensional array (e) to obtain 8 segmented first data packets (f), and each of the 1 st to 8 th first data packets includes 16 data.
Step 103: and the data compression device performs data compression on the M first data packets to obtain two second data packets after data compression.
In one possible example, the data compression apparatus performs data compression on the M first data packets to obtain two second data packets after data compression, including:
the data compression device acquires data packet information of a jth first data packet, wherein the data packet information of the jth first data packet includes P indication signals, a first data set and a length of the jth first data packet, the indication signals are used for indicating whether each piece of P data included in the jth first data packet is zero, the first data set includes at least one piece of non-zero data in the P pieces of data included in the jth first data packet, and the jth first data packet is any one of the 1 st first data packet to the M-1 st first data packet;
the data compression device executes the same operation on M-2 first data packets except the jth first data packet from the 1 st first data packet to the M-1 st first data packet to obtain data packet information of each first data packet in the M-2 first data packets;
the data compression device acquires data packet information of the Mth first data packet, wherein the data packet information of the Mth first data packet comprises Q indication signals, a second data set and the length of the Mth first data packet, the indication signals are used for indicating whether each piece of Q data included by the Mth first data packet is zero or not, and the second data set comprises at least one piece of non-zero data in the Q data included by the Mth first data packet;
the data compression device enables P indicating signals and a first data set included by each of the 1 st first data packet to the M-1 st first data packet to form a first sub data packet, and M-1 first sub data packets are obtained;
the data compression device combines the Q indication signals and the second data set included by the Mth first data packet into an Mth first sub-packet;
the data compression device combines the M-1 first sub-packets and the Mth first sub-packet into a 1 st second data packet based on the ordering of the M first data packets;
the data compression device groups the lengths of the M first packets into a 2 nd second packet based on the ordering of the M first packets.
The ordering of the M first sub-packets in the 1 st second data packet is the same as the ordering of the M first data packets, that is, the M first sub-packets correspond to the M first data packets one to one.
The length ordering of the M first data packets in the 2 nd second data packet is the same as the ordering of the M first data packets, that is, the lengths of the M first data packets correspond to the M first data packets one to one.
For example, as shown in fig. 1D, it is assumed that M is 3, the sequence of 3 first data packets is that the 1 st to 3 rd first data packets are sequentially arranged, the data compression apparatus obtains the data packet information of the 3 first data packets, the data packet information of the 1 st first data packet includes 16 indication signals (16bit), 12 non-zero data (data 1-data 12) and the length of the 1 st first data packet is 64bit, the data packet information of the 2 nd first data packet includes 16 indication signals (16bit), 10 non-zero data (data 13-data 22) and the length of the 2 nd first data packet is 56bit, the data packet information of the 3 rd first data packet includes 16 indication signals (16bit), 11 non-zero data (data 23-data 33) and the length of the 3 rd first data packet is 60bit, and each of the 3 rd first data packets includes the indication signals and the first non-zero data packets, which constitute the sub-data packets And packaging to obtain 3 first sub data packets, forming the 1 st second data packet (g) by using the 3 first sub data packets based on the sequencing of the 3 first data packets, and forming the 2 nd second data packet (h) by using the lengths of the 3 first data packets based on the sequencing of the 3 first data packets.
It can be seen that, in the embodiment of the present application, the data compression apparatus obtains an output data set of an i-th layer structure in the neural network model, where the output data set includes at least one mxn matrix, performs data packet segmentation on the at least one mxn matrix to obtain M divided first data packets, and performs data compression on the M first data packets to obtain two data packets after data compression. Therefore, by dividing at least one M × n matrix into M first data packets and compressing the M first data packets into two second data packets, the output data set is compressed into two second data packets, and the compression efficiency of the data in the neural network model is improved.
In one possible example, after the data compression device performs data compression on the M first data packets to obtain two second data packets after the data compression, the method further includes:
the data compression device executes the same operation on N-1 layers of structures except the ith layer of structure in the N layers of structures to obtain two second data packets corresponding to each layer of structure in the N-1 layers of structures;
and the data compression device stores the 2N second data packets corresponding to the N-layer structure into a double-data-rate synchronous dynamic random access memory (DDR).
As can be seen, in this example, the data compression apparatus stores the 2N second packets corresponding to the N-layer structure into the DDR, and since the storage of the 2N second packets into the DDR is an off-chip storage manner, the occupation of the internal storage space of the neural network model is reduced.
In one possible example, after the data compression apparatus stores the 2N second data packets corresponding to the N-layer structure into the DDR, the method further includes:
the data compression device acquires a first position of data to be searched, the first position is a qth row and a qth column of the data to be searched in a target matrix included in an output data set of the ith layer structure, p is more than or equal to 1 and less than or equal to m, and q is more than or equal to 1 and less than or equal to n;
the data compression device determines, based on the first location and the P, a length to read R first packets, where R is [ ((P-1) x n + q)/P ] rounded;
the data compression device determines that the sum of the length of the 1 st first data packet to the length of the Rth first data packet is S based on the 2 nd second data packet corresponding to the ith layer structure;
the data compression device reads the S +1 th indication signal to the [ S + ((P-1) xn + q) -R xP ] th indication signal in the 1 st second data packet corresponding to the ith layer structure;
the data compression device determines a second position of the data to be searched based on the S +1 th indication signal to the [ S + ((P-1) xn + q) -RxP ] th indication signal, wherein the second position is a sequence number of the data to be searched in a 1 st second data packet corresponding to the ith layer structure;
and the data compression device reads the target data of the second position from the 1 st second data packet corresponding to the ith layer structure, wherein the target data is the data to be searched.
Specifically, the data compression apparatus determines the second position of the data to be searched based on the S +1 th indication signal to the [ S + ((P-1) × n + q) -R × P ] th indication signal by: the data compression device determines the number of non-zero indication signals from the S +1 th indication signal to the [ S + ((P-1) x n + q) -R x P ] th indication signal to be T; and the data compression device determines that the second position of the data to be searched is S + P + T.
For example, assuming that the i-th layer structure is a convolutional layer, each nonzero data is 4 bits, an output data set of the convolutional layer is an 8 × 8 matrix, P is 16, the data compression apparatus acquires the 4 th column of the 7 th row in the 8 × 8 matrix of the first position of the data to be searched, determines to read the length of 3 first data packets based on the first position and P, acquires that the length of the 1 st first data packet in the 2 nd second data packet corresponding to the convolutional layer is 64 bits, the 1 st first data packet includes 12 nonzero data, the length of the 2 nd first data packet is 56 bits and the length of the 3 rd first data packet is 60 bits, determines that the sum of the length of the 1 st first data packet to the length of the 3 rd first data packet is 180 bits, reads the 181 th to the first indication signals in the 1 st second data packet corresponding to the convolutional layer, and determines that the number of the 181 th to 184 th indication signals is 4, and determining that the second position of the data to be searched is 200, and reading 200 th data from the 1 st second data packet corresponding to the convolutional layer, wherein the 200 th data is the data to be searched.
As can be seen, in this example, the data compression apparatus determines to read the length of R first packets based on the first position of the data to be searched and P, determines that the sum of the length of the 1 st first packet to the length of the R first packet is S, reads the S +1 st indication signal to the [ S + ((P-1) x n + q) -R x P ] th indication signal in the 1 st second packet corresponding to the i-th layer structure, determines the second position of the data to be searched based on the S +1 st indication signal to the [ S + ((P-1) x n + q) -R x P ] th indication signal, and reads the target data of the second position from the 1 st second packet corresponding to the i-th layer structure. Therefore, the length of the R first data packets is determined through the first position of the data to be searched and the P, the second position of the data to be searched is determined based on the length of the R first data packets, and the position of the compressed source data can be quickly positioned from the variable-length compressed data.
Referring to fig. 2, fig. 2 is a schematic flow chart of another data compression method according to an embodiment of the present application, consistent with the embodiment shown in fig. 1A, where the data compression method includes:
step 201: the data compression device obtains an output data set of an ith layer structure in the neural network model, wherein the output data set comprises at least one m x N matrix, m and N are integers greater than 1, and i is any one of 1 to N.
Step 202: and the data compression device performs array conversion on the at least one mxn matrix to obtain a one-dimensional array, wherein the one-dimensional array is the at least one mxn matrix subjected to array conversion.
Step 203: the data compression device divides the one-dimensional array into data packets to obtain M divided first data packets, wherein each of the 1 st first data packet to the M-1 st first data packet comprises P data, the M first data packet comprises Q data, M is an integer greater than or equal to 1, P is an integer greater than or equal to 1, and Q is an integer greater than or equal to 1 and less than or equal to P.
Step 204: the data compression device obtains data packet information of a jth first data packet, where the data packet information of the jth first data packet includes P indication signals, a first data set, and a length of the jth first data packet, where the indication signals are used to indicate whether each piece of P data included in the jth first data packet is zero, the first data set includes at least one non-zero piece of P data included in the jth first data packet, and the jth first data packet is any one of the 1 st first data packet to the M-1 st first data packet.
Step 205: and the data compression device executes the same operation on M-2 first data packets except the jth first data packet from the 1 st first data packet to the M-1 st first data packet to obtain data packet information of each first data packet in the M-2 first data packets.
Step 206: the data compression device obtains data packet information of the mth first data packet, the data packet information of the mth first data packet includes Q indication signals, a second data set and a length of the mth first data packet, the indication signals are used for indicating whether each piece of Q data included in the mth first data packet is zero, and the second data set includes at least one piece of non-zero data in the Q data included in the mth first data packet.
Step 207: and the data compression device combines the P indication signals and the first data set included in each of the 1 st to M-1 th first data packets into a first sub-packet to obtain M-1 first sub-packets.
Step 208: and the data compression device combines the Q indication signals and the second data set included by the Mth first data packet into an Mth first sub-packet.
Step 209: and the data compression device combines the M-1 first sub-packets and the Mth first sub-packet into a 1 st second data packet based on the sequencing of the M first data packets.
Step 210: the data compression device groups the lengths of the M first packets into a 2 nd second packet based on the ordering of the M first packets.
It should be noted that, the specific implementation of the steps of the method shown in fig. 2 can refer to the specific implementation described in the above method, and will not be described here.
In accordance with the embodiment shown in fig. 1A and fig. 2, please refer to fig. 3, and fig. 3 is a schematic flow chart of another data compression method provided in the present application, where the data compression method includes:
step 301: the data compression device obtains an output data set of an ith layer structure in the neural network model, wherein the output data set comprises at least one m x N matrix, m and N are integers greater than 1, and i is any one of 1 to N.
Step 302: and the data compression device performs array conversion on the at least one mxn matrix to obtain a one-dimensional array, wherein the one-dimensional array is the at least one mxn matrix subjected to array conversion.
Step 303: the data compression device divides the data packets of the one-dimensional array to obtain M divided first data packets, wherein each of the 1 st first data packet to the M-1 st first data packet comprises P data, the M first data packet comprises Q data, M is an integer greater than 1, P is an integer greater than or equal to 1, and Q is an integer greater than or equal to 1 and less than or equal to P.
Step 304: the data compression device obtains data packet information of a jth first data packet, where the data packet information of the jth first data packet includes P indication signals, a first data set, and a length of the jth first data packet, where the indication signals are used to indicate whether each piece of P data included in the jth first data packet is zero, the first data set includes at least one non-zero piece of P data included in the jth first data packet, and the jth first data packet is any one of the 1 st first data packet to the M-1 st first data packet.
Step 305: and the data compression device executes the same operation on M-2 first data packets except the jth first data packet from the 1 st first data packet to the M-1 st first data packet to obtain data packet information of each first data packet in the M-2 first data packets.
Step 306: the data compression device obtains data packet information of the mth first data packet, the data packet information of the mth first data packet includes Q indication signals, a second data set and a length of the mth first data packet, the indication signals are used for indicating whether each piece of Q data included in the mth first data packet is zero, and the second data set includes at least one piece of non-zero data in the Q data included in the mth first data packet.
Step 307: and the data compression device combines the P indication signals and the first data set included in each of the 1 st to M-1 th first data packets into a first sub-packet to obtain M-1 first sub-packets.
Step 308: and the data compression device combines the Q indication signals and the second data set included by the Mth first data packet into an Mth first sub-packet.
Step 309: and the data compression device combines the M-1 first sub-packets and the Mth first sub-packet into a 1 st second data packet based on the sequencing of the M first data packets.
Step 310: the data compression device groups the lengths of the M first packets into a 2 nd second packet based on the ordering of the M first packets.
Step 311: and the data compression device executes the same operation on the N-1 layers of structures except the ith layer of structure in the N layers of structures to obtain two second data packets corresponding to each layer of structure in the N-1 layers of structures.
Step 312: and the data compression device stores the 2N second data packets corresponding to the N-layer structure into a double-data-rate synchronous dynamic random access memory (DDR).
It should be noted that, the specific implementation of the steps of the method shown in fig. 3 can refer to the specific implementation described in the above method, and will not be described here.
The above embodiments mainly introduce the scheme of the embodiments of the present application from the perspective of the method-side implementation process. It is to be understood that the data compression apparatus includes hardware structures and/or software modules corresponding to the respective functions for implementing the above-described functions. Those of skill in the art would readily appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as hardware or combinations of hardware and computer software. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiment of the present application, the data compression apparatus may be divided into the functional units according to the method example, for example, each functional unit may be divided corresponding to each function, or two or more functions may be integrated into one processing unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit. It should be noted that the division of the unit in the embodiment of the present application is schematic, and is only a logic function division, and there may be another division manner in actual implementation.
The following is an embodiment of the apparatus of the present application, which is used to execute the method implemented by the embodiment of the method of the present application. Referring to fig. 4, fig. 4 is a block diagram illustrating functional units of a data compression apparatus 400 according to an embodiment of the present application, in which the data compression apparatus 400 is applied to a neural network model including an N-layer structure, where N is an integer greater than 1, the data compression apparatus 400 includes:
an obtaining unit 401, configured to obtain an output data set of an i-th layer structure in the neural network model, where the output data set includes at least one m × N matrix, m and N are both integers greater than 1, and i is any one of 1 to N;
a dividing unit 402, configured to perform packet division on the at least one mxn matrix to obtain M divided first data packets, where M is an integer greater than or equal to 1;
a compressing unit 403, configured to perform data compression on the M first data packets to obtain two second data packets after data compression.
It can be seen that, in the embodiment of the present application, the data compression apparatus obtains an output data set of an i-th layer structure in the neural network model, where the output data set includes at least one mxn matrix, performs data packet segmentation on the at least one mxn matrix to obtain M divided first data packets, and performs data compression on the M first data packets to obtain two data packets after data compression. Therefore, by dividing at least one M × n matrix into M first data packets and compressing the M first data packets into two second data packets, the output data set is compressed into two second data packets, and the compression efficiency of the data in the neural network model is improved.
In one possible example, in obtaining the output data set of the i-th layer structure in the neural network model, the obtaining unit 401 is specifically configured to:
when i is 1, acquiring an input data set of a layer 1 structure in the neural network model, wherein the input data set of the layer 1 structure comprises at least one first matrix;
acquiring a weight data set of the 1 st layer structure, decompressing the weight data set of the 1 st layer structure to obtain a second matrix, wherein the second matrix is the decompressed weight data set of the 1 st layer structure;
determining an output data set of the layer 1 structure based on the at least one first matrix and the second matrix;
when i is more than or equal to 2 and less than or equal to N, acquiring an output data set of an i-1 layer structure in the neural network model, wherein the output data set of the i-1 layer structure comprises at least one third matrix;
acquiring a weight data set of the ith layer structure, decompressing the weight data set of the ith layer structure to obtain a fourth matrix, wherein the fourth matrix is the decompressed weight data set of the ith layer structure;
determining an output data set of the i-th layer structure based on the at least one third matrix and the fourth matrix.
In a possible example, in terms of performing packet segmentation on the at least one M × n matrix to obtain M segmented first packets, the segmentation unit 402 is specifically configured to:
performing array conversion on the at least one mxn matrix to obtain a one-dimensional array, wherein the one-dimensional array is the at least one mxn matrix subjected to array conversion;
and performing data packet segmentation on the one-dimensional array to obtain M segmented first data packets, wherein each of the 1 st first data packet to the M-1 st first data packet comprises P data, the M first data packet comprises Q data, P is an integer greater than or equal to 1, and Q is an integer greater than or equal to 1 and less than or equal to P.
In a possible example, in terms of performing data compression on the M first data packets to obtain two second data packets after data compression, the compression unit 403 is specifically configured to:
acquiring data packet information of a jth first data packet, where the data packet information of the jth first data packet includes P indication signals, a first data set, and a length of the jth first data packet, where the indication signals are used to indicate whether each piece of P data included in the jth first data packet is zero, the first data set includes at least one non-zero piece of P pieces of data included in the jth first data packet, and the jth first data packet is any one of the 1 st first data packet to the M-1 st first data packet;
performing the same operation on M-2 first data packets except the jth first data packet from the 1 st first data packet to the M-1 st first data packet to obtain data packet information of each first data packet in the M-2 first data packets;
acquiring data packet information of the mth first data packet, where the data packet information of the mth first data packet includes Q indication signals, a second data set, and a length of the mth first data packet, where the indication signals are used to indicate whether each piece of Q data included in the mth first data packet is zero, and the second data set includes at least one piece of non-zero data in the Q pieces of data included in the mth first data packet;
forming a first sub-data packet by P indicating signals and a first data set included in each of the 1 st to M-1 th first data packets to obtain M-1 first sub-data packets;
forming an Mth first sub-data packet by Q indicating signals and a second data set which are included in the Mth first data packet;
forming a 1 st second data packet by the M-1 first sub-packets and the Mth first sub-packet based on the ordering of the M first data packets;
and forming the length of the M first data packets into a 2 nd second data packet based on the sorting of the M first data packets.
In one possible example, the data compression apparatus 400 further includes:
an executing unit 404, configured to execute the same operation on N-1 layers of the N-layer structures except for the ith layer of structure, to obtain two second data packets corresponding to each layer of the N-1 layers of structures;
the storage unit 405 is configured to store the 2N second data packets corresponding to the N-layer structure into the double data rate synchronous dynamic random access memory DDR.
Consistent with the embodiments shown in fig. 1A, fig. 2 and fig. 3, please refer to fig. 5, fig. 5 is a schematic structural diagram of an electronic device provided in an embodiment of the present application, where the electronic device includes a processor, a memory, a communication interface, and one or more programs, the one or more programs are stored in the memory and configured to be executed by the processor, and the programs include instructions for performing the following steps:
acquiring an output data set of an ith layer structure in the neural network model, wherein the output data set comprises at least one m x N matrix, m and N are integers greater than 1, and i is any one of 1 to N;
performing data packet segmentation on the at least one mxn matrix to obtain M segmented first data packets, wherein M is an integer greater than or equal to 1;
and performing data compression on the M first data packets to obtain two second data packets after data compression.
It can be seen that, in the embodiment of the present application, an output data set of an i-th layer structure in a neural network model is obtained, where the output data set includes at least one mxn matrix, the at least one mxn matrix is subjected to data packet segmentation to obtain M segmented first data packets, and the M first data packets are subjected to data compression to obtain two data packets after data compression. Therefore, by dividing at least one M × n matrix into M first data packets and compressing the M first data packets into two second data packets, the output data set is compressed into two second data packets, and the compression efficiency of the data in the neural network model is improved.
In one possible example, in obtaining the output data set of the i-th layer structure in the neural network model, the program comprises instructions for performing the following steps:
when i is 1, acquiring an input data set of a layer 1 structure in the neural network model, wherein the input data set of the layer 1 structure comprises at least one first matrix;
acquiring at least one weight data set of the 1 st layer structure, decompressing the weight data set of the 1 st layer structure to obtain a second matrix, wherein the second matrix is the decompressed weight data set of the 1 st layer structure;
determining an output data set of the layer 1 structure based on the at least one first matrix and the second matrix;
when i is more than or equal to 2 and less than or equal to N, acquiring an output data set of an i-1 layer structure in the neural network model, wherein the output data set of the i-1 layer structure comprises at least one third matrix;
acquiring a weight data set of the ith layer structure, decompressing the weight data set of the ith layer structure to obtain a fourth matrix, wherein the fourth matrix is the decompressed weight data set of the ith layer structure;
determining an output data set of the i-th layer structure based on the at least one third matrix and the fourth matrix.
In one possible example, in terms of performing packet segmentation on the at least one M × n matrix to obtain M segmented first packets, the program includes instructions specifically configured to:
performing array conversion on the at least one mxn matrix to obtain a one-dimensional array, wherein the one-dimensional array is the at least one mxn matrix subjected to array conversion;
and performing data packet segmentation on the one-dimensional array to obtain M segmented first data packets, wherein each of the 1 st first data packet to the M-1 st first data packet comprises P data, the M first data packet comprises Q data, P is an integer greater than or equal to 1, and Q is an integer greater than or equal to 1 and less than or equal to P.
In one possible example, in terms of performing data compression on the M first data packets to obtain two second data packets after data compression, the program includes instructions specifically configured to perform the following steps:
acquiring data packet information of a jth first data packet, where the data packet information of the jth first data packet includes P indication signals, a first data set, and a length of the jth first data packet, where the indication signals are used to indicate whether each piece of P data included in the jth first data packet is zero, the first data set includes at least one non-zero piece of P pieces of data included in the jth first data packet, and the jth first data packet is any one of the 1 st first data packet to the M-1 st first data packet;
performing the same operation on M-2 first data packets except the jth first data packet from the 1 st first data packet to the M-1 st first data packet to obtain data packet information of each first data packet in the M-2 first data packets;
acquiring data packet information of the mth first data packet, where the data packet information of the mth first data packet includes Q indication signals, a second data set, and a length of the mth first data packet, where the indication signals are used to indicate whether each piece of Q data included in the mth first data packet is zero, and the second data set includes at least one piece of non-zero data in the Q pieces of data included in the mth first data packet;
forming a first sub-data packet by P indicating signals and a first data set included in each of the 1 st to M-1 th first data packets to obtain M-1 first sub-data packets;
forming an Mth first sub-data packet by Q indicating signals and a second data set which are included in the Mth first data packet;
forming a 1 st second data packet by the M-1 first sub-packets and the Mth first sub-packet based on the ordering of the M first data packets;
and forming the length of the M first data packets into a 2 nd second data packet based on the sorting of the M first data packets.
In one possible example, the program further includes instructions for performing the steps of:
executing the same operation on N-1 layers of structures except the ith layer of structure in the N layers of structures to obtain two second data packets corresponding to each layer of structure in the N-1 layers of structures;
and storing the 2N second data packets corresponding to the N-layer structure in a double-speed synchronous dynamic random access memory (DDR).
Embodiments of the present application further provide a computer storage medium for storing a computer program, where the computer program is executed by a processor to implement part or all of the steps of any one of the methods described in the above method embodiments, and the computer includes an electronic device.
Embodiments of the present application also provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps of any of the methods as described in the above method embodiments. The computer program product may be a software installation package, the computer comprising an electronic device.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the above-described division of the units is only one type of division of logical functions, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be an electric or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit may be stored in a computer readable memory if it is implemented in the form of a software functional unit and sold or used as a stand-alone product. Based on such understanding, the technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a memory, and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the above-mentioned method of the embodiments of the present application. And the aforementioned memory comprises: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable memory, which may include: flash Memory disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
The foregoing detailed description of the embodiments of the present application has been presented to illustrate the principles and implementations of the present application, and the above description of the embodiments is only provided to help understand the method and the core concept of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific implementation and application scope, and in view of the above, the content of the present specification should not be construed as a limitation to the present application.

Claims (6)

1.一种数据压缩方法,其特征在于,应用于包括N层结构的神经网络模型,所述N为大于1的整数,所述方法包括:1. A data compression method, characterized in that, applied to a neural network model comprising an N-layer structure, wherein N is an integer greater than 1, and the method comprises: 获取所述神经网络模型中的第i层结构的输出数据集合,所述输出数据集合包括至少一个m×n的矩阵,所述m和所述n均为大于1的整数,所述i为1至所述N中的任意一个;Obtain the output data set of the i-th layer structure in the neural network model, the output data set includes at least one m×n matrix, the m and the n are both integers greater than 1, and the i is 1 to any of the N; 对所述至少一个m×n的矩阵进行数据包分割,得到分割后的M个第一数据包,所述M为大于或等于1的整数;performing data packet division on the at least one m×n matrix to obtain M first data packets after division, where M is an integer greater than or equal to 1; 对所述M个第一数据包进行数据压缩,得到数据压缩后的两个第二数据包;performing data compression on the M first data packets to obtain two second data packets after data compression; 所述对所述至少一个m×n的矩阵进行数据包分割,得到分割后的M个第一数据包,包括:The said at least one m×n matrix is divided into data packets to obtain M first data packets after division, including: 将所述至少一个m×n的矩阵进行数组转换,得到一维数组,所述一维数组为数组转换后的所述至少一个m×n的矩阵;performing array conversion on the at least one m×n matrix to obtain a one-dimensional array, where the one-dimensional array is the at least one m×n matrix after the array conversion; 将所述一维数组进行数据包分割,得到分割后的所述M个第一数据包,第1个第一数据包至第M-1个第一数据包中的每个第一数据包均包括P个数据,第M个第一数据包包括Q个数据,所述P为大于或等于1的整数,所述Q为大于或等于1且小于或等于所述P的整数;The one-dimensional array is divided into data packets to obtain the M first data packets after the division, and each of the first data packets from the first first data packet to the M-1th first data packet is including P pieces of data, the Mth first data packet includes Q pieces of data, the P is an integer greater than or equal to 1, and the Q is an integer greater than or equal to 1 and less than or equal to the P; 所述对所述M个第一数据包进行数据压缩,得到数据压缩后的两个第二数据包,包括:The described data compression is performed on the M first data packets to obtain two second data packets after data compression, including: 获取第j个第一数据包的数据包信息,所述第j个第一数据包的数据包信息包括P个指示信号、第一数据集合和所述第j个第一数据包的长度,所述指示信号用于指示所述第j个第一数据包包括的P个数据中的每个数据是否为零,所述第一数据集合包括所述第j个第一数据包包括的P个数据中的至少一个非零数据,所述第j个第一数据包为所述第1个第一数据包至所述第M-1个第一数据包中的任意一个;Obtain the data packet information of the jth first data packet, where the data packet information of the jth first data packet includes P indication signals, the first data set and the length of the jth first data packet, so The indication signal is used to indicate whether each of the P data included in the jth first data packet is zero, and the first data set includes P data included in the jth first data packet At least one non-zero data in, the jth first data packet is any one of the 1st first data packet to the M-1th first data packet; 对所述第1个第一数据包至所述第M-1个第一数据包中除所述第j个第一数据包之外的M-2个第一数据包执行相同的操作,得到所述M-2个第一数据包中的每个第一数据包的数据包信息;Perform the same operation on M-2 first data packets from the first first data packet to the M-1 th first data packet except the jth first data packet, to obtain Packet information of each first packet in the M-2 first packets; 获取所述第M个第一数据包的数据包信息,所述第M个第一数据包的数据包信息包括Q个指示信号、第二数据集合和所述第M个第一数据包的长度,所述指示信号用于指示所述第M个第一数据包包括的Q个数据中的每个数据是否为零,所述第二数据集合包括所述第M个第一数据包包括的Q个数据中的至少一个非零数据;Acquire the data packet information of the M-th first data packet, where the data-packet information of the M-th first data packet includes Q indication signals, a second data set, and the length of the M-th first data packet , the indication signal is used to indicate whether each data in the Q data included in the Mth first data packet is zero, and the second data set includes the Q data included in the Mth first data packet at least one of the data is non-zero; 将所述第1个第一数据包至所述第M-1个第一数据包中的每个第一数据包包括的P个指示信号和第一数据集合组成第一子数据包,得到M-1个第一子数据包;The P indicator signals and the first data set included in each of the first data packets from the first first data packet to the M-1th first data packet are formed into a first sub-data packet, to obtain M - 1 first sub-packet; 将所述第M个第一数据包包括的Q个指示信号和第二数据集合组成第M个第一子数据包;The M-th first sub-packet is formed by the Q indicator signals and the second data set included in the M-th first data packet; 基于所述M个第一数据包的排序将所述M-1个第一子数据包和所述第M个第一子数据包组成第1个第二数据包;Based on the ordering of the M first data packets, the M-1 first sub-data packets and the M-th first sub-data packets form the first second data packet; 基于所述M个第一数据包的排序将所述M个第一数据包的长度组成第2个第二数据包。The lengths of the M first data packets are formed into a second second data packet based on the ordering of the M first data packets. 2.根据权利要求1所述的方法,其特征在于,所述获取所述神经网络模型中的第i层结构的输出数据集合,包括:2. The method according to claim 1, wherein the acquiring the output data set of the i-th layer structure in the neural network model comprises: 当i=1时,获取所述神经网络模型中的第1层结构的输入数据集合,所述第1层结构的输入数据集合包括至少一个第一矩阵;When i=1, obtain the input data set of the first layer structure in the neural network model, and the input data set of the first layer structure includes at least one first matrix; 获取所述第1层结构的权值数据集合,将所述第1层结构的权值数据集合进行解压缩,得到第二矩阵,所述第二矩阵为解压缩后的所述第1层结构的权值数据集合;Obtain the weight data set of the first layer structure, decompress the weight data set of the first layer structure to obtain a second matrix, and the second matrix is the decompressed first layer structure The weight data set of ; 基于所述至少一个第一矩阵和所述第二矩阵确定所述第1层结构的输出数据集合;determining a set of output data for the layer 1 structure based on the at least one first matrix and the second matrix; 当2≤i≤N时,获取所述神经网络模型中的第i-1层结构的输出数据集合,所述第i-1层结构的输出数据集合包括至少一个第三矩阵;When 2≤i≤N, obtain the output data set of the i-1th layer structure in the neural network model, and the output data set of the i-1th layer structure includes at least one third matrix; 获取所述第i层结构的权值数据集合,将所述第i层结构的权值数据集合进行解压缩,得到第四矩阵,所述第四矩阵为解压缩后的所述第i层结构的权值数据集合;Obtain the weight data set of the i-th layer structure, decompress the weight data set of the i-th layer structure, and obtain a fourth matrix, where the fourth matrix is the decompressed i-th layer structure The weight data set of ; 基于所述至少一个第三矩阵和所述第四矩阵确定所述第i层结构的输出数据集合。An output data set of the i-th layer structure is determined based on the at least one third matrix and the fourth matrix. 3.根据权利要求1所述的方法,其特征在于,所述对所述M个第一数据包进行数据压缩,得到数据压缩后的两个第二数据包之后,所述方法还包括:3. The method according to claim 1, wherein after the data compression is performed on the M first data packets to obtain two second data packets after data compression, the method further comprises: 对所述N层结构中除所述第i层结构之外的N-1层结构执行相同的操作,得到所述N-1层结构中的每层结构对应的两个第二数据包;Perform the same operation on the N-1 layer structure except the i-th layer structure in the N-layer structure, to obtain two second data packets corresponding to each layer structure in the N-1 layer structure; 将所述N层结构对应的2N个第二数据包存储至双倍速率同步动态随机存储器DDR中。2N second data packets corresponding to the N-layer structure are stored in the double-rate synchronous dynamic random access memory DDR. 4.一种数据压缩装置,其特征在于,应用于包括N层结构的神经网络模型,所述N为大于1的整数,所述装置包括:4. A data compression device, characterized in that, when applied to a neural network model comprising an N-layer structure, the N is an integer greater than 1, and the device comprises: 获取单元,用于获取所述神经网络模型中的第i层结构的输出数据集合,所述输出数据集合包括至少一个m×n的矩阵,所述m和所述n均为大于1的整数,所述i为1至所述N中的任意一个;an obtaining unit, configured to obtain the output data set of the i-th layer structure in the neural network model, the output data set includes at least one m×n matrix, and the m and the n are both integers greater than 1, The i is any one from 1 to the N; 分割单元,用于对所述至少一个m×n的矩阵进行数据包分割,得到分割后的M个第一数据包,所述M为大于或等于1的整数;a dividing unit, configured to perform data packet division on the at least one m×n matrix to obtain M first data packets after division, where M is an integer greater than or equal to 1; 压缩单元,用于对所述M个第一数据包进行数据压缩,得到数据压缩后的两个第二数据包;a compression unit, configured to perform data compression on the M first data packets to obtain two second data packets after data compression; 在对所述至少一个m×n的矩阵进行数据包分割,得到分割后的M个第一数据包方面,所述分割单元具体用于:In terms of performing data packet segmentation on the at least one m×n matrix to obtain M first data packets after the segmentation, the segmentation unit is specifically used for: 将所述至少一个m×n的矩阵进行数组转换,得到一维数组,所述一维数组为数组转换后的所述至少一个m×n的矩阵;performing array conversion on the at least one m×n matrix to obtain a one-dimensional array, where the one-dimensional array is the at least one m×n matrix after the array conversion; 将所述一维数组进行数据包分割,得到分割后的所述M个第一数据包,第1个第一数据包至第M-1个第一数据包中的每个第一数据包均包括P个数据,第M个第一数据包包括Q个数据,所述P为大于或等于1的整数,所述Q为大于或等于1且小于或等于所述P的整数;The one-dimensional array is divided into data packets to obtain the M first data packets after the division, and each of the first data packets from the first first data packet to the M-1th first data packet is including P pieces of data, the Mth first data packet includes Q pieces of data, the P is an integer greater than or equal to 1, and the Q is an integer greater than or equal to 1 and less than or equal to the P; 在对所述M个第一数据包进行数据压缩,得到数据压缩后的两个第二数据包方面,所述压缩单元具体用于:In terms of performing data compression on the M first data packets to obtain two second data packets after data compression, the compression unit is specifically used for: 获取第j个第一数据包的数据包信息,所述第j个第一数据包的数据包信息包括P个指示信号、第一数据集合和所述第j个第一数据包的长度,所述指示信号用于指示所述第j个第一数据包包括的P个数据中的每个数据是否为零,所述第一数据集合包括所述第j个第一数据包包括的P个数据中的至少一个非零数据,所述第j个第一数据包为所述第1个第一数据包至所述第M-1个第一数据包中的任意一个;Obtain the data packet information of the jth first data packet, where the data packet information of the jth first data packet includes P indication signals, the first data set and the length of the jth first data packet, so The indication signal is used to indicate whether each of the P data included in the jth first data packet is zero, and the first data set includes P data included in the jth first data packet At least one non-zero data in, the jth first data packet is any one of the 1st first data packet to the M-1th first data packet; 对所述第1个第一数据包至所述第M-1个第一数据包中除所述第j个第一数据包之外的M-2个第一数据包执行相同的操作,得到所述M-2个第一数据包中的每个第一数据包的数据包信息;Perform the same operation on M-2 first data packets from the first first data packet to the M-1 th first data packet except the jth first data packet, to obtain Packet information of each first packet in the M-2 first packets; 获取所述第M个第一数据包的数据包信息,所述第M个第一数据包的数据包信息包括Q个指示信号、第二数据集合和所述第M个第一数据包的长度,所述指示信号用于指示所述第M个第一数据包包括的Q个数据中的每个数据是否为零,所述第二数据集合包括所述第M个第一数据包包括的Q个数据中的至少一个非零数据;Acquire the data packet information of the M-th first data packet, where the data-packet information of the M-th first data packet includes Q indication signals, a second data set, and the length of the M-th first data packet , the indication signal is used to indicate whether each data in the Q data included in the Mth first data packet is zero, and the second data set includes the Q data included in the Mth first data packet at least one of the data is non-zero; 将所述第1个第一数据包至所述第M-1个第一数据包中的每个第一数据包包括的P个指示信号和第一数据集合组成第一子数据包,得到M-1个第一子数据包;The P indicator signals and the first data set included in each of the first data packets from the first first data packet to the M-1th first data packet are formed into a first sub-data packet, to obtain M - 1 first sub-packet; 将所述第M个第一数据包包括的Q个指示信号和第二数据集合组成第M个第一子数据包;The M-th first sub-packet is formed by the Q indicator signals and the second data set included in the M-th first data packet; 基于所述M个第一数据包的排序将所述M-1个第一子数据包和所述第M个第一子数据包组成第1个第二数据包;Based on the ordering of the M first data packets, the M-1 first sub-data packets and the M-th first sub-data packets form the first second data packet; 基于所述M个第一数据包的排序将所述M个第一数据包的长度组成第2个第二数据包。The lengths of the M first data packets are formed into a second second data packet based on the ordering of the M first data packets. 5.一种电子设备,其特征在于,包括处理器、存储器、通信接口,以及一个或多个程序,所述一个或多个程序被存储在所述存储器中,并且被配置由所述处理器执行,所述程序包括用于执行如权利要求1-3任一项所述的方法中的步骤的指令。5. An electronic device comprising a processor, a memory, a communication interface, and one or more programs, the one or more programs being stored in the memory and configured by the processor Executing, the program includes instructions for performing the steps in the method of any of claims 1-3. 6.一种计算机可读存储介质,其特征在于,所述计算机可读存储介质用于存储计算机程序,所述计算机程序被处理器执行,以实现如权利要求1-3任一项所述的方法。6. A computer-readable storage medium, characterized in that, the computer-readable storage medium is used for storing a computer program, and the computer program is executed by a processor to realize the method according to any one of claims 1-3. method.
CN201811641325.2A 2018-12-29 2018-12-29 Data compression method and related device Active CN109800869B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201811641325.2A CN109800869B (en) 2018-12-29 2018-12-29 Data compression method and related device
PCT/CN2019/114731 WO2020134550A1 (en) 2018-12-29 2019-10-31 Data compression method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811641325.2A CN109800869B (en) 2018-12-29 2018-12-29 Data compression method and related device

Publications (2)

Publication Number Publication Date
CN109800869A CN109800869A (en) 2019-05-24
CN109800869B true CN109800869B (en) 2021-03-05

Family

ID=66558223

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811641325.2A Active CN109800869B (en) 2018-12-29 2018-12-29 Data compression method and related device

Country Status (2)

Country Link
CN (1) CN109800869B (en)
WO (1) WO2020134550A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109800869B (en) * 2018-12-29 2021-03-05 深圳云天励飞技术有限公司 Data compression method and related device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239825A (en) * 2016-08-22 2017-10-10 北京深鉴智能科技有限公司 Consider the deep neural network compression method of load balancing
CN107608937A (en) * 2017-09-11 2018-01-19 浙江大学 A kind of machine learning fan condition monitoring method and device based on cloud computing platform
CN108615074A (en) * 2018-04-28 2018-10-02 中国科学院计算技术研究所 Processing with Neural Network system and method based on compressed sensing

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106447034B (en) * 2016-10-27 2019-07-30 中国科学院计算技术研究所 A kind of neural network processor based on data compression, design method, chip
CN108122030A (en) * 2016-11-30 2018-06-05 华为技术有限公司 A kind of operation method of convolutional neural networks, device and server
CN107220702B (en) * 2017-06-21 2020-11-24 北京图森智途科技有限公司 A computer vision processing method and device for low computing power processing equipment
CN107145939B (en) * 2017-06-21 2020-11-24 北京图森智途科技有限公司 A computer vision processing method and device for low computing power processing equipment
CN107590533B (en) * 2017-08-29 2020-07-31 中国科学院计算技术研究所 Compression device for deep neural network
CN107634937A (en) * 2017-08-29 2018-01-26 中国地质大学(武汉) A wireless sensor network data compression method, device and storage device thereof
CN107565971B (en) * 2017-09-07 2020-04-14 华为技术有限公司 A data compression method and device
CN107634943A (en) * 2017-09-08 2018-01-26 中国地质大学(武汉) A weight reduction wireless sensor network data compression method, device and storage device
CN107729995A (en) * 2017-10-31 2018-02-23 中国科学院计算技术研究所 Method and system and neural network processor for accelerans network processing unit
CN108615076B (en) * 2018-04-08 2020-09-11 瑞芯微电子股份有限公司 Deep learning chip-based data storage optimization method and device
CN108763379B (en) * 2018-05-18 2022-06-03 北京奇艺世纪科技有限公司 Data compression method, data decompression method, device and electronic equipment
CN109800869B (en) * 2018-12-29 2021-03-05 深圳云天励飞技术有限公司 Data compression method and related device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239825A (en) * 2016-08-22 2017-10-10 北京深鉴智能科技有限公司 Consider the deep neural network compression method of load balancing
CN107608937A (en) * 2017-09-11 2018-01-19 浙江大学 A kind of machine learning fan condition monitoring method and device based on cloud computing platform
CN108615074A (en) * 2018-04-28 2018-10-02 中国科学院计算技术研究所 Processing with Neural Network system and method based on compressed sensing

Also Published As

Publication number Publication date
CN109800869A (en) 2019-05-24
WO2020134550A1 (en) 2020-07-02

Similar Documents

Publication Publication Date Title
CN113673701B (en) Operation method of neural network model, readable medium and electronic equipment
CN115115720B (en) Image decoding, encoding method, device and equipment
CN111491169B (en) Digital image compression method, device, equipment and medium
CN103139567B (en) The method and apparatus of a kind of image compression and decompression
CN114640354B (en) Data compression methods, apparatus, electronic devices and computer-readable storage media
CN110399511A (en) Image cache method, equipment, storage medium and device based on Redis
CN112668708A (en) Convolution operation device for improving data utilization rate
CN106780363A (en) Picture processing method and device and electronic equipment
CN111310115A (en) Data processing method, device and chip, electronic equipment and storage medium
CN112950640A (en) Video portrait segmentation method and device, electronic equipment and storage medium
CN116912556A (en) Image classification method, device, electronic device and storage medium
CN109800869B (en) Data compression method and related device
EP2783509B1 (en) Method and apparatus for generating a bitstream of repetitive structure discovery based 3d model compression
CN111045726B (en) Deep learning processing device and method supporting coding and decoding
CN108880559B (en) Data compression method, data decompression method, compression device and decompression device
CN113810058B (en) Data compression method, data decompression method, device and electronic equipment
US8515189B2 (en) Image compression method with fixed compression ratio, image decompression method, and electronic device thereof
CN117560013A (en) Data compression methods and electronic devices
CN110869975A (en) Image processing method and apparatus, and video processor
US20210224632A1 (en) Methods, devices, chips, electronic apparatuses, and storage media for processing data
CN114070901A (en) Data sending and receiving method, device and equipment based on multi-data alignment
CN110460854B (en) Image compression method
CN112508187A (en) Machine learning model compression method, device and equipment
CN111143641A (en) Deep learning model training method and device and electronic equipment
CN117939127A (en) Image processing method and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant