[go: up one dir, main page]

WO2018058427A1 - Neural network computation apparatus and method - Google Patents

Neural network computation apparatus and method Download PDF

Info

Publication number
WO2018058427A1
WO2018058427A1 PCT/CN2016/100784 CN2016100784W WO2018058427A1 WO 2018058427 A1 WO2018058427 A1 WO 2018058427A1 CN 2016100784 W CN2016100784 W CN 2016100784W WO 2018058427 A1 WO2018058427 A1 WO 2018058427A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
data
unit
sparse
network data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2016/100784
Other languages
French (fr)
Chinese (zh)
Inventor
陈天石
刘少礼
陈云霁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cambricon Technologies Corp Ltd
Original Assignee
Cambricon Technologies Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cambricon Technologies Corp Ltd filed Critical Cambricon Technologies Corp Ltd
Priority to PCT/CN2016/100784 priority Critical patent/WO2018058427A1/en
Publication of WO2018058427A1 publication Critical patent/WO2018058427A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks

Definitions

  • the present invention relates to the field of information technology, and in particular, to a neural network operation device and method compatible with general neural network data, sparse neural network data, and discrete neural network data.
  • ANNs Artificial neural networks
  • NNNs neural networks
  • This kind of network relies on the complexity of the system to adjust the relationship between a large number of internal nodes to achieve the purpose of processing information.
  • neural networks have made great progress in many fields such as intelligent control and machine learning. With the continuous development of deep learning technology, the current model of neural network is getting larger and larger, and the computing performance and memory bandwidth requirements are getting higher and higher.
  • the existing neural network computing platform (CPU, GPU, traditional neural network accelerator) The user's needs have not been met.
  • sparse neural network data and discrete neural network data are developed based on the general neural network data.
  • the current neural network computing platform needs to set up a separate processing module for each type of neural network data to process, resulting in tight computing resources, and associated problems such as insufficient memory bandwidth and high power consumption.
  • the present invention provides a neural network computing device and method for improving the degree of multiplexing of neural network data processing and saving computing resources.
  • a neural network computing device includes: a control unit, a storage unit, a sparse selection unit, and a neural network operation unit; wherein: a storage unit is configured to store neural network data; and a control unit is configured to generate a corresponding sparse selection unit and a neural network operation unit respectively And storing the microinstruction of the unit, and sending the microinstruction to the corresponding unit; the sparse selection unit is configured to store in the storage unit according to the microinstruction corresponding to the sparse selection unit delivered by the control unit according to the location information represented by the sparse data therein In the neural network data, the neural network data corresponding to the effective weight is selected to participate in the operation; and the neural network The network operation unit is configured to perform a neural network operation on the neural network data selected by the sparse selection unit according to the micro-command corresponding to the neural network operation unit delivered by the control unit, to obtain an operation result.
  • the neural network data processing method comprises: Step D, the discrete neural network data splitting unit splits the neural network model of the discrete neural network data into N sparsely represented sub-networks, each sub-network contains only one real number, and the remaining weights The values are all 0; step E, the sparse selection unit and the neural network operation unit process each sub-network according to the sparse neural network data to obtain the operation results respectively; and in step F, the neural network operation unit calculates the operation results of the N sub-networks And, the neural network operation result of the discrete neural network data is obtained, and the neural network data processing ends.
  • the neural network data processing method includes: Step A, the data type judging unit judges the type of the neural network data, if the neural network data is sparse neural network data, step B is performed, and if the neural network data is discrete neural network data, step D is performed; If the neural network data is general neural network data, step G is performed; step B, the sparse selection unit selects neural network data corresponding to the effective weight in the storage unit according to the position information represented by the sparse data; step C, neural network operation The unit performs neural network operations on the neural network data acquired by the sparse selection unit, and obtains the operation result of the sparse neural network data, and the neural network data processing ends; in step D, the discrete neural network data splitting unit decomposes the neural network model of the discrete neural network data.
  • step E sparse selection unit and neural network operation unit process each sub-network according to sparse neural network data , respectively, get the operation result;
  • step F the neural network operation unit sums the operation results of the N sub-networks to obtain the neural network operation result of the discrete neural network data, and the neural network data processing ends;
  • step G the neural network operation unit executes the neural network on the general neural network data. The operation is performed, and the operation result is obtained, and the neural network data processing ends.
  • the device can determine whether the data has an interdependence relationship. For example, the input data used in the next calculation is the output result after the end of the previous calculation, so that the data dependency module is not calculated, and the next calculation is performed. Starting the calculation without waiting for the end of the previous calculation will cause the calculation result to be incorrect.
  • the dependency processing unit determines the data dependency relationship, so that the control device waits for the data to perform the next calculation, thereby ensuring the correctness and high efficiency of the device operation.
  • FIG. 1 is a schematic structural diagram of a neural network computing device according to a first embodiment of the present invention
  • 2 is a schematic diagram of sparse neural network weight model data
  • FIG. 5 is a schematic structural diagram of a neural network computing device according to a second embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of a neural network computing device according to a third embodiment of the present invention.
  • FIG. 7 is a flowchart of a neural network data processing method according to a fourth embodiment of the present invention.
  • FIG. 8 is a flowchart of a neural network data processing method according to a fifth embodiment of the present invention.
  • FIG. 9 is a flowchart of a neural network data processing method according to a sixth embodiment of the present invention.
  • FIG. 10 is a flowchart of a neural network data processing method according to a seventh embodiment of the present invention.
  • general-purpose neural network data refers to general-purpose computer data, that is, data types commonly used in computers, such as 32-bit floating-point data, 16-bit floating-point data, 32-bit fixed-point data, and the like.
  • the discrete neural network data is expressed as: part of the data or all of the data is computer data represented by discrete data. Different from the data representation of 32-bit floating point and 16-bit floating point in general neural network data, the discrete neural network data refers to all the data involved in the operation is only a set of discrete real numbers.
  • the data in the neural network includes input data and Neural network model data. It includes the following types:
  • the input data and the neural network model data are all composed of these real numbers called all discrete data representations
  • the discrete data representation in the present invention refers to the three discrete data representations described above.
  • the input data is the original universal neural network data, which may be an RGB image data
  • the neural network model data is represented by discrete data
  • the weight data of a certain layer has only two values of -1/+1, which is Neural network represented by discrete neural network data.
  • the sparse neural network data is: data that is discontinuous in position, specifically including data and data location information.
  • the model data of a neural network is sparse.
  • 01 reflects whether the model data at the corresponding position is valid, 0 indicates that the position data is invalid and is sparse, and 1 indicates that the position data is valid.
  • 01 reflects whether the model data at the corresponding position is valid, 0 indicates that the position data is invalid and is sparse, and 1 indicates that the position data is valid.
  • the data information stored in this way and the position information stored in the 01-bit string together constitute our sparse neural network data, which is also called sparse data representation in the present invention.
  • the neural network computing device and method provided by the present invention support sparse neural network data and neural network operations of discrete neural network data by multiplexing sparse selection units.
  • the invention can be applied to the following (including but not limited to) scenarios: data processing, robots, computers, printers, scanners, telephones, tablets, smart terminals, mobile phones, driving recorders, navigators, sensors, cameras, cloud servers , cameras, camcorders, projectors, watches, earphones, mobile storage, wearable devices and other electronic products; aircraft, ships, vehicles and other types of transportation; televisions, air conditioners, microwave ovens, refrigerators, rice cookers, humidifiers, washing machines, Electric lights, gas stoves, range hoods and other household appliances; and including nuclear magnetic resonance instruments, B-ultrasound, electrocardiograph and other medical equipment.
  • a neural network computing device in a first exemplary embodiment of the present invention, includes: a control unit 100, a storage unit 200, a sparse selection unit 300, and a neural network operation unit 400.
  • the storage unit 200 is configured to store neural network data.
  • the control unit 100 is configured to generate microinstructions respectively corresponding to the sparse selection unit and the neural network operation unit, and send the microinstructions to the corresponding units.
  • the sparse selection unit 300 is configured to select, according to the microinstruction corresponding to the sparse selection unit delivered by the control unit, the neural network data corresponding to the effective weight in the neural network data stored by the storage unit according to the location information represented by the sparse data therein Participate in the operation.
  • the neural network operation unit 400 is configured to perform a neural network operation on the neural network data selected by the sparse selection unit according to the micro-instruction of the corresponding neural network operation unit delivered by the control unit, to obtain an operation result.
  • the storage unit 200 is configured to store three types of neural network data - general neural network data, sparse neural network data, and discrete neural network data.
  • the storage unit may be a scratch pad memory and can support different sizes.
  • the data size of the present invention is such that the necessary calculation data is temporarily stored in the scratch pad memory (Scratchpad Memory), so that the computing device can more flexibly and efficiently support data of different sizes in the process of performing neural network operations.
  • the memory cells can be implemented by a variety of different memory devices (SRAM, eDRAM, DRAM, memristor, 3D-DRAM or non-volatile memory, etc.).
  • the control unit 100 is configured to generate microinstructions respectively corresponding to the sparse selection unit and the neural network operation unit, and send the microinstructions to the corresponding units.
  • the control unit can support a plurality of different types of neural network algorithms, including but not limited to CNN/DNN/DBN/MLP/RNN/LSTM/SOM/RCNN/FastRCNN/Faster-RCNN.
  • the sparse selection unit 300 is configured to select a neuron participating operation corresponding to the effective weight according to the location information of the sparse neural network data. When dealing with discrete neural network data, we also process the corresponding discrete data through the sparse selection unit.
  • the neural network operation unit 400 is configured to acquire input data from the storage unit according to the microinstruction generated by the control unit, and execute a general neural network or a sparse neural network or a discrete data representation. Through the network operation, the operation result is obtained, and the operation result is stored in the storage unit.
  • the sparse selection unit may process the sparse data representation and the discrete data representation. Specifically, the sparse selection unit selects a neural network corresponding to the location according to the location information of the sparse data and the 01 bit string. The input data of one layer is sent to the neural network operation unit. In the 01 bit string, each bit corresponds to one weight data in the neural network model, and 0 indicates that the corresponding weight data is invalid and does not exist. 1 indicates that the corresponding weight data is valid and exists. The data portion of the sparse data representation stores only valid data. For example, we have the sparse neural network weight model data shown in Figure 2.
  • the sparse selection module directly sends the effective weight portion of the sparse data representation to the neural network operation unit, and then selects the data of the input neuron corresponding to the position of one of the valid bits according to the 01 string to be sent to the neural network operation unit.
  • the sparse selection module will be associated with the weight position 1/2/3/7/9/11/15 (this number corresponds to the left-to-right position of the position information number 1 in Figure 2, which corresponds to the array number).
  • the corresponding input neuron is sent to the neural network operation unit.
  • a major feature of the present invention is the reuse of the sparse selection module 300 in discrete neural network data.
  • discrete neural network data several real values are used as a few sparse neural network data to perform operations.
  • the neural network model is split into N sub-networks.
  • the subnetwork is the same size as the original discrete neural network data.
  • Each subnetwork contains only one real number, and the remaining weights are all 0, so each subnetwork is similar to the sparse representation described above.
  • the only difference from the above sparse data is that after the neural network operation unit is calculated, the sub-network needs an external command to control the neural network operation unit to sum the sub-network calculation results to obtain the final result.
  • the neural network computing device further includes: a discrete neural network data splitting unit 500.
  • the discrete neural network data splitting unit 500 is configured to:
  • the neural network model of the discrete neural network model data is divided into N sub-networks, each sub-network contains only one real number, and the remaining weights are all 0;
  • the sparse selection unit 300 and the neural network operation unit 400 process each sub-network according to the sparse neural network data, and respectively obtain the operation result.
  • the neural network operation unit is further configured to sum the operation results of the N sub-networks to obtain the discrete neural network data. The result of the neural network operation.
  • the weight data of a certain layer in the neural network model data is represented by discrete data.
  • the four sub-networks are prepared externally, and the sparse selection module reads four sub-networks in turn, and the subsequent processing method is the same as the sparse neural network data, and the input data corresponding to the weight position information is selected and sent to the operation. unit.
  • the only difference is that after the calculation of the arithmetic unit is completed, an external command is needed to control the neural network operation unit to sum the operation results of the four sub-networks.
  • the sparse selection module selects the input data twice, and the input data corresponding to the position information of the weight 1 and the weight-1 respectively is input to the operation unit. Similarly, an external command control unit is required to sum the output of the two sub-networks.
  • a neural network computing device is provided. As shown in FIG. 5, the neural network computing device of the present embodiment differs from the first embodiment in that the data type determining unit 600 is added to determine the type of the neural network data.
  • the neural network data type of the present invention is specified in the instruction.
  • the control unit controls the operation mode of the sparse selection unit and the operation unit by the output result of the data type judgment unit:
  • the sparse selection module selects corresponding input data according to the location information and sends it to the neural network operation unit;
  • the sparse selection unit selects, according to the location information represented by the sparse data, a neuron corresponding to the effective weight to participate in the operation in the storage unit;
  • the neural network operation unit performs a neural network operation on the neural network data acquired by the sparse selection unit, and obtains an operation result.
  • the sparse selection module selects the corresponding input data according to the position information and sends it to the operation unit, and the operation unit sums the calculation result according to the external operation instruction;
  • the discrete neural network data splitting unit is operated, and the neural network model of the discrete neural network data is split into N sub-networks;
  • the unit and the neural network operation unit work, and each sub-network is processed according to the sparse neural network data to obtain the operation result respectively;
  • the neural network operation unit is operated, and the operation results of the N sub-networks are summed to obtain the discrete neural network. Neural network operation results of network data.
  • the sparse selection module does not work and is not selected based on location information.
  • the sparse selection unit is disabled, and the neural network operation unit performs a neural network operation on the general neural network data to obtain an operation result.
  • a neural network computing device is provided. Compared with the second embodiment, the neural network computing device of the present embodiment differs in that a dependency processing function is added to the control unit.
  • the control unit 100 includes: an instruction cache module 110, configured to store a neural network instruction to be executed, where the neural network instruction includes address information of the neural network data to be processed;
  • the module 120 is configured to acquire a neural network instruction from the instruction cache module, and the decoding module 130 is configured to decode the neural network instruction to obtain micro instructions corresponding to the storage unit, the sparse selection unit, and the neural network operation unit respectively.
  • the microinstruction includes address information of the corresponding neural network data; the instruction queue 140 is configured to store the decoded microinstruction; the scalar register file 150 is configured to store address information of the to-be-processed neural network data;
  • the dependency processing module 160 is configured to determine whether the microinstruction in the instruction queue and the previous microinstruction access the same data, and if so, store the microinstruction in a storage queue, and store the microinstruction after the execution of the previous microinstruction.
  • the microinstruction in the queue is transmitted to the corresponding unit; otherwise, the microinstruction is directly transmitted to the corresponding unit.
  • the instruction cache module is configured to store a neural network instruction to be executed. During the execution of the instruction, it is also cached in the instruction cache module. When an instruction is executed, if the instruction is also the earliest instruction in the instruction cache module that is not committed, the instruction will be submitted once submitted. The operation of this instruction will not be able to cancel the change of the device status.
  • the instruction cache module can be a reordering cache.
  • the neural network computing device of this embodiment further includes: an input and output unit for storing data in the storage unit, or acquiring a neural network operation result from the storage unit.
  • the direct storage unit is responsible for reading data from or writing data to the memory.
  • the present invention also provides a general neural network data (general neural network refers to a neural network in which data is not represented by discrete data representation or sparse representation), and is used for performing a general operation according to an operation instruction. Neural network operations. As shown in FIG. 7, the processing method of the general neural network data in this embodiment includes:
  • Step S701 the instruction fetch module extracts the neural network instruction from the instruction cache module, and sends the neural network instruction to the decoding module.
  • Step S702 the decoding module decodes the neural network instruction, and obtains micro-instructions corresponding to the storage unit, the sparse selection unit, and the neural network operation unit respectively, and sends each micro-instruction to the instruction queue;
  • Step S703 obtaining a neural network operation operation code of the micro-instruction and a neural network operation operand from the scalar register file, and then the micro-instruction is sent to the dependency processing unit;
  • Step S704 the dependency processing unit analyzes whether the microinstruction has a dependency on the data with the microinstruction that has not been executed before, and if so, the microinstruction needs to wait in the storage queue until it is not executed before. After the microinstruction no longer has a dependency on the data, the microinstruction is sent to the neural network operation unit and the storage unit;
  • Step S705 the neural network operation unit extracts the required data (including input data, neural network model data, etc.) from the scratchpad memory according to the address and size of the required data.
  • Step S706 then completing the neural network operation corresponding to the operation instruction in the neural network operation unit, and writing the result obtained by the neural network operation back to the storage unit.
  • the present invention also provides a sparse neural network data processing method for performing a sparse neural network operation according to an operation instruction.
  • the processing method of the sparse neural network data in this embodiment includes:
  • Step S801 the instruction module extracts the neural network instruction from the instruction cache module, and the The neural network command is sent to the decoding module;
  • Step S802 the decoding module decodes the neural network instruction, obtains micro-instructions respectively corresponding to the storage unit, the sparse selection unit, and the neural network operation unit, and sends each micro-instruction to the instruction queue;
  • Step S803 obtaining a neural network operation operation code of the micro-instruction and a neural network operation operand from the scalar register file, and then the micro-instruction is sent to the dependency processing unit;
  • Step S804 the dependency processing unit analyzes whether the microinstruction has a dependency on the data with the microinstruction that has not been executed before, and if so, the microinstruction needs to wait in the storage queue until it is not executed before. After the microinstruction no longer has a dependency on the data, the microinstruction is sent to the neural network operation unit and the storage unit;
  • Step S805 the operation unit extracts the required data (including input data, neural network model data, neural network sparse representation data) from the scratchpad memory according to the address and size of the required data, and then the sparse selection module selects according to the sparse representation.
  • Input data corresponding to effective neural network weight data
  • input data is represented by general data
  • neural network model data is represented by sparse representation.
  • the sparse selection module selects input data corresponding to the weight according to the 01-bit string of the neural network model data, and the length of the 01-bit string is equal to the length of the neural network model data, as shown in FIG. 2, where the bit string number is 1
  • the input data input device corresponding to the position weight does not input the input data corresponding to the weight.
  • Step S806 the neural network operation corresponding to the operation instruction is completed in the operation unit (because we have selected the input data corresponding thereto according to the sparse weight data in S5. Therefore, the calculation process and the step S106 in FIG. 3 The same is the case: the process of adding the offset and the final excitation after the input and the weight are multiplied, and writing the result of the neural network operation back to the storage unit.
  • the present invention also provides a discrete neural network data processing method for performing a neural network operation of discrete data representation according to an operation instruction.
  • the data processing method for the discrete neural network in this embodiment includes:
  • Step S901 the fetching module extracts the neural network instruction from the instruction cache module, and the The neural network command is sent to the decoding module;
  • Step S902 the decoding module decodes the neural network instruction, obtains micro-instructions corresponding to the storage unit, the sparse selection unit, and the neural network operation unit, respectively, and sends each micro-instruction to the instruction queue;
  • Step S903 acquiring a neural network operation operation code of the micro-instruction and a neural network operation operand from the scalar register file, and then the micro-instruction is sent to the dependency processing unit;
  • Step S904 the dependency processing unit analyzes whether the microinstruction has a dependency on the data with the microinstruction that has not been executed before, and if so, the microinstruction needs to wait in the storage queue until it is not executed before. After the microinstruction no longer has a dependency on the data, the microinstruction is sent to the neural network operation unit and the storage unit;
  • Step S905 the operation unit extracts the required data (including the input data, such as the model data of the plurality of subnetworks described above) from the cache memory according to the address and size of the required data, and each subnetwork includes only one discrete representation.
  • the weight, and the sparse representation of each sub-network, and then the sparse selection module selects the input data corresponding to the effective weight data of the sub-network according to the sparse representation of each sub-network.
  • the storage method of the discrete data is as shown in FIG. 3 and FIG. 4, for example.
  • the sparse selection module is similar to the above operation, and according to the position information represented by the sparsely represented 01-bit string, the corresponding input data is selected and retrieved from the scratchpad memory into the device)
  • Step S906 and then completing the operation of the sub-neural network corresponding to the operation instruction in the operation unit (this process is also similar to the calculation process in the above, the only difference is that, for example, the difference like the difference between FIG. 2 and FIG. 3, the sparse representation method
  • the model data has only one sparse representation, while the discrete data representation may generate multiple sub-models from one model data.
  • the operation process the operation results of all sub-models need to be accumulated and the operation of each sub-network is performed. The results are added and the final result of the operation is written back to the storage unit.
  • the present invention also provides a neural network data processing method.
  • the neural network data processing method of this embodiment includes:
  • Step A the data type judging unit judges the type of the neural network data, if the neural network data is sparse neural network data, step B is performed, if the neural network data is discrete neural network data, step D is performed; if the neural network data is a general neural network Data, execution steps G;
  • Step B The sparse selection unit selects neural network data corresponding to the effective weight in the storage unit according to the location information represented by the sparse data;
  • Step C The neural network operation unit performs a neural network operation on the neural network data acquired by the sparse selection unit, and obtains an operation result of the sparse neural network data, and the neural network data processing ends;
  • Step D the discrete neural network data splitting unit splits the neural network model of the discrete neural network data into N sparsely represented sub-networks, each sub-network contains only one real number, and the remaining weights are all 0;
  • Step E the sparse selection unit and the neural network operation unit process each sub-network according to the sparse neural network data, and respectively obtain the operation result;
  • Step F the neural network operation unit sums the operation results of the N sub-networks to obtain a neural network operation result of the discrete neural network data, and the neural network data processing ends;
  • step G the neural network operation unit performs a neural network operation on the general neural network data, and obtains the operation result, and the neural network data processing ends.
  • the seventh embodiment of the present invention has been introduced to the sparse neural network data processing method.
  • the general data of the common neural network data is mixed.
  • the input data is general neural network data.
  • some layers of data use discrete data, and some layers of data use sparse data. Since the basic flow in the apparatus of the present invention is exemplified by a layer of neural network operation, the neural network actually used is often multi-layered, so it is common for each layer to adopt different data types in real use.
  • the present invention solves the problem of reducing the amount of data required for the operation and increasing the data multiplexing in the operation process by multiplexing the sparse selection unit and efficiently supporting the sparse neural network and the neural network operation of the discrete data representation.
  • problems such as insufficient computing performance, insufficient memory access bandwidth, and excessive power consumption.
  • the dependency processing module through the dependency processing module, the effect of ensuring the correct operation of the neural network and improving the running efficiency and shortening the running time is achieved. It has wide application in many fields, and has strong application prospect and great economic value.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

A neural network computation apparatus and method. The neural network computation apparatus comprises: a control unit (100), a storage unit (200), a sparse selection unit (300), and a neural network computation unit (400); the control unit (100) is used for producing micro-commands respectively corresponding to each unit, and sending the micro-commands to each corresponding unit; the sparse selection unit (300) is used for selecting neural network data corresponding to an effective weighted value for computation from the neural network data stored in the storage unit (200) on the basis of the micro-command corresponding to the sparse selection unit (300) issued by the control unit (100) and according to position information indicated by the sparse data therein; and the neural network computation unit (400) is used for executing neural network computation on the neural network data selected by the sparse selection unit (300) on the basis of the micro-command corresponding to the neural network computation unit (400) issued by the control unit (100), in order to obtain computation results. The present apparatus and method improve the ability of the neural network computation apparatus to process different types of data, increasing the speed of neural network computation whilst reducing power consumption.

Description

神经网络运算装置及方法Neural network computing device and method 技术领域Technical field

本发明涉及信息技术领域,尤其涉及一种兼容通用神经网络数据、稀疏神经网络数据和离散神经网络数据的神经网络运算装置及方法。The present invention relates to the field of information technology, and in particular, to a neural network operation device and method compatible with general neural network data, sparse neural network data, and discrete neural network data.

背景技术Background technique

人工神经网络(ANNs),简称神经网络(NNs),是一种模仿动物神经网络行为特征,进行分布式并行信息处理的算法数学模型。这种网络依靠系统的复杂程度,通过调整内部大量节点之间相互连接的关系,从而达到处理信息的目的。目前,神经网络在智能控制、机器学习等很多领域均获得长足发展。随着深度学习技术的不断发展,当前神经网络的模型规模越来越大,对运算性能以及访存带宽需求越来越高,已有的神经网络运算平台(CPU,GPU,传统神经网络加速器)已无法满足用户需求。Artificial neural networks (ANNs), referred to as neural networks (NNs), are mathematical models of algorithms that mimic the behavioral characteristics of animal neural networks and perform distributed parallel information processing. This kind of network relies on the complexity of the system to adjust the relationship between a large number of internal nodes to achieve the purpose of processing information. At present, neural networks have made great progress in many fields such as intelligent control and machine learning. With the continuous development of deep learning technology, the current model of neural network is getting larger and larger, and the computing performance and memory bandwidth requirements are getting higher and higher. The existing neural network computing platform (CPU, GPU, traditional neural network accelerator) The user's needs have not been met.

为了提高神经网络运算平台的运算效率,在通用神经网络数据的基础上,发展出稀疏神经网络数据和离散神经网络数据。然而,目前的神经网络运算平台针对每一种类型的神经网络数据均需要设立单独的处理模块进行处理,造成计算资源紧张,并连带产生了访存带宽不够、功耗过高等问题。In order to improve the computational efficiency of the neural network computing platform, sparse neural network data and discrete neural network data are developed based on the general neural network data. However, the current neural network computing platform needs to set up a separate processing module for each type of neural network data to process, resulting in tight computing resources, and associated problems such as insufficient memory bandwidth and high power consumption.

发明内容Summary of the invention

(一)要解决的技术问题(1) Technical problems to be solved

鉴于上述技术问题,本发明提供了一种神经网络运算装置及方法,以提升神经网络数据处理的复用化程度,节省计算资源。In view of the above technical problems, the present invention provides a neural network computing device and method for improving the degree of multiplexing of neural network data processing and saving computing resources.

(二)技术方案(2) Technical plan

根据本发明的一个方面,提供了一种神经网络运算装置。该神经网络运算装置包括:控制单元、存储单元、稀疏选择单元和神经网络运算单元;其中:存储单元,用于存储神经网络数据;控制单元,用于产生分别对应稀疏选择单元和神经网络运算单元、存储单元的微指令,并将微指令发送至相应单元;稀疏选择单元,用于根据控制单元下发的对应稀疏选择单元的微指令,依照其中的稀疏数据表示的位置信息,在存储单元存储的神经网络数据中选择与有效权值相对应的神经网络数据参与运算;以及神经网 络运算单元,用于根据控制单元下发的对应神经网络运算单元的微指令,对稀疏选择单元选取的神经网络数据执行神经网络运算,得到运算结果。According to an aspect of the invention, a neural network computing device is provided. The neural network computing device includes: a control unit, a storage unit, a sparse selection unit, and a neural network operation unit; wherein: a storage unit is configured to store neural network data; and a control unit is configured to generate a corresponding sparse selection unit and a neural network operation unit respectively And storing the microinstruction of the unit, and sending the microinstruction to the corresponding unit; the sparse selection unit is configured to store in the storage unit according to the microinstruction corresponding to the sparse selection unit delivered by the control unit according to the location information represented by the sparse data therein In the neural network data, the neural network data corresponding to the effective weight is selected to participate in the operation; and the neural network The network operation unit is configured to perform a neural network operation on the neural network data selected by the sparse selection unit according to the micro-command corresponding to the neural network operation unit delivered by the control unit, to obtain an operation result.

根据本发明的另一个方面,还提供了一种运用上述神经网络运算装置的神经网络数据处理方法。该神经网络数据处理方法包括:步骤D,离散神经网络数据拆分单元将离散神经网络数据的神经网络模型拆分成N个稀疏表示的子网络,每个子网络中只包含一种实数,其余权值都为0;步骤E,稀疏选择单元和神经网络运算单元将每一个子网络按照稀疏神经网络数据进行处理,分别得到运算结果;以及步骤F,神经网络运算单元将N个子网络的运算结果求和,得到离散神经网络数据的神经网络运算结果,神经网络数据处理结束。According to another aspect of the present invention, there is also provided a neural network data processing method using the above neural network computing device. The neural network data processing method comprises: Step D, the discrete neural network data splitting unit splits the neural network model of the discrete neural network data into N sparsely represented sub-networks, each sub-network contains only one real number, and the remaining weights The values are all 0; step E, the sparse selection unit and the neural network operation unit process each sub-network according to the sparse neural network data to obtain the operation results respectively; and in step F, the neural network operation unit calculates the operation results of the N sub-networks And, the neural network operation result of the discrete neural network data is obtained, and the neural network data processing ends.

根据本发明的另一个方面,还提供了一种运用上述神经网络运算装置的神经网络数据处理方法。该神经网络数据处理方法包括:步骤A,数据类型判断单元判断神经网络数据的类型,如果神经网络数据为稀疏神经网络数据,执行步骤B,如果神经网络数据为离散神经网络数据,执行步骤D;如果神经网络数据为通用神经网络数据,执行步骤G;步骤B,稀疏选择单元依照稀疏数据表示的位置信息,在存储单元中选择与有效权值相对应的神经网络数据;步骤C,神经网络运算单元对稀疏选择单元获取的神经网络数据执行神经网络运算,得到稀疏神经网络数据的运算结果,神经网络数据处理结束;步骤D,离散神经网络数据拆分单元将离散神经网络数据的神经网络模型拆分成N个稀疏表示的子网络,每个子网络中只包含一种实数,其余权值都为0;步骤E,稀疏选择单元和神经网络运算单元将每一个子网络按照稀疏神经网络数据进行处理,分别得到运算结果;以及步骤F,神经网络运算单元将N个子网络的运算结果求和,得到离散神经网络数据的神经网络运算结果,神经网络数据处理结束;步骤G,神经网络运算单元对通用神经网络数据执行神经网络运算,得到运算结果,神经网络数据处理结束。According to another aspect of the present invention, there is also provided a neural network data processing method using the above neural network computing device. The neural network data processing method includes: Step A, the data type judging unit judges the type of the neural network data, if the neural network data is sparse neural network data, step B is performed, and if the neural network data is discrete neural network data, step D is performed; If the neural network data is general neural network data, step G is performed; step B, the sparse selection unit selects neural network data corresponding to the effective weight in the storage unit according to the position information represented by the sparse data; step C, neural network operation The unit performs neural network operations on the neural network data acquired by the sparse selection unit, and obtains the operation result of the sparse neural network data, and the neural network data processing ends; in step D, the discrete neural network data splitting unit decomposes the neural network model of the discrete neural network data. Divided into N sparse representation sub-networks, each sub-network contains only one real number, and the remaining weights are 0; step E, sparse selection unit and neural network operation unit process each sub-network according to sparse neural network data , respectively, get the operation result; And in step F, the neural network operation unit sums the operation results of the N sub-networks to obtain the neural network operation result of the discrete neural network data, and the neural network data processing ends; in step G, the neural network operation unit executes the neural network on the general neural network data. The operation is performed, and the operation result is obtained, and the neural network data processing ends.

(三)有益效果(3) Beneficial effects

从上述技术方案可以看出,本发明神经网络运算装置及方法至少具有以下有益效果其中之一:It can be seen from the above technical solutions that the neural network computing device and method of the present invention have at least one of the following beneficial effects:

(1)通过复用稀疏选择单元,同时高效支持稀疏神经网络以及离散 数据表示的神经网络运算,实现了减少运算需要的数据量,增加运算过程中的数据复用,从而解决了现有技术中存在的运算性能不足、访存带宽不够、功耗过高等问题;(1) By multiplexing sparse selection units while efficiently supporting sparse neural networks and discrete The neural network operation represented by the data realizes reducing the amount of data required for the operation and increasing the data multiplexing in the operation process, thereby solving the problems of insufficient performance performance, insufficient memory access bandwidth, and excessive power consumption in the prior art;

(2)通过依赖关系处理单元,本装置可以判断数据是否有相互依赖关系,例如下一步计算使用的输入数据是上一步计算执行结束之后的输出结果,这样没有盘算数据依赖关系模块,下一步计算不等待上一步计算结束就开始计算,会引发计算结果不正确。通过依赖关系处理单元判断数据依赖关系从而控制装置等待数据进行下一步计算,从而保证了装置运行的正确性和高效性。(2) Through the dependency processing unit, the device can determine whether the data has an interdependence relationship. For example, the input data used in the next calculation is the output result after the end of the previous calculation, so that the data dependency module is not calculated, and the next calculation is performed. Starting the calculation without waiting for the end of the previous calculation will cause the calculation result to be incorrect. The dependency processing unit determines the data dependency relationship, so that the control device waits for the data to perform the next calculation, thereby ensuring the correctness and high efficiency of the device operation.

附图说明DRAWINGS

图1为本发明第一实施例神经网络运算装置的结构示意图;1 is a schematic structural diagram of a neural network computing device according to a first embodiment of the present invention;

图2为稀疏神经网络权值模型数据的示意图;2 is a schematic diagram of sparse neural network weight model data;

图3为将N=4的离散神经网络数据拆分为两个子网络的示意图;3 is a schematic diagram of splitting discrete neural network data of N=4 into two sub-networks;

图4为将N=2的离散神经网络数据拆分为两个子网络的示意图;4 is a schematic diagram of splitting discrete neural network data of N=2 into two sub-networks;

图5为本发明第二实施例神经网络运算装置的结构示意图;FIG. 5 is a schematic structural diagram of a neural network computing device according to a second embodiment of the present invention; FIG.

图6为本发明第三实施例神经网络运算装置的结构示意图;6 is a schematic structural diagram of a neural network computing device according to a third embodiment of the present invention;

图7为本发明第四实施例神经网络数据处理方法的流程图;7 is a flowchart of a neural network data processing method according to a fourth embodiment of the present invention;

图8为本发明第五实施例神经网络数据处理方法的流程图;8 is a flowchart of a neural network data processing method according to a fifth embodiment of the present invention;

图9为本发明第六实施例神经网络数据处理方法的流程图;FIG. 9 is a flowchart of a neural network data processing method according to a sixth embodiment of the present invention; FIG.

图10为本发明第七实施例神经网络数据处理方法的流程图。FIG. 10 is a flowchart of a neural network data processing method according to a seventh embodiment of the present invention.

具体实施方式detailed description

在对本发明进行介绍之前,首先对三种类型的神经网络数据-通用神经网络数据、稀疏神经网络数据和离散神经网络数据进行说明。Prior to the introduction of the present invention, three types of neural network data - general neural network data, sparse neural network data, and discrete neural network data are first described.

本发明中,通用神经网络数据指代的是通用的计算机数据,也就是计算机中常用的数据类型,例如32位浮点数据、16位浮点数据、32位定点数据等等。In the present invention, general-purpose neural network data refers to general-purpose computer data, that is, data types commonly used in computers, such as 32-bit floating-point data, 16-bit floating-point data, 32-bit fixed-point data, and the like.

本发明中,离散神经网络数据表示为:部分数据或全部数据是用离散数据表示的计算机数据。不同于通用神经网络数据中32位浮点、16位浮点的数据表示,离散神经网络数据指参与运算的全部数据只是某几个离散的实数组成的集合,神经网络中的数据包括输入数据和神经网络模型数据。 包括以下几种类型:In the present invention, the discrete neural network data is expressed as: part of the data or all of the data is computer data represented by discrete data. Different from the data representation of 32-bit floating point and 16-bit floating point in general neural network data, the discrete neural network data refers to all the data involved in the operation is only a set of discrete real numbers. The data in the neural network includes input data and Neural network model data. It includes the following types:

(1)输入数据和神经网络模型数据全部由这几个实数组成叫做全部离散数据表示;(1) The input data and the neural network model data are all composed of these real numbers called all discrete data representations;

(2)神经网络中的数据只有神经网络模型数据(全部神经网络层或某几个神经网络层)由这几个实数组成,输入数据用通用神经网络数据叫做模型离散数据表示;(2) The data in the neural network only the neural network model data (all neural network layers or some neural network layers) are composed of these real numbers, and the input data is represented by the general neural network data called model discrete data;

(3)神经网络中的数据只有输入数据由这几个实数组成,神经网络模型数据用原始通用神经网络数据叫做输入离散数据表示。(3) Data in the neural network Only the input data consists of these real numbers, and the neural network model data is represented by the input data of the original general neural network called input discrete data.

本发明中的离散数据表示指代了包括上述三种离散数据表示方式。例如输入数据是原始通用的神经网络数据,可以是一张RGB图像数据,神经网络模型数据是离散数据表示的,既某几层的权值数据只有-1/+1两种值,此既为离散神经网络数据表示的神经网络。The discrete data representation in the present invention refers to the three discrete data representations described above. For example, the input data is the original universal neural network data, which may be an RGB image data, and the neural network model data is represented by discrete data, and the weight data of a certain layer has only two values of -1/+1, which is Neural network represented by discrete neural network data.

本发明中,稀疏神经网络数据为:位置上不连续的数据,具体包括数据和数据位置信息这两部分。例如一个神经网络的模型数据是稀疏的,首先我们通过1个长度与整个模型数据大小相同的01比特串反映了数据位置信息。具体为01反应了相应位置上的模型数据是否有效,0表示该位置数据无效既稀疏掉,1表示该位置数据有效。最后,存储时我们只存储有效位置上的数据成为我们的数据信息。这样真实存储的数据信息和01比特串存储的位置信息共同组成了我们的稀疏神经网络数据,本发明中这种数据表示方式也叫做稀疏数据表示。In the present invention, the sparse neural network data is: data that is discontinuous in position, specifically including data and data location information. For example, the model data of a neural network is sparse. First, we reflect the data location information by a 01-bit string of the same length as the entire model data. Specifically, 01 reflects whether the model data at the corresponding position is valid, 0 indicates that the position data is invalid and is sparse, and 1 indicates that the position data is valid. Finally, when storing, we only store the data at the valid location to become our data information. The data information stored in this way and the position information stored in the 01-bit string together constitute our sparse neural network data, which is also called sparse data representation in the present invention.

本发明提供的神经网络运算装置及方法通过复用稀疏选择单元,同时支持稀疏神经网络数据以及离散神经网络数据的神经网络运算。The neural network computing device and method provided by the present invention support sparse neural network data and neural network operations of discrete neural network data by multiplexing sparse selection units.

本发明可以应用于以下(包括但不限于)场景中:数据处理、机器人、电脑、打印机、扫描仪、电话、平板电脑、智能终端、手机、行车记录仪、导航仪、传感器、摄像头、云端服务器、相机、摄像机、投影仪、手表、耳机、移动存储、可穿戴设备等各类电子产品;飞机、轮船、车辆等各类交通工具;电视、空调、微波炉、冰箱、电饭煲、加湿器、洗衣机、电灯、燃气灶、油烟机等各类家用电器;以及包括核磁共振仪、B超、心电图仪等各类医疗设备。The invention can be applied to the following (including but not limited to) scenarios: data processing, robots, computers, printers, scanners, telephones, tablets, smart terminals, mobile phones, driving recorders, navigators, sensors, cameras, cloud servers , cameras, camcorders, projectors, watches, earphones, mobile storage, wearable devices and other electronic products; aircraft, ships, vehicles and other types of transportation; televisions, air conditioners, microwave ovens, refrigerators, rice cookers, humidifiers, washing machines, Electric lights, gas stoves, range hoods and other household appliances; and including nuclear magnetic resonance instruments, B-ultrasound, electrocardiograph and other medical equipment.

为使本发明的目的、技术方案和优点更加清楚明白,以下结合具体实 施例,并参照附图,对本发明进一步详细说明。In order to make the objects, technical solutions and advantages of the present invention more clear, the following combinations are specific. The invention will be further described in detail with reference to the accompanying drawings.

一、第一实施例First, the first embodiment

在本发明的第一个示例性实施例中,提供了一种神经网络运算装置。请参照图1,本实施例神经网络运算装置包括:控制单元100、存储单元200、稀疏选择单元300和神经网络运算单元400。其中,存储单元200用于存储神经网络数据。控制单元100用于产生分别对应所述稀疏选择单元和神经网络运算单元的微指令,并将微指令发送至相应单元。稀疏选择单元300用于根据控制单元下发的对应稀疏选择单元的微指令,依照其中的稀疏数据表示的位置信息,在存储单元存储的神经网络数据中选择与有效权值相对应的神经网络数据参与运算。神经网络运算单元400用于根据控制单元下发的对应神经网络运算单元的微指令,对稀疏选择单元选取的神经网络数据执行神经网络运算,得到运算结果。In a first exemplary embodiment of the present invention, a neural network computing device is provided. Referring to FIG. 1, the neural network operation device of this embodiment includes: a control unit 100, a storage unit 200, a sparse selection unit 300, and a neural network operation unit 400. The storage unit 200 is configured to store neural network data. The control unit 100 is configured to generate microinstructions respectively corresponding to the sparse selection unit and the neural network operation unit, and send the microinstructions to the corresponding units. The sparse selection unit 300 is configured to select, according to the microinstruction corresponding to the sparse selection unit delivered by the control unit, the neural network data corresponding to the effective weight in the neural network data stored by the storage unit according to the location information represented by the sparse data therein Participate in the operation. The neural network operation unit 400 is configured to perform a neural network operation on the neural network data selected by the sparse selection unit according to the micro-instruction of the corresponding neural network operation unit delivered by the control unit, to obtain an operation result.

以下分别对本实施例神经网络运算装置的各个组成部分进行详细描述。The respective components of the neural network computing device of the present embodiment will be described in detail below.

存储单元200用于存储三种类型的神经网络数据-通用神经网络数据、稀疏神经网络数据和离散神经网络数据,在一种实施方式中,该存储单元可以是高速暂存存储器,能够支持不同大小的数据规模;本发明将必要的计算数据暂存在高速暂存存储器(Scratchpad Memory)上,使本运算装置在进行神经网络运算过程中可以更加灵活有效地支持不同规模的数据。存储单元可以通过各种不同存储器件(SRAM、eDRAM、DRAM、忆阻器、3D-DRAM或非易失存储等)实现。The storage unit 200 is configured to store three types of neural network data - general neural network data, sparse neural network data, and discrete neural network data. In an embodiment, the storage unit may be a scratch pad memory and can support different sizes. The data size of the present invention is such that the necessary calculation data is temporarily stored in the scratch pad memory (Scratchpad Memory), so that the computing device can more flexibly and efficiently support data of different sizes in the process of performing neural network operations. The memory cells can be implemented by a variety of different memory devices (SRAM, eDRAM, DRAM, memristor, 3D-DRAM or non-volatile memory, etc.).

控制单元100用于产生分别对应所述稀疏选择单元和神经网络运算单元的微指令,并将微指令发送至相应单元。其中,在本发明中,控制单元可以支持多种不同类型的神经网络算法,包括但不限于CNN/DNN/DBN/MLP/RNN/LSTM/SOM/RCNN/FastRCNN/Faster-RCNN等。The control unit 100 is configured to generate microinstructions respectively corresponding to the sparse selection unit and the neural network operation unit, and send the microinstructions to the corresponding units. Wherein, in the present invention, the control unit can support a plurality of different types of neural network algorithms, including but not limited to CNN/DNN/DBN/MLP/RNN/LSTM/SOM/RCNN/FastRCNN/Faster-RCNN.

稀疏选择单元300用于根据稀疏神经网络数据的位置信息选择与有效权值相对应的神经元参与运算。在处理离散神经网络数据时,我们同样通过稀疏选择单元处理相应的离散数据。The sparse selection unit 300 is configured to select a neuron participating operation corresponding to the effective weight according to the location information of the sparse neural network data. When dealing with discrete neural network data, we also process the corresponding discrete data through the sparse selection unit.

神经网络运算单元400用于根据控制单元生成的微指令,从存储单元中获取输入数据,执行一般神经网络或稀疏神经网络或离散数据表示的神 经网络运算,得到运算结果,并将运算结果存储至存储单元中。The neural network operation unit 400 is configured to acquire input data from the storage unit according to the microinstruction generated by the control unit, and execute a general neural network or a sparse neural network or a discrete data representation. Through the network operation, the operation result is obtained, and the operation result is stored in the storage unit.

在上述介绍的基础上,以下重点对本实施例中的稀疏选择单元进行详细说明。请参照图1,该稀疏选择单元可以对稀疏数据表示和离散数据表示进行处理,具体为:稀疏选择单元根据稀疏数据的位置信息,既01比特串,来选择与该位置相对应的神经网络每一层的输入数据送入到神经网络运算单元。01比特串中,每1位对应神经网络模型中的一个权值数据,0表示对应的权值数据无效,既不存在。1表示对应的权值数据有效,既存在。稀疏数据表示中的数据部分只存储有效的数据。例如,我们有图2所示的稀疏神经网络权值模型数据。稀疏选择模块将稀疏数据表示的有效权值部分直接送入神经网络运算单元,之后根据01字符串选择其中1既有效位的位置对应的输入神经元的数据送入神经网络运算单元。图2中,稀疏选择模块将把与权值位置1/2/3/7/9/11/15(该数字对应图2中位置信息数字1的从左到右位置,相当于数组序号)相对应的输入神经元送入神经网络运算单元。Based on the above description, the following focuses on the sparse selection unit in this embodiment. Referring to FIG. 1 , the sparse selection unit may process the sparse data representation and the discrete data representation. Specifically, the sparse selection unit selects a neural network corresponding to the location according to the location information of the sparse data and the 01 bit string. The input data of one layer is sent to the neural network operation unit. In the 01 bit string, each bit corresponds to one weight data in the neural network model, and 0 indicates that the corresponding weight data is invalid and does not exist. 1 indicates that the corresponding weight data is valid and exists. The data portion of the sparse data representation stores only valid data. For example, we have the sparse neural network weight model data shown in Figure 2. The sparse selection module directly sends the effective weight portion of the sparse data representation to the neural network operation unit, and then selects the data of the input neuron corresponding to the position of one of the valid bits according to the 01 string to be sent to the neural network operation unit. In Figure 2, the sparse selection module will be associated with the weight position 1/2/3/7/9/11/15 (this number corresponds to the left-to-right position of the position information number 1 in Figure 2, which corresponds to the array number). The corresponding input neuron is sent to the neural network operation unit.

本发明的一大特点是将稀疏选择模块300复用于离散神经网络数据中。具体地,对离散神经网络数据,由几个实数值便当做几个稀疏神经网络数据进行运算。准备神经网络模型数据时,通过将神经网络模型拆分成N个子网络。该子网络与原离散神经网络数据的尺寸相同。每个子网络中只包含一种实数,其余权值都为0,这样每个子网络都类似于上述稀疏表示。与上述稀疏数据的唯一区别在于,子网络在神经网络运算单元计算完毕后,需要外部一条指令控制神经网络运算单元将子网络计算结果求和得到最终结果。A major feature of the present invention is the reuse of the sparse selection module 300 in discrete neural network data. Specifically, for discrete neural network data, several real values are used as a few sparse neural network data to perform operations. When preparing neural network model data, the neural network model is split into N sub-networks. The subnetwork is the same size as the original discrete neural network data. Each subnetwork contains only one real number, and the remaining weights are all 0, so each subnetwork is similar to the sparse representation described above. The only difference from the above sparse data is that after the neural network operation unit is calculated, the sub-network needs an external command to control the neural network operation unit to sum the sub-network calculation results to obtain the final result.

请继续参照图1,本实施例中,神经网络运算装置还包括:离散神经网络数据拆分单元500。该离散神经网络数据拆分单元500用于:Referring to FIG. 1 , in this embodiment, the neural network computing device further includes: a discrete neural network data splitting unit 500. The discrete neural network data splitting unit 500 is configured to:

(1)确定离散神经网络数据中实数值的个数N;(1) determining the number N of real values in the discrete neural network data;

(2)将离散神经网络模型数据的神经网络模型拆分成N个子网络,每个子网络中只包含一种实数,其余权值都为0;(2) The neural network model of the discrete neural network model data is divided into N sub-networks, each sub-network contains only one real number, and the remaining weights are all 0;

其中,稀疏选择单元300和神经网络运算单元400将每一个子网络按照稀疏神经网络数据进行处理,分别得到运算结果。所述神经网络运算单元还用于将N个子网络的运算结果求和,从而得到所述离散神经网络数据 的神经网络运算结果。The sparse selection unit 300 and the neural network operation unit 400 process each sub-network according to the sparse neural network data, and respectively obtain the operation result. The neural network operation unit is further configured to sum the operation results of the N sub-networks to obtain the discrete neural network data. The result of the neural network operation.

如图3所示,神经网络模型数据中某一层的权值数据由离散数据表示。我们的离散数据由4个实数值组成(N=4),既只有-1/1/2/-2四种权值,按照这四种权值拆成了四个稀疏表示的子网络。具体执行过程中,外部准备好这四个子网络,稀疏选择模块依次读入4个子网络,之后的处理方式与稀疏神经网络数据相同,均是选择与权值位置信息相对应的输入数据送到运算单元。唯一区别在于,运算单元计算完成后,需要外部一条指令控制神经网络运算单元将4个子网络的运算结果求和。As shown in FIG. 3, the weight data of a certain layer in the neural network model data is represented by discrete data. Our discrete data consists of 4 real values (N=4), which have only four weights of -1/1/2/-2, and are divided into four sparsely represented sub-networks according to these four weights. During the specific implementation process, the four sub-networks are prepared externally, and the sparse selection module reads four sub-networks in turn, and the subsequent processing method is the same as the sparse neural network data, and the input data corresponding to the weight position information is selected and sent to the operation. unit. The only difference is that after the calculation of the arithmetic unit is completed, an external command is needed to control the neural network operation unit to sum the operation results of the four sub-networks.

当量化值个数N=2时,可以将两个稀疏表示合并成一个(稀疏表示中的0和1分别表示两个不同的量化值,不再表示有效和非有效)。如图4所示,特别的当N=2时,我们可以将两个稀疏表示的子网络的位置信息2串比特串合并为1串比特串,比特串指数字01组成的序列。此时的0/1不表示是否稀疏的位置信息,而分别表示1/-1权值的位置信息。稀疏选择模块会选择输入数据两次,分别与权值1和权值-1的位置信息相对应的输入数据输入到运算单元中。同样,最后需要外部一条指令控制运算单元将2个子网络输出结果求和。When the number of quantized values N=2, two sparse representations can be combined into one (0 and 1 in the sparse representation respectively represent two different quantized values, no longer indicating valid and inactive). As shown in FIG. 4, especially when N=2, we can combine the position information 2 string of two sparsely represented sub-networks into one string of bits, and the bit string refers to a sequence of numbers 01. The 0/1 at this time does not indicate whether or not the position information is sparse, and the position information of the 1/-1 weight is respectively indicated. The sparse selection module selects the input data twice, and the input data corresponding to the position information of the weight 1 and the weight-1 respectively is input to the operation unit. Similarly, an external command control unit is required to sum the output of the two sub-networks.

二、第二实施例Second, the second embodiment

在本发明的第二个示例性实施例中,提供了一种神经网络运算装置。如图5所示,与第一实施例相比,本实施例神经网络运算装置的区别在于:增加了数据类型判断单元600来判断神经网络数据的类型。In a second exemplary embodiment of the present invention, a neural network computing device is provided. As shown in FIG. 5, the neural network computing device of the present embodiment differs from the first embodiment in that the data type determining unit 600 is added to determine the type of the neural network data.

本发明的神经网络数据类型在指令中指定。控制单元通过数据类型判断单元的输出结果控制了稀疏选择单元和运算单元工作方式:The neural network data type of the present invention is specified in the instruction. The control unit controls the operation mode of the sparse selection unit and the operation unit by the output result of the data type judgment unit:

(a)针对稀疏神经网络数据,稀疏选择模块根据位置信息选择对应输入数据送入神经网络运算单元;(a) for sparse neural network data, the sparse selection module selects corresponding input data according to the location information and sends it to the neural network operation unit;

具体而言,当所述神经网络数据为稀疏神经网络数据,令所述稀疏选择单元依照稀疏数据表示的位置信息,在存储单元中选择与有效权值相对应的神经元参与运算;令所述神经网络运算单元对稀疏选择单元获取的神经网络数据执行神经网络运算,得到运算结果。Specifically, when the neural network data is sparse neural network data, the sparse selection unit selects, according to the location information represented by the sparse data, a neuron corresponding to the effective weight to participate in the operation in the storage unit; The neural network operation unit performs a neural network operation on the neural network data acquired by the sparse selection unit, and obtains an operation result.

(b)针对离散神经网络数据,稀疏选择模块根据位置信息选择相对应的输入数据送入运算单元,运算单元根据外部运算指令对计算结果求和; (b) for the discrete neural network data, the sparse selection module selects the corresponding input data according to the position information and sends it to the operation unit, and the operation unit sums the calculation result according to the external operation instruction;

具体而言,当所述神经网络数据为离散神经网络数据时,令所述离散神经网络数据拆分单元工作,将离散神经网络数据的神经网络模型拆分成N个子网络;令所述稀疏选择单元和神经网络运算单元工作,将每一个子网络按照稀疏神经网络数据进行处理,分别得到运算结果;令所述神经网络运算单元工作,将N个子网络的运算结果求和,得到所述离散神经网络数据的神经网络运算结果。Specifically, when the neural network data is discrete neural network data, the discrete neural network data splitting unit is operated, and the neural network model of the discrete neural network data is split into N sub-networks; The unit and the neural network operation unit work, and each sub-network is processed according to the sparse neural network data to obtain the operation result respectively; the neural network operation unit is operated, and the operation results of the N sub-networks are summed to obtain the discrete neural network. Neural network operation results of network data.

(c)针对一般神经网络数据,既稀疏选择模块不工作,不会根据位置信息选择。(c) For general neural network data, the sparse selection module does not work and is not selected based on location information.

具体而言,当所述神经网络数据为通用神经网络数据时,令所述稀疏选择单元不工作,令所述神经网络运算单元对通用神经网络数据执行神经网络运算,得到运算结果。Specifically, when the neural network data is general neural network data, the sparse selection unit is disabled, and the neural network operation unit performs a neural network operation on the general neural network data to obtain an operation result.

三、第三实施例Third, the third embodiment

在本发明的第三个示例性实施例中,提供了一种神经网络运算装置。与第二实施例相比,本实施例神经网络运算装置的区别在于:在控制单元中增加了依赖关系处理功能。In a third exemplary embodiment of the present invention, a neural network computing device is provided. Compared with the second embodiment, the neural network computing device of the present embodiment differs in that a dependency processing function is added to the control unit.

请参照图6,根据本发明的一种实施方式,控制单元100包括:指令缓存模块110,用于存储待执行的神经网络指令,所述神经网络指令包含待处理神经网络数据的地址信息;取指模块120,用于从所述指令缓存模块中获取神经网络指令;译码模块130,用于对神经网络指令进行译码,得到分别对应存储单元、稀疏选择单元和神经网络运算单元的微指令,所述微指令中包含相应神经网络数据的地址信息;指令队列140,用于对译码后的微指令进行存储;标量寄存器堆150,用于存储所述待处理神经网络数据的地址信息;依赖关系处理模块160,用于判断指令队列中的微指令与前一微指令是否访问相同的数据,若是,将该微指令存储在一存储队列中,待前一微指令执行完毕后,将存储队列中的该微指令发射至相应单元;否则,直接将该微指令发射至相应单元。Referring to FIG. 6, according to an embodiment of the present invention, the control unit 100 includes: an instruction cache module 110, configured to store a neural network instruction to be executed, where the neural network instruction includes address information of the neural network data to be processed; The module 120 is configured to acquire a neural network instruction from the instruction cache module, and the decoding module 130 is configured to decode the neural network instruction to obtain micro instructions corresponding to the storage unit, the sparse selection unit, and the neural network operation unit respectively. The microinstruction includes address information of the corresponding neural network data; the instruction queue 140 is configured to store the decoded microinstruction; the scalar register file 150 is configured to store address information of the to-be-processed neural network data; The dependency processing module 160 is configured to determine whether the microinstruction in the instruction queue and the previous microinstruction access the same data, and if so, store the microinstruction in a storage queue, and store the microinstruction after the execution of the previous microinstruction. The microinstruction in the queue is transmitted to the corresponding unit; otherwise, the microinstruction is directly transmitted to the corresponding unit.

其中,指令缓存模块用于存储待执行的神经网络指令。指令在执行过程中,同时也被缓存在指令缓存模块中,当一条指令执行完之后,如果该指令同时也是指令缓存模块中未被提交指令中最早的一条指令,该指令将被提交,一旦提交,该条指令进行的操作对装置状态的改变将无法撤销。 在一种实施方式中,指令缓存模块可以是重排序缓存。The instruction cache module is configured to store a neural network instruction to be executed. During the execution of the instruction, it is also cached in the instruction cache module. When an instruction is executed, if the instruction is also the earliest instruction in the instruction cache module that is not committed, the instruction will be submitted once submitted. The operation of this instruction will not be able to cancel the change of the device status. In one embodiment, the instruction cache module can be a reordering cache.

除此之外,本实施例神经网络运算装置还包括:输入输出单元,用于将数据存储于存储单元,或者,从存储单元中获取神经网络运算结果。其中,直接存储单元,负责从内存中读取数据或写入数据。In addition, the neural network computing device of this embodiment further includes: an input and output unit for storing data in the storage unit, or acquiring a neural network operation result from the storage unit. Among them, the direct storage unit is responsible for reading data from or writing data to the memory.

四、第四实施例Fourth, the fourth embodiment

基于第三实施例的神经网络运算装置,本发明还提供一种通用神经网络数据(通用神经网络指数据不采用离散数据表示或者稀疏化表示的神经网络)处理方法,用于根据运算指令执行一般神经网络运算。如图7所示,本实施例通用神经网络数据的处理方法包括:According to the neural network computing device of the third embodiment, the present invention also provides a general neural network data (general neural network refers to a neural network in which data is not represented by discrete data representation or sparse representation), and is used for performing a general operation according to an operation instruction. Neural network operations. As shown in FIG. 7, the processing method of the general neural network data in this embodiment includes:

步骤S701,取指模块由指令缓存模块中取出神经网络指令,并将所述神经网络指令送往译码模块;Step S701, the instruction fetch module extracts the neural network instruction from the instruction cache module, and sends the neural network instruction to the decoding module.

步骤S702,译码模块对所述神经网络指令译码,得到分别对应存储单元、稀疏选择单元和神经网络运算单元的微指令,并将各微指令送往指令队列;Step S702, the decoding module decodes the neural network instruction, and obtains micro-instructions corresponding to the storage unit, the sparse selection unit, and the neural network operation unit respectively, and sends each micro-instruction to the instruction queue;

步骤S703,从标量寄存器堆里获取所述微指令的神经网络运算操作码和神经网络运算操作数,之后的微指令送给依赖关系处理单元;Step S703, obtaining a neural network operation operation code of the micro-instruction and a neural network operation operand from the scalar register file, and then the micro-instruction is sent to the dependency processing unit;

步骤S704,依赖关系处理单元分析所述微指令与之前尚未执行完的微指令在数据上是否存在依赖关系,如果存在,则所述微指令需要在存储队列中等待至其与之前未执行完的微指令在数据上不再存在依赖关系为止之后微指令送往神经网络运算单元和存储单元;Step S704, the dependency processing unit analyzes whether the microinstruction has a dependency on the data with the microinstruction that has not been executed before, and if so, the microinstruction needs to wait in the storage queue until it is not executed before. After the microinstruction no longer has a dependency on the data, the microinstruction is sent to the neural network operation unit and the storage unit;

步骤S705,神经网络运算单元根据所需数据的地址和大小从高速暂存存储器中取出需要的数据(包括输入数据,神经网络模型数据等)。Step S705, the neural network operation unit extracts the required data (including input data, neural network model data, etc.) from the scratchpad memory according to the address and size of the required data.

步骤S706,然后在神经网络运算单元中完成所述运算指令对应的神经网络运算,并将神经网络运算得到的结果写回存储单元。Step S706, then completing the neural network operation corresponding to the operation instruction in the neural network operation unit, and writing the result obtained by the neural network operation back to the storage unit.

至此,本发明第四实施例稀疏神经网络数据处理方法介绍完毕。So far, the data processing method for the sparse neural network of the fourth embodiment of the present invention has been completed.

五、第五实施例Fifth, the fifth embodiment

基于第三实施例的神经网络运算装置,本发明还提供一种稀疏神经网络数据处理方法,用于根据运算指令执行稀疏神经网络运算。如图8所示,本实施例稀疏神经网络数据的处理方法包括:Based on the neural network computing device of the third embodiment, the present invention also provides a sparse neural network data processing method for performing a sparse neural network operation according to an operation instruction. As shown in FIG. 8, the processing method of the sparse neural network data in this embodiment includes:

步骤S801,取指模块由指令缓存模块中取出神经网络指令,并将所述 神经网络指令送往译码模块;Step S801, the instruction module extracts the neural network instruction from the instruction cache module, and the The neural network command is sent to the decoding module;

步骤S802,译码模块对所述神经网络指令译码,得到分别对应存储单元、稀疏选择单元和神经网络运算单元的微指令,并将各微指令送往指令队列;Step S802, the decoding module decodes the neural network instruction, obtains micro-instructions respectively corresponding to the storage unit, the sparse selection unit, and the neural network operation unit, and sends each micro-instruction to the instruction queue;

步骤S803,从标量寄存器堆里获取所述微指令的神经网络运算操作码和神经网络运算操作数,之后的微指令送给依赖关系处理单元;Step S803, obtaining a neural network operation operation code of the micro-instruction and a neural network operation operand from the scalar register file, and then the micro-instruction is sent to the dependency processing unit;

步骤S804,依赖关系处理单元分析所述微指令与之前尚未执行完的微指令在数据上是否存在依赖关系,如果存在,则所述微指令需要在存储队列中等待至其与之前未执行完的微指令在数据上不再存在依赖关系为止之后微指令送往神经网络运算单元和存储单元;Step S804, the dependency processing unit analyzes whether the microinstruction has a dependency on the data with the microinstruction that has not been executed before, and if so, the microinstruction needs to wait in the storage queue until it is not executed before. After the microinstruction no longer has a dependency on the data, the microinstruction is sent to the neural network operation unit and the storage unit;

步骤S805,运算单元根据所需数据的地址和大小从高速暂存存储器中取出需要的数据(包括输入数据,神经网络模型数据,神经网络稀疏表示数据),然后稀疏选择模块根据稀疏表示,选择出有效神经网络权值数据对应的输入数据;Step S805, the operation unit extracts the required data (including input data, neural network model data, neural network sparse representation data) from the scratchpad memory according to the address and size of the required data, and then the sparse selection module selects according to the sparse representation. Input data corresponding to effective neural network weight data;

例如,输入数据采用一般数据表示,神经网络模型数据采用稀疏表示。稀疏选择模块根据神经网络模型数据的01比特串选择与权值相对应的输入数据,01比特串的长度等于神经网络模型数据的长度,如图2所示,在比特串数字为1的位置选择该位置权值相对应的输入数据输入装置,为0的位置不输入权值相对应的输入数据。如此,我们根据输入稀疏化表示的权值的位置信息01比特串选择了与稀疏权值位置相对应输入数据。For example, input data is represented by general data, and neural network model data is represented by sparse representation. The sparse selection module selects input data corresponding to the weight according to the 01-bit string of the neural network model data, and the length of the 01-bit string is equal to the length of the neural network model data, as shown in FIG. 2, where the bit string number is 1 The input data input device corresponding to the position weight does not input the input data corresponding to the weight. Thus, we select the input data corresponding to the sparse weight position based on the position information 01 bit string of the weight of the input sparse representation.

步骤S806,在运算单元中完成所述运算指令对应的神经网络运算(因为我们在S5中已经根据稀疏的权值数据选择了与之相对应的输入数据。所以计算过程与图3中的S106步骤相同。包括输入和权值相乘求和之后加偏置最后激励的过程),并将神经网络运算得到的结果写回存储单元。Step S806, the neural network operation corresponding to the operation instruction is completed in the operation unit (because we have selected the input data corresponding thereto according to the sparse weight data in S5. Therefore, the calculation process and the step S106 in FIG. 3 The same is the case: the process of adding the offset and the final excitation after the input and the weight are multiplied, and writing the result of the neural network operation back to the storage unit.

至此,本发明第五实施例稀疏神经网络数据处理方法介绍完毕。So far, the data processing method of the sparse neural network of the fifth embodiment of the present invention has been completed.

六、第六实施例Sixth embodiment

基于第三实施例的神经网络运算装置,本发明还提供一种离散神经网络数据处理方法,用于根据运算指令执行离散数据表示的神经网络运算。Based on the neural network computing device of the third embodiment, the present invention also provides a discrete neural network data processing method for performing a neural network operation of discrete data representation according to an operation instruction.

如图9所示,本实施例离散神经网络数据处理方法包括:As shown in FIG. 9, the data processing method for the discrete neural network in this embodiment includes:

步骤S901,取指模块由指令缓存模块中取出神经网络指令,并将所述 神经网络指令送往译码模块;Step S901, the fetching module extracts the neural network instruction from the instruction cache module, and the The neural network command is sent to the decoding module;

步骤S902,译码模块对所述神经网络指令译码,得到分别对应存储单元、稀疏选择单元和神经网络运算单元的微指令,并将各微指令送往指令队列;Step S902, the decoding module decodes the neural network instruction, obtains micro-instructions corresponding to the storage unit, the sparse selection unit, and the neural network operation unit, respectively, and sends each micro-instruction to the instruction queue;

步骤S903,从标量寄存器堆里获取所述微指令的神经网络运算操作码和神经网络运算操作数,之后的微指令送给依赖关系处理单元;Step S903, acquiring a neural network operation operation code of the micro-instruction and a neural network operation operand from the scalar register file, and then the micro-instruction is sent to the dependency processing unit;

步骤S904,依赖关系处理单元分析所述微指令与之前尚未执行完的微指令在数据上是否存在依赖关系,如果存在,则所述微指令需要在存储队列中等待至其与之前未执行完的微指令在数据上不再存在依赖关系为止之后微指令送往神经网络运算单元和存储单元;Step S904, the dependency processing unit analyzes whether the microinstruction has a dependency on the data with the microinstruction that has not been executed before, and if so, the microinstruction needs to wait in the storage queue until it is not executed before. After the microinstruction no longer has a dependency on the data, the microinstruction is sent to the neural network operation unit and the storage unit;

步骤S905,运算单元根据所需数据的地址和大小从高速暂存存储器中取出需要的数据(包括输入数据,如上文所述的多个子网络的模型数据,每个子网络仅包含一种离散表示的权值,以及每个子网络的稀疏表示,然后稀疏选择模块根据每个子网络的稀疏表示,选择出该子网络的有效权值数据对应的输入数据。离散数据的存储方式例如图3、图4所示,稀疏选择模块与上文中的操作类似,根据稀疏表示的01比特串表示的位置信息,选择相对应的输入数据从高速暂存存储器中取出到装置中)Step S905, the operation unit extracts the required data (including the input data, such as the model data of the plurality of subnetworks described above) from the cache memory according to the address and size of the required data, and each subnetwork includes only one discrete representation. The weight, and the sparse representation of each sub-network, and then the sparse selection module selects the input data corresponding to the effective weight data of the sub-network according to the sparse representation of each sub-network. The storage method of the discrete data is as shown in FIG. 3 and FIG. 4, for example. The sparse selection module is similar to the above operation, and according to the position information represented by the sparsely represented 01-bit string, the corresponding input data is selected and retrieved from the scratchpad memory into the device)

步骤S906,然后在运算单元中完成所述运算指令对应的子神经网络的运算(此过程也与上文中的计算过程类似,唯一的不同在于,例如像图2和图3的差别,稀疏表示方法中模型数据只有一个稀疏化表示,而离散数据数据表示可能会从一个模型数据中产生出多个子模型,运算过程中需要对所有子模型的运算结果做累加和),并将各个子网络的运算结果相加,并将运算得到的最终结果写回存储单元。Step S906, and then completing the operation of the sub-neural network corresponding to the operation instruction in the operation unit (this process is also similar to the calculation process in the above, the only difference is that, for example, the difference like the difference between FIG. 2 and FIG. 3, the sparse representation method The model data has only one sparse representation, while the discrete data representation may generate multiple sub-models from one model data. In the operation process, the operation results of all sub-models need to be accumulated and the operation of each sub-network is performed. The results are added and the final result of the operation is written back to the storage unit.

至此,本发明第六实施例稀疏神经网络数据处理方法介绍完毕。So far, the sixth embodiment of the present invention has been introduced to the sparse neural network data processing method.

七、第七实施例Seventh embodiment

基于第三实施例的神经网络运算装置,本发明还提供一种神经网络数据处理方法。请参照图10,本实施例神经网络数据处理方法包括:Based on the neural network computing device of the third embodiment, the present invention also provides a neural network data processing method. Referring to FIG. 10, the neural network data processing method of this embodiment includes:

步骤A,数据类型判断单元判断神经网络数据的类型,如果神经网络数据为稀疏神经网络数据,执行步骤B,如果神经网络数据为离散神经网络数据,执行步骤D;如果神经网络数据为通用神经网络数据,执行步骤 G;Step A, the data type judging unit judges the type of the neural network data, if the neural network data is sparse neural network data, step B is performed, if the neural network data is discrete neural network data, step D is performed; if the neural network data is a general neural network Data, execution steps G;

步骤B,稀疏选择单元依照稀疏数据表示的位置信息,在存储单元中选择与有效权值相对应的神经网络数据;Step B: The sparse selection unit selects neural network data corresponding to the effective weight in the storage unit according to the location information represented by the sparse data;

步骤C,神经网络运算单元对稀疏选择单元获取的神经网络数据执行神经网络运算,得到稀疏神经网络数据的运算结果,神经网络数据处理结束;Step C: The neural network operation unit performs a neural network operation on the neural network data acquired by the sparse selection unit, and obtains an operation result of the sparse neural network data, and the neural network data processing ends;

步骤D,离散神经网络数据拆分单元将离散神经网络数据的神经网络模型拆分成N个稀疏表示的子网络,每个子网络中只包含一种实数,其余权值都为0;Step D, the discrete neural network data splitting unit splits the neural network model of the discrete neural network data into N sparsely represented sub-networks, each sub-network contains only one real number, and the remaining weights are all 0;

步骤E,稀疏选择单元和神经网络运算单元将每一个子网络按照稀疏神经网络数据进行处理,分别得到运算结果;以及Step E, the sparse selection unit and the neural network operation unit process each sub-network according to the sparse neural network data, and respectively obtain the operation result;

步骤F,神经网络运算单元将N个子网络的运算结果求和,得到离散神经网络数据的神经网络运算结果,神经网络数据处理结束;Step F: the neural network operation unit sums the operation results of the N sub-networks to obtain a neural network operation result of the discrete neural network data, and the neural network data processing ends;

步骤G,神经网络运算单元对通用神经网络数据执行神经网络运算,得到运算结果,神经网络数据处理结束。In step G, the neural network operation unit performs a neural network operation on the general neural network data, and obtains the operation result, and the neural network data processing ends.

至此,本发明第七实施例稀疏神经网络数据处理方法介绍完毕。So far, the seventh embodiment of the present invention has been introduced to the sparse neural network data processing method.

需要说明的是,在附图或说明书正文中,未绘示或描述的实现方式,均为所属技术领域中普通技术人员所知的形式,并未进行详细说明。此外,上述对各元件和方法的定义并不仅限于实施例中提到的各种具体结构、形状或方式,本领域普通技术人员可对其进行简单地更改或替换,例如:离散数据、稀疏数据、一般普通神经网络数据三种数据的混合使用,输入数据是一般神经网络数据,神经网络模型数据中某几层数据采用离散数据、某几层数据采用稀疏数据。因为本发明装置中的基本流程是以一层神经网络运算为例的,真实使用的神经网络往往是多层的,所以真实使用中这种每一层采取不同数据类型的方式是常见的。It should be noted that the implementations that are not shown or described in the drawings or the text of the specification are all known to those of ordinary skill in the art and are not described in detail. In addition, the above definitions of the various elements and methods are not limited to the specific structures, shapes or manners mentioned in the embodiments, and those skilled in the art can simply modify or replace them, for example: discrete data, sparse data The general data of the common neural network data is mixed. The input data is general neural network data. In the neural network model data, some layers of data use discrete data, and some layers of data use sparse data. Since the basic flow in the apparatus of the present invention is exemplified by a layer of neural network operation, the neural network actually used is often multi-layered, so it is common for each layer to adopt different data types in real use.

还需要说明的是,除非特别描述或必须依序发生的步骤,上述步骤的顺序并无限制于以上所列,且可根据所需设计而变化或重新安排。并且上述实施例可基于设计及可靠度的考虑,彼此混合搭配使用或与其他实施例混合搭配使用,即不同实施例中的技术特征可以自由组合形成更多的实施例。 It should also be noted that the order of the above steps is not limited to the above, and may be varied or rearranged depending on the desired design, unless specifically described or necessarily occurring in sequence. The above embodiments may be used in combination with other embodiments or based on design and reliability considerations, that is, the technical features in different embodiments may be freely combined to form more embodiments.

综上所述,本发明通过复用稀疏选择单元,同时高效支持稀疏神经网络以及离散数据表示的神经网络运算,实现了减少运算需要的数据量,增加运算过程中的数据复用,从而解决了现有技术中存在的运算性能不足、访存带宽不够、功耗过高等问题。同时,通过依赖关系处理模块,达到了保证神经网络正确运行的同时提高了运行效率缩短运行时间的效果,在多个领域均有广泛的应用,具有极强的应用前景和较大的经济价值。In summary, the present invention solves the problem of reducing the amount of data required for the operation and increasing the data multiplexing in the operation process by multiplexing the sparse selection unit and efficiently supporting the sparse neural network and the neural network operation of the discrete data representation. In the prior art, there are problems such as insufficient computing performance, insufficient memory access bandwidth, and excessive power consumption. At the same time, through the dependency processing module, the effect of ensuring the correct operation of the neural network and improving the running efficiency and shortening the running time is achieved. It has wide application in many fields, and has strong application prospect and great economic value.

以上所述的具体实施例,对本发明的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本发明的具体实施例而已,并不用于限制本发明,凡在本发明的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。 The specific embodiments of the present invention have been described in detail, and are not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and scope of the present invention are intended to be included within the scope of the present invention.

Claims (10)

一种神经网络运算装置,其特征在于,包括:控制单元、存储单元、稀疏选择单元和神经网络运算单元;其中:A neural network computing device, comprising: a control unit, a storage unit, a sparse selection unit, and a neural network operation unit; wherein: 存储单元,用于存储神经网络数据;a storage unit for storing neural network data; 所述控制单元,用于产生分别对应所述稀疏选择单元和神经网络运算单元、存储单元的微指令,并将微指令发送至相应单元;The control unit is configured to generate microinstructions respectively corresponding to the sparse selection unit and the neural network operation unit and the storage unit, and send the microinstruction to the corresponding unit; 稀疏选择单元,用于根据控制单元下发的对应稀疏选择单元的微指令,依照其中的稀疏数据表示的位置信息,在存储单元存储的神经网络数据中选择与有效权值相对应的神经网络数据参与运算;以及a sparse selection unit, configured to select, according to the microinstruction corresponding to the sparse selection unit delivered by the control unit, the neural network data corresponding to the effective weight in the neural network data stored in the storage unit according to the position information represented by the sparse data therein Participate in the operation; 神经网络运算单元,用于根据控制单元下发的对应神经网络运算单元的微指令,对稀疏选择单元选取的神经网络数据执行神经网络运算,得到运算结果。The neural network operation unit is configured to perform a neural network operation on the neural network data selected by the sparse selection unit according to the micro-instruction of the corresponding neural network operation unit delivered by the control unit, to obtain an operation result. 根据权利要求1所述的神经网络运算装置,其特征在于,所述神经网络数据为离散神经网络数据;The neural network computing device according to claim 1, wherein the neural network data is discrete neural network data; 所述神经网络运算装置还包括:离散神经网络数据拆分单元,用于确定离散神经网络数据中实数值的个数N,将离散神经网络数据的神经网络模型拆分成N个稀疏表示的子网络,每个子网络中只包含一种实数,其余权值都为0;The neural network computing device further includes: a discrete neural network data splitting unit, configured to determine the number N of real values in the discrete neural network data, and split the neural network model of the discrete neural network data into N sparse representations. Network, each subnet contains only one real number, and the remaining weights are 0; 所述稀疏选择单元和神经网络运算单元将每个子网络按照稀疏神经网络数据进行处理,分别得到运算结果;The sparse selection unit and the neural network operation unit process each sub-network according to the sparse neural network data, and respectively obtain the operation result; 所述神经网络运算单元还用于将N个子网络的运算结果求和,从而得到所述离散神经网络数据的神经网络运算结果。The neural network operation unit is further configured to sum the operation results of the N sub-networks to obtain a neural network operation result of the discrete neural network data. 根据权利要求2所述的神经网络运算装置,其特征在于,所述N=2或4。The neural network computing device according to claim 2, wherein said N = 2 or 4. 根据权利要求2所述的神经网络运算装置,其特征在于,还包括:The neural network computing device according to claim 2, further comprising: 数据类型判断单元,用于判断所述神经网络数据的类型;a data type determining unit, configured to determine a type of the neural network data; 所述控制单元用于:The control unit is used to: (a)当所述神经网络数据为稀疏神经网络数据,令所述稀疏选择单元依照稀疏数据表示的位置信息,在存储单元中选择与有效权值相对应的神经网络数据;令所述神经网络运算单元对稀疏选择单元获取的神经网络 数据执行神经网络运算,得到运算结果;(a) when the neural network data is sparse neural network data, causing the sparse selection unit to select neural network data corresponding to the effective weight in the storage unit according to the location information represented by the sparse data; Neural network acquired by the arithmetic unit for sparse selection unit The data performs a neural network operation to obtain an operation result; (b)当所述神经网络数据为离散神经网络数据时,令所述离散神经网络数据拆分单元工作,将离散神经网络数据的神经网络模型拆分成N个子网络;令所述稀疏选择单元和神经网络运算单元工作,将每一个子网络按照稀疏神经网络数据进行处理,分别得到运算结果;令所述神经网络运算单元工作,将N个子网络的运算结果求和,得到所述离散神经网络数据的神经网络运算结果。(b) when the neural network data is discrete neural network data, causing the discrete neural network data splitting unit to work, splitting the neural network model of the discrete neural network data into N sub-networks; and causing the sparse selection unit Working with the neural network computing unit, processing each sub-network according to the sparse neural network data, respectively obtaining the operation result; causing the neural network operation unit to work, summing the operation results of the N sub-networks, and obtaining the discrete neural network The result of the neural network operation of the data. 根据权利要求4所述的神经网络运算装置,其特征在于,所述控制单元还用于:The neural network computing device according to claim 4, wherein the control unit is further configured to: (c)当所述神经网络数据为通用神经网络数据时,令所述稀疏选择单元不工作,令所述神经网络运算单元对通用神经网络数据执行神经网络运算,得到运算结果。(c) when the neural network data is general neural network data, causing the sparse selection unit not to operate, and causing the neural network operation unit to perform a neural network operation on the general neural network data to obtain an operation result. 根据权利要求1所述的神经网络运算装置,其特征在于,所述控制单元包括:The neural network computing device according to claim 1, wherein the control unit comprises: 指令缓存模块,用于存储待执行的神经网络指令,所述神经网络指令包含待处理神经网络数据的地址信息;An instruction cache module, configured to store a neural network instruction to be executed, where the neural network instruction includes address information of the neural network data to be processed; 取指模块,用于从所述指令缓存模块中获取神经网络指令;An instruction fetch module, configured to acquire a neural network instruction from the instruction cache module; 译码模块,用于对神经网络指令进行译码,得到分别对应存储单元、稀疏选择单元和神经网络运算单元的微指令,所述微指令中包含相应神经网络数据的地址信息;a decoding module, configured to decode the neural network instruction, to obtain micro-instructions corresponding to the storage unit, the sparse selection unit, and the neural network operation unit, where the micro-instruction includes address information of the corresponding neural network data; 指令队列,用于对译码后的微指令进行存储;An instruction queue for storing the decoded microinstructions; 标量寄存器堆,用于存储所述待处理神经网络数据的地址信息;a scalar register file for storing address information of the to-be-processed neural network data; 依赖关系处理模块,用于判断指令队列中的微指令与前一微指令是否访问相同的数据,若是,将该微指令存储在一存储队列中,待前一微指令执行完毕后,将存储队列中的该微指令发射至相应单元;否则,直接将该微指令发射至相应单元。The dependency processing module is configured to determine whether the microinstruction in the instruction queue and the previous microinstruction access the same data, and if so, store the microinstruction in a storage queue, and store the queue after the execution of the previous microinstruction The microinstruction in is transmitted to the corresponding unit; otherwise, the microinstruction is directly transmitted to the corresponding unit. 根据权利要求1至6中任一项所述的神经网络运算装置,其特征在于,应用于以下场景中:电子产品、交通工具、家用电器或医疗设备,其中:The neural network computing device according to any one of claims 1 to 6, which is applied to the following scenarios: an electronic product, a vehicle, a home appliance or a medical device, wherein: 所述电子产品为以下群组中的一种:数据处理、机器人、电脑、打印 机、扫描仪、电话、平板电脑、智能终端、手机、行车记录仪、导航仪、传感器、摄像头、云端服务器、相机、摄像机、投影仪、手表、耳机、移动存储、可穿戴设备;The electronic product is one of the following groups: data processing, robot, computer, printing Machine, scanner, telephone, tablet, smart terminal, mobile phone, driving recorder, navigator, sensor, camera, cloud server, camera, camera, projector, watch, earphone, mobile storage, wearable device; 所述交通工具为以下群组中的一种:飞机、轮船、车辆;The vehicle is one of the following groups: an airplane, a ship, a vehicle; 所述家用电器为以下群组中的一种:电视、空调、微波炉、冰箱、电饭煲、加湿器、洗衣机、电灯、燃气灶、油烟机;The household appliance is one of the following groups: a television, an air conditioner, a microwave oven, a refrigerator, a rice cooker, a humidifier, a washing machine, an electric lamp, a gas stove, a range hood; 所述医疗设备为以下群组中的一种:核磁共振仪、B超、心电图仪。The medical device is one of the following groups: a nuclear magnetic resonance instrument, a B-mode ultrasound, and an electrocardiograph. 一种运用权利要求2所述神经网络运算装置的神经网络数据处理方法,其特征在于,包括:A neural network data processing method using the neural network computing device of claim 2, comprising: 步骤D,离散神经网络数据拆分单元将离散神经网络数据的神经网络模型拆分成N个稀疏表示的子网络,每个子网络中只包含一种实数,其余权值都为0;Step D, the discrete neural network data splitting unit splits the neural network model of the discrete neural network data into N sparsely represented sub-networks, each sub-network contains only one real number, and the remaining weights are all 0; 步骤E,稀疏选择单元和神经网络运算单元将每一个子网络按照稀疏神经网络数据进行处理,分别得到运算结果;以及Step E, the sparse selection unit and the neural network operation unit process each sub-network according to the sparse neural network data, and respectively obtain the operation result; 步骤F,神经网络运算单元将N个子网络的运算结果求和,得到所述离散神经网络数据的神经网络运算结果,神经网络数据处理结束。In step F, the neural network operation unit sums the operation results of the N sub-networks to obtain a neural network operation result of the discrete neural network data, and the neural network data processing ends. 一种运用权利要求5所述神经网络运算装置的神经网络数据处理方法,其特征在于,包括:A neural network data processing method using the neural network computing device of claim 5, comprising: 步骤A,所述数据类型判断单元判断神经网络数据的类型,如果神经网络数据为稀疏神经网络数据,执行步骤B,如果神经网络数据为离散神经网络数据,执行步骤D;如果神经网络数据为通用神经网络数据,执行步骤G;Step A, the data type determining unit determines the type of the neural network data, if the neural network data is sparse neural network data, performing step B, if the neural network data is discrete neural network data, performing step D; if the neural network data is universal Neural network data, performing step G; 步骤B,所述稀疏选择单元依照稀疏数据表示的位置信息,在存储单元中选择与有效权值相对应的神经网络数据;Step B, the sparse selection unit selects neural network data corresponding to the effective weight in the storage unit according to the location information represented by the sparse data; 步骤C,所述神经网络运算单元对稀疏选择单元获取的神经网络数据执行神经网络运算,得到稀疏神经网络数据的运算结果,神经网络数据处理结束;Step C: The neural network operation unit performs a neural network operation on the neural network data acquired by the sparse selection unit, and obtains an operation result of the sparse neural network data, and the neural network data processing ends; 步骤D,所述离散神经网络数据拆分单元将离散神经网络数据的神经网络模型拆分成N个稀疏表示的子网络,每个子网络中只包含一种实数,其余权值都为0; Step D, the discrete neural network data splitting unit splits the neural network model of the discrete neural network data into N sparsely represented sub-networks, each sub-network contains only one real number, and the remaining weights are all 0; 步骤E,所述稀疏选择单元和神经网络运算单元将每一个子网络按照稀疏神经网络数据进行处理,分别得到运算结果;以及Step E, the sparse selection unit and the neural network operation unit process each sub-network according to the sparse neural network data, and respectively obtain the operation result; 步骤F,所述神经网络运算单元将N个子网络的运算结果求和,得到所述离散神经网络数据的神经网络运算结果,神经网络数据处理结束;Step F, the neural network operation unit sums the operation results of the N sub-networks to obtain a neural network operation result of the discrete neural network data, and the neural network data processing ends; 步骤G,所述神经网络运算单元对通用神经网络数据执行神经网络运算,得到运算结果,神经网络数据处理结束。In step G, the neural network operation unit performs a neural network operation on the general neural network data to obtain an operation result, and the neural network data processing ends. 根据权利要求8或9所述的神经网络处理方法,其特征在于,所述控制单元包括:指令缓存模块、取指模块、译码模块和指令队列和标量寄存器堆和依赖关系处理模块;The neural network processing method according to claim 8 or 9, wherein the control unit comprises: an instruction cache module, an instruction fetch module, a decoding module, an instruction queue, a scalar register file, and a dependency processing module; 所述步骤A之前还包括:Before the step A, the method further includes: 取指模块由指令缓存模块中取出神经网络指令,并将所述神经网络指令送往译码模块;The instruction fetch module extracts the neural network instruction from the instruction cache module, and sends the neural network instruction to the decoding module; 译码模块对所述神经网络指令译码,得到分别对应存储单元、稀疏选择单元和神经网络运算单元的微指令,并将各微指令送往指令队列;Decoding module decodes the neural network instruction, and obtains micro-instructions corresponding to the storage unit, the sparse selection unit, and the neural network operation unit, respectively, and sends each micro-instruction to the instruction queue; 之后微指令从标量寄存器堆里获取所述微指令的神经网络运算操作码和神经网络运算操作数,之后的微指令送给依赖关系处理单元;The microinstruction then acquires the neural network operation opcode of the microinstruction and the neural network operation operand from the scalar register file, and the subsequent microinstruction is sent to the dependency processing unit; 依赖关系处理单元分析所述微指令与之前尚未执行完的微指令在数据上是否存在依赖关系,如果存在,则所述微指令需要在存储队列中等待至其与之前未执行完的微指令在数据上不再存在依赖关系为止之后微指令送往神经网络运算单元、稀疏选择单元、存储单元。 The dependency processing unit analyzes whether the microinstruction has a dependency on the data with the microinstruction that has not been executed before, and if so, the microinstruction needs to wait in the storage queue until it is not executed with the microinstruction that has not been executed before. The microinstruction is sent to the neural network operation unit, the sparse selection unit, and the storage unit after the dependency is no longer present in the data.
PCT/CN2016/100784 2016-09-29 2016-09-29 Neural network computation apparatus and method Ceased WO2018058427A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/100784 WO2018058427A1 (en) 2016-09-29 2016-09-29 Neural network computation apparatus and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/100784 WO2018058427A1 (en) 2016-09-29 2016-09-29 Neural network computation apparatus and method

Publications (1)

Publication Number Publication Date
WO2018058427A1 true WO2018058427A1 (en) 2018-04-05

Family

ID=61763580

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/100784 Ceased WO2018058427A1 (en) 2016-09-29 2016-09-29 Neural network computation apparatus and method

Country Status (1)

Country Link
WO (1) WO2018058427A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109032669A (en) * 2018-02-05 2018-12-18 上海寒武纪信息科技有限公司 Processing with Neural Network device and its method for executing the instruction of vector minimum value
CN111222632A (en) * 2018-11-27 2020-06-02 中科寒武纪科技股份有限公司 Computing device, computing method and related product
CN111523655A (en) * 2019-02-03 2020-08-11 上海寒武纪信息科技有限公司 Processing apparatus and method
CN111767995A (en) * 2019-04-02 2020-10-13 上海寒武纪信息科技有限公司 Computing method, device and related products
CN111860796A (en) * 2019-04-30 2020-10-30 上海寒武纪信息科技有限公司 Computing method, device and related products
CN111985634A (en) * 2020-08-21 2020-11-24 北京灵汐科技有限公司 Neural network computing method, device, computer equipment and storage medium
CN114861897A (en) * 2021-02-05 2022-08-05 三星电子株式会社 Neural network operation device and method of operating the same

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105184366A (en) * 2015-09-15 2015-12-23 中国科学院计算技术研究所 Time-division-multiplexing general neural network processor
CN105488563A (en) * 2015-12-16 2016-04-13 重庆大学 Sparse adaptive neural network, algorithm and implementation device for deep learning
CN105512723A (en) * 2016-01-20 2016-04-20 南京艾溪信息科技有限公司 An artificial neural network computing device and method for sparse connections
US20160196488A1 (en) * 2013-08-02 2016-07-07 Byungik Ahn Neural network computing device, system and method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160196488A1 (en) * 2013-08-02 2016-07-07 Byungik Ahn Neural network computing device, system and method
CN105184366A (en) * 2015-09-15 2015-12-23 中国科学院计算技术研究所 Time-division-multiplexing general neural network processor
CN105488563A (en) * 2015-12-16 2016-04-13 重庆大学 Sparse adaptive neural network, algorithm and implementation device for deep learning
CN105512723A (en) * 2016-01-20 2016-04-20 南京艾溪信息科技有限公司 An artificial neural network computing device and method for sparse connections

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109101273B (en) * 2018-02-05 2023-08-25 上海寒武纪信息科技有限公司 Neural network processing device and method for executing vector maximum value instruction
CN109101273A (en) * 2018-02-05 2018-12-28 上海寒武纪信息科技有限公司 Processing with Neural Network device and its method for executing vector maximization instruction
CN109032669A (en) * 2018-02-05 2018-12-18 上海寒武纪信息科技有限公司 Processing with Neural Network device and its method for executing the instruction of vector minimum value
CN109032669B (en) * 2018-02-05 2023-08-29 上海寒武纪信息科技有限公司 Neural network processing device and method for executing vector minimum instruction
CN111222632A (en) * 2018-11-27 2020-06-02 中科寒武纪科技股份有限公司 Computing device, computing method and related product
CN111523655A (en) * 2019-02-03 2020-08-11 上海寒武纪信息科技有限公司 Processing apparatus and method
CN111523655B (en) * 2019-02-03 2024-03-29 上海寒武纪信息科技有限公司 Processing device and method
CN111767995A (en) * 2019-04-02 2020-10-13 上海寒武纪信息科技有限公司 Computing method, device and related products
CN111767995B (en) * 2019-04-02 2023-12-05 上海寒武纪信息科技有限公司 Computing methods, devices and related products
CN111860796A (en) * 2019-04-30 2020-10-30 上海寒武纪信息科技有限公司 Computing method, device and related products
CN111860796B (en) * 2019-04-30 2023-10-03 上海寒武纪信息科技有限公司 Computing methods, devices and related products
CN111985634A (en) * 2020-08-21 2020-11-24 北京灵汐科技有限公司 Neural network computing method, device, computer equipment and storage medium
CN114861897A (en) * 2021-02-05 2022-08-05 三星电子株式会社 Neural network operation device and method of operating the same

Similar Documents

Publication Publication Date Title
CN110298443B (en) Neural network operation device and method
WO2018058427A1 (en) Neural network computation apparatus and method
CN109284823B (en) A computing device and related products
CN111260025B (en) Device and operation method for performing LSTM neural network operations
CN111857819B (en) Apparatus and method for performing matrix add/subtract operation
CN110689138A (en) Operation method, device and related product
TW201805858A (en) Apparatus and method for performing neural network operation
CN110163357A (en) A computing device and method
CN111126590B (en) A device and method for artificial neural network calculation
WO2018120016A1 (en) Apparatus for executing lstm neural network operation, and operational method
CN107315563B (en) An apparatus and method for performing vector comparison operations
CN109754062B (en) Execution method of convolution expansion instruction and related product
CN108171328B (en) A neural network processor and a convolution operation method performed by the same
CN111381871A (en) Computing method, device and related products
CN111047022A (en) A computing device and related products
CN109711540B (en) Computing device and board card
CN107305486B (en) Neural network maxout layer computing device
WO2017185248A1 (en) Apparatus and method for performing auto-learning operation of artificial neural network
CN107329733B (en) Apparatus and method for performing posing operations
KR102467544B1 (en) arithmetic device and its operating method
CN114692824B (en) A quantized training method, device and equipment for a neural network model
CN115437603B (en) Method for generating random numbers and related products
CN112766475A (en) Processing unit and artificial intelligence processor
CN114692845B (en) Data processing device, data processing method and related products
CN111401536A (en) Computing method, device and related products

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16917181

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16917181

Country of ref document: EP

Kind code of ref document: A1