CN110097186B - A Neural Network Heterogeneous Quantization Training Method - Google Patents
A Neural Network Heterogeneous Quantization Training Method Download PDFInfo
- Publication number
- CN110097186B CN110097186B CN201910354693.7A CN201910354693A CN110097186B CN 110097186 B CN110097186 B CN 110097186B CN 201910354693 A CN201910354693 A CN 201910354693A CN 110097186 B CN110097186 B CN 110097186B
- Authority
- CN
- China
- Prior art keywords
- training
- quantization
- data
- neural network
- acceleration module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Feedback Control In General (AREA)
Abstract
本发明提供一种神经网络异构量化训练方法,属于人工神经网络技术领域,本发明在传统基于CPU或GPU或二者结合的训练架构基础上,添加高速接口逻辑,通过高速接口逻辑连接硬件计算加速模块,训练过程中间特定的某一步或几步计算过程下放至所述硬件计算加速模块,计算完成后结果经所述高速接口逻辑返回至源训练主控,完成具有特定定制功能的训练过程。将前沿新结构或新算法快速实现并部署到训练中,提高系统灵活性,降低存储与带宽需求,减少正向预测过程中资源需求,降低训练复杂度,提高训练效率,保证当前训练装置能较好适应最新的神经网络结构。The invention provides a neural network heterogeneous quantitative training method, which belongs to the technical field of artificial neural network. On the basis of the traditional training framework based on CPU or GPU or a combination of the two, the invention adds high-speed interface logic, and connects hardware computing through the high-speed interface logic. In the acceleration module, a specific step or steps in the middle of the training process is transferred to the hardware computing acceleration module. After the calculation is completed, the result is returned to the source training master through the high-speed interface logic, and the training process with specific customization functions is completed. Quickly implement and deploy cutting-edge new structures or algorithms into training, improve system flexibility, reduce storage and bandwidth requirements, reduce resource requirements in the forward prediction process, reduce training complexity, improve training efficiency, and ensure that current training devices can be more efficient. Well adapted to the latest neural network architectures.
Description
技术领域technical field
本发明涉及人工神经网络技术领域,尤其涉及一种神经网络异构量化训练方法。The invention relates to the technical field of artificial neural networks, in particular to a neural network heterogeneous quantitative training method.
背景技术Background technique
神经网络训练将一组训练集送入网络,根据网络的实际输出与期望输出间的差别来调整权值。训练过程包括:定义神经网络的结构和前向传播的输出结果,求出结果与期望值的误差,再将误差一层一层的返回,然后进行权值更新。通过训练样本和期望值来调整网络权值。Neural network training sends a set of training sets into the network, and adjusts the weights according to the difference between the actual output of the network and the expected output. The training process includes: defining the structure of the neural network and the output of forward propagation, finding the error between the result and the expected value, and then returning the error layer by layer, and then updating the weights. Adjust the network weights by training samples and expected values.
CPU擅长逻辑控制、串行运算与通用类型数据运算,GPU侧重处理大规模并行计算多重任务。CPU与GPU在各自领域都可以高效地完成任务,亦可作为当前神经网络训练的主流方式。CPU is good at logic control, serial operation and general type data operation, while GPU focuses on processing multiple tasks of large-scale parallel computing. Both CPU and GPU can efficiently complete tasks in their respective fields, and can also be used as the mainstream method of current neural network training.
随着研究深入,越来越多的新结构,新算法不断被提出,给通用CPU、GPU训练方式带来了更高要求与挑战,特定的细节结构难以快速实现,训练时间可能变得更加冗长。With the deepening of research, more and more new structures and new algorithms are constantly being proposed, which brings higher requirements and challenges to general-purpose CPU and GPU training methods. It is difficult to quickly realize specific detailed structures, and the training time may become more lengthy. .
发明内容Contents of the invention
为了解决以上技术问题,本发明提出了一种神经网络异构量化训练方法。采用异构方式对原有训练过程进行加速,能将前沿新结构,如特种卷积类型等,或新算法,如模型参数量化等,快速实现并部署到训练中,提高系统灵活性,降低存储与带宽需求,减少正向预测过程中资源需求,降低训练复杂度,提高训练效率,保证当前训练装置能较好适应最新的神经网络结构。In order to solve the above technical problems, the present invention proposes a neural network heterogeneous quantization training method. Heterogeneous methods are used to accelerate the original training process, and cutting-edge new structures, such as special convolution types, or new algorithms, such as model parameter quantization, can be quickly implemented and deployed in training, improving system flexibility and reducing storage And bandwidth requirements, reduce resource requirements in the forward prediction process, reduce training complexity, improve training efficiency, and ensure that the current training device can better adapt to the latest neural network structure.
本发明的技术方案是:Technical scheme of the present invention is:
一种神经网络异构量化训练方法,在传统基于CPU或GPU或二者结合的训练架构基础上,添加高速接口逻辑,通过高速接口逻辑连接硬件量化加速模块,在训练过程中添加量化步骤,将模型参数与特征图结果的量化计算过程下放至所述硬件量化加速模块,将量化计算完成后的结果经所述高速接口逻辑返回至源训练主控,更新量化后的模型参数,迭代完成具有模型参数与特征图结果量化功能的训练过程。A neural network heterogeneous quantization training method, on the basis of the traditional training architecture based on CPU or GPU or a combination of the two, add high-speed interface logic, connect the hardware quantization acceleration module through the high-speed interface logic, add quantization steps in the training process, and The quantitative calculation process of model parameters and feature map results is delegated to the hardware quantization acceleration module, and the results after the quantitative calculation are completed are returned to the source training main control through the high-speed interface logic, the quantized model parameters are updated, and the model with the model is iteratively completed. The training process of the parameter and feature map result quantization function.
进一步的,所述硬件量化加速模块负责完成神经网络模型参数与神经网络特征图结果的低比特位量化,由专用电路实现,与传统的训练主体CPU或GPU构成异构结构。Further, the hardware quantization acceleration module is responsible for completing the low-bit quantization of neural network model parameters and neural network feature map results, which is implemented by a dedicated circuit and forms a heterogeneous structure with the traditional training subject CPU or GPU.
进一步的,所述数据量化操作包括:数据暂存,数据统计排序,数据压缩与解压缩,数据哈希与查表,浮点数转特定位数定点数,浮点数移位缩放与截取,数据逆量化等。Further, the data quantization operation includes: data temporary storage, data statistical sorting, data compression and decompression, data hashing and table lookup, floating-point number to fixed-point number with specific digits, floating-point number shift scaling and interception, data inversion quantification etc.
所述特定定制功能包括但不限于:模型参数量化,浮点数转定点数,特殊的卷积操作,如膨胀卷积、deep-wise卷积,1x1乘法器阵列,全连接乘加器阵列等;所述特定定制功能由所述硬件计算加速模块实现,训练过程中可能多次或仅一次使能该模块,完成特定功能。The specific customization functions include but are not limited to: model parameter quantization, conversion of floating-point numbers to fixed-point numbers, special convolution operations, such as dilated convolution, deep-wise convolution, 1x1 multiplier array, fully connected multiplier-adder array, etc.; The specific custom function is realized by the hardware computing acceleration module, and the module may be enabled multiple times or only once during the training process to complete the specific function.
具体包括以下步骤:Specifically include the following steps:
1)在传统基于CPU或GPU训练框架下,设置神经网络模型参数与超参数初始值,同时初始化硬件量化加速模块,开始训练;1) Under the traditional CPU or GPU-based training framework, set the initial values of neural network model parameters and hyperparameters, initialize the hardware quantization acceleration module at the same time, and start training;
2)首轮反向传播更新完神经网络最后一层参数后,将更新后的权重参数传入所述硬件量化加速模块,经过如GZIP或熵编码等通用数据压缩方法对权重参数进行初次压缩并存储,然后对数据进行统计排序,根据期望的定点位数对数据进一步移位与截取,并限定数据最大值与最小值,得到量化后的权重参数,传回传统框架中继续进行反向传播的前一层参数更新,直到完成首轮反向传播,得到全部权重参数;2) After the first round of backpropagation has updated the last layer parameters of the neural network, the updated weight parameters are passed into the hardware quantization acceleration module, and the weight parameters are initially compressed and compressed by general data compression methods such as GZIP or entropy coding. Store, then sort the data statistically, further shift and intercept the data according to the expected fixed-point digits, and limit the maximum and minimum values of the data, obtain the quantized weight parameters, and send them back to the traditional framework to continue backpropagation The parameters of the previous layer are updated until the first round of backpropagation is completed, and all weight parameters are obtained;
3)重复步骤2对权重进行更新,完成多轮反向传播,直到达到模型Loss要求,完成训练;3) Repeat step 2 to update the weights, complete multiple rounds of backpropagation, until the model Loss requirements are met, and the training is completed;
4)除权重参数外,神经网络各层特征图结果也可进行量化操作,以对整个模型推理进行进一步量化;4) In addition to the weight parameters, the results of the feature maps of each layer of the neural network can also be quantified to further quantify the reasoning of the entire model;
5)根据需要,可对权重数据进行哈希运算得到索引值,对数据进行进一步压缩,或在量化完成后立即进行数据逆量化,降低量化损失。5) According to the needs, the weight data can be hashed to obtain the index value, and the data can be further compressed, or the data can be dequantized immediately after the quantization is completed, so as to reduce the quantization loss.
所述硬件计算加速模块采用FPGA或ACAP通过逻辑配置实现,外接非易失存储器件,不同定制功能可同时存储,按训练需求对FPGA或ACAP进行实时配置,完成同一训练过程中的不同功能。The hardware computing acceleration module adopts FPGA or ACAP to realize through logical configuration, external non-volatile memory device, different customized functions can be stored at the same time, and FPGA or ACAP is configured in real time according to training requirements to complete different functions in the same training process.
所述高速接口逻辑包括但不限于PCIE接口,USB3.0接口,万兆以太网接口等方式实现,与原有训练主控进行通信交互。The high-speed interface logic includes but is not limited to PCIE interface, USB3.0 interface, 10 Gigabit Ethernet interface, etc., and communicates and interacts with the original training master control.
本发明的有益效果是The beneficial effects of the present invention are
采用异构方式对原有训练过程进行加速,能将前沿新结构,如特种卷积类型等,或新算法,如模型参数量化等,快速实现并部署到训练中,提高系统灵活性,降低存储与带宽需求,减少正向预测过程中资源需求,降低训练复杂度,提高训练效率,保证当前训练装置能较好适应最新的神经网络结构。Heterogeneous methods are used to accelerate the original training process, and cutting-edge new structures, such as special convolution types, or new algorithms, such as model parameter quantization, can be quickly implemented and deployed in training, improving system flexibility and reducing storage And bandwidth requirements, reduce resource requirements in the forward prediction process, reduce training complexity, improve training efficiency, and ensure that the current training device can better adapt to the latest neural network structure.
具体实施方式Detailed ways
为使本发明实施例的目的、技术方案和优点更加清楚,下面对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例,基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention are clearly and completely described below. Obviously, the described embodiments are part of the embodiments of the present invention, not all Embodiments, based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.
本发明的一种神经网络异构量化训练方法,在传统基于CPU或GPU或二者结合的训练架构基础上,添加高速接口逻辑,通过高速接口逻辑连接硬件量化加速模块,在训练过程中添加量化步骤,将模型参数与特征图结果的量化计算过程下放至所述硬件量化加速模块,将量化计算完成后的结果经所述高速接口逻辑返回至源训练主控,更新量化后的模型参数,迭代完成具有模型参数与特征图结果量化功能的训练过程。A neural network heterogeneous quantization training method of the present invention, on the basis of the traditional training framework based on CPU or GPU or a combination of the two, adds high-speed interface logic, connects the hardware quantization acceleration module through the high-speed interface logic, and adds quantization during the training process step, delegating the quantization calculation process of model parameters and feature map results to the hardware quantization acceleration module, returning the results of the quantization calculation to the source training master through the high-speed interface logic, updating the quantized model parameters, and iterating Complete the training process with the quantization function of model parameters and feature map results.
硬件量化加速模块负责完成神经网络模型参数与神经网络特征图结果的低比特位量化,由专用电路实现,与传统的训练主体CPU或GPU构成异构结构。数据量化操作包括:数据暂存,数据统计排序,数据压缩与解压缩,数据哈希与查表,浮点数转特定位数定点数,浮点数移位缩放与截取,数据逆量化等。The hardware quantization acceleration module is responsible for completing the low-bit quantization of neural network model parameters and neural network feature map results. It is realized by a dedicated circuit and forms a heterogeneous structure with the traditional training main body CPU or GPU. Data quantization operations include: data temporary storage, data statistical sorting, data compression and decompression, data hashing and table lookup, floating-point numbers to fixed-point numbers with specific digits, floating-point number shift scaling and interception, data inverse quantization, etc.
包括以下步骤:Include the following steps:
1)在传统基于CPU或GPU训练框架下,设置神经网络模型参数与超参数初始值,同时初始化硬件量化加速模块,开始训练;1) Under the traditional CPU or GPU-based training framework, set the initial values of the neural network model parameters and hyperparameters, initialize the hardware quantization acceleration module at the same time, and start training;
2)首轮反向传播更新完神经网络最后一层参数后,将更新后的权重参数传入所述硬件量化加速模块,经过如GZIP或熵编码等通用数据压缩方法对权重参数进行初次压缩并存储,然后对数据进行统计排序,根据期望的定点位数对数据进一步移位与截取,并限定数据最大值与最小值,得到量化后的权重参数,传回传统框架中继续进行反向传播的前一层参数更新,直到完成首轮反向传播,得到全部权重参数;2) After the first round of backpropagation has updated the last layer parameters of the neural network, the updated weight parameters are passed into the hardware quantization acceleration module, and the weight parameters are initially compressed and compressed by general data compression methods such as GZIP or entropy coding. Store, then sort the data statistically, further shift and intercept the data according to the expected fixed-point digits, and limit the maximum and minimum values of the data, obtain the quantized weight parameters, and send them back to the traditional framework to continue backpropagation The parameters of the previous layer are updated until the first round of backpropagation is completed, and all weight parameters are obtained;
3)重复步骤2对权重进行更新,完成多轮反向传播,直到达到模型Loss要求,完成训练;3) Repeat step 2 to update the weights, complete multiple rounds of backpropagation, until the model Loss requirements are met, and the training is completed;
4)除权重参数外,神经网络各层特征图结果也可进行量化操作,以对整个模型推理进行进一步量化;4) In addition to the weight parameters, the results of the feature maps of each layer of the neural network can also be quantified to further quantify the reasoning of the entire model;
5)根据需要,可对权重数据进行哈希运算得到索引值,对数据进行进一步压缩,或在量化完成后立即进行数据逆量化,降低量化损失;5) According to the needs, the weight data can be hashed to obtain the index value, and the data can be further compressed, or the data can be inversely quantized immediately after the quantization is completed to reduce the quantization loss;
所述硬件计算加速模块采用FPGA或ACAP通过逻辑配置实现,外接非易失存储器件,不同定制功能可同时存储,按训练需求对FPGA或ACAP进行实时配置,完成同一训练过程中的不同功能;所述高速接口逻辑包括但不限于PCIE接口,USB3.0接口,万兆以太网接口等方式实现,与原有训练主控进行通信交互。The hardware calculation acceleration module adopts FPGA or ACAP to realize through logical configuration, external non-volatile storage device, different customized functions can be stored at the same time, FPGA or ACAP is configured in real time according to training requirements, and different functions in the same training process are completed; The above-mentioned high-speed interface logic includes but is not limited to PCIE interface, USB3.0 interface, 10 Gigabit Ethernet interface, etc., and communicates with the original training master control.
以上所述仅为本发明的较佳实施例,仅用于说明本发明的技术方案,并非用于限定本发明的保护范围。凡在本发明的精神和原则之内所做的任何修改、等同替换、改进等,均包含在本发明的保护范围内。The above descriptions are only preferred embodiments of the present invention, and are only used to illustrate the technical solution of the present invention, and are not used to limit the protection scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principles of the present invention are included in the protection scope of the present invention.
Claims (6)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910354693.7A CN110097186B (en) | 2019-04-29 | 2019-04-29 | A Neural Network Heterogeneous Quantization Training Method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910354693.7A CN110097186B (en) | 2019-04-29 | 2019-04-29 | A Neural Network Heterogeneous Quantization Training Method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110097186A CN110097186A (en) | 2019-08-06 |
CN110097186B true CN110097186B (en) | 2023-04-18 |
Family
ID=67446342
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910354693.7A Active CN110097186B (en) | 2019-04-29 | 2019-04-29 | A Neural Network Heterogeneous Quantization Training Method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110097186B (en) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111582476B (en) * | 2020-05-09 | 2024-08-02 | 北京百度网讯科技有限公司 | Automatic quantization strategy searching method, device, equipment and storage medium |
CN111598237B (en) * | 2020-05-21 | 2024-06-11 | 上海商汤智能科技有限公司 | Quantization training, image processing method and device, and storage medium |
CN112258377A (en) * | 2020-10-13 | 2021-01-22 | 国家计算机网络与信息安全管理中心 | Method and equipment for constructing robust binary neural network |
CN114692861A (en) * | 2020-12-28 | 2022-07-01 | 华为技术有限公司 | Computation graph updating method, computation graph processing method and related equipment |
CN112308215B (en) * | 2020-12-31 | 2021-03-30 | 之江实验室 | Intelligent training acceleration method and system based on data sparse characteristic in neural network |
CN113033784A (en) * | 2021-04-18 | 2021-06-25 | 沈阳雅译网络技术有限公司 | Method for searching neural network structure for CPU and GPU equipment |
CN114611697B (en) * | 2022-05-11 | 2022-09-09 | 上海登临科技有限公司 | Neural network quantification and deployment method, system, electronic device and storage medium |
CN116451757B (en) * | 2023-06-19 | 2023-09-08 | 山东浪潮科学研究院有限公司 | Heterogeneous acceleration method, heterogeneous acceleration device, heterogeneous acceleration equipment and heterogeneous acceleration medium for neural network model |
CN116911350B (en) * | 2023-09-12 | 2024-01-09 | 苏州浪潮智能科技有限公司 | Quantification method based on graph neural network model, task processing method and task processing device |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108280514A (en) * | 2018-01-05 | 2018-07-13 | 中国科学技术大学 | Sparse neural network acceleration system based on FPGA and design method |
CN109635936A (en) * | 2018-12-29 | 2019-04-16 | 杭州国芯科技股份有限公司 | A kind of neural networks pruning quantization method based on retraining |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8131659B2 (en) * | 2008-09-25 | 2012-03-06 | Microsoft Corporation | Field-programmable gate array based accelerator system |
-
2019
- 2019-04-29 CN CN201910354693.7A patent/CN110097186B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108280514A (en) * | 2018-01-05 | 2018-07-13 | 中国科学技术大学 | Sparse neural network acceleration system based on FPGA and design method |
CN109635936A (en) * | 2018-12-29 | 2019-04-16 | 杭州国芯科技股份有限公司 | A kind of neural networks pruning quantization method based on retraining |
Non-Patent Citations (2)
Title |
---|
一种简洁高效的加速卷积神经网络的方法;刘进锋;《科学技术与工程》;20141128(第33期);全文 * |
基于GPU的并行拟牛顿神经网络训练算法设计;刘强等;《河海大学学报(自然科学版)》;20180925(第05期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN110097186A (en) | 2019-08-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110097186B (en) | A Neural Network Heterogeneous Quantization Training Method | |
Mills et al. | Communication-efficient federated learning for wireless edge intelligence in IoT | |
CN110378468B (en) | A neural network accelerator based on structured pruning and low-bit quantization | |
CN113315604B (en) | Adaptive gradient quantization method for federated learning | |
CN111339027B (en) | Reconfigurable artificial intelligence core and automatic design method for heterogeneous multi-core chips | |
CN113595993B (en) | A joint learning method for vehicle sensing equipment based on model structure optimization under edge computing | |
CN111182582B (en) | Multitask distributed unloading method facing mobile edge calculation | |
CN107330515A (en) | A device and method for performing forward operation of artificial neural network | |
CN108304928A (en) | Compression method based on the deep neural network for improving cluster | |
WO2020238237A1 (en) | Power exponent quantization-based neural network compression method | |
CN111158912B (en) | Task unloading decision method based on deep learning in cloud and fog collaborative computing environment | |
CN109635935A (en) | Depth convolutional neural networks model adaptation quantization method based on the long cluster of mould | |
Li et al. | Anycostfl: Efficient on-demand federated learning over heterogeneous edge devices | |
CN105553937A (en) | System and method for data compression | |
CN111199740B (en) | Unloading method for accelerating automatic voice recognition task based on edge calculation | |
CN112733863B (en) | Image feature extraction method, device, equipment and storage medium | |
CN116962176B (en) | A distributed cluster data processing method, device, system and storage medium | |
CN111488981A (en) | A method for selecting the parameter sparsity threshold of deep network based on Gaussian distribution estimation | |
Struharik et al. | Conna–compressed cnn hardware accelerator | |
CN108985444A (en) | A kind of convolutional neural networks pruning method inhibited based on node | |
CN116306912A (en) | A Model Parameter Update Method for Distributed Federated Learning | |
CN109978144A (en) | A kind of model compression method and system | |
CN110782396B (en) | Light-weight image super-resolution reconstruction network and reconstruction method | |
CN111260049A (en) | A neural network implementation method based on domestic embedded system | |
CN114997382B (en) | Low hardware overhead convolution computing structure and calculation method for lightweight neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20230320 Address after: 250000 building S02, No. 1036, Langchao Road, high tech Zone, Jinan City, Shandong Province Applicant after: Shandong Inspur Scientific Research Institute Co.,Ltd. Address before: 250100 First Floor of R&D Building 2877 Kehang Road, Sun Village Town, Jinan High-tech Zone, Shandong Province Applicant before: JINAN INSPUR HIGH-TECH TECHNOLOGY DEVELOPMENT Co.,Ltd. |
|
TA01 | Transfer of patent application right | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20250717 Address after: 250000 Shandong Province, Jinan City, China (Shandong) Free Trade Pilot Zone, Shunhua Road Street, Inspur Road 1036, Building S01, 5th Floor Patentee after: Yuanqixin (Shandong) Semiconductor Technology Co.,Ltd. Country or region after: China Address before: 250000 building S02, No. 1036, Langchao Road, high tech Zone, Jinan City, Shandong Province Patentee before: Shandong Inspur Scientific Research Institute Co.,Ltd. Country or region before: China |
|
TR01 | Transfer of patent right |