TWI774411B - Model compression method and model compression system - Google Patents
Model compression method and model compression system Download PDFInfo
- Publication number
- TWI774411B TWI774411B TW110120608A TW110120608A TWI774411B TW I774411 B TWI774411 B TW I774411B TW 110120608 A TW110120608 A TW 110120608A TW 110120608 A TW110120608 A TW 110120608A TW I774411 B TWI774411 B TW I774411B
- Authority
- TW
- Taiwan
- Prior art keywords
- model
- output
- output data
- compressed
- similarity
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Feedback Control In General (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Steroid Compounds (AREA)
Abstract
Description
本發明係有關於模型壓縮,尤指一種透過以相似度作為獎勵的強化學習機制來調整模型剪枝操作的模型壓縮方法與模型壓縮系統。The present invention relates to model compression, in particular to a model compression method and model compression system for adjusting model pruning operations through a reinforcement learning mechanism that uses similarity as a reward.
在模型壓縮當中有個技術是透過現有的模型(教師模型)去訓練一個更小的模型(學生模型),教師模型通常參數量較大且不易於部屬在現有設備上,所以透過這種方式來訓練一個能力相似的小模型並部屬到移動設備上,然而這種方式多半還是必須手動設計學生模型的參數,因此亟需一種能夠自動找尋合適的學生模型(亦即壓縮後模型)的方式。There is a technique in model compression to train a smaller model (student model) through the existing model (teacher model). The teacher model usually has a large number of parameters and is not easy to deploy on existing equipment, so this method is used to train a smaller model (student model). To train a small model with similar capabilities and deploy it to a mobile device, however, most of the methods still have to manually design the parameters of the student model, so there is an urgent need for a method that can automatically find a suitable student model (ie, the compressed model).
因此,本發明的目的之一在於提出一種透過以相似度作為獎勵的強化學習機制來進行模型剪枝操作的模型壓縮方法與模型壓縮系統。Therefore, one of the objectives of the present invention is to propose a model compression method and model compression system for performing model pruning operations through a reinforcement learning mechanism that uses similarity as a reward.
在本發明的一實施例中,揭露一種模型壓縮方法。該模型壓縮方法包含:針對具有一深度神經網路架構之一原始模型來進行一模型剪枝操作,以產生一壓縮後模型;將同一測試資料分別輸入至該原始模型以及該壓縮後模型;計算該原始模型處理該測試資料所得到之一第一輸出資料以及該壓縮後模型處理該測試資料所得到之一第二輸出資料之間的相似度;以及以該相似度作為獎勵,透過強化學習來判斷如何進一步調整該模型剪枝操作。In an embodiment of the present invention, a model compression method is disclosed. The model compression method includes: performing a model pruning operation on an original model with a deep neural network structure to generate a compressed model; inputting the same test data into the original model and the compressed model respectively; calculating The similarity between a first output data obtained by the original model processing the test data and a second output data obtained by the compressed model processing the test data; and using the similarity as a reward, through reinforcement learning Determine how to further adjust the model pruning operation.
在本發明的另一實施例中,揭露一種模型壓縮系統。該模型壓縮系統包含一儲存裝置以及一處理器。該儲存裝置用以儲存一程式碼。該處理器用以載入並執行該程式碼,以執行以下操作:針對具有一深度神經網路架構之一原始模型來進行一模型剪枝操作,以產生一壓縮後模型;將同一測試資料分別輸入至該原始模型以及該壓縮後模型;計算該原始模型處理該測試資料所得到之一第一輸出資料以及該壓縮後模型處理該測試資料所得到之一第二輸出資料之間的相似度;以及以該相似度作為獎勵,透過強化學習來判斷如何進一步調整該模型剪枝操作。In another embodiment of the present invention, a model compression system is disclosed. The model compression system includes a storage device and a processor. The storage device is used for storing a code. The processor is used for loading and executing the code to perform the following operations: performing a model pruning operation on an original model having a deep neural network architecture to generate a compressed model; inputting the same test data respectively to the original model and the compressed model; calculating the similarity between a first output data obtained by processing the test data by the original model and a second output data obtained by processing the test data by the compressed model; and Using the similarity as a reward, reinforcement learning is used to determine how to further adjust the model pruning operation.
本發明模型壓縮方法採用相似度來作為模型剪枝(模型壓縮)的依據,因此使用者無需提供標記過的資料來作為測試資料,可減少資料標記的成本與時間,另外,使用者無需提供測試原始程式碼,可直接輸入模型來進行壓縮,故能有效推廣模型壓縮的應用,再者,壓縮後留下的泛化特徵也不易產生過擬合。The model compression method of the present invention uses similarity as the basis for model pruning (model compression), so the user does not need to provide the marked data as the test data, which can reduce the cost and time of data marking. In addition, the user does not need to provide the test data The original code can be directly input into the model for compression, so it can effectively promote the application of model compression. Moreover, the generalization features left after compression are not prone to overfitting.
第1圖為根據本發明一實施例之模型壓縮系統的示意圖。如第1圖所示,模型壓縮系統100包含一處理器102以及一儲存裝置104。儲存裝置104用以儲存一程式碼Code_MC,例如儲存裝置104可以是傳統硬碟、固態硬碟、記憶體等等,但本發明並不以此為限。處理器102可載入並執行程式碼Code_MC,以執行第2圖所示之模型壓縮方法中的各個步驟。FIG. 1 is a schematic diagram of a model compression system according to an embodiment of the present invention. As shown in FIG. 1 , the
請一併參閱第2圖與第3圖。第2圖為本發明一實施例之模型壓縮方法的流程圖。第3圖為第2圖所示之模型壓縮方法的操作示意圖。請注意,假若可以得到相同結果,則模型壓縮方法不一定要完全遵照第2圖所示之步驟來依序執行,此外,根據設計需求及/或應用需求,模型壓縮方法亦可修改來新增其它步驟。於步驟202,針對具有一深度神經網路(deep neural network)架構之一原始模型302來進行一模型剪枝(model pruning)操作,以產生一壓縮後模型(compression model)304。舉例來說,原始模型302可以是基於卷積神經網路(convolution neural network, CNN)架構進行訓練所得到的模型,而模型剪枝的目標是只需保留重要的權重而刪除影響較小的權重,換言之,相較於原始模型302,壓縮後模型304會具有較少的參數數量,故可以降低計算成本與儲存空間,如此一來,便可將壓縮後模型304部署至運算能力有限的產品端,像是手機、邊緣裝置(edge device) 等等。此外,本發明的模型剪枝操作也希望能讓壓縮後模型302的輸出能盡量趨近原始模型302的輸出,進一步內容將於後詳述。Please refer to Figure 2 and Figure 3 together. FIG. 2 is a flowchart of a model compression method according to an embodiment of the present invention. FIG. 3 is a schematic diagram of the operation of the model compression method shown in FIG. 2 . Please note that if the same result can be obtained, the model compression method does not have to be executed in sequence according to the steps shown in Figure 2. In addition, according to design requirements and/or application requirements, the model compression method can also be modified and added. other steps. In step 202 , a model pruning operation is performed on an
於步驟204,將同一測試資料308分別輸入至原始模型302以及壓縮後模型304來進行處理,換言之,基於同一測試資料308,原始模型302的輸出以及壓縮後模型304的輸出便可用來評估壓縮後模型304是否與原始模型302相似。In
於步驟206,計算原始模型302處理測試資料308所得到之輸出資料D1以及壓縮後模型304處理測試資料308所得到之輸出資料D2之間的相似度(similarity),因此,該相似度的數值可代表修剪後的模型的輸出特徵是否與修剪前的模型的輸出特徵相似。In
於步驟208,以步驟206所計算得到的相似度作為獎勵(reward),透過強化學習(reinforcement learning)來判斷如何進一步調整模型剪枝操作。舉例來說,強化學習的主體(agent)可採用深度確定性策略梯度(deep deterministic policy gradient, DDPG)演算法來決定所要採取的動作(action),其中該動作係用以選擇壓縮的部位,進而達到調整模型剪枝操作的目的。在其他實施例中,亦可使用其他演算法(例如截止自然策略梯度(Truncated Natural Policy Gradient, TNPG)演算法、交叉熵(Cross Entropy Method, CEM)演算法等)來決定所要採取的動作。In
舉例來說,假設使用者所輸入之原始模型302是一個由3個卷積層(convolution layer)所組成的結構,且所具有的通道尺寸(channel size)分別為[32, 64, 128]。一開始初始化的主體306根據原始模型302的模型信息(例如包括輸入尺寸、各層卷積核尺寸、各層浮點運算次數等等)給出3層各自的初始壓縮率為[60%, 40%, 70%],因此,本發明模型壓縮方法使用強化學習的方式來對原始模型302進行壓縮,使得壓縮後模型304所具有的通道尺寸分別為[12, 38, 38],並得到相似度為0.3,接著,相似度為0.3的結果會回饋給主體306,由主體306判斷接下來的壓縮方向(例如調整壓縮的部位),基於該相似度以及該模型信息調整所期望的各層壓縮率,後續模型壓縮操作透過調整後的模型剪枝操作來對原始模型302進行壓縮,使得壓縮後模型304所具有的通道尺寸分別為[14, 32, 64],並得到相似度為0.4;上述模型壓縮操作會迭代執行,以透過強化學習的方式來得到具有較高相似度的壓縮後模型304。For example, it is assumed that the
於本發明之一實施方式中,模型壓縮方法可參照基於自動化機器學習之模型壓縮(AutoML for Model Compression, AMC)的習知架構來實作模型剪枝,但並不以此為限,本領域技術人員可以知道其他多種模型壓縮方法,於此不再贅述。習知模型壓縮架構是將壓縮後模型的輸出作為獎勵,以讓強化學習來判斷如何進一步對原始模型進行壓縮,進一步來說,習知架構採用準確率(accuracy)來作為強化學習之主體的獎勵,為了準確率的計算,需要使用者提供標記過的資料(labeled data)來作為饋入至壓縮後模型的測試資料,以便透過標記所提供的資訊來得知壓縮後模型之輸出的準確率,然而,對使用者而言,資料的標記相當費時費工,此外,計算準確率時,一般會取壓縮後模型之輸出中的最大值來跟標記進行比較,因此,準確率的計算根本不在意壓縮後模型之輸出中除了最大值以外的其它數值,假若輸入資料較難判斷時,壓縮後模型之輸出中的數值會彼此十分接近,單用準確率來作為強化學習之主體的獎勵,可能造成模型過度自信並且損失部分特徵的判斷能力,因此造成類似過擬合(overfitting)的結果或者是與原始模型不相同的輸出,再者,為了得知要採用哪種準確率的算法,習知架構亦需要使用者提供測試原始程式碼(source code)。In one embodiment of the present invention, the model compression method can refer to the conventional framework of automatic machine learning based model compression (AutoML for Model Compression, AMC) to implement model pruning, but it is not limited thereto. The skilled person may know other various model compression methods, which will not be repeated here. The conventional model compression architecture uses the output of the compressed model as a reward, so that reinforcement learning can determine how to further compress the original model. Further, the conventional architecture uses accuracy as the reward for the main body of reinforcement learning. , in order to calculate the accuracy, the user needs to provide labeled data as the test data fed into the compressed model, so that the accuracy of the output of the compressed model can be known through the information provided by the labeling. However, , For users, the marking of data is quite time-consuming and labor-intensive. In addition, when calculating the accuracy, the maximum value in the output of the compressed model is generally compared with the marking. Therefore, the calculation of the accuracy does not care about compression at all. In addition to the maximum value in the output of the latter model, if the input data is difficult to judge, the values in the output of the compressed model will be very close to each other. Using the accuracy rate alone as the reward for the main body of reinforcement learning may cause the model Overconfidence and loss of judgment ability of some features, thus resulting in similar overfitting (overfitting) results or different outputs from the original model. Furthermore, in order to know which accuracy algorithm to use, the conventional architecture also The user is required to provide the test source code.
相較於習知架構採用準確率來作為強化學習之主體的獎勵,本發明模型壓縮方法改用相似度來作為強化學習之主體的獎勵,並透過強化學習(例如DDPG演算法)進行模型剪枝的調整。模型剪枝(模型壓縮)的最主要目的是使壓縮後模型304能跟使用者所提供的原始模型302相似,因此,本發明模型壓縮方法可將原始模型302的輸出資料D1與壓縮後模型304的輸出資料D2進行相似性的比較,以作為模型剪枝(模型壓縮)的依據。Compared with the conventional architecture that uses accuracy as the reward for the main body of reinforcement learning, the model compression method of the present invention uses similarity as the reward for the main body of reinforcement learning, and performs model pruning through reinforcement learning (such as DDPG algorithm). adjustment. The main purpose of model pruning (model compression) is to make the
於本發明之一實施例中,相似度可藉由計算原始模型X之輸出與壓縮後模型Y之輸出的皮爾森相關係數(Pearson’s correlation coefficient)來得到,例如:In an embodiment of the present invention, the similarity can be obtained by calculating the Pearson's correlation coefficient between the output of the original model X and the output of the compressed model Y, for example:
Output matrix of X= [1.0, 2.0, 3.0]Output matrix of X= [1.0, 2.0, 3.0]
Output matrix of Y= [2.0, 20.0, 38.0]Output matrix of Y= [2.0, 20.0, 38.0]
Pearson’s correlation coefficient ρ(X,Y) = =1.0 Pearson's correlation coefficient ρ(X,Y) = =1.0
於本發明之另一實施例中,相似度可藉由計算原始模型X之輸出與壓縮後模型Y之輸出的餘弦相似度(Cosine similarity)來得到,例如:In another embodiment of the present invention, the similarity can be obtained by calculating the cosine similarity between the output of the original model X and the output of the compressed model Y, for example:
Output matrix of X= [1.0, 2.0, 3.0]Output matrix of X= [1.0, 2.0, 3.0]
Output matrix of Y= [2.0, 20.0, 38.0]Output matrix of Y= [2.0, 20.0, 38.0]
Cosine similarity (X,Y) = =0.9698612260388879 Cosine similarity (X,Y) = =0.9698612260388879
然而,上述僅作為範例說明之用,並非用來作為本發明的限制條件,實際上,本發明模型壓縮方法亦可根據設計需求及/或應用需求來採用其它適合的相似度算法,這些設計上的變化亦落入本發明的範疇。However, the above is only used as an example and is not used as a limitation of the present invention. In fact, the model compression method of the present invention can also adopt other suitable similarity algorithms according to design requirements and/or application requirements. Variations also fall within the scope of the present invention.
如上所述,相似度的計算是基於原始模型302與壓縮後模型304各自的輸出資料D1、D2,於本實施例中,假若原始模型302是基於卷積神經網路架構進行訓練所得到的模型,則模型剪枝操作(所要進行壓縮的部位是由強化學習之主體306所採取的動作來選取)僅會施加於卷積層,因此輸出資料D1、D2可以是卷積神經網路架構中位於卷積層後面之任一層的輸出。第4圖為第3圖所示之原始模型302與壓縮後模型304所具備之卷積神經網路架構的示意圖。如圖所示,卷積神經網路架構400包含有輸入層(input layer)、卷積層404_1~404_N (N≧1)、池化層(pooling layer)406_1~406_N (N≧1),全連接層(filly-connected layer)408_1~408_M (M≧1)以及輸出層(output layer)410。於本發明之一實施例中,輸出資料D1可以是原始模型302之一全連接層(例如408_i,1≦i≦M)的輸出,以及輸出資料D2可以是壓縮後模型304之同一全連接層(例如408_i,1≦i≦M)的輸出。於本發明之另一實施例中,輸出層410為最後一層,並且會執行 Softmax函式以使得全連接層408_M之所有節點輸出的機率分佈總和為 1,此外,輸出資料D1可以是原始模型302之最後一層的Softmax函式輸出,以及輸出資料D2可以是壓縮後模型304之最後一層的Softmax函式輸出。請注意,第4圖所示之卷積神經網路架構400僅作為範例說明之用,並非作為本發明的限制條件,實作上,本發明模型壓縮方法亦可適用於其它神經網路架構,這些設計上的變化亦落入本發明的範疇。As mentioned above, the calculation of the similarity is based on the respective output data D1 and D2 of the
綜上所述,相似度的計算是基於原始模型與壓縮後模型各自的輸出資料,故無需將壓縮後模型的輸出資料跟測試資料的標記進行比較,換言之,相較於習知架構採用準確率來作為強化學習之主體的獎勵而需要使用者提供標記過的資料來作為測試資料,本發明模型壓縮方法採用相似度來作為強化學習之主體的獎勵而可以不使用標記過的資料來作為測試資料(亦即,測試資料308是未標記過的資料(non-labeled data)),由於測試資料308不用包含標記,故能減少資料標記的成本與時間。此外,本發明模型壓縮方法採用相似度的計算,故使用者無需提供測試原始程式碼,可以直接輸入模型來進行壓縮,因此能有效推廣模型壓縮的應用。再者,本發明模型壓縮方法改用相似度來作為強化學習之主體的獎勵,故壓縮後留下的泛化特徵也不易產生過擬合。
以上所述僅為本發明之較佳實施例,凡依本發明申請專利範圍所做之均等變化與修飾,皆應屬本發明之涵蓋範圍。
To sum up, the similarity calculation is based on the respective output data of the original model and the compressed model, so there is no need to compare the output data of the compressed model with the labels of the test data. As the reward for the main body of reinforcement learning, the user needs to provide the marked data as the test data. The model compression method of the present invention uses similarity as the reward for the main body of the reinforcement learning and can not use the marked data as the test data. (That is, the
100:模型壓縮系統
102:處理器
104:儲存裝置
202,204,206,208:步驟
302:原始模型
304:壓縮後模型
306:主體
308:測試資料
400:卷積神經網路架構
402:輸入層
404_1,404_N:卷積層
406_1,406_N:池化層
408_1,408_M:全連接層
410:輸出層
Code_MC:程式碼
D1,D2:輸出資料100: Model Compression System
102: Processor
104:
第1圖為根據本發明一實施例之模型壓縮系統的示意圖。 第2圖為本發明一實施例之模型壓縮方法的流程圖。 第3圖為第2圖所示之模型壓縮方法的操作示意圖。 第4圖為第3圖所示之原始模型與壓縮後模型所具備之卷積神經網路架構的示意圖。 FIG. 1 is a schematic diagram of a model compression system according to an embodiment of the present invention. FIG. 2 is a flowchart of a model compression method according to an embodiment of the present invention. FIG. 3 is a schematic diagram of the operation of the model compression method shown in FIG. 2 . Figure 4 is a schematic diagram of the convolutional neural network architecture of the original model and the compressed model shown in Figure 3.
202,204,206,208:步驟 202, 204, 206, 208: steps
Claims (12)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW110120608A TWI774411B (en) | 2021-06-07 | 2021-06-07 | Model compression method and model compression system |
| CN202110882210.8A CN113570045A (en) | 2021-06-07 | 2021-08-02 | Model compression method and model compression system |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW110120608A TWI774411B (en) | 2021-06-07 | 2021-06-07 | Model compression method and model compression system |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| TWI774411B true TWI774411B (en) | 2022-08-11 |
| TW202248904A TW202248904A (en) | 2022-12-16 |
Family
ID=78169982
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| TW110120608A TWI774411B (en) | 2021-06-07 | 2021-06-07 | Model compression method and model compression system |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN113570045A (en) |
| TW (1) | TWI774411B (en) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115880527A (en) * | 2022-11-30 | 2023-03-31 | 北京三快在线科技有限公司 | Model compression method and device, storage medium and electronic equipment |
| CN116775800A (en) * | 2023-06-26 | 2023-09-19 | 中国银行股份有限公司 | Method, device, equipment and medium for constructing information extraction model |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109348707A (en) * | 2016-04-27 | 2019-02-15 | 纽拉拉股份有限公司 | Method and apparatus for pruning empirical memory for deep neural network-based Q-learning |
| TW202004569A (en) * | 2018-06-03 | 2020-01-16 | 耐能智慧股份有限公司 | Method for batch normalization layer pruning in deep neural networks |
| CN111340227A (en) * | 2020-05-15 | 2020-06-26 | 支付宝(杭州)信息技术有限公司 | Method and device for compressing business prediction model through reinforcement learning model |
| US20200272905A1 (en) * | 2019-02-26 | 2020-08-27 | GE Precision Healthcare LLC | Artificial neural network compression via iterative hybrid reinforcement learning approach |
-
2021
- 2021-06-07 TW TW110120608A patent/TWI774411B/en active
- 2021-08-02 CN CN202110882210.8A patent/CN113570045A/en active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109348707A (en) * | 2016-04-27 | 2019-02-15 | 纽拉拉股份有限公司 | Method and apparatus for pruning empirical memory for deep neural network-based Q-learning |
| TW202004569A (en) * | 2018-06-03 | 2020-01-16 | 耐能智慧股份有限公司 | Method for batch normalization layer pruning in deep neural networks |
| US20200272905A1 (en) * | 2019-02-26 | 2020-08-27 | GE Precision Healthcare LLC | Artificial neural network compression via iterative hybrid reinforcement learning approach |
| CN111340227A (en) * | 2020-05-15 | 2020-06-26 | 支付宝(杭州)信息技术有限公司 | Method and device for compressing business prediction model through reinforcement learning model |
Also Published As
| Publication number | Publication date |
|---|---|
| CN113570045A (en) | 2021-10-29 |
| TW202248904A (en) | 2022-12-16 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN110348562B (en) | Quantitative strategy determination method of neural network, image recognition method and device | |
| CN111488986B (en) | A model compression method, image processing method and device | |
| CN109002889B (en) | Adaptive iterative convolution neural network model compression method | |
| CN109859106B (en) | A Self-Attention-Based High-Order Fusion Network for Image Super-Resolution Reconstruction | |
| WO2021135715A1 (en) | Image compression method and apparatus | |
| CN113011588B (en) | Pruning method, device, equipment and medium of convolutional neural network | |
| US20170061279A1 (en) | Updating an artificial neural network using flexible fixed point representation | |
| CN110598839A (en) | Convolutional neural network system and method for quantizing convolutional neural network | |
| CN108805259A (en) | neural network model training method, device, storage medium and terminal device | |
| CN109934300B (en) | Model compression method, device, computer equipment and storage medium | |
| TWI774411B (en) | Model compression method and model compression system | |
| CN110363297A (en) | Neural metwork training and image processing method, device, equipment and medium | |
| CN110276451A (en) | One kind being based on the normalized deep neural network compression method of weight | |
| CN107395211B (en) | A data processing method and device based on convolutional neural network model | |
| CN112687266B (en) | Speech recognition method, device, computer equipment and storage medium | |
| CN112446461A (en) | Neural network model training method and device | |
| CN112906889A (en) | Method and system for compressing deep neural network model | |
| KR20210143093A (en) | Electronic apparatus and control method thereof | |
| CN112766496A (en) | Deep learning model security guarantee compression method and device based on reinforcement learning | |
| CN116188878A (en) | Image classification method, device and storage medium based on fine-tuning of neural network structure | |
| CN114861671B (en) | Model training method, device, computer equipment and storage medium | |
| CN114677548A (en) | A neural network image classification system and method based on resistive memory | |
| CN117151178B (en) | A CNN custom network quantization acceleration method for FPGA | |
| CN114330690A (en) | Convolutional neural network compression method, device and electronic device | |
| CN112613604A (en) | Neural network quantification method and device |