TWI754970B

TWI754970B - Device, method and storage medium for accelerating operation of an activation function

Info

Publication number: TWI754970B
Application number: TW109121207A
Authority: TW
Inventors: 詹大緯; 林宏文
Original assignee: 鴻海精密工業股份有限公司
Priority date: 2020-06-22
Filing date: 2020-06-22
Publication date: 2022-02-11
Also published as: TW202201246A

Abstract

The invention provides a device for accelerating operation of an activation function. The device comprises a register for storing a storage table; a matching unit includes a plurality of comparators, a logic unit and a selection unit, and the plurality of comparators are connected with the selection unit through the logic unit, the plurality of comparators are used to match input variable of a activation function with variable interval of the activation function to obtain a comparison output result, and the logic unit performs logic operation according to the comparison output result to obtain the logic output result and determine the calculating variable interval; the selection unit is used to query the storage table according to the calculating variable interval to obtain parameters of a fitting quadratic function; the calculation unit is connected with the matching unit to complete the operation of the input variable according to the parameters. The invention also provides a method and a storage medium for accelerating operation of an activation function.

Description

Apparatus, method and storage medium for accelerating startup function operation

本發明涉及深度學習技術領域，具體涉及一種加速啟動函數運算的裝置、方法及存儲介質。 The present invention relates to the technical field of deep learning, and in particular, to a device, method and storage medium for accelerating the operation of a startup function.

對於最先進的人工神經網路，其資料處理包括卷積、池化和啟動等。其中，啟動的作用是提供神經網路的非線性建模能力。現有的啟動函數的功能多種多樣且複雜。所述啟動函數中可能包括指數運算和除法運算等，引入了複雜的計算和時間消耗。 For state-of-the-art artificial neural networks, data processing includes convolution, pooling, and priming. Among them, the role of priming is to provide the nonlinear modeling ability of the neural network. The functionality of the existing startup functions is varied and complex. The startup function may include exponential operation and division operation, etc., which introduces complex calculation and time consumption.

鑒於以上問題，本發明提出一種加速啟動函數運算的裝置、方法及存儲介質，以提高啟動函數處理的速度。 In view of the above problems, the present invention provides an apparatus, method and storage medium for accelerating the operation of the activation function, so as to improve the speed of processing the activation function.

本申請的第一方面提供一種加速啟動函數運算的裝置，所述裝置包括：暫存器，用於儲存一存儲表，其中，所述存儲表描述的是啟動函數的變數區間以及該區間對應的擬合二次函數的參數的映射關係；匹配單元，包括複數個比較器、邏輯單元和選擇單元，所述複數個比較器藉由所述邏輯單元與選擇單元連接，所述複數個比較器用於將啟動函數的輸入變數與啟動函數的變數區間進行匹配，得到比較輸出結果，所述邏輯單元根據所述比較輸出結果進行邏輯運算得到邏輯輸出結果，並根據所述邏輯輸出結果確定待計算的變數區間；所述選擇單元用於根據待計算的變數區間查詢所述存儲表，得到擬合二次函數的參數；計算單元，與所述匹配單元連接，用於根據所述參數完成針對所述輸入變數的運算。 A first aspect of the present application provides an apparatus for accelerating the operation of a startup function, the apparatus comprising: The temporary register is used to store a storage table, wherein the storage table describes the variable interval of the startup function and the mapping relationship of the parameters of the fitting quadratic function corresponding to the interval; the matching unit includes a plurality of comparators, a logic unit and a selection unit, the plurality of comparators are connected to the selection unit through the logic unit, and the plurality of comparators are used to match the input variable of the activation function with the variable interval of the activation function to obtain a comparison output result, The logic unit performs a logic operation according to the comparison output result to obtain a logic output result, and outputs the result according to the logic Determine the variable interval to be calculated; the selection unit is used for querying the storage table according to the variable interval to be calculated to obtain the parameters of the fitting quadratic function; the calculation unit is connected to the matching unit and is used for according to the parameter Completion of the operation on the input variable.

優選地，所述邏輯單元包括反閘、複數個異或閘和暫存單元。 Preferably, the logic unit includes an inversion gate, a plurality of XOR gates and a temporary storage unit.

優選地，所述擬合二次函數的參數包括二次項係數、一次項係數和常數。 Preferably, the parameters of the fitting quadratic function include quadratic term coefficients, linear term coefficients and constants.

優選地，所述計算單元包括乘法器和加法器，其中，所述乘法器接收來自於所述選擇單元的二次項係數、一次項係數與輸入變數，進行乘法運算；所述加法器接收來自於乘法器的輸出結果和來自於所述選擇單元的常數，執行加法運算。 Preferably, the calculation unit includes a multiplier and an adder, wherein the multiplier receives the quadratic term coefficient, the first-order term coefficient and the input variable from the selection unit, and performs a multiplication operation; the adder receives from the selection unit. An addition operation is performed on the output result of the multiplier and the constant from the selection unit.

優選地，當輸入變數小於變數區間的最大變數時，所述比較器輸出低電平；當輸入變數大於或等於所述變數區間的最大變數時，所述比較器輸出高電平。 Preferably, when the input variable is less than the maximum variable in the variable interval, the comparator outputs a low level; When the input variable is greater than or equal to the maximum variable of the variable interval, the comparator outputs a high level.

優選地，所述啟動函數包括函

或f(x)=max(0,x)。 Preferably, the startup function includes a function

or f ( x ) = max (0, x ).

本申請的第二方面提供一種加速啟動函數運算的方法，所述方法包括：接收輸入變數；比較所述輸入變數與啟動函數的變數區間，得到比較輸出結果；將所述比較輸出結果進行邏輯運算得到邏輯輸出結果；根據所述邏輯輸出結果確定待計算的變數區間；根據待計算的變數區間查詢存儲表，得到擬合二次函數的參數；根據所述參數完成針對所述輸入變數的運算。 A second aspect of the present application provides a method for accelerating the operation of a startup function, the method comprising: receiving an input variable; comparing the input variable and the variable interval of the startup function to obtain a comparison output result; and performing a logical operation on the comparison output result Obtaining a logical output result; determining a variable interval to be calculated according to the logical output result; querying a storage table according to the variable interval to be calculated to obtain parameters for fitting a quadratic function; and completing an operation for the input variable according to the parameters.

優選地，所述存儲表描述的是啟動函數的變數區間以及該區間對應的擬合二次函數的參數的映射關係。 Preferably, the storage table describes the variable interval of the startup function and the mapping relationship of the parameters of the fitting quadratic function corresponding to the interval.

本發明協力廠商面提供一種電腦可讀存儲介質，其上存儲有電腦程式，所述電腦程式被處理器執行時實現如前所述的資料處理方法。 The third party aspect of the present invention provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the aforementioned data processing method is implemented.

本發明提供的加速啟動函數運算的裝置、方法及存儲介質，本申請提供的加速啟動函數運算的裝置，可以藉由匹配單元匹配與啟動函數的變數區間對應擬合二次函數的參數，再藉由計算單元根據所述參數完成針對啟動函數的輸入變數的運算。可以簡化啟動函數計算過程，在提高神經網路中啟動函數處理的計算效率的同時可以更高精度的擬合啟動函數。 The device, method and storage medium for accelerating the operation of a startup function provided by the present invention, and the device for accelerating the operation of a startup function provided by the present application can match the parameters of the quadratic function corresponding to the variable interval of the startup function by matching the matching unit, and then borrow the parameters of the quadratic function. The computation of the input variables of the activation function is performed by the computing unit according to the parameters. The calculation process of the activation function can be simplified, and the activation function can be fitted with higher precision while improving the calculation efficiency of the activation function processing in the neural network.

10:運算裝置 10: Computing device

110:暫存器 110: Scratchpad

120:匹配單元 120: Matching unit

130:計算單元 130: Computing Unit

121:比較器 121: Comparator

122:邏輯單元 122: Logic Unit

123:選擇單元 123:Select unit

1220:反閘 1220: Reverse gate

1221:異或閘 1221: XOR gate

1222:暫存單元 1222: Temporary storage unit

200:運算系統 200: Computing Systems

201:接收模組 201: Receive module

202:比較模組 202: Compare Mods

203:處理模組 203: Processing modules

204:確定模組 204: Determine the module

205:查詢模組 205: Query Module

1:電子設備 1: Electronic equipment

11:記憶體 11: Memory

12:處理器 12: Processor

13:電腦可讀存儲介質 13: Computer-readable storage media

14:通訊匯流排 14: Communication bus

圖1是本發明一實施例所提供的加速啟動函數運算的裝置的示意圖。 FIG. 1 is a schematic diagram of an apparatus for accelerating the operation of an activation function provided by an embodiment of the present invention.

圖2是本發明一實施例所提供的加速啟動函數運算的裝置中的匹配單元的示意圖。 FIG. 2 is a schematic diagram of a matching unit in an apparatus for accelerating the operation of an activation function provided by an embodiment of the present invention.

圖3是本發明一實施方式提供的加速啟動函數運算的裝置中的邏輯單元的示意圖。 FIG. 3 is a schematic diagram of a logic unit in an apparatus for accelerating the operation of an activation function provided by an embodiment of the present invention.

圖4是本發明一實施方式提供的加速啟動函數運算的裝置中的計算單元的示意圖。 FIG. 4 is a schematic diagram of a calculation unit in an apparatus for accelerating the operation of an activation function provided by an embodiment of the present invention.

圖5是本發明一實施方式提供的藉由二次函數逼近啟動函數和藉由一次函數逼近啟動函數示例圖。 FIG. 5 is an example diagram of approximating the activation function by a quadratic function and approximating the activation function by a linear function according to an embodiment of the present invention.

圖6是本發明一實施方式提供的加速啟動函數運算的方法的流程圖。 FIG. 6 is a flowchart of a method for accelerating the operation of an activation function provided by an embodiment of the present invention.

圖7是本發明一實施方式提供的加速啟動函數運算系統的功能模組圖。 FIG. 7 is a functional module diagram of an accelerated activation function computing system according to an embodiment of the present invention.

圖8是本發明一實施方式提供的電子設備的示意圖。 FIG. 8 is a schematic diagram of an electronic device according to an embodiment of the present invention.

為了能夠更清楚地理解本發明的上述目的、特徵和優點，下面結合附圖和具體實施例對本發明進行詳細描述。需要說明的是，在不衝突的情況下，本申請的實施例及實施例中的特徵可以相互組合。 In order to more clearly understand the above objects, features and advantages of the present invention, the present invention will be described in detail below with reference to the accompanying drawings and specific embodiments. It should be noted that the embodiments of the present application and the features in the embodiments may be combined with each other in the case of no conflict.

在下面的描述中闡述了很多具體細節以便於充分理解本發明，所描述的實施例僅僅是本發明一部分實施例，而不是全部的實施例。 In the following description, many specific details are set forth in order to facilitate a full understanding of the present invention, and the described embodiments are only some, but not all, embodiments of the present invention.

除非另有定義，本文所使用的所有的技術和科學術語與屬於本發明的技術領域的技術人員通常理解的含義相同。本文中在本發明的說明書中所使用的術語只是為了描述具體的實施例的目的，不是旨在於限制本發明。 Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terms used herein in the description of the present invention are for the purpose of describing specific embodiments only, and are not intended to limit the present invention.

請參閱圖1，圖1為本發明一個實施例提供的加速啟動函數運算的裝置(為了便於描述，下文簡稱“運算裝置10”)的示意圖。在本實施方式中，神經網路的啟動函數通常採用非線性函數。例如sigmoid函數

，f(x)=tanh(x)，ReLU函數f(x)=max(0,x)。非線性的啟動函數在神經網路的啟動處理過程中存在執行速度慢的問題。為瞭解決該問題，本申請中將計算複雜的非線性函數藉由分段擬合二次函數和有限元方法來近似。具體地，藉由將啟動函數劃分為N段，每一段都藉由所述擬合二次函數以對其進行近似。計算每一段對應的擬合二次函數的參數，並存儲至係數記憶體中。 Please refer to FIG. 1 . FIG. 1 is a schematic diagram of an apparatus for accelerating the operation of a startup function (hereinafter referred to as “operation apparatus 10 ” for convenience of description) provided by an embodiment of the present invention. In this embodiment, the activation function of the neural network usually adopts a nonlinear function. For example the sigmoid function

, f ( x )=tanh( x ), the ReLU function f ( x )= max (0, x ). The nonlinear startup function has the problem of slow execution in the startup process of the neural network. In order to solve this problem, the computationally complex nonlinear functions are approximated by piecewise fitting quadratic functions and finite element methods in this application. Specifically, by dividing the activation function into N segments, each segment is approximated by the fitting quadratic function. Calculate the parameters of the quadratic function corresponding to each segment, and store them in the coefficient memory.

在本實施方式中，所述運算裝置10包括暫存器110、匹配單元120、與匹配單元120連接的計算單元130。需要說明的是，儘管未示出，所述運算裝置10中各單元可以由統一的時鐘驅動，以保證處理過程中的計算時序。所述暫存器110包括係數暫存器、分段暫存器和配置暫存器。所述係數暫存器用於存儲用於擬合啟動函數的擬合二次函數的參數。在本實施方式中，所述參數包括二次項係數a、一次項係數b和常數c。利用所述參數可唯一地確定與啟動函數的某一段變數區間對應的擬合二次函數的運算式。例如，如果將所述啟動函數劃分成N段，那麼所述係數暫存器中存儲有N組參數(a _i，b _i，c _i)，其中，1

i

N。所述參數的資料類型為32位浮點型。 In this embodiment, the computing device 10 includes a temporary register 110 , a matching unit 120 , and a computing unit 130 connected to the matching unit 120 . It should be noted that, although not shown, each unit in the computing device 10 may be driven by a unified clock, so as to ensure the calculation sequence in the processing process. The scratchpad 110 includes a coefficient scratchpad, a segment scratchpad and a configuration scratchpad. The coefficient buffer is used to store the parameters of the fitting quadratic function for fitting the start function. In this embodiment, the parameters include a quadratic term coefficient a, a first-order term coefficient b, and a constant c. Using the parameters, the arithmetic expression of the fitting quadratic function corresponding to a certain variable interval of the activation function can be uniquely determined. For example, if the start-up function is divided into N segments, then N groups of parameters ( a _i , b _i , c _i ) are stored in the coefficient temporary register, where 1

i

N. The data type of the parameter is a 32-bit floating point type.

所述分段暫存器用於存儲分段處理後的啟動函數的變數。例如，將所述啟動函數的變數劃分為N份，可以得到N個變數區間。在本實施方式中，所述係數暫存器與所述分段暫存器存在對應關係。所述係數暫存器中存儲的參數構建的擬合二次函數可以表示所述分段暫存器中存儲的變數區間對應的啟動函數。 The segment buffer is used to store the variables of the start function after segment processing. For example, by dividing the variables of the startup function into N parts, N variable intervals can be obtained. In this embodiment, there is a corresponding relationship between the coefficient register and the segment register. The fitting quadratic function constructed by the parameters stored in the coefficient buffer may represent the start function corresponding to the variable interval stored in the segment buffer.

所述配置暫存器用於配置所述擬合二次函數以適用各種類型的啟動函數。在本實施方式中，所述配置暫存器包括使能單元，當使能單元的使能信號為高電平信號“1”時，根據分段暫存器中存儲的分段處理後的啟動函數選擇對應的擬合二次函數；當使能單元的使能信號為0時，確認啟動函數的輸出與輸入相等。 The configuration register is used to configure the fitting quadratic function to be applicable to various types of starting functions. In this implementation manner, the configuration register includes an enabling unit, and when the enable signal of the enabling unit is a high-level signal "1", the start-up after segment processing stored in the segment register is The function selects the corresponding fitting quadratic function; when the enable signal of the enable unit is 0, confirm that the output of the start function is equal to the input.

在一實施方式中，所述暫存器110中還可以存儲啟動函數的變數區間以及該區間對應的擬合二次函數的參數的映射關係。優選地，所述映射關係可以是存儲表的形式。所述存儲表可以採用暫存器堆的形式構成，即每組變數區間索引和對應的擬合函數參數均保存在暫存器中，採用暫存器堆的方式存儲能夠將查閱資料表的複數個輸出同時接入複數個比較器中，從而提高了運算裝置的並行度。所述複數個輸出包括複數個變數區間和對應的擬合二次函數的參數。 In one embodiment, the register 110 may further store a variable interval of the activation function and a mapping relationship of the parameters of the fitting quadratic function corresponding to the interval. Preferably, the mapping relationship may be in the form of a storage table. The storage table can be formed in the form of a temporary memory heap, that is, each group of variable interval indexes and corresponding fitting function parameters are stored in the temporary memory, and the temporary memory heap is used to store the complex numbers that can be used to refer to the data table. The outputs are simultaneously connected to a plurality of comparators, thereby improving the parallelism of the computing device. The plurality of outputs include a plurality of variable intervals and corresponding parameters for fitting a quadratic function.

例如，在一個實施例中，基於sigmoid函數構建的存儲表參見下表1所示。 For example, in one embodiment, the storage table constructed based on the sigmoid function is shown in Table 1 below.

在表1中，包括N個變數區間，每個變數區間對應的是啟動函數的變數區間範圍，二次項係數、一次項係數和常數表示在相應變數區間範圍內與該啟動函數擬合的二次函數的二次項係數、一次項係數和)常數。例如，表1中啟動函數的變數區間範圍被劃分為等間距的複數個分段區間[0,1)、[1,2)等。例如，當變數區間為[0,1)時，對應的擬合二次函數的參數為二次項係數a ₁、一次項係數b ₁和常數c ₁；當變數區間為[1,2)時，對應的擬合二次函數的參數為a ₂、一次項係數b ₂和常數c ₂。 In Table 1, N variable intervals are included, each variable interval corresponds to the variable interval range of the startup function, and the quadratic term coefficient, linear term coefficient and constant represent the quadratic term that fits the startup function within the corresponding variable interval range. The quadratic coefficient, linear coefficient, and) constant of the function. For example, the variable interval range of the startup function in Table 1 is divided into a plurality of equally spaced segment intervals [0, 1), [1, 2) and so on. For example, when the variable interval is [0,1), the parameters of the corresponding fitting quadratic function are quadratic term coefficient a ₁ , linear term coefficient b ₁ and constant c ₁ ; when the variable interval is [1,2), The parameters of the corresponding fitting quadratic function are a ₂ , the first-order coefficient b ₂ and the constant c ₂ .

需要說明的是，所述儲表參可在進行啟動處理時線上構建或動態改變，也可離線構建神經網路中涉及的複數個啟動函數的儲表參，預先存儲於暫存器110中，以在啟動處理過程中動態讀取，優選採用離線方式構建儲表參，以提高啟動處理效率。 It should be noted that the storage table parameters can be constructed online or dynamically changed during the startup process, and the storage table parameters of a plurality of startup functions involved in the neural network can also be constructed offline, and stored in the temporary register 110 in advance, In order to read dynamically during the startup process, it is preferable to construct the stored table parameters in an offline manner, so as to improve the efficiency of the startup process.

所述匹配單元120用於根據啟動函數的輸入變數與變數區間，以確定所述啟動函數的輸入變數落在哪個變數區間，再根據確定的變數區間匹配對應的擬合二次函數的參數。 The matching unit 120 is configured to determine which variable interval the input variable of the startup function falls in according to the input variable and variable interval of the startup function, and then match the parameters of the corresponding fitting quadratic function according to the determined variable interval.

參閱圖2，所述匹配單元120包括複數個比較器121，邏輯單元122和選擇單元123。所述複數個比較器121藉由所述邏輯單元122與選擇單元123連接。圖2示中出了N個比較器，分別為比較器1、比較器2...比較器N-1。 Referring to FIG. 2 , the matching unit 120 includes a plurality of comparators 121 , a logic unit 122 and a selection unit 123 . The plurality of comparators 121 are connected to the selection unit 123 through the logic unit 122 . FIG. 2 shows N comparators, which are respectively Comparator 1, Comparator 2 . . . Comparator N-1.

在本實施方式中，所述比較器121用於將啟動函數的輸入變數與啟動函數的變數區間進行匹配，並根據他們之間的大小關係確定輸出值。當輸入變數x小於變數區間的最大變數x _N時，所述比較器輸出低電平“0”；當輸入變數x大於或等於所述變數區間的最大變數x _N時，所述比較器輸出高電平“1”。 In this embodiment, the comparator 121 is used to match the input variable of the activation function with the variable interval of the activation function, and determine the output value according to the magnitude relationship between them. When the input variable x is less than the maximum variable x _N of the variable interval, the comparator outputs a low level "0"; when the input variable x is greater than or equal to the maximum variable x _N of the variable interval, the comparator outputs a high level level "1".

所述邏輯單元122包括反閘1220、複數個異或閘1221和暫存單元1222。例如，如圖3所示，所述比較器1的輸出分別與所述邏輯單元122的反閘和異或閘1連接；比較器2的輸出分別與所述異或閘1和異或閘2連接；比較器3的輸出分別與所述異或閘2和異或閘3連接；依次類推，比較器N-2的輸出分別與所述異或閘N-3和異或閘N-2連接；比較器N-1的輸出分別與所述異或閘N-2和暫存單元連接。當輸入變數大於所有變數區間的最大值時，所述比較器N-1的輸出結果直接輸入至所述暫存單元1222。 The logic unit 122 includes an inversion gate 1220 , a plurality of XOR gates 1221 and a temporary storage unit 1222 . For example, as shown in FIG. 3 , the output of the comparator 1 is respectively connected with the flip-gate sum of the logic unit 122 The XOR gate 1 is connected; the output of the comparator 2 is respectively connected with the XOR gate 1 and the XOR gate 2; the output of the comparator 3 is connected with the XOR gate 2 and the XOR gate 3 respectively; and so on, compare The output of the comparator N-2 is respectively connected with the XOR gate N-3 and the XOR gate N-2; the output of the comparator N-1 is respectively connected with the XOR gate N-2 and the temporary storage unit. When the input variable is greater than the maximum value of all variable ranges, the output result of the comparator N-1 is directly input to the temporary storage unit 1222 .

例如，比較器2的一個輸入用於接收啟動函數的輸入變數，另外一個輸入用於接收區間變數[1,2)的最大值x ₂，x ₂趨近於2。例如，當輸入變數為1.5時，由於1.5大於1，所以比較器1輸出高電平“1”，由於1.5小於2，比較器2輸出低電平“0”，同樣，由於1.5小於其他區間變數的最大值，所以其他比較器也輸出低電平“0”。而比較器1經過邏輯單元的反閘後輸出低電平“0”，而比較器1的輸出和比較器2的輸出經過邏輯單元的異或閘處理後輸出高電平“1”，比較器2的輸出和比較器3的輸出經過邏輯單元的異或閘後輸出低電平“0”，依次類推，比較器N-2的輸出和比較器N-1的輸出經過邏輯單元的異或閘後輸出低電平“0”。因此，選擇單元選擇與變數區間[1,2)對應的擬合二次函數的參數(a ₂，b ₂，c ₂)。再將選擇單元123選擇的參數輸出至連接的計算單元130。 For example, one input of the comparator 2 is used to receive the input variable of the start function, and the other input is used to receive the maximum value x ₂ of the interval variable [1,2), and x ₂ approaches 2. For example, when the input variable is 1.5, since 1.5 is greater than 1, the comparator 1 outputs a high level "1", since 1.5 is less than 2, the comparator 2 outputs a low level "0", and similarly, since 1.5 is less than other interval variables the maximum value, so other comparators also output low level "0". Comparator 1 outputs a low level "0" after being reversed by the logic unit, while the output of comparator 1 and the output of comparator 2 output a high level "1" after being processed by the exclusive OR gate of the logic unit. The output of 2 and the output of comparator 3 output a low level "0" after passing through the XOR gate of the logic unit, and so on, the output of the comparator N-2 and the output of the comparator N-1 pass through the XOR gate of the logic unit Then output low level "0". Therefore, the selection unit selects the parameters ( a ₂ , b ₂ , c ₂ ) of the fitted quadratic function corresponding to the variable interval [1, 2). The parameters selected by the selection unit 123 are then output to the connected calculation unit 130 .

如圖4，所述計算單元130包括乘法器130和加法器131，其中，乘法器130接收來自於選擇單元123的二次項係數a、一次項係數b與輸入變數x，進行乘法運算，加法器131接收來自於乘法器130的輸出結果和來自於選擇單元123的常數c，執行加法運算，從而該計算單元130獲得輸入變數x對應的函數值，一般性的表示為f(x)=ax ²+bx+c。 As shown in FIG. 4 , the calculation unit 130 includes a multiplier 130 and an adder 131, wherein the multiplier 130 receives the quadratic term coefficient a, the first-order term coefficient b and the input variable x from the selection unit 123, and performs a multiplication operation, and the adder 131 receives the output result from the multiplier 130 and the constant c from the selection unit 123, and performs an addition operation, so that the calculation unit 130 obtains the function value corresponding to the input variable x, generally expressed as f ( x )= ax ² + bx + c .

需要說明的是，在優選的實施例中，每個匹配單元中比較器的數量與變數區間的數量相等，如2所示，當所述啟動函數包括N個變數區間的情況下，每個匹配單元中比較器的數量同樣設置為N個。上述實施例中，資料選擇器、比較器、乘法器和加法器等可採用通用或專用器件實現。 It should be noted that, in a preferred embodiment, the number of comparators in each matching unit is equal to the number of variable intervals, as shown in 2, when the startup function includes N variable intervals, each matching The number of comparators in the unit is also set to N. In the above-mentioned embodiments, the data selector, comparator, multiplier, and adder, etc. can be implemented by general-purpose or special-purpose devices.

在本實施方式中，藉由採用擬合二次函數逼近所述啟動函數，相較於一次函數逼近所述啟動函數，可以帶來更小的誤差。以啟動函數為 f(x)=tanh(x)為例，如圖5所示，藉由二次函數逼近f(x)=tanh(x)的誤差減少50%，明顯優於藉由一次函數逼近f(x)=tanh(x)，更適於神經網路的運算。 In this embodiment, by using a fitted quadratic function to approximate the activation function, a smaller error can be brought about compared to approximating the activation function with a linear function. Taking the activation function as f ( x )=tanh( x ) as an example, as shown in Figure 5, the error of approximating f ( x )=tanh( x ) by a quadratic function is reduced by 50%, which is obviously better than that by a linear function. Approximate f ( x )=tanh( x ), which is more suitable for the operation of neural network.

請參閱圖6，圖6為根據本申請一實施方式的加速啟動函數運算的方法的流程圖。根據不同的需求，所述流程圖中步驟的順序可以改變，某些步驟可以省略。在本實施方式中，所述加速啟動函數運算的方法應用於所述加速啟動函數運算的裝置10中。所述加速啟動函數運算的方法可以包括以下步驟。 Please refer to FIG. 6 . FIG. 6 is a flowchart of a method for accelerating the operation of an activation function according to an embodiment of the present application. According to different requirements, the order of the steps in the flowchart can be changed, and some steps can be omitted. In this embodiment, the method for accelerating the computation of the startup function is applied to the apparatus 10 for accelerating the computation of the startup function. The method for accelerating the operation of the activation function may include the following steps.

步驟S1，接收輸入變數。 Step S1, receiving input variables.

在本實施方式中，所述輸入變數為激勵函數的輸入變數。 In this embodiment, the input variable is the input variable of the excitation function.

步驟S2，比較所述輸入變數與啟動函數的變數區間，得到比較輸出結果。 Step S2, compare the input variable and the variable interval of the activation function to obtain a comparison output result.

在本實施方式中，藉由匹配單元中的比較器比較所述輸入變數與啟動函數的變數區間，得到比較輸出結果。所述比較器用於將啟動函數的輸入變數與啟動函數的變數區間進行匹配，並根據他們之間的大小關係確定輸出值。當輸入變數x小於區間變數的最大值x _N時，所述比較單元輸出低電平“0”；當輸入變數x大於或等於所述區間變數的最大值x _N時，所述比較單元輸出高電平“1”。 In this embodiment, the comparator in the matching unit compares the input variable with the variable range of the activation function to obtain the comparison output result. The comparator is used for matching the input variable of the activation function with the variable interval of the activation function, and determining the output value according to the magnitude relationship between them. When the input variable x is less than the maximum value x _N of the interval variable, the comparison unit outputs a low level "0"; when the input variable x is greater than or equal to the maximum value x _N of the interval variable, the comparison unit outputs a high level level "1".

例如，比較器2的一個輸入用於接收啟動函數的輸入變數，另外一個輸入用於接收區間變數[1,2)的最大值x ₂，x ₂趨近於2。例如，當輸入變數為1.5時，由於1.5大於1，所以比較器1輸出高電平“1”；由於1.5小於2，比較器2輸出低電平“0”，同樣，由於1.5小於其他區間變數的最大值，所以其他比較器也輸出低電平“0”。 For example, one input of the comparator 2 is used to receive the input variable of the start function, and the other input is used to receive the maximum value x ₂ of the interval variable [1,2), and x ₂ approaches 2. For example, when the input variable is 1.5, since 1.5 is greater than 1, the comparator 1 outputs a high level "1"; since 1.5 is less than 2, the comparator 2 outputs a low level "0", and similarly, since 1.5 is less than other interval variables the maximum value, so other comparators also output low level "0".

步驟S3，將所述比較輸出結果進行邏輯運算得到邏輯輸出結果。 Step S3, performing a logical operation on the comparison output result to obtain a logical output result.

在本實施方式中，藉由所述匹配單元中的邏輯單元將所述比較輸出結果進行邏輯運算得到邏輯輸出結果。所述邏輯單元包括反閘、複數個異或閘和暫存單元。例如，如圖3所示，所述比較器1的輸出分別與所述邏輯單元的反閘和異或閘1連接；比較器2的輸出分別與所述異或閘1和異或閘2連接；比較器3的輸出分別與所述異或閘2和異或閘3連接；依次類推，比較器N-2的輸出分別與所述異或閘N-3和異或閘N-2連接；比較器N-1的輸出分別與所述異或閘N-2和暫存單元連接。 In this embodiment, a logic output result is obtained by performing a logic operation on the comparison output result by the logic unit in the matching unit. The logic unit includes an anti-gate, a plurality of XOR gates and a temporary storage unit. For example, as shown in FIG. 3 , the output of the comparator 1 is respectively connected to the inverse gate and the XOR gate 1 of the logic unit; the output of the comparator 2 is respectively connected to the XOR gate 1 and the XOR gate 2 ; The output of the comparator 3 is connected with the XOR gate 2 and the XOR gate 3 respectively; and so on, the output of the comparator N-2 The output of the comparator N-1 is respectively connected to the XOR gate N-3 and the XOR gate N-2; the output of the comparator N-1 is respectively connected to the XOR gate N-2 and the temporary storage unit.

例如，當輸入變數為1.5時，比較器1經過邏輯單元的反閘後輸出低電平“0”，而比較器1的輸出和比較器2的輸出經過邏輯單元的異或閘處理後輸出高電平“1”，比較器2的輸出和比較器3的輸出經過邏輯單元的異或閘後輸出低電平“0”，依次類推，比較器N-2的輸出和比較器N-1的輸出經過邏輯單元的異或閘後輸出低電平“0”。 For example, when the input variable is 1.5, the comparator 1 outputs a low level "0" after the reverse gate of the logic unit, while the output of the comparator 1 and the output of the comparator 2 output high after the exclusive OR gate processing of the logic unit Level "1", the output of the comparator 2 and the output of the comparator 3 output a low level "0" after passing through the exclusive OR gate of the logic unit, and so on, the output of the comparator N-2 and the output of the comparator N-1 The output outputs a low level "0" after passing through the exclusive OR gate of the logic unit.

步驟S4，根據所述邏輯輸出結果確定待計算的變數區間。 Step S4: Determine the variable interval to be calculated according to the logical output result.

在本實施方式中，當邏輯輸出結果為“1”時，對應的比較器的一個輸入的區間變數所在的變數區間為所述待計算的變數區間。 In this embodiment, when the logical output result is "1", the variable interval where the interval variable of one input of the corresponding comparator is located is the variable interval to be calculated.

例如，當輸入變數為1.5時，比較器1經過邏輯單元的反閘後輸出低電平“0”，而比較器1的輸出和比較器2的輸出經過邏輯單元的異或閘處理後輸出高電平“1”，比較器2的輸出和比較器3的輸出經過邏輯單元的異或閘後輸出低電平“0”。則比較器2的一個輸入的區間變數所在的變數區間[1,2)為所述待計算的變數區間。 For example, when the input variable is 1.5, the comparator 1 outputs a low level "0" after the reverse gate of the logic unit, while the output of the comparator 1 and the output of the comparator 2 output high after the exclusive OR gate processing of the logic unit When the level is "1", the output of the comparator 2 and the output of the comparator 3 output a low level "0" after passing through the exclusive OR gate of the logic unit. Then the variable interval [1, 2) where an input interval variable of the comparator 2 is located is the variable interval to be calculated.

步驟S5，根據待計算的變數區間查詢存儲表，得到擬合二次函數的參數。 Step S5, query the storage table according to the variable interval to be calculated, and obtain the parameters of the fitting quadratic function.

在本實施方式中，藉由選擇單元根據所述待計算的變數區間查詢存儲表，得到擬合二次函數的參數。例如，選擇單元選擇與變數區間[1,2)對應的擬合二次函數的參數(a ₂，b ₂，c ₂)。再將選擇單元選擇的參數輸出至連接的計算單元。 In this embodiment, the selection unit queries the storage table according to the variable interval to be calculated, so as to obtain the parameters of the fitting quadratic function. For example, the selection unit selects the parameters ( a ₂ , b ₂ , c ₂ ) of the fitted quadratic function corresponding to the variable interval [1,2). The parameters selected by the selection unit are then output to the connected calculation unit.

步驟S6，根據所述參數完成針對所述輸入變數的運算。 Step S6, completing the operation on the input variable according to the parameter.

所述計算單元根據參數(a ₂，b ₂，c ₂)代入擬合二次函數f(x)=a ₂ x ²+b ₂ x+c ₂中，計算完成針對所述輸入變數的運算。 The calculation unit substitutes the parameters ( a ₂ , b ₂ , c ₂ ) into the fitting quadratic function f ( x )= a ₂ x ² + b ₂ x + c ₂ , and completes the calculation for the input variable.

圖7為本發明加速啟動函數運算系統較佳實施例中的功能模組圖。 FIG. 7 is a functional module diagram in a preferred embodiment of the accelerated activation function computing system of the present invention.

在一些實施例中，所述加速啟動函數運算系統200(為了便於描述，下文簡稱“運算系統200”)運行於電子設備1中。所述運算系統200可以包括複數個由程式碼段所組成的功能模組。所述運算系統200中的各個程式段的程式碼可以存儲於記憶體中，並由至少一個處理器所執行，以實現啟動處理過程。 In some embodiments, the acceleration activation function computing system 200 (hereinafter referred to as “computing system 200 ” for convenience of description) runs in the electronic device 1 . The computing system 200 may include a plurality of functional modules composed of program code segments. The program codes of each program segment in the operating system 200 can be stored in the memory and executed by at least one processor to realize the start-up process.

本實施例中，所述運算系統200根據其所執行的功能，可以被劃分為複數個功能模組。所述功能模組可以包括：接收模組201、比較模組202、處理模組203、確定模組204和查詢模組205。本發明所稱的模組是指一種能夠被至少一個處理器所執行並且能夠完成固定功能的一系列電腦程式段，其存儲在記憶體中。在一些實施例中，關於各模組的功能將在後續的實施例中詳述。 In this embodiment, the computing system 200 can be divided into a plurality of functional modules according to the functions performed by the computing system 200 . The functional modules may include: a receiving module 201 , a comparing module 202 , a processing module 203 , a determining module 204 and a querying module 205 . The module referred to in the present invention refers to a series of computer program segments that can be executed by at least one processor and can perform fixed functions, and are stored in a memory. In some embodiments, the function of each module will be described in detail in subsequent embodiments.

所述接收模組201用於接收輸入變數。 The receiving module 201 is used for receiving input variables.

所述比較模組202用於比較所述輸入變數與啟動函數的變數區間，得到比較輸出結果。 The comparison module 202 is used for comparing the input variable and the variable interval of the activation function to obtain a comparison output result.

所述處理模組203用於將所述比較輸出結果進行邏輯運算得到邏輯輸出結果。 The processing module 203 is configured to perform a logical operation on the comparison output result to obtain a logical output result.

在本實施方式中，藉由所述匹配單元中的邏輯單元將所述比較輸出結果進行邏輯運算得到邏輯輸出結果。所述邏輯單元包括反閘、複數個異或閘和暫存單元。例如，如圖3所示，所述比較器1的輸出分別與所述邏輯單元的反閘和異或閘1連接；比較器2的輸出分別與所述異或閘1和異或閘2連接；比較器3的輸出分別與所述異或閘2和異或閘3連接；依次類推，比較器N-2的輸出分別與所述異或閘N-3和異或閘N-2連接；比較器N-1的輸出分別與所述異或閘N-2和暫存單元連接。 In this embodiment, a logic output result is obtained by performing a logic operation on the comparison output result by the logic unit in the matching unit. The logic unit includes an anti-gate, a plurality of XOR gates and a temporary storage unit. For example, as shown in FIG. 3 , the output of the comparator 1 is respectively connected to the inverse gate and the XOR gate 1 of the logic unit; the output of the comparator 2 is respectively connected to the XOR gate 1 and the XOR gate 2 The output of comparator 3 is connected with described XOR gate 2 and XOR gate 3 respectively; By analogy, the output of comparator N-2 is connected with described XOR gate N-3 and XOR gate N-2 respectively; The output of the comparator N-1 is respectively connected with the exclusive OR gate N-2 and the temporary storage unit.

所述確定模組204用於根據所述邏輯輸出結果確定待計算的變數區間。 The determining module 204 is used for determining the variable interval to be calculated according to the logical output result.

所述查詢模組205根據待計算的變數區間查詢存儲表，得到擬合二次函數的參數。 The query module 205 queries the storage table according to the variable interval to be calculated, and obtains the parameters of the fitting quadratic function.

所述處理模組203還用於根據所述參數完成針對所述輸入變數的運算。 The processing module 203 is further configured to complete the operation on the input variable according to the parameter.

所述計算單元根據參數(a ₂，b ₂，c ₂)代入擬合二次函數(x)=a ₂ x ²+b ₂ x+c ₂中，計算完成針對所述輸入變數的運算。 The calculation unit substitutes the parameters ( a ₂ , b ₂ , c ₂ ) into the fitting quadratic function ( x )= a ₂ x ² + b ₂ x + c ₂ , and completes the calculation for the input variable.

上述以軟體功能模組的形式實現的集成的單元，可以存儲在一個電腦可讀取存儲介質中。上述軟體功能模組存儲在一個存儲介質中，包括若干指令用以使得一台電子設備(可以是個人電腦，雙屏設備，或者網路設備等)或處理器(processor)執行本發明各個實施例所述方法的部分。 The above-mentioned integrated units implemented in the form of software function modules can be stored in a computer-readable storage medium. The above-mentioned software function module is stored in a storage medium, and includes several instructions to enable an electronic device (which may be a personal computer, a dual-screen device, or a network device, etc.) or a processor (processor) to execute various embodiments of the present invention part of the method.

圖8為本發明實施例三提供的電子設備的示意圖。 FIG. 8 is a schematic diagram of an electronic device according to Embodiment 3 of the present invention.

所述電子設備1包括：記憶體11、至少一個處理器12、存儲在所述記憶體11中並可在所述至少一個處理器12上運行的電腦可讀存儲介質13及至少一條通訊匯流排14。 The electronic device 1 includes: a memory 11, at least one processor 12, a computer-readable storage medium 13 stored in the memory 11 and executable on the at least one processor 12, and at least one communication bus 14.

所述至少一個處理器12執行所述電腦可讀存儲介質13時實現上述運算方法實施例中的步驟。 When the at least one processor 12 executes the computer-readable storage medium 13, the steps in the foregoing computing method embodiments are implemented.

示例性的，所述電腦可讀存儲介質13可以被分割成一個或複數個模組/單元，所述一個或者複數個模組/單元被存儲在所述記憶體11中，並由所述至少一個處理器12執行，以完成本發明。所述一個或複數個模組/單元可以是能夠完成特定功能的一系列電腦程式指令段，所述指令段用於描述所述電腦可讀存儲介質13在所述電子設備1中的執行過程。 Exemplarily, the computer-readable storage medium 13 may be divided into one or more modules/units, the one or more modules/units are stored in the memory 11, and are stored in the at least one module/unit. A processor 12 executes to accomplish the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, and the instruction segments are used to describe the execution process of the computer-readable storage medium 13 in the electronic device 1 .

所述電子設備1可以是手機、平板電腦、個人數位助理(Personal Digital Assistant，PDA)等安裝有應用程式的設備。本領域技術人員可以理解，所述示意圖7僅僅是電子設備1的示例，並不構成對電子設備1的限定，可以包括比圖示更多或更少的部件，或者組合某些部件，或者不同的部件，例如所述電子設備1還可以包括輸入輸出設備、網路接入設備、匯流排等。 The electronic device 1 may be a mobile phone, a tablet computer, a personal digital assistant (Personal Digital Assistant, PDA) and other devices installed with application programs. Those skilled in the art can understand that the schematic diagram 7 is only an example of the electronic device 1, and does not constitute a limitation on the electronic device 1, and may include more or less components than the one shown, or combine some components, or different For example, the electronic device 1 may also include input and output devices, network access devices, bus bars, and the like.

所述至少一個處理器12可以是中央處理單元(Central Processing Unit，CPU)，還可以是其他通用處理器、數位訊號處理器(Digital Signal Processor，DSP)、專用積體電路(Application Specific Integrated Circuit，ASIC)、現成可程式設計閘陣列(Field-Programmable Gate Array，FPGA)或者其他可程式設計邏輯器件、分立門或者電晶體邏輯器件、分立硬體元件等。所述處理器12可以是微處理器或者所述處理器12也可以是任何常規的處理器等，所述處理器12是所述電子設備1的控制中心，利用各種介面和線路連接整個電子設備1的各個部分。 The at least one processor 12 may be a central processing unit (Central Processing Unit, CPU), and may also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The processor 12 can be a microprocessor or the processor 12 can also be any conventional processor, etc. The processor 12 is the control center of the electronic device 1, and uses various interfaces and lines to connect the entire electronic device. 1 parts.

所述記憶體11可用於存儲所述電腦可讀存儲介質13和/或模組/單元，所述處理器12藉由運行或執行存儲在所述記憶體11內的電腦程式和/或模組/單元，以及調用存儲在記憶體11內的資料，實現所述電子設備1的各種功能。所述記憶體11可主要包括存儲程式區和存儲資料區，其中，存儲程式區可存儲作業系統、至少一個功能所需的應用程式(比如聲音播放功能、圖像播放功能等)等；存儲資料區可存儲根據電子設備1的使用所創建的資料(比如音訊資料等)等。此外，記憶體11可以包括非易失性記憶體，例如硬碟、記憶體、插接式硬碟，智慧存儲卡(Smart Media Card，SMC)，安全數位(Secure Digital，SD)卡，快閃記憶體卡(Flash Card)、至少一個磁碟記憶體件、快閃記憶體器件、或其他非易失性固態記憶體件。 The memory 11 can be used to store the computer-readable storage medium 13 and/or modules/units, and the processor 12 can run or execute computer programs and/or modules stored in the memory 11 /unit, and call the data stored in the memory 11 to realize various functions of the electronic device 1 . The memory 11 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function (such as a sound playback function, an image playback function, etc.), etc.; storage data The area can store materials (such as audio materials, etc.) created according to the use of the electronic device 1, and the like. In addition, the memory 11 may include non-volatile memory, such as hard disk, memory, plug-in hard disk, Smart Media Card (SMC), Secure Digital (SD) card, flash memory A memory card (Flash Card), at least one disk memory device, flash memory device, or other non-volatile solid state memory device.

所述記憶體11中存儲有程式碼，且所述至少一個處理器12可調用所述記憶體11中存儲的程式碼以執行相關的功能。例如，圖7中所述的各個模組(接收模組201、比較模組202、處理模組203、確定模組204和查詢模組205)是存儲在所述記憶體11中的程式碼，並由所述至少一個處理器12所執行，從而實現所述各個模組的功能以實現啟動處理的目的。 The memory 11 stores program codes, and the at least one processor 12 can call the program codes stored in the memory 11 to execute related functions. For example, each module (receiving module 201, comparing module 202, processing module 203, determining module 204, and querying module 205) described in FIG. 7 is a program code stored in the memory 11, and executed by the at least one processor 12, so as to realize the functions of the various modules to achieve the purpose of starting processing.

所述電子設備1集成的模組/單元如果以軟體功能單元的形式實現並作為獨立的產品銷售或使用時，可以存儲在一個電腦可讀取存儲介質中。基於這樣的理解，本發明實現上述實施例方法中的全部或部分流程，也可以藉由電腦程式來指令相關的硬體來完成，所述的電腦程式可存儲於一電腦可讀存儲介質中，所述電腦程式在被處理器執行時，可實現上述各個方法實施例的步驟。其中，所述電腦程式包括電腦程式代碼，所述電腦程式代碼可以為原始程式碼形式、物件代碼形式、可執行檔或某些中間形式等。所述電腦可讀介質可以包括：能夠攜帶所述電腦程式代碼的任何實體或裝置、記錄介質、U盤、移動硬碟、磁碟、光碟、電腦記憶體、唯讀記憶體(ROM，Read-Only Memory)等。 If the modules/units integrated in the electronic device 1 are implemented in the form of software functional units and sold or used as independent products, they may be stored in a computer-readable storage medium. Based on this understanding, the present invention implements all or part of the processes in the methods of the above embodiments, and can also be implemented by The computer program instructs the relevant hardware to complete, and the computer program can be stored in a computer-readable storage medium. When the computer program is executed by the processor, the steps of the above-mentioned method embodiments can be implemented. Wherein, the computer program includes computer program code, and the computer program code may be in the form of original code, object code, executable file, or some intermediate form. The computer-readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a USB flash drive, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory (ROM, Read-Only Memory); Only Memory) etc.

在本發明所提供的幾個實施例中，應所述理解到，所揭露的電子設備和方法，可以藉由其它的方式實現。例如，以上所描述的電子設備實施例僅僅是示意性的，例如，所述單元的劃分，僅僅為一種邏輯功能劃分，實際實現時可以有另外的劃分方式。 In the several embodiments provided by the present invention, it should be understood that the disclosed electronic devices and methods may be implemented in other manners. For example, the electronic device embodiments described above are only illustrative. For example, the division of the units is only a logical function division, and other division methods may be used in actual implementation.

另外，在本發明各個實施例中的各功能單元可以集成在相同處理單元中，也可以是各個單元單獨物理存在，也可以兩個或兩個以上單元集成在相同單元中。上述集成的單元既可以採用硬體的形式實現，也可以採用硬體加軟體功能模組的形式實現。 In addition, each functional unit in each embodiment of the present invention may be integrated in the same processing unit, or each unit may exist physically alone, or two or more units may be integrated in the same unit. The above-mentioned integrated units can be implemented in the form of hardware, or can be implemented in the form of hardware plus software function modules.

對於本領域技術人員而言，顯然本發明不限於上述示範性實施例的細節，而且在不背離本發明的精神或基本特徵的情況下，能夠以其他的具體形式實現本發明。因此，無論從哪一點來看，均應將實施例看作是示範性的，而且是非限制性的，本發明的範圍由所附請求項而不是上述說明限定，因此旨在將落在請求項的等同要件的含義和範圍內的所有變化涵括在本發明內。不應將請求項中的任何附圖標記視為限制所涉及的請求項。此外，顯然“包括”一詞不排除其他單元或，單數不排除複數。系統請求項中陳述的複數個單元或裝置也可以由一個單元或裝置藉由軟體或者硬體來實現。第一，第二等詞語用來表示名稱，而並不表示任何特定的順序。 It will be apparent to those skilled in the art that the present invention is not limited to the details of the above-described exemplary embodiments, but that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics of the invention. Therefore, the embodiments are to be regarded in all respects as illustrative and not restrictive, and the scope of the present invention is defined by the appended claims rather than the foregoing description, and is therefore intended to fall within the scope of the claims. All changes within the meaning and range of the equivalents of , are included in the present invention. Any reference sign in a claim should not be construed as limiting the claim to which it relates. Furthermore, it is clear that the word "comprising" does not exclude other units or, and the singular does not exclude the plural. A plurality of units or means stated in the system claim can also be implemented by one unit or means by software or hardware. The terms first, second, etc. are used to denote names and do not denote any particular order.

最後應說明的是，以上實施例僅用以說明本發明的技術方案而非限制，儘管參照較佳實施例對本發明進行了詳細說明，本領域的普通技術人員應當理解，可以對本發明的技術方案進行修改或等同替換，而不脫離本發明技術方案的精神範圍。 Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and not to limit them. Although the present invention has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the present invention can be Modifications or equivalent substitutions can be made without departing from the technology of the present invention. the spiritual scope of the technical programme.

120:匹配單元 120: Matching unit

121:比較器 121: Comparator

122:邏輯單元 122: Logic Unit

123:選擇單元 123:Select unit

Claims

A device for accelerating the operation of a startup function, the device comprising: a temporary memory for storing a storage table, wherein the storage table describes the variable interval of the startup function and the parameters of the fitting quadratic function corresponding to the interval The mapping relationship; the matching unit includes a plurality of comparators, a logic unit and a selection unit, the plurality of comparators are connected to the selection unit through the logic unit, and the plurality of comparators are used for the activation function. The input variable is matched with the variable interval of the startup function to obtain a comparison output result, and the logic unit performs a logic operation according to the comparison output result to obtain a logic output result, and determines the to-be-calculated according to the logic output result. the variable interval; the selection unit is used to query the storage table according to the variable interval to be calculated, and obtain the parameters of the fitting quadratic function; the calculation unit is connected to the matching unit and is used for according to the The parameter completes the operation for the input variable.

The device for accelerating the start-up function operation according to claim 1, wherein the logic unit comprises an anti-gate, a plurality of XOR gates and a temporary storage unit.

The device for accelerating the operation of a starting function according to claim 1, wherein the parameters for fitting the quadratic function include quadratic term coefficients, linear term coefficients and constants.

The apparatus for accelerating the operation of the activation function according to claim 3, wherein the calculation unit includes a multiplier and an adder, wherein the multiplier receives the quadratic term coefficient, the first-order term coefficient and the input from the selection unit The variable is multiplied; the adder receives the output result from the multiplier and the constant from the selection unit, and performs the addition.

The device for accelerating the operation of an activation function according to claim 1, wherein the activation function includes a function

or f ( x ) = max (0, x ).

A method for accelerating the operation of a startup function, the method comprising: receiving an input variable; comparing the input variable and the variable interval of the startup function to obtain a comparison output result; performing a logical operation on the comparison output result to obtain a logic output result; The variable interval to be calculated is determined according to the logical output result; the storage table is queried according to the variable interval to be calculated to obtain parameters for fitting a quadratic function; and the operation of the input variable is completed according to the parameters.

The method for accelerating the operation of the activation function according to claim 6, wherein the storage table describes the variable interval of the activation function and the mapping relationship of the parameters of the fitting quadratic function corresponding to the interval.

The method for accelerating the start-up function operation according to claim 7, wherein the parameters of the fitting quadratic function include quadratic term coefficients, linear term coefficients and constants.

A computer-readable storage medium on which a computer program is stored, wherein, when the computer program is executed by a processor, the method for accelerating the operation of a startup function according to any one of claim 6 to claim 8 is implemented.