TWI763503B

TWI763503B - Artificial neural network system using non-affine transformation technology for neuron cell body and method for applying the same

Info

Publication number: TWI763503B
Application number: TW110118865A
Authority: TW
Inventors: 楊青天
Original assignee: 東旭能興業有限公司
Priority date: 2021-05-25
Filing date: 2021-05-25
Publication date: 2022-05-01
Also published as: TW202247047A

Abstract

一種神經元胞體採非仿射轉換技術之人工神經網路系統，其包含一包括多個第一神經元的輸入層用於接收一組輸入資料、至少一包括多個第二神經元的隱藏層，及一包括多個第三神經元的輸出層。各該第二神經元包括一胞體、自該胞體往前延伸並接收來自前一層之神經元中至少部分之輸入資料的多個突觸，及自該胞體往後延伸的一軸突，各該突觸具有一鍵結權值，該胞體採用一非線性函數來針對所接收的輸入資料搭配對應突觸的鍵結權值進行非線性轉換，轉換後的資料經該軸突往下一層傳遞。An artificial neural network system using non-affine transformation technology for neuron cell body, which includes an input layer including a plurality of first neurons for receiving a set of input data, at least one hidden layer including a plurality of second neurons layer, and an output layer including a plurality of third neurons. Each of the second neurons includes a cell body, a plurality of synapses extending forward from the cell body and receiving at least a portion of input data from neurons in the previous layer, and an axon extending rearward from the cell body, Each of the synapses has a bond weight, the cell body uses a nonlinear function to perform nonlinear transformation for the received input data and the bond weight of the corresponding synapse, and the converted data goes down through the axon. One layer transfer.

Description

Artificial neural network system using non-affine transformation technology for neuron cell body and method for applying the same

本發明是有關於一種人工神經網路系統及其應用，特別是指一種在神經元(neuron)的胞體(cell body)中採用非線性函數來取代傳統的線性仿射函數的人工神經網路系統及其應用。The present invention relates to an artificial neural network system and its application, in particular to an artificial neural network in which a nonlinear function is used in the cell body of a neuron to replace the traditional linear affine function system and its applications.

人工神經網路(Artificial Neural Network)是一種模仿大腦運作的演算法，其架構包含接收資料的輸入層、傳遞並處理資料的隱藏層，以及輸出演算結果的輸出層。每個層都包含許多神經元。Artificial Neural Network is an algorithm that imitates the operation of the brain. Its architecture includes an input layer that receives data, a hidden layer that transmits and processes data, and an output layer that outputs the result of the calculation. Each layer contains many neurons.

人工神經網路之隱藏層的神經元，一般在胞體皆採用仿射函數(affine function)來將輸入層或前一層的資料進行轉換。轉換後的資料接著傳遞到細胞元的軸突(axon)，在軸突處經活化函數(activation function)運算後輸出。The neurons of the hidden layer of the artificial neural network generally use an affine function in the cell body to convert the data of the input layer or the previous layer. The converted data is then transmitted to the axon of the cell, where it is output by an activation function.

前述隱藏層的每一個神經元胞體所採用仿射函數通常以[式一]表示：

……………………[式一] The affine function used by each neuron cell body of the aforementioned hidden layer is usually expressed by [Equation 1]:

………………[Formula 1]

其中，

代表n個當中第i個輸入層或前一層神經元的輸入值，

代表與該第i個輸入層或前一層神經元的鍵結權值，b為該神經元的偏權值(bias)。 in,

represents the input value of the i-th input layer or the previous layer of neurons among the n,

Represents the bonding weight with the i-th input layer or the previous layer neuron, and b is the bias weight of the neuron.

人工神經網路的訓練，大多是針對該隱藏層的各該神經元的仿射函數的鍵結權值

進行調整，使得來自輸入層或前一層神經元的有些輸入值影響降低、有些影響增大。訓練完成的人工神經網路能對待分析資料產生一輸出矩陣，該輸出矩陣代表一辨識分類結果。然而，當該人工神經網路用來解決非線性問題或者當該待分析資料特徵高度重疊時，由於仿射函數為線性函數，導致人工神經網路系統無法正確辨識分類，往往需要更多的隱藏層或者更多的神經元才有機會提高辨識分類的正確率。 The training of artificial neural network is mostly for the bonding weight of the affine function of each neuron in the hidden layer

Make adjustments so that some input values from the input layer or neurons in the previous layer have less influence and some more. The trained artificial neural network can generate an output matrix for the data to be analyzed, and the output matrix represents an identification classification result. However, when the artificial neural network is used to solve nonlinear problems or when the features of the data to be analyzed are highly overlapping, since the affine function is a linear function, the artificial neural network system cannot correctly identify the classification, and often requires more hidden Layers or more neurons have the opportunity to improve the accuracy of recognition and classification.

除此之外，也有許多人工神經網路是藉助於活化函數的選用及調整來彌補仿射函數之不足。常見的活化函數例如非線性雙曲邏輯函數(Log-Sigmoid Function)或者非線性雙曲正切函數(Tan-Sigmoid Function)，用於將資料做非線性轉換，以期達到讓資料訓練較快收斂以及提高正確率的目的。In addition, there are many artificial neural networks that make up for the lack of affine functions by means of the selection and adjustment of activation functions. Common activation functions, such as nonlinear hyperbolic logic function (Log-Sigmoid Function) or nonlinear hyperbolic tangent function (Tan-Sigmoid Function), are used to perform nonlinear transformation of data, in order to achieve faster convergence and improved data training. purpose of accuracy.

因此，本發明之目的，在於提供一種神經元胞體採非仿射轉換技術之人工神經網路系統，能對於各種非線性問題提供穩定的辨識分類的正確率。Therefore, the purpose of the present invention is to provide an artificial neural network system using non-affine transformation technology for neuron cell bodies, which can provide stable identification and classification accuracy for various nonlinear problems.

本發明人工神經網路系統包括一輸入層、至少一隱藏層及一輸出層。該輸入層包括多個第一神經元用於接收一組輸入資料，隱藏層包括多個第二神經元，輸出層包括多個第三神經元。該輸入層、該至少一隱藏層與該輸出層前後依序相連排列，且各該第二神經元包括一胞體、自該胞體往前延伸並接收來自前一層之神經元中至少部分之輸入資料的多個突觸，及自該胞體往後延伸的一軸突，各該突觸具有一鍵結權值，該胞體採用一非線性函數來針對所接收的輸入資料搭配對應突觸的鍵結權值進行非線性轉換，轉換後的資料經該軸突往下一層傳遞。The artificial neural network system of the present invention includes an input layer, at least one hidden layer and an output layer. The input layer includes a plurality of first neurons for receiving a set of input data, the hidden layer includes a plurality of second neurons, and the output layer includes a plurality of third neurons. The input layer, the at least one hidden layer, and the output layer are arranged in sequence, and each of the second neurons includes a cell body, extending forward from the cell body and receiving at least part of the neurons from the previous layer. A plurality of synapses for input data, and an axon extending backward from the cell body, each of the synapses has a bond weight, and the cell body uses a nonlinear function to match the corresponding synapse for the received input data The non-linear transformation is performed on the bond weights of , and the transformed data is transmitted to the next layer through the axon.

其中各該第二神經元之胞體所採用的非線性函數是一範數函數，即對各該輸入資料與其對應之鍵結權值計算其範數。The nonlinear function adopted by the cell body of each second neuron is a norm function, that is, the norm is calculated for each input data and its corresponding bonding weight.

其中各該第二神經元之胞體所採用的非線性函數也可以是一封閉曲線函數。The nonlinear function adopted by the cell body of each second neuron may also be a closed curve function.

其中各該第一神經元所接收的該組輸入資料預先進行正規化，使得所有輸入資料之數值分布於0至1間。甚至各該第二神經元的該等突觸的鍵結權值預設為介於0至1間。The set of input data received by each first neuron is pre-normalized, so that the values of all input data are distributed between 0 and 1. Even the bond weights of the synapses of the second neurons are preset to be between 0 and 1.

本發明之另一目的，在於提供一種應用人工神經網路系統之方法，能對於各種非線性問題提供穩定的辨識分類的正確率。Another object of the present invention is to provide a method for applying an artificial neural network system, which can provide stable identification and classification accuracy for various nonlinear problems.

本發明應用人工神經網路系統之方法包括透過一輸入層的多個第一神經元接收一組輸入資料；透過一隱藏層針對所接收的輸入資料進行非線性轉換，其中該隱藏層包括多個第二神經元，且各該第二神經元包括一胞體、自該胞體往前延伸並接收來自前一層的輸入資料的多個突觸，及自該胞體往後延伸的一軸突，各該突觸具有一鍵結權值，該胞體採用一非線性函數來針對所接收的輸入資料搭配對應突觸的鍵結權值進行非線性轉換；及透過一輸出層的多個第三神經元從該隱藏層的多個第二神經元的軸突接收轉換後的資料，處理後經該等第三神經元輸出一分析結果。The method for applying the artificial neural network system of the present invention includes receiving a set of input data through a plurality of first neurons in an input layer; performing nonlinear transformation on the received input data through a hidden layer, wherein the hidden layer includes a plurality of second neurons, and each of the second neurons includes a cell body, a plurality of synapses extending forward from the cell body and receiving input from a previous layer, and an axon extending backward from the cell body, Each of the synapses has a bond weight, and the cell body adopts a nonlinear function to perform nonlinear transformation for the received input data with the bond weight of the corresponding synapse; and through a plurality of third The neuron receives the transformed data from the axons of a plurality of second neurons in the hidden layer, and outputs an analysis result through the third neurons after processing.

其中各該第二神經元之胞體是對各該輸入資料與其對應之鍵結權值經一範數函數運算後輸出。Wherein, the cell body of each second neuron is output after performing a norm function operation on each input data and its corresponding bond weight.

其中各該第二神經元之胞體是對各該輸入資料與其對應之鍵結權值經一封閉曲線函數運算後輸出。The cell body of each second neuron is outputted by a closed curve function operation on each input data and its corresponding bond weight.

其中，該應用人工神經網路系統之方法，還在該輸入層的多個第一神經元接收該組輸入資料前，針對該組輸入資料預先進行正規化，使得所有輸入資料之數值分布於0至1間。Wherein, in the method of applying the artificial neural network system, before the plurality of first neurons of the input layer receive the set of input data, the set of input data is pre-normalized, so that the values of all input data are distributed at 0 to 1 room.

本發明之功效在於：在神經網路系統的隱藏層的細胞元胞體中使用非線性函數取代一般的仿射函數，對於各種非線性問題能夠較穩定地辨識分類，減少過度學習(over fitting)的狀況。The effect of the present invention lies in: using a nonlinear function in the cell body of the hidden layer of the neural network system to replace the general affine function, it can stably identify and classify various nonlinear problems and reduce over-learning (over fitting) condition.

在本發明被詳細描述之前，應當注意在以下的說明內容中，類似的元件是以相同的編號來表示。Before the present invention is described in detail, it should be noted that in the following description, similar elements are designated by the same reference numerals.

參閱圖1，本發明神經元胞體採非仿射轉換技術之人工神經網路系統100，在一實施例中，可藉由一處理器91以及儲存有程式指令且當處理器91執行該等指令時組配來實現特定功能的電腦可讀媒體92來實現。在其他實施例，也可以是利用例如場域可編程邏輯閘陣列(field-programmable gate array，簡稱FPGA)或系統單晶片(system on chip) 等硬體來實現，並且可採用單一裝置或分散式裝置來執行功能。Referring to FIG. 1 , an artificial neural network system 100 using non-affine transformation technology for neuron cell bodies of the present invention, in one embodiment, can be implemented by a processor 91 and stored program instructions and when the processor 91 executes the The instructions are implemented by a computer-readable medium 92 that is configured to implement the specified functions. In other embodiments, it can also be implemented by hardware such as field-programmable gate array (FPGA) or system on chip, and can use a single device or distributed device to perform the function.

該神經元胞體採非仿射轉換技術之神經網路系統100的架構及訓練流程分別如圖2、3所示，神經網路系統100架構包括前後依序相連排列的一用於接收輸入資料的輸入層1、至少一隱藏層2及一輸出層3。圖2中僅以一層隱藏層2舉例說明，但不以此為限。該輸入層1包括m個第一神經元11、該隱藏層2包括n個第二神經元21，且該輸出層3包括L個第三神經元31。The structure and training process of the neural network system 100 using the non-affine transformation technology for the neuron cell body are shown in Figures 2 and 3 respectively. input layer 1, at least one hidden layer 2 and an output layer 3. In FIG. 2, only one hidden layer 2 is used as an example for illustration, but it is not limited thereto. The input layer 1 includes m first neurons 11 , the hidden layer 2 includes n second neurons 21 , and the output layer 3 includes L third neurons 31 .

配合參閱圖4，由於人工神經網路的神經元是模擬生物神經元，因此本文神經元亦採用生物神經元的用語說明。各該第二神經元21包括一胞體211、自該胞體211往前延伸並接收來自前一層(以圖2範例來說，即輸入層1)之神經元11之輸入資料

的多個突觸212，及自該胞體211往後延伸的一軸突213。i=1~m，

代表輸入層1的第i個神經元11的資料，輸入層1的神經元11共m個。圖2是以全連接網路(fully connected network)舉例說明但不以此為限，在此情況下，突觸212也有m條。各該突觸212具有一鍵結權值

，其中j=1~n，代表隱藏層2的第j個神經元21的突觸212。 Referring to FIG. 4 , since the neurons of the artificial neural network are simulated biological neurons, the terminology of biological neurons is also used in this paper to describe the neurons. Each of the second neurons 21 includes a cell body 211, extending forward from the cell body 211 and receiving input data from the neurons 11 of the previous layer (in the example of FIG. 2, the input layer 1).

A plurality of synapses 212 , and an axon 213 extending backward from the cell body 211 . i=1~m,

It represents the data of the ith neuron 11 of the input layer 1, and there are m neurons 11 in the input layer 1. FIG. 2 is an example of a fully connected network (fully connected network), but it is not limited thereto. In this case, there are also m synapses 212 . Each of the synapses 212 has a bond weight

, where j=1~n, represents the synapse 212 of the jth neuron 21 of the hidden layer 2.

該隱藏層2的神經元21之胞體211，採用非線性函數來針對所接收的輸入資料

搭配對應突觸212的鍵結權值

進行非線性轉換，轉換後的資料經該軸突213往下一層傳遞。該非線性函數可以是一範數(norm)函數，在本實施例，以[式二]舉例說明隱藏層2之第j個神經元21之胞體211的運算式：

………………[式二] The cell body 211 of the neuron 21 of the hidden layer 2 uses a nonlinear function to respond to the received input data

With the bond weight corresponding to synapse 212

Non-linear transformation is performed, and the transformed data is transmitted to the next layer through the axon 213 . The nonlinear function may be a norm function. In this embodiment, [Equation 2] is used as an example to illustrate the operation formula of the cell body 211 of the jth neuron 21 of the hidden layer 2:

…………[Formula 2]

在本實施例中，隱藏層2第j個神經元21之胞體211是對各該輸入資料

與其對應之鍵結權值

進行差平方運算及加總。[式二]中的b為該神經元的偏權值(bias)，在本實施例可設為零但不以此為限。該神經元21的輸出值

會傳遞到輸出層3的L個神經元31以分別運算出一得分。該人工神經網路系統100即依據輸出層3各個神經元31的得分得到辨識分類結果。 In this embodiment, the cell body 211 of the jth neuron 21 of the hidden layer 2 is the input data for each

its corresponding bond weight

Perform the difference squaring and summation. In [Equation 2], b is the bias value of the neuron, which can be set to zero in this embodiment but is not limited thereto. The output value of this neuron 21

It will be passed to the L neurons 31 of the output layer 3 to calculate a score respectively. The artificial neural network system 100 obtains identification and classification results according to the scores of each neuron 31 of the output layer 3 .

值得一提的是，上述運算式只是一種舉例，並不以此為限，其中的平方項可以改成是p次方，

，並且可以在加總後再開p次方。在另外其他實施例，各該第二神經元之胞體所採用的非線性函數可以是例如圓形、橢圓形、多邊形的一封閉曲線函數。 It is worth mentioning that the above formula is only an example, not limited to this, the square term can be changed to the p power,

, and can be raised to the p power after adding up. In other embodiments, the nonlinear function adopted by the cell body of each second neuron may be a closed curve function such as a circle, an ellipse, or a polygon.

當該神經網路系統100用於解決一問題，例如圖形辨識，需取多筆原始訓練資料進行訓練，該等原始訓練資料可預先進行分類標記但不以此為限。在圖3的步驟S1中，可先對系統進行初始化，使前述鍵結權值

初始預設為介於0至1間以利於收斂。 When the neural network system 100 is used to solve a problem, such as pattern recognition, multiple pieces of original training data need to be obtained for training, and the original training data can be classified and marked in advance, but not limited thereto. In step S1 of FIG. 3 , the system can be initialized first, so that the aforementioned bond weights are

The initial default is between 0 and 1 to facilitate convergence.

在步驟S2中，可先對原始訓練資料進行正規化，使輸入到輸入層1的第一神經元11的輸入資料

數值分布於0至1間以利於訓練。 In step S2, the original training data can be normalized first, so that the input data input to the first neuron 11 of the input layer 1

Values are distributed between 0 and 1 to facilitate training.

在步驟S3中，該神經網路系統100的輸入層1接收輸入資料

。 In step S3, the input layer 1 of the neural network system 100 receives input data

.

在步驟S4中，該神經網路系統100的隱藏層2的每一個神經元21胞體211，針對所接收的輸入資料

搭配對應突觸212的鍵結權值

進行非線性轉換，並將轉換後的資料經該軸突213往輸出層3傳遞。值得一提的是，在其他實施例，也可以在軸突213處針對轉換後的資料進一步經過活化函數再做一次非線性變換，該活化函數例如雙曲邏輯函數(Log-Sigmoid Function)或者雙曲正切函數(Tan-Sigmoid Function)，但不以此為限。 In step S4, the cell body 211 of each neuron 21 of the hidden layer 2 of the neural network system 100, for the received input data

With the bond weight corresponding to synapse 212

Non-linear transformation is performed, and the transformed data is transmitted to the output layer 3 through the axon 213 . It is worth mentioning that, in other embodiments, the transformed data may also undergo a nonlinear transformation at the axon 213 through an activation function, such as a hyperbolic logistic function (Log-Sigmoid Function) or a double activation function. Tan-Sigmoid Function, but not limited thereto.

在步驟S5中，該神經網路系統100的輸出層3的每一神經元31就轉換後的資料運算得到一得分，該等神經元31的得分可代表出一辨識結果。例如輸出層3有第一神經元(代表A)及第二神經元(代表B)，運算得到的得分分別為10％與90％，則辨識結果就是B。此辨識結果與輸入資料的分類標記進行比對，得到一正確或錯誤的計數，當所有的訓練資料運算完畢可得到正確率。上述流程執行預設次數(例如500次，但不以此為限)視為完成訓練。In step S5, each neuron 31 of the output layer 3 of the neural network system 100 obtains a score by calculating the converted data, and the scores of the neurons 31 can represent an identification result. For example, the output layer 3 has a first neuron (representing A) and a second neuron (representing B), and the scores obtained by the operation are 10% and 90% respectively, then the identification result is B. The identification result is compared with the classification labels of the input data to obtain a correct or incorrect count, and the correct rate can be obtained when all the training data are calculated. The above-mentioned process is executed for a preset number of times (for example, 500 times, but not limited thereto) as the completion of the training.

以下利用各種現有資料庫訓練本實施例之人工神經網路系統100並驗證其正確率，同時與一般採用仿射函數的人工神經網路進行比較說明。在以下的驗證實驗中，採用仿射函數的人工神經網路(以下稱仿射模型)與採用上述舉例之非線性函數的人工神經網路(以下稱本發明模型)，具有相同層數的隱藏層，且各層具有相同數量的神經元，兩者僅隱藏層的神經元胞體所採用函數不同。The following uses various existing databases to train the artificial neural network system 100 of the present embodiment and verify its correctness, and compare it with the artificial neural network that generally adopts an affine function. In the following verification experiments, the artificial neural network using the affine function (hereinafter referred to as the affine model) and the artificial neural network using the above-mentioned nonlinear function (hereinafter referred to as the model of the present invention) have the same number of hidden layers. Each layer has the same number of neurons, and the only difference between the two is that the function of the neuron cell body of the hidden layer is different.

首先以圖5舉例說明，假設要區分一個二維圖片中的兩個顏色分布，訓練資料如最左圖所示。兩個模型的隱藏層2之神經元21數量皆為二，輸出層3的神經元31數量也是二(辨識兩種顏色)。訓練資料輸入仿射模型，運算得到的結果如圖5中間圖所示，該模型能將該二維圖片劃分出概略的顏色分布。訓練資料輸入本發明模型，運算得到的結果如圖5最右圖所示，該模型能對該二維圖片劃出較準確的界線將顏色區分。First, take Figure 5 as an example, assuming that two color distributions in a two-dimensional picture are to be distinguished, and the training data is shown in the leftmost picture. The number of neurons 21 in the hidden layer 2 of the two models is two, and the number of neurons 31 in the output layer 3 is also two (recognizing two colors). The training data is input into the affine model, and the result obtained by the operation is shown in the middle figure of Figure 5. The model can divide the two-dimensional image into a rough color distribution. The training data is input into the model of the present invention, and the result obtained by the operation is shown in the rightmost figure in Figure 5. The model can draw a more accurate boundary for the two-dimensional picture to distinguish the colors.

再以圖6所示現有的Mnist資料庫舉例說明。該資料庫包含七萬張從0到9共十類的手寫數字的28×28像素灰階影像圖。在本實驗中，取六萬筆作為訓練資料，一萬筆做為測試資料。兩個模型的輸入層1皆為784個神經元11以接收每一個像素的灰階值，隱藏層2皆為兩層，第一層採用二十個神經元21，第二層採用五個神經元21，而輸出層3的神經元31數量則為十(辨識十種手寫數字)。參閱圖7，從圖中可知，仿射模型就六萬筆訓練資料訓練的結果比本發明模型較快達到高的正確率，但是經訓練的仿射模型對於一萬筆測試資料的預測正確率，卻比經訓練的本發明模型預測正確率低。原因在於仿射模型為線性模型，對於各種現實生活中遇到的狀況(非線性問題)的適應性較差(也就是過度學習(overfitting)的現象)。而本發明採非線性運算，雖然學習曲線不如仿射模型，但對於非線性問題的適應性佳。Again, the existing Mnist database shown in FIG. 6 is used as an example for illustration. The database contains 70,000 28×28 pixel grayscale images of handwritten digits in ten categories ranging from 0 to 9. In this experiment, 60,000 records are used as training data, and 10,000 records are used as test data. The input layer 1 of the two models has 784 neurons 11 to receive the grayscale value of each pixel, the hidden layer 2 is two layers, the first layer uses twenty neurons 21, and the second layer uses five neurons. The number of neurons 31 in the output layer 3 is ten (recognizing ten kinds of handwritten digits). Referring to Fig. 7, as can be seen from the figure, the result of the affine model training with respect to 60,000 pieces of training data reaches a higher correct rate faster than the model of the present invention, but the trained affine model is for the prediction accuracy rate of 10,000 pieces of test data. , but the prediction accuracy rate is lower than that of the trained model of the present invention. The reason is that the affine model is a linear model and has poor adaptability to various situations (non-linear problems) encountered in real life (that is, the phenomenon of overfitting). On the other hand, the present invention adopts nonlinear operation. Although the learning curve is not as good as that of the affine model, it has good adaptability to nonlinear problems.

最後以圖8所示現有的CIFAR-10資料庫舉例說明。該資料庫由六萬筆32×32被分為十種物體的RGB彩色影像圖組成。在本實驗中，取五萬筆作為訓練資料，一萬筆做為測試資料。兩個模型的輸入層1皆為3072個神經元11以接收每一個像素的RGB值，隱藏層2皆為三層，第一層採用二十個神經元21、第二層採用四十個神經元21、第三層採一百個神經元21，而輸出層3的神經元31數量則為十(辨識十種物體)。參閱圖9，從圖中可知，仿射模型就五萬筆訓練資料訓練的結果很快達到80％的正確率，但是經訓練的仿射模型對於一萬筆測試資料的預測正確率卻降到幾乎40％。反觀本發明模型，雖然用相同的訓練次數只達到50％的正確率，但對於測試資料也能維持50％正確率，高於仿射模型的40％。再次驗證本發明採非線性運算，雖然學習速度較慢，但對於非線性問題的適應性優於傳統仿射模型。Finally, the existing CIFAR-10 database shown in Figure 8 is used as an example. The database consists of 60,000 32×32 RGB color images divided into ten objects. In this experiment, 50,000 records are used as training data, and 10,000 records are used as test data. The input layer 1 of the two models has 3072 neurons 11 to receive the RGB value of each pixel, and the hidden layer 2 has three layers. The first layer uses twenty neurons 21 and the second layer uses forty neurons. The number of neurons 21 in the third layer is one hundred neurons 21, and the number of neurons 31 in the output layer 3 is ten (recognizing ten kinds of objects). Referring to Figure 9, it can be seen from the figure that the result of training the affine model on 50,000 pieces of training data quickly reached an accuracy of 80%, but the prediction accuracy of the trained affine model for 10,000 pieces of test data dropped to almost 40%. On the other hand, although the model of the present invention only achieves a 50% correct rate with the same training times, it can also maintain a 50% correct rate for the test data, which is higher than 40% of the affine model. It is verified again that the present invention adopts nonlinear operation. Although the learning speed is relatively slow, the adaptability to nonlinear problems is better than that of the traditional affine model.

綜上所述，本發明在神經網路系統100的隱藏層2的細胞元21胞體211中使用非線性函數，改變了傳統採用仿射函數所產生過度學習的問題，使得測試結果貼近訓練結果，故確實能達成本發明之目的。In summary, the present invention uses a nonlinear function in the cell 21 cell body 211 of the hidden layer 2 of the neural network system 100, which changes the problem of over-learning caused by the traditional use of affine functions, and makes the test results close to the training results. , so it can indeed achieve the purpose of the present invention.

惟以上所述者，僅為本發明之實施例而已，當不能以此限定本發明實施之範圍，凡是依本發明申請專利範圍及專利說明書內容所作之簡單的等效變化與修飾，皆仍屬本發明專利涵蓋之範圍內。However, the above are only examples of the present invention, and should not limit the scope of the present invention. Any simple equivalent changes and modifications made according to the scope of the application for patent of the present invention and the content of the patent specification are still within the scope of the present invention. within the scope of the invention patent.

100:神經網路系統100: Neural Network Systems

1:輸入層1: Input layer

11:神經元11: Neurons

2:隱藏層2: Hidden layer

21:神經元21: Neurons

211:胞體211: Soma

212:突觸212: Synapse

213:軸突213: Axon

3:輸出層3: output layer

31:神經元31: Neurons

S1~S5:步驟S1~S5: Steps

本發明之其他的特徵及功效，將於參照圖式的實施方式中清楚地呈現，其中：圖1是一方塊圖，說明用於實現本發明人工神經網路系統之一實施例的硬體；圖2是本發明人工神經網路系統的一實施例的架構圖；圖3是本發明人工神經網路系統的一實施例的運作流程圖；圖4是本發明人工神經網路系統之一隱藏層的一神經元的示意圖；圖5是一比較表，說明使用本發明模型與傳統仿射模型對於一訓練資料的學習結果；及圖6是一示意圖，說明現有的Mnist資料庫；圖7是一示意圖，說明使用本發明模型與傳統仿射模型的訓練正確率以及測試正確率；圖8是一示意圖，說明現有的CIFAR-10資料庫；及圖9是一示意圖，說明使用本發明模型與傳統仿射模型的訓練正確率以及測試正確率。 Other features and effects of the present invention will be clearly presented in the embodiments with reference to the drawings, wherein: 1 is a block diagram illustrating the hardware used to implement one embodiment of the artificial neural network system of the present invention; 2 is a schematic diagram of an embodiment of the artificial neural network system of the present invention; 3 is a flow chart of the operation of an embodiment of the artificial neural network system of the present invention; 4 is a schematic diagram of a neuron of a hidden layer of the artificial neural network system of the present invention; FIG. 5 is a comparison table illustrating the learning results for a training data using the model of the present invention and a conventional affine model; and 6 is a schematic diagram illustrating an existing Mnist database; Fig. 7 is a schematic diagram illustrating the training accuracy and test accuracy using the model of the present invention and a conventional affine model; Figure 8 is a schematic diagram illustrating the existing CIFAR-10 database; and FIG. 9 is a schematic diagram illustrating the training accuracy and test accuracy using the model of the present invention and the conventional affine model.

S1~S5:人工神經網路系統之運作步驟 S1~S5: Operation steps of artificial neural network system

Claims

An artificial neural network system, comprising: an input layer, including a plurality of first neurons, for receiving a set of input data; at least one hidden layer, including a plurality of second neurons; and an output layer, including a plurality of The third neuron; the input layer, the at least one hidden layer, and the output layer are arranged in sequence, and each of the second neurons includes a cell body, extends forward from the cell body, and receives neurons from the previous layer A plurality of synapses for at least part of the input data in the cell, and an axon extending back from the cell body, each of the synapses having a bond weight, the cell body uses a nonlinear function for the received input The data is nonlinearly transformed with the bond weights of the corresponding synapses, and the transformed data is transmitted to the next layer through the axon, wherein the nonlinear function adopted by the cell body of each second neuron is a norm function, That is, the norm is calculated for each input data and its corresponding bond weight.

The artificial neural network system of claim 1, wherein the nonlinear function adopted by the cell body of each second neuron is a closed curve function.

The artificial neural network system of claim 1, wherein the set of input data received by each first neuron is pre-normalized, so that the values of all input data are distributed between 0 and 1.

The artificial neural network system of claim 1, wherein the bond weights of the synapses of the second neurons are preset to be between 0 and 1.

The artificial neural network system according to claim 1, wherein after the cell body performs nonlinear transformation on the received input data, the axon has a The data is further subjected to an activation function for a nonlinear transformation.

A method for applying an artificial neural network system, comprising: receiving a set of input data through a plurality of first neurons of an input layer; performing nonlinear transformation on the received input data through a hidden layer, wherein the hidden layer includes multiple a second neuron, and each second neuron includes a cell body, a plurality of synapses extending forward from the cell body and receiving input data from the previous layer, and an axon extending backward from the cell body , each of the synapses has a bond weight, the cell body uses a nonlinear function to perform nonlinear transformation for the received input data and the bond weight of the corresponding synapse, wherein the cells of each second neuron The body is output by a norm function operation on each of the input data and its corresponding bond weights; and received from axons of a plurality of second neurons in the hidden layer through a plurality of third neurons in an output layer The converted data is processed and output a result through the third neurons.

The method for applying an artificial neural network system as claimed in claim 6, wherein the cell body of each second neuron is output after performing a closed curve function operation on each of the input data and its corresponding bond weights.

The method for applying an artificial neural network system according to claim 6, further, before the plurality of first neurons in the input layer receive the set of input data, the set of input data is pre-normalized, so that all input data are Values are distributed between 0 and 1.