TW202121257A

TW202121257A - Parameter iteration method of artificial intelligence training

Info

Publication number: TW202121257A
Application number: TW108143027A
Authority: TW
Inventors: 張漢威
Original assignee: 張漢威
Priority date: 2019-11-26
Filing date: 2019-11-26
Publication date: 2021-06-01
Also published as: TWI752380B

Abstract

A parameter iteration method of artificial intelligence training includes: a setting a step, providing a training set, and setting a numerical range of the training parameters; an initializing step, selecting a three initial setting values from the numerical range of the training parameters, and calculating an accuracy values of each initial setting value according to the training set, setting the initial setting value with highest accuracy valuesas a first core value, and setting a first parameter range based on the first core value; the parameter optimization step, selecting three first iteration values in the first parameter range, calculating the accuracy value of each first training set value, comparing the accuracy value of each of the first training set values, setting the value with the highest accuracy value as the second core value, and setting the second parameter range based on the second core value; and the determining step, if the accuracy value of the second core value is higher than 0.9, setting the second core value as the training parameter standard value; if not higher than 0.9, repeats the parameter optimization step until the accuracy of a test core value is higher than 0.9, and then setting the test core value as the training parameter standard value.

Description

Artificial intelligence training parameter iteration method

本申請案涉及人工智慧領域，尤其是一種人工智慧訓練參數迭代方法。This application relates to the field of artificial intelligence, especially an artificial intelligence training parameter iteration method.

隨著科技的進步，人工智慧的應用面越來越大，例如，缺陷檢測、人臉辨識、醫療判斷等等都可以逐步看到人工智慧的應用。在一般人工智慧實際進入應用領域之前，通常要經過資料的訓練，資料的訓練方式可以是類神經網路、卷積神經網絡(Convolutional Neural Network，CNN)等演算法。With the advancement of science and technology, the application of artificial intelligence is increasing. For example, defect detection, face recognition, medical judgment, etc. can gradually see the application of artificial intelligence. Before general artificial intelligence actually enters the application field, it usually needs to be trained with data. The training method of the data can be neural network, convolutional neural network (Convolutional Neural Network, CNN) and other algorithms.

現行上，CNN的深度學習為最常見的影像判別學習訓練方式，由於現行上由於訓練上的參數通常為亂數設定，加上輸入的資料量龐大，演算過程可能產生大量的演算資料，因而對於記憶體、運算資源的負擔較大，而造成訓練時間長、效率不佳的問題。At present, CNN's deep learning is the most common method of image discrimination learning and training. Because the current training parameters are usually set by random numbers, and the amount of input data is huge, the calculation process may generate a large amount of calculation data. The burden of memory and computing resources is relatively large, which results in long training time and poor efficiency.

在此，提供一種人工智慧訓練參數迭代方法。人工智慧訓練參數迭代方法包含設定步驟、初始化步驟、參數優化步驟、以及判斷步驟。設定步驟是提供訓練集，並設定至少二訓練參數的數值範圍。初始化步驟是由訓練參數的數值範圍，隨機選取至少三初始設定值，並依據訓練集計算出各三初始設定值的準確率，至少三初始設定值中準確率最高者為第一核心值，並以第一核心值的參數座標值為物理中心設定出第一參數範圍。參數優化步驟是在第一參數範圍中，選取至少三第一迭代值，並依據訓練集計算各第一迭代值的準確率，並比較至少三第一迭代值的準確率，以準確率最高者為第二核心值，並依據第二核心值的參數座標值為物理中心設定出第二參數範圍。判斷步驟是判斷第二核心值的準確率是否高於0.9，若高於0.9則停止，以第二核心值為訓練參數標準值。若未高於0.9，則以第二核心值及第二參數範圍，取代第一核心值及第一參數範圍，重複參數優化步驟，直到測試核心值的準確率高於0.9，以測試核心值之參數座標為訓練參數標準值。Here, an artificial intelligence training parameter iteration method is provided. The artificial intelligence training parameter iteration method includes a setting step, an initialization step, a parameter optimization step, and a judgment step. The setting step is to provide a training set and set the numerical range of at least two training parameters. The initialization step is to randomly select at least three initial settings from the numerical range of the training parameters, and calculate the accuracy of each of the three initial settings according to the training set. The highest accuracy of the at least three initial settings is the first core value, and The parameter coordinate value of the first core value sets the first parameter range for the physical center. The parameter optimization step is to select at least three first iteration values in the first parameter range, calculate the accuracy of each first iteration value according to the training set, and compare the accuracy of at least three first iteration values, with the highest accuracy Is the second core value, and the second parameter range is set according to the physical center of the parameter coordinate value of the second core value. The judging step is to judge whether the accuracy of the second core value is higher than 0.9, if it is higher than 0.9, stop, and take the second core value as the standard value of the training parameter. If it is not higher than 0.9, replace the first core value and the first parameter range with the second core value and the second parameter range, and repeat the parameter optimization step until the accuracy of the test core value is higher than 0.9 to test the core value The parameter coordinates are the standard values of the training parameters.

在一些實施例中，至少二參數包含批次學習數量及學習率，批次學習數量的範圍是0.5至1.5，學習率的範圍是0.5至1.5，第一參數範圍是在批次學習數量及學習率分別為橫軸、縱軸的座標系統上，以第一核心值為圓心的圓形。更詳細地，在一些實施例中，學習數量的範圍是0.7至1.3，學習率的範圍是0.7至1.3。In some embodiments, the at least two parameters include the number of batch learning and the learning rate, the range of the number of batch learning is 0.5 to 1.5, the range of the learning rate is 0.5 to 1.5, and the first parameter range is the number of batch learning and the learning rate. The rate is a circle with the first core value on the coordinate system of the horizontal axis and the vertical axis. In more detail, in some embodiments, the range of the number of learning is 0.7 to 1.3, and the range of the learning rate is 0.7 to 1.3.

更詳細地，在一些實施例中，初始化步驟中是由二圖形處理器分別進行初始設定值之批次學習數量與學習率的選取、及準確率的計算。In more detail, in some embodiments, in the initialization step, two graphics processors respectively perform the selection of the batch learning quantity and the learning rate of the initial setting value, and the calculation of the accuracy rate.

在一些實施例中，至少二參數更包含動量，動量的範圍為0至1，在此，第一參數範圍是在批次學習數量、學習率及動量分別為x軸、y軸、z軸的一座標系統上以第一核心值為球心的球體。更詳細地，動量的範圍是0.3至0.8。In some embodiments, the at least two parameters further include momentum, and the range of momentum is 0 to 1. Here, the first parameter range is when the number of batch learning, the learning rate, and the momentum are the x-axis, y-axis, and z-axis, respectively. A sphere with the first core value as the center of the sphere on a standard system. In more detail, the range of momentum is 0.3 to 0.8.

在一些實施例中，至少二參數更包含正規化，正規化的範圍為0.00001至 0.001，其中第一參數範圍是在批次學習數量、學習率、動量、正規化分別為x軸、y軸、z軸、w軸的一座標系統上以第一核心值為物理中心的物理量範圍。更詳細地，在一些實施例中，正規化的範圍是0.0001至0.0005。In some embodiments, the at least two parameters further include normalization, and the range of normalization is 0.00001 to 0.001, where the first parameter range is the number of batches learned, learning rate, momentum, and normalization are x-axis, y-axis, On the z-axis and w-axis of a standard system, the first core value is the physical quantity range of the physical center. In more detail, in some embodiments, the range of normalization is 0.0001 to 0.0005.

在一些實施例中，在初始化步驟中，批次學習數量、學習率、動量、正規化中的任兩者由第一圖形處理器進行選取、而批次學習數量、學習率、動量、正規化中的另兩者由第二圖形處理器進行選取，並由第三圖形處理器進行準確率的計算。In some embodiments, in the initialization step, any two of the batch learning number, learning rate, momentum, and normalization are selected by the first graphics processor, and the batch learning number, learning rate, momentum, and normalization are selected by the first graphics processor. The other two are selected by the second graphics processor, and the third graphics processor calculates the accuracy.

在一些實施例中，人工智慧訓練參數迭代方法更包含檢驗步驟。檢驗步驟是提供測試集，並以判斷步驟中準確率高於0.9之第二核心值或測試核心值計算準確率，若依據測試集計算出準確率低於0.9，則重新進行初始化步驟。In some embodiments, the artificial intelligence training parameter iteration method further includes a verification step. The verification step is to provide a test set, and calculate the accuracy rate based on the second core value or test core value whose accuracy rate is higher than 0.9 in the judgment step. If the accuracy rate is lower than 0.9 calculated according to the test set, the initialization step is performed again.

綜上所述，透過人工智慧訓練參數迭代方法，可以加速參數值的選取，加快整體人工智慧訓練的過程，從而能以更少量的檔案、更短的速度，達到人工訓練得梯度快速下降的效果，而能簡短訓練的時間、大幅提升訓練的效率。To sum up, through the artificial intelligence training parameter iteration method, the selection of parameter values can be accelerated, and the overall artificial intelligence training process can be accelerated, so as to achieve the effect of a rapid decline in the gradient of manual training with a smaller number of files and a shorter speed. , And can shorten the training time and greatly improve the efficiency of training.

圖1為人工智慧訓練參數迭代方法的流程圖。如圖1所示，人工智慧訓練參數迭代方法S1包含設定步驟S10、初始化步驟S20、參數優化步驟S30、以及判斷步驟S40。Figure 1 is a flowchart of an iterative method for artificial intelligence training parameters. As shown in FIG. 1, the artificial intelligence training parameter iteration method S1 includes a setting step S10, an initialization step S20, a parameter optimization step S30, and a judgment step S40.

設定步驟S10是提供訓練集，並設定至少二訓練參數的數值範圍。在此，係以卷積神經網絡(Convolutional Neural Network，CNN)進行人工智慧訓練，以隨機梯度下降法的模型做為訓練集，其可以如以下方程式(一)所示。然而，這僅為示例，而非用以限制，在此的訓練集更包含損失(loss)計算等演算方式。方程式(一)：

，其中w表示為權重、n為批次學習量(batch size)、η為學習率。The setting step S10 is to provide a training set and set the numerical range of at least two training parameters. Here, a Convolutional Neural Network (CNN) is used for artificial intelligence training, and a model of the stochastic gradient descent method is used as a training set, which can be as shown in the following equation (1). However, this is only an example, not a limitation. The training set here further includes calculation methods such as loss calculation. Equation (1):

, Where w is the weight, n is the batch size, and η is the learning rate.

如同前述方程式(一)所示，批次學習量與學習率，將實際影響到權重的收斂，當大的批次學習量，可以減少了批次的數量，而減少了訓練的時間，但相對地，較大的學習量會使得需要更大的迭代次數，而使得模型的性能較差，所產生的資料量也更大。學習率太小，代表對於權重更新範圍很小，會使其訓練變非常緩慢；然後學習率太大，可能導致無法收斂。因此，設定步驟S10除了提供適當的訓練集，同時也需設定參數的範圍，例如，設定的參數包含批次學習數量及學習率，批次學習數量的範圍是0.5至1.5，學習率的範圍是0.5至1.5。較佳地，學習數量的範圍是0.7至1.3，學習率的範圍是0.7至1.3。然而，以上僅為示例，而非用以限制，相關的參數亦不只批次學習數量及學習率。As shown in the aforementioned equation (1), the batch learning volume and learning rate will actually affect the convergence of the weights. When the batch learning volume is large, the number of batches can be reduced, and the training time can be reduced. Ground, a larger amount of learning will require a larger number of iterations, and make the performance of the model worse, and the amount of data generated is also larger. The learning rate is too small, which means that the weight update range is very small, which will make the training very slow; then the learning rate is too large, which may lead to failure to converge. Therefore, in addition to providing an appropriate training set, the setting step S10 also needs to set the range of parameters. For example, the set parameters include the number of batch learning and the learning rate. The range of the number of batch learning is 0.5 to 1.5, and the range of the learning rate is 0.5 to 1.5. Preferably, the range of the learning amount is 0.7 to 1.3, and the range of the learning rate is 0.7 to 1.3. However, the above is only an example, not a limitation, and the relevant parameters are not only the number of batch learning and the learning rate.

圖2為初始化步驟的示意圖，如圖2所示，初始化步驟S20是由訓練參數的數值範圍，隨機選取至少三初始設定值A1、A2、A3。在此，以下以批次學習量為橫軸、學習率，隨機選取出三初始設定值A1、A2、A3在座標系統上的座標值分別為(x1,y1)、(x2, y2)、(x3, y3)為示例。此僅為了方便在平面上呈現，實際上的參數並不限於此。Fig. 2 is a schematic diagram of the initialization step. As shown in Fig. 2, the initialization step S20 is to randomly select at least three initial setting values A1, A2, and A3 from the numerical range of the training parameter. Here, the following uses the batch learning amount as the horizontal axis and the learning rate to randomly select the three initial settings A1, A2, and A3. The coordinate values on the coordinate system are (x1, y1), (x2, y2), ( x3, y3) are examples. This is only for the convenience of presentation on the plane, and the actual parameters are not limited to this.

接著，依據訓練集計算出各三初始設定值的準確率，準確率(Accuracy Rate, ACC)的計算方式為1-loss。經計算後，比較三者的準確率，並以至少三初始設定值A1、A2、A3中準確最高者(假設為A1)為第一核心值，並以第一核心值的參數座標值(x1, y1)為物理中心設定出第一參數範圍R1。在此，第一參數範圍R1是在批次學習數量及該學習率分別為橫軸、縱軸的座標系統上，以第一核心值(x1, y1)為圓心的圓形。在此，圓形的半徑可以為

。然而，這僅為示例，也可以預選為特定半徑值作為第一參數範圍R1。Then, the accuracy of each three initial setting values is calculated according to the training set, and the accuracy rate (Accuracy Rate, ACC) is calculated as 1-loss. After calculation, compare the accuracy of the three, and take the most accurate of at least three initial settings A1, A2, and A3 (assuming A1) as the first core value, and use the parameter coordinate value of the first core value (x1 , y1) Set the first parameter range R1 for the physical center. Here, the first parameter range R1 is a circle with the first core value (x1, y1) as the center on a coordinate system where the number of batch learning and the learning rate are the horizontal axis and the vertical axis, respectively. Here, the radius of the circle can be

. However, this is only an example, and a specific radius value can also be pre-selected as the first parameter range R1.

圖3為參數優化步驟的示意圖。如圖3所示，參數優化步驟S30是在第一參數範圍R1中，選取至少三第一迭代值B1、B2、B3，並依據訓練集計算各第一迭代值B1、B2、B3的準確率，並比較至少三第一迭代值B1、B2、B3的準確率，以準確率最高者(假設為B3)為第二核心值，並依據第二核心值的參數座標值(x6,y6)為物理中心設定出第二參數範圍R2。在此，圓形的半徑可以為

。然而，這僅為示例，也可以預選為特定半徑值作為第一參數範圍R1Figure 3 is a schematic diagram of the parameter optimization step. As shown in Figure 3, the parameter optimization step S30 is to select at least three first iteration values B1, B2, B3 in the first parameter range R1, and calculate the accuracy of each first iteration value B1, B2, B3 according to the training set , And compare the accuracy rates of at least three first iteration values B1, B2, B3, take the highest accuracy rate (assuming B3) as the second core value, and based on the parameter coordinate value (x6, y6) of the second core value as The physical center sets a second parameter range R2. Here, the radius of the circle can be

. However, this is only an example, and a specific radius value can also be pre-selected as the first parameter range R1

判斷步驟S40是判斷第二核心值的準確率是否高於0.9，若高於0.9則停止，進入步驟S45，選定第二核心值(x6, y6)為訓練參數標準值進行訓練。The judging step S40 is to judge whether the accuracy of the second core value is higher than 0.9, if it is higher than 0.9, stop, go to step S45, and select the second core value (x6, y6) as the training parameter standard value for training.

圖4為參數優化步驟的示意圖。如圖4所示，若是判斷步驟S40判斷第二核心值的準確率未高於0.9，則以第二核心值(x6,y6)及第二參數範圍R2，取代第一核心值(x1, y1)及第一參數範圍R1，重複參數優化步驟S30，選取至少三第二迭代值C1、C2、C3，並依據訓練集計算各第二迭代值C1、C2、C3的準確率，並比較至少三第二訓練設定C1、C2、C3值的準確率，以準確率最高者(假設為C1)為第三核心值，並依據第三核心值C1的參數座標值(x7,y7)為物理中心設定出第三參數範圍R3。在此，圓形的半徑可以為

。可以依此方式重覆判斷步驟S40及參數優化步驟S30，直到某次的測試核心值的準確率高於0.9，以測試核心值之參數座標為訓練參數標準值。Figure 4 is a schematic diagram of the parameter optimization step. As shown in Fig. 4, if it is determined in step S40 that the accuracy of the second core value is not higher than 0.9, the second core value (x6, y6) and the second parameter range R2 are used to replace the first core value (x1, y1). ) And the first parameter range R1, repeat the parameter optimization step S30, select at least three second iteration values C1, C2, C3, and calculate the accuracy of each second iteration value C1, C2, C3 according to the training set, and compare at least three The second training sets the accuracy of the C1, C2, and C3 values. The highest accuracy (assuming C1) is the third core value, and the third core value C1's parameter coordinate value (x7, y7) is set as the physical center. Out of the third parameter range R3. Here, the radius of the circle can be

. The judgment step S40 and the parameter optimization step S30 can be repeated in this way until the accuracy of the tested core value is higher than 0.9 at a certain time, and the parameter coordinates of the tested core value are used as the training parameter standard value.

再次參閱圖1，人工智慧訓練參數迭代方法S1更包含檢驗步驟S50。檢驗步驟S50是提供測試集，測試集與訓練集中的數值均不相同，檢驗步驟S50中將判斷步驟S40中準確率高於0.9之第二核心值、或是後續重覆參數優化步驟S30所得到的測試核心值依據測試集計算出準確率。若是依據測試集所計算出的準確率高於0.9，則以第二核心值或測試核心值為訓練參數標準值。而若是準確率低於0.9，則廢棄該參數，重新進行初始化步驟S20。Referring to FIG. 1 again, the artificial intelligence training parameter iteration method S1 further includes a verification step S50. The verification step S50 is to provide a test set. The values in the test set and the training set are not the same. In the verification step S50, the second core value whose accuracy rate is higher than 0.9 in step S40 will be judged, or it will be obtained by repeating the parameter optimization step S30. The core value of the test calculates the accuracy rate based on the test set. If the accuracy rate calculated based on the test set is higher than 0.9, the second core value or the test core value is used as the standard value of the training parameter. If the accuracy rate is lower than 0.9, the parameter is discarded, and the initialization step S20 is performed again.

在前述實施例中，初始化步驟S20中是由二圖形處理器分別進行初始設定值之批次學習數量與學習率的選取、及準確率的計算。也就是，圖2中的初始設定值A1、A2、A3的選取，以及初始設定值A1、A2、A3之準確率的計算分別由二個圖形處理器來計算出。如此，可以分散圖形處理器的運算資源，達到更快的計算效率。In the foregoing embodiment, in the initialization step S20, the two graphics processors respectively perform the selection of the batch learning number and the learning rate of the initial setting value, and the calculation of the accuracy rate. That is, the selection of the initial setting values A1, A2, A3 in FIG. 2 and the calculation of the accuracy of the initial setting values A1, A2, A3 are respectively calculated by two graphics processors. In this way, the computing resources of the graphics processor can be dispersed to achieve faster computing efficiency.

然而，圖2至4僅為示例，實際影響到訓練效率的參數還有動量及正規化，若是設定三個參數，例如，批次學習數量、學習率及動量，則可以理解的是，第一參數範圍是在批次學習數量、學習率及動量分別為x軸、y軸、z軸的一座標系統上以第一核心值為球心的球體。而若是四個參數，批次學習數量、學習率、動量及正規化時，第一參數範圍是在批次學習數量、學習率、動量、正規化分別為x軸、y軸、z軸、w軸的座標系統上以第一核心值為物理中心的物理量範圍。唯，於平面繪製三軸、四軸的空間不易呈現出迭代的效果，在此未繪示出，而本領域具有通常知識者，可以依據圖2至4，思考到於三軸、四軸空間的變換。However, Figures 2 to 4 are only examples. The parameters that actually affect training efficiency are momentum and regularization. If three parameters are set, such as the number of batches, learning rate, and momentum, it can be understood that the first The parameter range is a sphere with the first core value as the center of the sphere on a standard system where the number of batch learning, learning rate, and momentum are x-axis, y-axis, and z-axis respectively. And if there are four parameters, the number of batch learning, learning rate, momentum and normalization, the first parameter range is the number of batch learning, learning rate, momentum, and normalization are x-axis, y-axis, z-axis, w The axis coordinate system takes the first core value as the physical quantity range of the physical center. However, drawing three-axis and four-axis space on a plane is not easy to show an iterative effect. It is not shown here. Those with ordinary knowledge in the field can think about three-axis and four-axis space based on Figures 2 to 4 The transformation.

更詳細地，若是考量到動量及正規化，動量的範圍為0至1，較佳地，其範圍是0.3至0.8。正規化的範圍為0.00001至0.001，較佳地，其範圍是0.0001至0.0005。In more detail, if momentum and normalization are considered, the range of momentum is 0 to 1, preferably, the range is 0.3 to 0.8. The range of normalization is 0.00001 to 0.001, preferably, the range is 0.0001 to 0.0005.

進一步地，初始化步驟S20中，考量到動量及正規化，批次學習數量、學習率、動量、正規化中的任兩者，可以由第一圖形處理器進行選取、而其他另兩者由第二圖形處理器進行選取，並由第三圖形處理器進行該準確率的計算。藉此，透過三個圖形處理器來分散運算資源，來達到更快的運算效果，使得梯度的下降更快。Further, in the initialization step S20, considering momentum and normalization, any two of the batch learning quantity, learning rate, momentum, and normalization can be selected by the first graphics processor, and the other two can be selected by the first graphics processor. The second graphics processor selects, and the third graphics processor calculates the accuracy. In this way, the computing resources are distributed through the three graphics processors to achieve faster computing effects and make the gradient drop faster.

在此，以比較例與實施例進行比較，實際採用的訓練集均為google提供的標準訓練集，比較例是以標準AI框架，採用Tesla 40m (2880CUDA、12GB)的GPU、利用google tensorflow的模式來進行訓練；而實施例是採用四個GTX 1050i (768CUDA、4GB)的GPU做串連，採用前述實施例，取批次學習數量、學習率、動量、正規化分別為0.7至1.3、0.7至1.3、0.3至0.8、0.001至0.005的範圍，來進行訓練。比較例的結果是在14400秒，得到86%的精確率；而實施例的結果是在900秒得到98%的精確率。Here, the comparative example is compared with the embodiment. The actual training set used is the standard training set provided by google. The comparative example is the standard AI framework, using Tesla 40m (2880CUDA, 12GB) GPU, and using google tensorflow mode For training; and the embodiment uses four GTX 1050i (768CUDA, 4GB) GPUs for serial connection. Using the foregoing embodiment, the number of batch learning, learning rate, momentum, and regularization are respectively 0.7 to 1.3 and 0.7 to 1.3, 0.3 to 0.8, 0.001 to 0.005 range for training. The result of the comparative example is an accuracy of 86% at 14400 seconds; the result of the example is an accuracy of 98% at 900 seconds.

綜上所述，透過本申請案之人工智慧訓練參數迭代方法，能夠快速地讓參數的選擇優化，而能快速地達到梯度下降收斂的功效，有助於提升訓練的效率、減少運算的資源、以較便宜的硬體成本來完成訓練。In summary, through the artificial intelligence training parameter iteration method of this application, the selection of parameters can be quickly optimized, and the effect of gradient descent convergence can be quickly achieved, which helps to improve training efficiency, reduce computing resources, and Complete training with cheaper hardware costs.

雖然本發明的技術內容已經以較佳實施例揭露如上，然其並非用以限定本發明，任何熟習此技藝者，在不脫離本發明之精神所作些許之更動與潤飾，皆應涵蓋於本發明的範疇內，因此本發明之保護範圍當視後附之申請專利範圍所界定者為準。Although the technical content of the present invention has been disclosed in the preferred embodiments as above, it is not intended to limit the present invention. Anyone who is familiar with this technique and makes some changes and modifications without departing from the spirit of the present invention should be covered by the present invention. Therefore, the scope of protection of the present invention shall be subject to the scope of the attached patent application.

S1:人工智慧訓練參數迭代方法 S10:設定步驟 S20:初始化步驟 S30:參數優化步驟 S40:判斷步驟 S45:選定訓練參數 S50:檢驗步驟 A1、A2、A3:初始設定值 B1、B2、B3:第一迭代值 C1、C2、C3:第二迭代值 R1:第一參數範圍 R2:第二參數範圍 R3:第三參數範圍S1: Artificial intelligence training parameter iteration method S10: Setting procedure S20: Initialization steps S30: Parameter optimization steps S40: Judgment step S45: selected training parameters S50: Inspection steps A1, A2, A3: Initial setting value B1, B2, B3: the first iteration value C1, C2, C3: second iteration value R1: The first parameter range R2: The second parameter range R3: The third parameter range

圖1為人工智慧訓練參數迭代方法的流程圖。圖2為初始化步驟的示意圖。圖3及圖4為參數優化步驟的示意圖。Figure 1 is a flowchart of an iterative method for artificial intelligence training parameters. Figure 2 is a schematic diagram of the initialization step. Figures 3 and 4 are schematic diagrams of parameter optimization steps.

S1:人工智慧訓練參數迭代方法S1: Artificial intelligence training parameter iteration method

S10:設定步驟S10: Setting procedure

S20:初始化步驟S20: Initialization steps

S30:參數優化步驟S30: Parameter optimization steps

S40:判斷步驟S40: Judgment step

S45:選定訓練參數S45: selected training parameters

S50:檢驗步驟S50: Inspection steps

Claims

An artificial intelligence training parameter iteration method, including: A setting step, providing a training set, and setting a value range of at least two training parameters; In an initialization step, at least three initial setting values are randomly selected from the numerical range of the training parameter, and an accuracy rate of each initial setting value is calculated according to the training set, and the highest accuracy rate among the at least three initial setting values is one A first core value, and a first parameter range is set with the physical center of the parameter coordinate value of the first core value; A parameter optimization step, selecting at least three first iteration values in the first parameter range, calculating the accuracy rate of each first iteration value according to the training set, and comparing the accuracy rates of the at least three first iteration values , Taking the highest accuracy rate as a second core value, and setting a second parameter range according to the physical center of the parameter coordinate value of the second core value; and A judgment step is to judge whether the accuracy of the second core value is higher than 0.9, if it is higher than 0.9, stop, take the second core value as a training parameter standard value, if it is not higher than 0.9, use the second core value The core value and the second parameter range replace the first core value and the first parameter range, and the parameter optimization step is repeated until the accuracy of a test core value is higher than 0.9, and the parameter coordinates of the test core value are the Standard values of training parameters.

The artificial intelligence training parameter iteration method according to claim 1, wherein the at least two parameters include a batch learning quantity and a learning rate, the batch learning quantity ranges from 0.5 to 1.5, and the learning rate ranges from 0.5 to 1.5. The first parameter range is a circle with the center of the first core value on a standard system with the number of batches of learning and the learning rate being the horizontal axis and the vertical axis, respectively.

The artificial intelligence training parameter iteration method according to claim 2, wherein the batch learning quantity ranges from 0.7 to 1.3, and the learning rate ranges from 0.7 to 1.3.

The artificial intelligence training parameter iteration method according to claim 3, wherein in the initialization step, two graphics processors respectively perform the selection of the batch learning quantity and the learning rate of the initial setting value and the calculation of the accuracy rate.

The artificial intelligence training parameter iteration method according to claim 2, wherein the at least two parameters further include a momentum, the momentum ranges from 0 to 1, and the first parameter range is the number of learning in the batch, the learning rate And the momentum is a sphere with the first core value as the center of the sphere on a standard system of x-axis, y-axis, and z-axis.

The artificial intelligence training parameter iteration method according to claim 5, wherein the range of the momentum is 0.3 to 0.8.

The artificial intelligence training parameter iteration method according to claim 5, wherein the at least two parameters further include a normalization, and the normalization ranges from 0.00001 to 0.001, and the first parameter range is the number of learning in the batch, the The learning rate, the momentum, and the normalization are respectively a range of physical quantities on a standard system of x-axis, y-axis, z-axis, and w-axis with the first core value as the physical center.

The artificial intelligence training parameter iteration method described in claim 7, wherein the normalization range is 0.0001 to 0.0005.

The artificial intelligence training parameter iteration method according to claim 7, wherein in the initialization step, any two of the batch learning quantity, the learning rate, the momentum, and the normalization are performed by a first graphics processor Select, and the other two of the batch learning quantity, the learning rate, the momentum, and the normalization are selected by a second graphics processor, and a third graphics processor performs the calculation of the accuracy.

The artificial intelligence training parameter iteration method described in claim 1, further comprising a verification step, providing a test set, and using the second core value or the test core value with an accuracy rate higher than 0.9 in the judgment step according to the test The accuracy rate is calculated based on the test set. If the accuracy rate is calculated to be lower than 0.9 based on the test set, the initialization step is performed again.