TWI752380B

TWI752380B - Parameter iteration method of artificial intelligence training

Info

Publication number: TWI752380B
Application number: TW108143027A
Authority: TW
Inventors: 張漢威
Original assignee: 張漢威
Priority date: 2019-11-26
Filing date: 2019-11-26
Publication date: 2022-01-11
Also published as: TW202121257A

Abstract

A parameter iteration method of artificial intelligence training includes: a setting a step, providing a training set, and setting a numerical range of the training parameters; an initializing step, selecting a three initial setting values from the numerical range of the training parameters, and calculating an accuracy values of each initial setting value according to the training set, setting the initial setting value with highest accuracy valuesas a first core value, and setting a first parameter range based on the first core value; the parameter optimization step, selecting three first iteration values in the first parameter range, calculating the accuracy value of each first training set value, comparing the accuracy value of each of the first training set values, setting the value with the highest accuracy value as the second core value, and setting the second parameter range based on the second core value; and the determining step, if the accuracy value of the second core value is higher than 0.9, setting the second core value as the training parameter standard value; if not higher than 0.9, repeats the parameter optimization step until the accuracy of a test core value is higher than 0.9, and then setting the test core value as the training parameter standard value.

Description

Iterative method of artificial intelligence training parameters

本申請案涉及人工智慧領域，尤其是一種人工智慧訓練參數迭代方法。The present application relates to the field of artificial intelligence, in particular to an iterative method for training parameters of artificial intelligence.

隨著科技的進步，人工智慧的應用面越來越大，例如，缺陷檢測、人臉辨識、醫療判斷等等都可以逐步看到人工智慧的應用。在一般人工智慧實際進入應用領域之前，通常要經過資料的訓練，資料的訓練方式可以是類神經網路、卷積神經網絡(Convolutional Neural Network，CNN)等演算法。With the advancement of science and technology, the application of artificial intelligence is getting larger and larger. For example, defect detection, face recognition, medical judgment, etc. can gradually see the application of artificial intelligence. Before general artificial intelligence actually enters the application field, it usually needs to be trained with data. The training method of data can be algorithms such as neural network and convolutional neural network (CNN).

現行上，CNN的深度學習為最常見的影像判別學習訓練方式，由於現行上由於訓練上的參數通常為亂數設定，加上輸入的資料量龐大，演算過程可能產生大量的演算資料，因而對於記憶體、運算資源的負擔較大，而造成訓練時間長、效率不佳的問題。Currently, CNN's deep learning is the most common image discrimination learning and training method. Since the current training parameters are usually set by random numbers and the amount of input data is huge, the calculation process may generate a large amount of calculation data. The burden of memory and computing resources is relatively large, resulting in long training time and poor efficiency.

在此，提供一種人工智慧訓練參數迭代方法。人工智慧訓練參數迭代方法包含設定步驟、初始化步驟、參數優化步驟、以及判斷步驟。設定步驟是提供訓練集，並設定至少二訓練參數的數值範圍。初始化步驟是由訓練參數的數值範圍，隨機選取至少三初始設定值，並依據訓練集計算出各三初始設定值的準確率，至少三初始設定值中準確率最高者為第一核心值，並以第一核心值的參數座標值為物理中心設定出第一參數範圍。參數優化步驟是在第一參數範圍中，選取至少三第一迭代值，並依據訓練集計算各第一迭代值的準確率，並比較至少三第一迭代值的準確率，以準確率最高者為第二核心值，並依據第二核心值的參數座標值為物理中心設定出第二參數範圍。判斷步驟是判斷第二核心值的準確率是否高於0.9，若高於0.9則停止，以第二核心值為訓練參數標準值。若未高於0.9，則以第二核心值及第二參數範圍，取代第一核心值及第一參數範圍，重複參數優化步驟，直到測試核心值的準確率高於0.9，以測試核心值之參數座標為訓練參數標準值。Here, an iterative method for artificial intelligence training parameters is provided. The artificial intelligence training parameter iteration method includes a setting step, an initialization step, a parameter optimization step, and a judgment step. The setting step is to provide a training set and set the numerical range of at least two training parameters. The initialization step is to randomly select at least three initial setting values from the numerical range of the training parameters, and calculate the accuracy rate of each of the three initial setting values according to the training set, and the highest accuracy rate among the at least three initial setting values is the first core value. The parameter coordinate value of the first core value sets the first parameter range for the physical center. The parameter optimization step is to select at least three first iteration values in the first parameter range, and calculate the accuracy rate of each first iteration value according to the training set, and compare the accuracy rates of at least three first iteration values, with the highest accuracy rate is the second core value, and the second parameter range is set according to the parameter coordinate value of the second core value for the physical center. The judging step is to judge whether the accuracy of the second core value is higher than 0.9, if it is higher than 0.9, stop, and use the second core value as the standard value of the training parameter. If it is not higher than 0.9, the second core value and the second parameter range are used to replace the first core value and the first parameter range, and the parameter optimization steps are repeated until the accuracy of the test core value is higher than 0.9. The parameter coordinates are the standard values of the training parameters.

在一些實施例中，至少二參數包含批次學習數量及學習率，批次學習數量的範圍是0.5至1.5，學習率的範圍是0.5至1.5，第一參數範圍是在批次學習數量及學習率分別為橫軸、縱軸的座標系統上，以第一核心值為圓心的圓形。更詳細地，在一些實施例中，學習數量的範圍是0.7至1.3，學習率的範圍是0.7至1.3。In some embodiments, the at least two parameters include a batch learning quantity and a learning rate, the range of the batch learning quantity is 0.5 to 1.5, the range of the learning rate is 0.5 to 1.5, and the first parameter range is the batch learning quantity and the learning rate. On a coordinate system whose rate is the horizontal axis and the vertical axis respectively, a circle with the first core value as the center of the circle. In more detail, in some embodiments, the learning amount ranges from 0.7 to 1.3 and the learning rate ranges from 0.7 to 1.3.

更詳細地，在一些實施例中，初始化步驟中是由二圖形處理器分別進行初始設定值之批次學習數量與學習率的選取、及準確率的計算。More specifically, in some embodiments, in the initialization step, the two GPUs respectively perform the selection of the batch learning quantity and the learning rate of the initial set value, and the calculation of the accuracy rate.

在一些實施例中，至少二參數更包含動量，動量的範圍為0至1，在此，第一參數範圍是在批次學習數量、學習率及動量分別為x軸、y軸、z軸的一座標系統上以第一核心值為球心的球體。更詳細地，動量的範圍是0.3至0.8。In some embodiments, the at least two parameters further include momentum, and the range of the momentum is 0 to 1. Here, the first parameter range is the number of batch learning, the learning rate, and the momentum, which are the x-axis, the y-axis, and the z-axis, respectively. A sphere with the first core value as the center of the sphere on a coordinate system. In more detail, the range of momentum is 0.3 to 0.8.

在一些實施例中，至少二參數更包含正規化，正規化的範圍為0.00001至 0.001，其中第一參數範圍是在批次學習數量、學習率、動量、正規化分別為x軸、y軸、z軸、w軸的一座標系統上以第一核心值為物理中心的物理量範圍。更詳細地，在一些實施例中，正規化的範圍是0.0001至0.0005。In some embodiments, the at least two parameters further include normalization, and the normalization range is 0.00001 to 0.001, wherein the first parameter range is the number of batch learning, learning rate, momentum, and normalization are the x-axis, y-axis, On the coordinate system of the z-axis and the w-axis, the first core value is the physical quantity range of the physical center. In more detail, in some embodiments, the normalization ranges from 0.0001 to 0.0005.

在一些實施例中，在初始化步驟中，批次學習數量、學習率、動量、正規化中的任兩者由第一圖形處理器進行選取、而批次學習數量、學習率、動量、正規化中的另兩者由第二圖形處理器進行選取，並由第三圖形處理器進行準確率的計算。In some embodiments, in the initialization step, any two of batch learning size, learning rate, momentum, normalization are selected by the first graphics processor, while batch learning size, learning rate, momentum, normalization The other two are selected by the second graphics processor, and the accuracy is calculated by the third graphics processor.

在一些實施例中，人工智慧訓練參數迭代方法更包含檢驗步驟。檢驗步驟是提供測試集，並以判斷步驟中準確率高於0.9之第二核心值或測試核心值計算準確率，若依據測試集計算出準確率低於0.9，則重新進行初始化步驟。In some embodiments, the artificial intelligence training parameter iteration method further includes a checking step. The verification step is to provide a test set, and calculate the accuracy rate with the second core value or the test core value with the accuracy rate higher than 0.9 in the judgment step. If the accuracy rate calculated according to the test set is lower than 0.9, the initialization step is performed again.

綜上所述，透過人工智慧訓練參數迭代方法，可以加速參數值的選取，加快整體人工智慧訓練的過程，從而能以更少量的檔案、更短的速度，達到人工訓練得梯度快速下降的效果，而能簡短訓練的時間、大幅提升訓練的效率。In summary, through the artificial intelligence training parameter iteration method, the selection of parameter values can be accelerated, and the overall artificial intelligence training process can be accelerated, so that the artificial training can achieve the effect of rapid gradient descent with a smaller number of files and a shorter speed. , which can shorten the training time and greatly improve the training efficiency.

圖1為人工智慧訓練參數迭代方法的流程圖。如圖1所示，人工智慧訓練參數迭代方法S1包含設定步驟S10、初始化步驟S20、參數優化步驟S30、以及判斷步驟S40。Fig. 1 is a flowchart of an iterative method of artificial intelligence training parameters. As shown in FIG. 1 , the artificial intelligence training parameter iteration method S1 includes a setting step S10 , an initialization step S20 , a parameter optimization step S30 , and a judgment step S40 .

設定步驟S10是提供訓練集，並設定至少二訓練參數的數值範圍。在此，係以卷積神經網絡(Convolutional Neural Network，CNN)進行人工智慧訓練，以隨機梯度下降法的模型做為訓練集，其可以如以下方程式(一)所示。然而，這僅為示例，而非用以限制，在此的訓練集更包含損失(loss)計算等演算方式。方程式(一)：

，其中w表示為權重、n為批次學習量(batch size)、η為學習率。 The setting step S10 is to provide a training set and set the numerical range of at least two training parameters. Here, artificial intelligence training is performed with a convolutional neural network (CNN), and a model of stochastic gradient descent is used as a training set, which can be shown as the following equation (1). However, this is only an example rather than a limitation, and the training set here also includes calculation methods such as loss calculation. Equation (1):

, where w is the weight, n is the batch size, and η is the learning rate.

如同前述方程式(一)所示，批次學習量與學習率，將實際影響到權重的收斂，當大的批次學習量，可以減少了批次的數量，而減少了訓練的時間，但相對地，較大的學習量會使得需要更大的迭代次數，而使得模型的性能較差，所產生的資料量也更大。學習率太小，代表對於權重更新範圍很小，會使其訓練變非常緩慢；然後學習率太大，可能導致無法收斂。因此，設定步驟S10除了提供適當的訓練集，同時也需設定參數的範圍，例如，設定的參數包含批次學習數量及學習率，批次學習數量的範圍是0.5至1.5，學習率的範圍是0.5至1.5。較佳地，學習數量的範圍是0.7至1.3，學習率的範圍是0.7至1.3。然而，以上僅為示例，而非用以限制，相關的參數亦不只批次學習數量及學習率。As shown in Equation (1) above, the batch learning amount and learning rate will actually affect the convergence of weights. When the batch learning amount is large, the number of batches can be reduced, and the training time can be reduced, but relatively In general, a larger learning amount will require a larger number of iterations, resulting in a poorer model performance and a larger amount of data generated. If the learning rate is too small, it means that the weight update range is very small, which will make the training very slow; then the learning rate is too large, which may lead to failure to converge. Therefore, the setting step S10 not only provides an appropriate training set, but also needs to set the range of parameters. For example, the set parameters include the number of batches and the learning rate. The range of the number of batches is 0.5 to 1.5, and the range of the learning rate is 0.5 to 1.5. Preferably, the learning amount ranges from 0.7 to 1.3, and the learning rate ranges from 0.7 to 1.3. However, the above is only an example, not a limitation, and the relevant parameters are not only the number of batch learning and the learning rate.

圖2為初始化步驟的示意圖，如圖2所示，初始化步驟S20是由訓練參數的數值範圍，隨機選取至少三初始設定值A1、A2、A3。在此，以下以批次學習量為橫軸、學習率，隨機選取出三初始設定值A1、A2、A3在座標系統上的座標值分別為(x1,y1)、(x2, y2)、(x3, y3)為示例。此僅為了方便在平面上呈現，實際上的參數並不限於此。FIG. 2 is a schematic diagram of an initialization step. As shown in FIG. 2 , in the initialization step S20 , at least three initial setting values A1 , A2 , and A3 are randomly selected from the numerical range of the training parameters. Here, with the batch learning amount as the horizontal axis and the learning rate, the coordinate values of the three initial set values A1, A2, and A3 on the coordinate system are randomly selected as (x1, y1), (x2, y2), ( x3, y3) are examples. This is only presented on a plane for convenience, and the actual parameters are not limited to this.

接著，依據訓練集計算出各三初始設定值的準確率，準確率(Accuracy Rate, ACC)的計算方式為1-loss。經計算後，比較三者的準確率，並以至少三初始設定值A1、A2、A3中準確最高者(假設為A1)為第一核心值，並以第一核心值的參數座標值(x1, y1)為物理中心設定出第一參數範圍R1。在此，第一參數範圍R1是在批次學習數量及該學習率分別為橫軸、縱軸的座標系統上，以第一核心值(x1, y1)為圓心的圓形。在此，圓形的半徑可以為

。然而，這僅為示例，也可以預選為特定半徑值作為第一參數範圍R1。 Next, according to the training set, the accuracy rates of the three initial setting values are calculated, and the calculation method of the accuracy rate (Accuracy Rate, ACC) is 1-loss. After calculation, compare the accuracy rates of the three, and take the highest accuracy among the at least three initial setting values A1, A2, and A3 (assuming A1) as the first core value, and use the parameter coordinate value of the first core value (x1 , y1) set the first parameter range R1 for the physical center. Here, the first parameter range R1 is a circle with the first core value (x1, y1) as the center on a coordinate system in which the number of batch learning and the learning rate are respectively the horizontal axis and the vertical axis. Here, the radius of the circle can be

. However, this is only an example, and a specific radius value may also be preselected as the first parameter range R1.

圖3為參數優化步驟的示意圖。如圖3所示，參數優化步驟S30是在第一參數範圍R1中，選取至少三第一迭代值B1、B2、B3，並依據訓練集計算各第一迭代值B1、B2、B3的準確率，並比較至少三第一迭代值B1、B2、B3的準確率，以準確率最高者(假設為B3)為第二核心值，並依據第二核心值的參數座標值(x6,y6)為物理中心設定出第二參數範圍R2。在此，圓形的半徑可以為

。然而，這僅為示例，也可以預選為特定半徑值作為第一參數範圍R1 FIG. 3 is a schematic diagram of parameter optimization steps. As shown in FIG. 3, the parameter optimization step S30 is to select at least three first iteration values B1, B2, B3 in the first parameter range R1, and calculate the accuracy of each first iteration value B1, B2, B3 according to the training set , and compare the accuracy rates of at least three first iteration values B1, B2, B3, take the one with the highest accuracy rate (assuming B3) as the second core value, and according to the parameter coordinate values (x6, y6) of the second core value as The physical center sets the second parameter range R2. Here, the radius of the circle can be

. However, this is only an example, it is also possible to preselect a specific radius value as the first parameter range R1

判斷步驟S40是判斷第二核心值的準確率是否高於0.9，若高於0.9則停止，進入步驟S45，選定第二核心值(x6, y6)為訓練參數標準值進行訓練。The judgment step S40 is to judge whether the accuracy rate of the second core value is higher than 0.9, if it is higher than 0.9, stop, and enter step S45, select the second core value (x6, y6) as the training parameter standard value for training.

圖4為參數優化步驟的示意圖。如圖4所示，若是判斷步驟S40判斷第二核心值的準確率未高於0.9，則以第二核心值(x6,y6)及第二參數範圍R2，取代第一核心值(x1, y1)及第一參數範圍R1，重複參數優化步驟S30，選取至少三第二迭代值C1、C2、C3，並依據訓練集計算各第二迭代值C1、C2、C3的準確率，並比較至少三第二訓練設定C1、C2、C3值的準確率，以準確率最高者(假設為C1)為第三核心值，並依據第三核心值C1的參數座標值(x7,y7)為物理中心設定出第三參數範圍R3。在此，圓形的半徑可以為

。可以依此方式重覆判斷步驟S40及參數優化步驟S30，直到某次的測試核心值的準確率高於0.9，以測試核心值之參數座標為訓練參數標準值。 FIG. 4 is a schematic diagram of parameter optimization steps. As shown in FIG. 4 , if the determination step S40 determines that the accuracy of the second core value is not higher than 0.9, the second core value (x6, y6) and the second parameter range R2 are used to replace the first core value (x1, y1 ) ) and the first parameter range R1, repeat the parameter optimization step S30, select at least three second iteration values C1, C2, C3, and calculate the accuracy of each second iteration value C1, C2, C3 according to the training set, and compare at least three The second training sets the accuracy rates of C1, C2, and C3 values, takes the highest accuracy rate (assuming C1) as the third core value, and sets the physical center according to the parameter coordinate values (x7, y7) of the third core value C1 out the third parameter range R3. Here, the radius of the circle can be

. The determination step S40 and the parameter optimization step S30 can be repeated in this way until the accuracy rate of a certain test core value is higher than 0.9, and the parameter coordinates of the test core value are used as the training parameter standard value.

再次參閱圖1，人工智慧訓練參數迭代方法S1更包含檢驗步驟S50。檢驗步驟S50是提供測試集，測試集與訓練集中的數值均不相同，檢驗步驟S50中將判斷步驟S40中準確率高於0.9之第二核心值、或是後續重覆參數優化步驟S30所得到的測試核心值依據測試集計算出準確率。若是依據測試集所計算出的準確率高於0.9，則以第二核心值或測試核心值為訓練參數標準值。而若是準確率低於0.9，則廢棄該參數，重新進行初始化步驟S20。Referring to FIG. 1 again, the artificial intelligence training parameter iteration method S1 further includes a checking step S50. The verification step S50 is to provide a test set, and the values in the test set and the training set are different. In the verification step S50, the second core value with an accuracy rate higher than 0.9 in the step S40 or the subsequent repeated parameter optimization step S30 will be determined in the verification step S50. The test core value is calculated based on the test set to calculate the accuracy. If the accuracy rate calculated according to the test set is higher than 0.9, the second core value or the test core value is used as the standard value of the training parameter. On the other hand, if the accuracy rate is lower than 0.9, the parameter is discarded, and the initialization step S20 is performed again.

在前述實施例中，初始化步驟S20中是由二圖形處理器分別進行初始設定值之批次學習數量與學習率的選取、及準確率的計算。也就是，圖2中的初始設定值A1、A2、A3的選取，以及初始設定值A1、A2、A3之準確率的計算分別由二個圖形處理器來計算出。如此，可以分散圖形處理器的運算資源，達到更快的計算效率。In the aforementioned embodiment, in the initialization step S20, the two graphics processors respectively perform the selection of the batch learning quantity and the learning rate of the initial set value, and the calculation of the accuracy rate. That is, the selection of the initial setting values A1 , A2 , and A3 in FIG. 2 and the calculation of the accuracy of the initial setting values A1 , A2 , and A3 are respectively calculated by two graphics processors. In this way, the computing resources of the graphics processor can be dispersed to achieve faster computing efficiency.

然而，圖2至4僅為示例，實際影響到訓練效率的參數還有動量及正規化，若是設定三個參數，例如，批次學習數量、學習率及動量，則可以理解的是，第一參數範圍是在批次學習數量、學習率及動量分別為x軸、y軸、z軸的一座標系統上以第一核心值為球心的球體。而若是四個參數，批次學習數量、學習率、動量及正規化時，第一參數範圍是在批次學習數量、學習率、動量、正規化分別為x軸、y軸、z軸、w軸的座標系統上以第一核心值為物理中心的物理量範圍。唯，於平面繪製三軸、四軸的空間不易呈現出迭代的效果，在此未繪示出，而本領域具有通常知識者，可以依據圖2至4，思考到於三軸、四軸空間的變換。However, Figures 2 to 4 are only examples. The parameters that actually affect the training efficiency are momentum and normalization. If three parameters are set, such as the number of batch learning, learning rate and momentum, it can be understood that the first The parameter range is a sphere with the first core value as the center of the sphere on a coordinate system in which the number of batch learning, the learning rate, and the momentum are the x-axis, y-axis, and z-axis, respectively. And if there are four parameters, batch learning quantity, learning rate, momentum and normalization, the first parameter range is the batch learning quantity, learning rate, momentum, and normalization, respectively x-axis, y-axis, z-axis, w On the coordinate system of the axis, the first core value is the physical quantity range of the physical center. However, it is not easy to show the iterative effect when drawing three-axis and four-axis spaces on a plane, which is not shown here, but those with ordinary knowledge in the art can think about three-axis and four-axis spaces according to Figures 2 to 4. transformation.

更詳細地，若是考量到動量及正規化，動量的範圍為0至1，較佳地，其範圍是0.3至0.8。正規化的範圍為0.00001至0.001，較佳地，其範圍是0.0001至0.0005。In more detail, if momentum and normalization are considered, the range of momentum is 0 to 1, preferably, the range is 0.3 to 0.8. The normalized range is 0.00001 to 0.001, preferably, the range is 0.0001 to 0.0005.

進一步地，初始化步驟S20中，考量到動量及正規化，批次學習數量、學習率、動量、正規化中的任兩者，可以由第一圖形處理器進行選取、而其他另兩者由第二圖形處理器進行選取，並由第三圖形處理器進行該準確率的計算。藉此，透過三個圖形處理器來分散運算資源，來達到更快的運算效果，使得梯度的下降更快。Further, in the initialization step S20, considering momentum and normalization, any two of the batch learning quantity, learning rate, momentum, and normalization can be selected by the first graphics processor, and the other two can be selected by the first graphics processor. The second graphics processor selects, and the third graphics processor performs the calculation of the accuracy. In this way, computing resources are distributed through three graphics processors to achieve faster computing effects and make the gradient drop faster.

在此，以比較例與實施例進行比較，實際採用的訓練集均為google提供的標準訓練集，比較例是以標準AI框架，採用Tesla 40m (2880CUDA、12GB)的GPU、利用google tensorflow的模式來進行訓練；而實施例是採用四個GTX 1050i (768CUDA、4GB)的GPU做串連，採用前述實施例，取批次學習數量、學習率、動量、正規化分別為0.7至1.3、0.7至1.3、0.3至0.8、0.001至0.005的範圍，來進行訓練。比較例的結果是在14400秒，得到86%的精確率；而實施例的結果是在900秒得到98%的精確率。Here, the comparative example is compared with the embodiment. The actual training set used is the standard training set provided by google. The comparative example is based on the standard AI framework, using Tesla 40m (2880CUDA, 12GB) GPU, using google tensorflow mode In the embodiment, four GTX 1050i (768CUDA, 4GB) GPUs are used for serial connection. Using the foregoing embodiment, the batch learning quantity, learning rate, momentum, and normalization are respectively 0.7 to 1.3, 0.7 to 1.3, 0.3 to 0.8, 0.001 to 0.005 range for training. The result of the comparative example is 86% accuracy at 14400 seconds; and the result of the example is 98% accuracy at 900 seconds.

綜上所述，透過本申請案之人工智慧訓練參數迭代方法，能夠快速地讓參數的選擇優化，而能快速地達到梯度下降收斂的功效，有助於提升訓練的效率、減少運算的資源、以較便宜的硬體成本來完成訓練。To sum up, through the artificial intelligence training parameter iteration method of the present application, the selection of parameters can be quickly optimized, and the effect of gradient descent convergence can be quickly achieved, which helps to improve the efficiency of training, reduce computing resources, Complete your training with less expensive hardware.

雖然本發明的技術內容已經以較佳實施例揭露如上，然其並非用以限定本發明，任何熟習此技藝者，在不脫離本發明之精神所作些許之更動與潤飾，皆應涵蓋於本發明的範疇內，因此本發明之保護範圍當視後附之申請專利範圍所界定者為準。Although the technical content of the present invention has been disclosed above with preferred embodiments, it is not intended to limit the present invention. Any person who is familiar with the art, makes some changes and modifications without departing from the spirit of the present invention, should be included in the present invention. Therefore, the protection scope of the present invention should be determined by the scope of the appended patent application.

S1 S1 人工智慧訓練參數迭代方法 Iterative method of artificial intelligence training parameters S10 S10 設定步驟 Setting steps S20 S20 初始化步驟 initialization steps S30 S30 參數優化步驟 Parameter optimization steps S40 S40 判斷步驟 Judgment step S45 S45 選定訓練參數 selected training parameters S50 S50 檢驗步驟 Inspection steps A1、A2、A3 A1, A2, A3 初始設定值 Initial setting value B1、B2、B3 B1, B2, B3 第一迭代值 first iteration value C1、C2、C3 C1, C2, C3 第二迭代值 second iteration value R1 R1 第一參數範圍 first parameter range R2 R2 第二參數範圍 Second parameter range R3 R3 第三參數範圍 third parameter range

圖1為人工智慧訓練參數迭代方法的流程圖。圖2為初始化步驟的示意圖。圖3及圖4為參數優化步驟的示意圖。 Fig. 1 is a flowchart of an iterative method of artificial intelligence training parameters. FIG. 2 is a schematic diagram of initialization steps. 3 and 4 are schematic diagrams of parameter optimization steps.

S1 S1 人工智慧訓練參數迭代方法 Iterative method of artificial intelligence training parameters S10 S10 設定步驟 Setting steps S20 S20 初始化步驟 initialization steps S30 S30 參數優化步驟 Parameter optimization steps S40 S40 判斷步驟 Judgment step S45 S45 選定訓練參數 selected training parameters S50 S50 檢驗步驟 Inspection steps

Claims

An iterative method for artificial intelligence training parameters, comprising: a setting step, providing a training set, and setting a numerical range of at least two training parameters; an initialization step, randomly selecting at least three initial setting values from the numerical range of the training parameters, and calculate an accuracy rate of each of the initial setting values according to the training set, the highest accuracy rate among the at least three initial setting values is a first core value, and the parameter coordinate value of the first core value is set as the physical center A first parameter range is obtained; in a parameter optimization step, in the first parameter range, at least three first iteration values are selected, and the accuracy rate of each first iteration value is calculated according to the training set, and the at least three third values are compared. For the accuracy rate of an iterative value, the one with the highest accuracy rate is a second core value, and a second parameter range is set according to the parameter coordinate value of the second core value for the physical center; and a judgment step is to judge the first parameter range. Whether the accuracy rate of the two core values is higher than 0.9, if it is higher than 0.9, stop, and use the second core value as a training parameter standard value, if not higher than 0.9, then use the second core value and the second parameter range, replace the first core value and the first parameter range, repeat the parameter optimization step until the accuracy of a test core value is higher than 0.9, and use the parameter coordinate of the test core value as the standard value of the training parameter, wherein the At least two training parameters include a batch learning quantity and a learning rate, the batch learning quantity ranges from 0.5 to 1.5, the learning rate ranges from 0.5 to 1.5, and the first parameter range is the batch learning quantity and The learning rate is on the coordinate system of the horizontal axis and the vertical axis respectively, and the first core value is a circle with the center of the circle, wherein the range of the batch learning quantity is 0.7 to 1.3, and the range of the learning rate is 0.7 to 1.3.

The artificial intelligence training parameter iteration method as claimed in claim 1, wherein in the initialization step, the selection of the batch learning quantity and the learning rate of the initial set value and the calculation of the accuracy rate are respectively performed by two graphics processors.

The artificial intelligence training parameter iteration method according to claim 1, wherein the at least two parameters further include a momentum, the range of the momentum is 0 to 1, wherein the first parameter range is the number of learning in the batch, the learning rate And the momentum is a sphere with the first core value as the center of the sphere on the coordinate system of the x-axis, the y-axis, and the z-axis respectively.

The artificial intelligence training parameter iteration method of claim 3, wherein the momentum is in the range of 0.3 to 0.8.

The artificial intelligence training parameter iteration method according to claim 3, wherein the at least two parameters further include a normalization, and the normalization range is 0.00001 to 0.001, wherein the first parameter range is the number of learning in the batch, the The learning rate, the momentum, and the normalization are a range of physical quantities with the first core value as the physical center on the coordinate system of the x-axis, the y-axis, the z-axis, and the w-axis, respectively.

The artificial intelligence training parameter iteration method as claimed in claim 5, wherein the normalization range is 0.0001 to 0.0005.

The artificial intelligence training parameter iteration method according to claim 5, wherein in the initialization step, any two of the batch learning quantity, the learning rate, the momentum, and the normalization are performed by a first graphics processor. and the other two of the batch learning quantity, the learning rate, the momentum, and the normalization are selected by a second graphics processor, and the accuracy is calculated by a third graphics processor.

The artificial intelligence training parameter iteration method according to claim 1, further comprising a checking step, providing a test set, and using the second core value or the test core value with the accuracy rate higher than 0.9 in the judgment step according to the test If the accuracy rate is calculated according to the test set and is lower than 0.9, the initialization step is performed again.