TWI783239B - Method of optimizing image data - Google Patents
Method of optimizing image data Download PDFInfo
- Publication number
- TWI783239B TWI783239B TW109121875A TW109121875A TWI783239B TW I783239 B TWI783239 B TW I783239B TW 109121875 A TW109121875 A TW 109121875A TW 109121875 A TW109121875 A TW 109121875A TW I783239 B TWI783239 B TW I783239B
- Authority
- TW
- Taiwan
- Prior art keywords
- image
- ratio
- size
- computer
- method described
- Prior art date
Links
Images
Landscapes
- Image Analysis (AREA)
- Apparatus For Radiation Diagnosis (AREA)
- Image Processing (AREA)
Abstract
Description
本發明係關於一種優化影像資料之方法,詳而言之,係關於一種設定移動物件大小閾值優化訓練資料集之方法。 The present invention relates to a method for optimizing image data, in particular, relates to a method for setting a size threshold of a moving object to optimize a training data set.
人工智慧之監督式學習需採集資料並標註以進行學習,經標註後的大量資料中,並非所有資料對於類神經網路的訓練是有益的,甚至會因為資料的多樣性,而導致偵測效果有所降低。以往在類神經網路的訓練上,由於其黑箱的特性,使得訓練時忽略了資料的特性,以致於一股腦地將所有資料投入至訓練當中。因此如何在大量資料中篩選出更加優良的資料,使得神經網路所訓練出的模型更加貼近理想,是本領域技術人員所設想的目標。 Supervised learning of artificial intelligence needs to collect data and label it for learning. Among the large amount of labeled data, not all data are beneficial to the training of neural networks, and even the detection effect may be caused by the diversity of data. decreased. In the past, in the training of similar neural networks, due to its black box characteristics, the characteristics of the data were ignored during training, so that all the data were put into the training. Therefore, how to screen out more excellent data from a large amount of data so that the model trained by the neural network is closer to the ideal is the goal conceived by those skilled in the art.
此外,由於物件在畫面上所占比例不同,因此在產生數據集時,同類別的物件會收集到各種比例的數據,導致偵測物件大小飄忽不定,更存在有偵測不到過小移動物件之問題。 In addition, due to the different proportions of objects on the screen, when generating data sets, objects of the same type will collect data of various proportions, resulting in erratic size of detected objects, and there may be cases where too small moving objects cannot be detected question.
本發明之一目的即在於提供一種根據物件在影像的比例值(本文或以比值稱之),用來設定移動物件大小閾值,藉此優化深度學習訓練集,可以 提升準確率,解決過去偵測不到過小移動物件之問題。本發明之另一目的即在於提供一種優化移動物件之訓練資料集,可提供物件應在此影像的比例讓類神經網路亦進行參考並修正,解決目前模型準確度難以進一步提升之問題。 One purpose of the present invention is to provide a method for setting the size threshold of moving objects according to the ratio value of the object in the image (this article may be referred to as the ratio), so as to optimize the deep learning training set, which can Improve the accuracy rate and solve the problem of not being able to detect too small moving objects in the past. Another object of the present invention is to provide a training data set for optimizing moving objects, which can provide the proportion of the object in the image for reference and correction by the neural network, and solve the problem that the accuracy of the current model is difficult to further improve.
因此,本發明提供一種優化影像資料之方法,係包括:標註影像中之物件;計算該物件占該影像的比值;比較該比值與一預設閾值後,產生比較結果;以及根據該比較結果處理該影像。 Therefore, the present invention provides a method for optimizing image data, which includes: marking the object in the image; calculating the ratio of the object to the image; comparing the ratio with a preset threshold to generate a comparison result; and processing according to the comparison result the image.
此外,在獲取大量訓練資料後,本發明利用物件與影像之比例值,規範物件在此影像的大小應大於閾值才被視為可使用之樣本,再將過濾後的訓練資料投入至類神經網路進行訓練,使其提升其模型準確度並符合理想需求。 In addition, after obtaining a large amount of training data, the present invention utilizes the ratio value of the object and the image, and standardizes that the size of the object in this image should be greater than the threshold before it is considered a usable sample, and then puts the filtered training data into the neural network The road is trained so that it improves its model accuracy and meets the desired requirements.
再者,將影像分為一般影像及魚眼影像,在採集大量標註的影像後,一般影像是使用畫面與物件之比例值進行篩選,而魚眼影像則是取得中心點以及其最大圓,並求出最大圓的半徑,將半徑值區分為四等分,劃分為四個同心圓,根據各類別物件在各個同心圓的理想大小以及其與中心點的距離,得出該物件在此影像的理想比例值。 Furthermore, the image is divided into general image and fisheye image. After collecting a large number of marked images, the general image is screened using the ratio value of the frame to the object, while the fisheye image is obtained by obtaining the center point and its largest circle, and Find the radius of the largest circle, divide the radius value into four equal parts, and divide it into four concentric circles. According to the ideal size of each type of object in each concentric circle and its distance from the center point, the object's position in the image is obtained. Ideal scale value.
S101、S102、S103a、S103b、S104、S105a、S105b、S106:步驟 S101, S102, S103a, S103b, S104, S105a, S105b, S106: steps
請參閱有關本發明之詳細說明及其附圖,將可進一步瞭解本發明之技術內容及其目的功效;有關附圖為: Please refer to the detailed description of the present invention and accompanying drawings, will further understand the technical content and purpose effect of the present invention; The relevant drawings are:
第1圖係本發明之優化影像資料之方法的概略流程圖。 Fig. 1 is a schematic flowchart of the method for optimizing image data of the present invention.
第2圖係本發明之優化影像資料之方法之一實施例的示意圖,其中,在魚眼影像中標註物件。 FIG. 2 is a schematic diagram of an embodiment of the method for optimizing image data of the present invention, wherein objects are marked in the fisheye image.
第3圖係本發明之優化影像資料之方法的一實施例的示意圖,其中,顯示固定閾值的影響情形。 FIG. 3 is a schematic diagram of an embodiment of the method for optimizing image data of the present invention, wherein the influence of a fixed threshold is shown.
第4圖係本發明之優化影像資料之方法的一實施例的示意圖,其中,顯示自適應閾值的影響情形。 FIG. 4 is a schematic diagram of an embodiment of the method for optimizing image data of the present invention, wherein the influence of the adaptive threshold is shown.
請參見第1圖,在步驟S101中,使用標註工具對所有影像之移動物件進行標註。移動物件的標註資訊包括類別、物件在影像中的(x1,y1)座標、(x2,y2)座標,取得以上標註資訊後,可計算出物件涵蓋面積。如第2圖所示,影像中有多個定界框(bounding box),除了在魚眼影像中央偏左的藍色定界框中的物件類別為摩托車之外,其餘定界框的物件類別為小型車,且各定界框有各自的參數例如中心座標、矩形長寬等等。 Please refer to FIG. 1 , in step S101 , mark moving objects of all images using a mark tool. The labeling information of the moving object includes the category, ( x 1, y 1) coordinates and ( x 2, y 2) coordinates of the object in the image. After obtaining the above labeling information, the covered area of the object can be calculated. As shown in Figure 2, there are multiple bounding boxes in the image. Except for the object in the blue bounding box to the left of the center of the fisheye image is a motorcycle, the objects in the other bounding boxes The category is a small car, and each bounding box has its own parameters such as center coordinates, rectangle length and width, and so on.
在步驟S102中,根據鏡頭的不同,判斷影像為一般影像或是魚眼影像。由於每個攝影機之輸出影像大小(例如:1080p、720p)不等,而不能單以物件涵蓋面積進行過濾。如第3圖所示,若使用固定閾值進行過濾,會使得720p中可辨識之物件被過濾,或1080p保留了原本需要被過濾之物件。因此使用比例法並設定閾值後,如第4圖在任何輸入影像大小都能更有效過濾掉小物件,能使得資料集優化,以達至更佳的訓練效果。另外,閾值的設定是將影像中預設的物件大小/影像大小(亦即,影像中預設的物件長*寬除以影像解析度)。 In step S102, it is determined whether the image is a normal image or a fisheye image according to different lenses. Since the output image size of each camera is different (for example: 1080p, 720p), it is not possible to filter only by the area covered by the object. As shown in Figure 3, if a fixed threshold is used for filtering, the identifiable objects in 720p will be filtered, or 1080p will retain the objects that originally need to be filtered. Therefore, after using the ratio method and setting the threshold, as shown in Figure 4, small objects can be filtered out more effectively at any input image size, which can optimize the data set to achieve better training results. In addition, the threshold is set by dividing the default object size in the image/image size (that is, dividing the default object length*width in the image by the image resolution).
比例法的運算分別為一般影像以及具有變形之影像(例如:魚眼影像),其中,這裡指的一般影像是由一般型攝影機所拍攝者,而變形影像是指由魚眼所拍攝者,或可由人工判斷、系統事先設定影像類型或依其特性自動判斷。在步驟S103b中,為一般影像時,相同物件下,占整個影像之比例,比例值計算方式如公式(1)。在步驟S104中,比較所計算出的比值與閾值的大小。在一實施例中,設定閾值為0.0003至0.001,以這些閾值對資料集所有物件進行過濾,例如物件比值大於閾值時表示不過濾,如步驟S105a,反之小於時則過濾,如步驟S105b,而對過濾後的資料集進行訓練,並比較結果使用多少閾值時效果較佳。不同類別的物件可設定不同的閾值。不同物件設定不同閾值。例如影像中的行人、摩托車、小型車與大型車的長寬都不同,其中,比較兩種解析度的輸入圖片(如720p,1080p),若使用固定閾值進行過濾,會使得720P中將可辨識之物件過濾,而1080p保留需要過濾之物件(如第3圖所示)。 The operation of the ratio method is the general image and the image with deformation (for example: fisheye image), wherein, the general image here refers to the one shot by the general camera, and the deformed image refers to the one shot by the fisheye, or It can be judged manually, the image type is pre-set by the system, or it can be judged automatically according to its characteristics. In step S103b, when it is a general image, the ratio of the same object to the entire image is calculated as formula (1). In step S104, the calculated ratio is compared with the threshold value. In one embodiment, the threshold is set to 0.0003 to 0.001, and all objects in the data set are filtered with these thresholds. For example, when the object ratio is greater than the threshold, it means no filtering, as in step S105a; otherwise, it is filtered, as in step S105b. It is better to use the filtered dataset for training and compare the results with which threshold. Different types of objects can set different thresholds. Different objects set different thresholds. For example, the length and width of pedestrians, motorcycles, small cars and large cars in the image are different. Among them, comparing the input images of two resolutions (such as 720p, 1080p), if a fixed threshold is used for filtering, the 720P will be recognizable 1080p retains the objects that need to be filtered (as shown in Figure 3).
因此計算物件比值是否符合閾值,若小於閾值則視為背景,並重新標籤此影像。例如第4圖中,假設閾值為0.0006,對於1080P的影像物件比值都大於閾值而被保留,而使用依影像比例去設閾值,假設4圖的720P行人閾值為0.001,則720P之上方圖的行人長*寬/(1280*720)都大於0.0005,因此黃框的行人保留標註,假設第4圖的720P小型車的閾值為0.0005,而第4圖的720P下方圖的小型車長*寬/(1280*720)也大於0.0005,因此黃框的小型車保留標註。 Therefore, it is calculated whether the object ratio meets the threshold value, and if it is less than the threshold value, it is regarded as the background, and the image is re-labeled. For example, in Figure 4, assuming the threshold is 0.0006, the ratio of 1080P image objects is greater than the threshold and is retained, and the threshold is set according to the image ratio. Assuming that the threshold of 720P pedestrians in Figure 4 is 0.001, the pedestrians in the upper image above 720P Length*Width/(1280*720) is greater than 0.0005, so the pedestrians in the yellow frame are reserved. Assume that the threshold of the 720P small car in Figure 4 is 0.0005, and the small car length*width/(1280* below the 720P in Figure 4 720) is also greater than 0.0005, so the small cars in the yellow frame remain marked.
公式(1)如下: Formula (1) is as follows:
閾值=預設的物件大小(ObjectSize)/影像大小(VideoSize) Threshold = default object size ( ObjectSize ) / image size ( VideoSize )
比值ε=實際的物件大小(ObjectSize)/影像大小(VideoSize) Ratio ε = actual object size ( ObjectSize ) / image size ( VideoSize )
比值ε 閾值(threshold),不過濾(reject object) Ratio ε Threshold ( threshold ), do not filter ( reject object )
比值ε<閾值(threshold),過濾(reserve object) Ratio ε < threshold ( threshold ), filter ( reserve object )
在步驟103a中,為魚眼影像時,由於各類別表現在影像的比例不相同,因此所使用的閾值亦不同。經由座標取得魚眼影像之中心點以及其最大圓,並求出最大圓的半徑,將半徑值區分為若干等分,劃分為多個同心圓,根據各類別物件在各個同心圓的理想大小以及其與中心點的距離,得出該物件在此影像比例的理想值。須說明的是,在魚眼影像上,物件隨著與中心點的距離改變其大小,而距離中心點越近則物件大小越大,同心圓半徑即是與中心點之距離,因此可利用同心圓之半徑看出該物件大小為何,並得出在不同距離下,各類物件大小為何,而算出之物件大小即為理想值。依照此方法將各不同距離所得到的比例來過濾資料集,並將各結果進行比較,求出應過濾距離超過多少之物件得出的結果最佳,如步驟S105a所示,若比值大於或等於閾值,則不進行過濾,反之如步驟S105b所示,物件會被過濾被視為背景。不同類別的物件在不同同心圓中具有不同的大小。假設有兩個同心圓,一個與中心的距離為10(arb.unit)而另一個與中心點的距離15(arb.unit),在同物件的情況下,在距離10測量物件大小為150,在距離15測量物件大小為60,比例為距離/物件長*寬,利用該比例做為閾值的設定去測出理想比例哪個最好,而根據不同的類別測出的比例會有所不同。假設紅框的小型車大小為150,假設十字線交點為中心點,與紅框的小型車距離為10,而假設藍框的摩托車大小為60,利用比例:距離/物件長*寬來做計算。如上說明,一般影像是直接設定閾值為0.0003至0.001,而魚眼是以同心圓的方式再各別做,例如設定此魚眼有五個內圈,在各內圈設定閾值為0.0003至0.001,再找出何種閾值對應類別結果最佳。 In step 103a, when it is a fisheye image, the thresholds used are also different because the proportions of the images displayed by each category are different. Obtain the center point of the fisheye image and its maximum circle through the coordinates, and calculate the radius of the maximum circle. Divide the radius value into several equal parts, and divide it into multiple concentric circles. According to the ideal size of each type of object in each concentric circle and Its distance from the center point yields the ideal value of the object's aspect ratio in this image. It should be noted that on the fisheye image, the size of the object changes with the distance from the center point, and the closer the distance to the center point, the larger the size of the object, and the radius of the concentric circle is the distance from the center point, so the concentric circle can be used The radius of the circle shows the size of the object, and obtains the size of various objects at different distances, and the calculated size of the object is the ideal value. According to this method, the ratio obtained by each different distance is used to filter the data set, and the results are compared to find out the best result for objects whose distance should be filtered. As shown in step S105a, if the ratio is greater than or equal to threshold, then no filtering is performed, otherwise, as shown in step S105b, the object will be filtered and regarded as the background. Objects of different classes have different sizes in different concentric circles. Suppose there are two concentric circles, one with a distance of 10(arb.unit) from the center and the other with a distance of 15(arb.unit) from the center point. In the case of the same object, the size of the object measured at a distance of 10 is 150, Measure the size of the object at a distance of 15 to 60, and the ratio is distance/object length*width. Use this ratio as the threshold setting to determine which is the best ideal ratio, and the measured ratio will be different according to different categories. Assume that the size of the small car in the red frame is 150, assume that the intersection point of the crosshairs is the center point, and the distance from the small car in the red frame is 10, and assume that the size of the motorcycle in the blue frame is 60, use the ratio: distance/object length * width to do the calculation. As explained above, the general image directly sets the threshold value from 0.0003 to 0.001, while the fisheye is made in a concentric circle. Then find out which threshold corresponds to the category with the best results.
表1為一般影像的一實施例的實驗結果,於表1中,第一欄為閾值,第二至六欄分別為大型車、小型車、機車、腳踏車和行人的平均精度(AP(average precision))值,而最後一欄為全部類別AP的平均值。 Table 1 is the experimental results of an embodiment of the general image. In Table 1, the first column is the threshold value, and the second to sixth columns are the average precision (AP (average precision) ) values, and the last column is the average value of all categories of AP.
最後,在步驟S106中,將過濾過後的訓練資料集放入類神經網路訓練。 Finally, in step S106, the filtered training data set is put into neural network training.
因此,本發明可透過過濾物件來優化訓練資料集,在一般影像與魚眼影像對物件比例的計算方式不同,當人工在框選時每個人員框的物件大小定義不一,透過此方法將各類別的大小設定為最適當的比例,且輸入任意影像大小都可使用此方法,使得資料及優化,達到更好訓練效果,提升準確度。 Therefore, the present invention can optimize the training data set by filtering objects. In general images and fisheye images, the calculation methods for object ratios are different. When manually selecting frames, the object size of each person frame is defined differently. Through this method, The size of each category is set to the most appropriate ratio, and this method can be used to input any image size, so that data and optimization can be achieved to achieve better training results and improve accuracy.
本發明所提出根據比例值優化訓練資料集的方法,具備下列優點: The method for optimizing the training data set according to the ratio value proposed by the present invention has the following advantages:
1.由於此方法將多媒體影像中之移動物件以比例方式過濾及優化資料集,因此不管是影像的輸出大小不同,或是影像表現的形狀不同,皆可適用此方法進行優化。 1. Since this method filters and optimizes the data set proportionally to moving objects in multimedia images, this method can be applied for optimization regardless of whether the output size of the image is different or the shape of the image is different.
2.將經過優化後的資料集,放入至類神經網路進行訓練,可先前使得難以提升的準確率,有明顯上升趨勢。 2. Putting the optimized data set into a neural network for training, the accuracy rate that was previously difficult to improve has a clear upward trend.
上列詳細說明係針對本發明之一可行實施例之具件說明,惟該實施例並非用以限制本發明之專利範圍,凡未脫離本發明技藝精神所為之等效實施或變更,均應包含於本案之專利範圍中。綜上所述,本案不但在技術思想上確屬創新,並能較習用物品增進上述多項功效,應以充分符合新穎性及進步性之法定發明專利要件,爰依法提出申請,懇請貴局核准本件發明專利申請案,以勵發明,至感德便。 The above detailed description is a concrete description of a feasible embodiment of the present invention, but this embodiment is not used to limit the patent scope of the present invention, and any equivalent implementation or change that does not depart from the technical spirit of the present invention shall include within the patent scope of this case. To sum up, this case is not only innovative in terms of technical thinking, but also can enhance the above-mentioned multiple functions compared with conventional products. It should fully meet the requirements of the statutory invention patent of novelty and advancement, and file an application in accordance with the law. I sincerely request your office to approve this document. Invention patent applications, to encourage inventions, to be grateful.
S101、S102、S103a、S103b、S104、S105a、S105b、S106:步驟 S101, S102, S103a, S103b, S104, S105a, S105b, S106: steps
Claims (9)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW109121875A TWI783239B (en) | 2020-06-29 | 2020-06-29 | Method of optimizing image data |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW109121875A TWI783239B (en) | 2020-06-29 | 2020-06-29 | Method of optimizing image data |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| TW202201341A TW202201341A (en) | 2022-01-01 |
| TWI783239B true TWI783239B (en) | 2022-11-11 |
Family
ID=80787736
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| TW109121875A TWI783239B (en) | 2020-06-29 | 2020-06-29 | Method of optimizing image data |
Country Status (1)
| Country | Link |
|---|---|
| TW (1) | TWI783239B (en) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TW201742444A (en) * | 2016-05-20 | 2017-12-01 | 高通公司 | Circular fisheye video in virtual reality |
| US20190304102A1 (en) * | 2018-03-30 | 2019-10-03 | Qualcomm Incorporated | Memory efficient blob based object classification in video analytics |
| TWI679612B (en) * | 2018-08-14 | 2019-12-11 | 國立交通大學 | Image tracking method |
| CN110738686A (en) * | 2019-10-12 | 2020-01-31 | 四川航天神坤科技有限公司 | Static and dynamic combined video man-vehicle detection method and system |
-
2020
- 2020-06-29 TW TW109121875A patent/TWI783239B/en active
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TW201742444A (en) * | 2016-05-20 | 2017-12-01 | 高通公司 | Circular fisheye video in virtual reality |
| US20190304102A1 (en) * | 2018-03-30 | 2019-10-03 | Qualcomm Incorporated | Memory efficient blob based object classification in video analytics |
| TWI679612B (en) * | 2018-08-14 | 2019-12-11 | 國立交通大學 | Image tracking method |
| CN110738686A (en) * | 2019-10-12 | 2020-01-31 | 四川航天神坤科技有限公司 | Static and dynamic combined video man-vehicle detection method and system |
Also Published As
| Publication number | Publication date |
|---|---|
| TW202201341A (en) | 2022-01-01 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN105608456B (en) | A kind of multi-direction Method for text detection based on full convolutional network | |
| JP5351673B2 (en) | Appearance inspection device, appearance inspection method | |
| CN106650740B (en) | A kind of licence plate recognition method and terminal | |
| CN113705460B (en) | Method, device, equipment and storage medium for detecting open and closed eyes of human face in images | |
| WO2021233266A1 (en) | Edge detection method and apparatus, and electronic device and storage medium | |
| CN109492609B (en) | Method for detecting lane line, vehicle and computing equipment | |
| CN105809121A (en) | Multi-characteristic synergic traffic sign detection and identification method | |
| CN111160373B (en) | Method for extracting, detecting and classifying defect image features of variable speed drum part | |
| CN101620060A (en) | Automatic detection method of particle size distribution | |
| CN107945200A (en) | Image binaryzation dividing method | |
| JP7393313B2 (en) | Defect classification device, defect classification method and program | |
| JP2015041164A (en) | Image processor, image processing method and program | |
| CN113537037A (en) | Pavement disease identification method, system, electronic device and storage medium | |
| CN117351001B (en) | A method for identifying surface defects of recycled aluminum alloy templates | |
| CN108549901A (en) | A kind of iteratively faster object detection method based on deep learning | |
| CN105740751A (en) | Object detection and identification method and system | |
| CN106815567A (en) | A kind of flame detecting method and device based on video | |
| TWI783239B (en) | Method of optimizing image data | |
| CN108932465A (en) | Reduce the method, apparatus and electronic equipment of Face datection false detection rate | |
| CN107066929A (en) | The manifold freeway tunnel Parking hierarchical identification method of one kind fusion | |
| TW202338732A (en) | Image restoration method and image restoration device | |
| CN110390224A (en) | Method and device for identifying traffic signs | |
| CN113378635A (en) | Target attribute boundary condition searching method and device of target detection model | |
| CN113485615A (en) | Method and system for making typical application intelligent image-text tutorial based on computer vision | |
| CN104166843B (en) | Document image source judgment method based on linear continuity |