TWI765339B - Stereoscopic Image Recognition and Matching System - Google Patents
Stereoscopic Image Recognition and Matching System Download PDFInfo
- Publication number
- TWI765339B TWI765339B TW109130809A TW109130809A TWI765339B TW I765339 B TWI765339 B TW I765339B TW 109130809 A TW109130809 A TW 109130809A TW 109130809 A TW109130809 A TW 109130809A TW I765339 B TWI765339 B TW I765339B
- Authority
- TW
- Taiwan
- Prior art keywords
- module
- image
- gaussian
- coupled
- pyramid
- Prior art date
Links
- 238000001514 detection method Methods 0.000 claims abstract description 138
- 238000004364 calculation method Methods 0.000 claims abstract description 78
- 230000000007 visual effect Effects 0.000 claims abstract description 23
- 230000006870 function Effects 0.000 claims description 64
- 239000011159 matrix material Substances 0.000 claims description 53
- 238000010276 construction Methods 0.000 claims description 32
- 238000000034 method Methods 0.000 claims description 23
- 238000010606 normalization Methods 0.000 claims description 15
- 230000000694 effects Effects 0.000 claims description 5
- 230000009471 action Effects 0.000 claims description 3
- 230000008878 coupling Effects 0.000 claims 1
- 238000010168 coupling process Methods 0.000 claims 1
- 238000005859 coupling reaction Methods 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 24
- 238000004422 calculation algorithm Methods 0.000 description 11
- 238000012545 processing Methods 0.000 description 11
- 239000013598 vector Substances 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 7
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000006073 displacement reaction Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000007667 floating Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000035800 maturation Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
Abstract
一種立體影像辨識及匹配系統,其包括: 一第一SIFT模組,其輸入端為一左眼視覺影像,用以進行一左眼的特徵偵測與描述後輸出一左眼影像特徵點;一第二SIFT模組,其輸入端為一右眼視覺影像,用以進行一右眼的特徵偵測與描述後輸出一右眼影像特徵點;一座標計算模組,耦接至該左眼視覺影像特徵點及右眼視覺影像特徵點,用以計算及輸出該左眼影像特徵點及右眼影像特徵點的影像座標;以及一立體功能匹配模組,分別耦接至該第一SIFT模組、座標計算模組及第二SIFT模組,根據該左眼影像特徵點、右眼影像特徵點及其影像座標進行匹配後輸出。A stereoscopic image recognition and matching system, comprising: a first SIFT module whose input terminal is a left-eye visual image, for performing feature detection and description of a left-eye and outputting a left-eye image feature point; a The second SIFT module, whose input end is a right-eye visual image, is used to detect and describe a right-eye feature and output a right-eye image feature point; a coordinate calculation module is coupled to the left-eye visual image an image feature point and a right-eye visual image feature point for calculating and outputting the image coordinates of the left-eye image feature point and the right-eye image feature point; and a stereo function matching module, respectively coupled to the first SIFT module , a coordinate calculation module and a second SIFT module, which are matched and outputted according to the left-eye image feature point, the right-eye image feature point and their image coordinates.
Description
本發明是有關於一種立體影像辨識及匹配系統,尤指一種將SIFT影像辨識演算法實現於FPGA上之影像辨識系統。The present invention relates to a stereo image recognition and matching system, especially an image recognition system that implements a SIFT image recognition algorithm on an FPGA.
近年來由於視覺感測器的進步以及影像技術的日漸成熟,影像辨識已經成為電腦視覺領域不可或缺的一環,其廣泛應用於軍事、工業、醫學領域等,如影像縫合(image stitching)、物體辨識(object recognition)、機器人地圖感知與導航(robotic mapping and navigation)、3D模型建立(3D modeling)、手勢辨識(gesture recognition)以及影像追蹤和動作比對(video tracking and match moving)等。In recent years, due to the advancement of visual sensors and the maturation of imaging technology, image recognition has become an indispensable part of the field of computer vision, which is widely used in military, industrial, medical fields, etc., such as image stitching, object Object recognition, robotic mapping and navigation, 3D modeling, gesture recognition, video tracking and match moving, etc.
影像辨識主要將擷取到之影像進行特徵偵測,近十年來有許多影像特徵辨識演算法被提出,而其中最為知名的是David G. Lowe於1999年電腦視覺會議中提出之尺度特徵不變性轉換(Scale-invariant feature transform , SIFT) SIFT演算法主要是在影像上偵測特徵點,再賦予每個特徵點不同之高維度向量描述,如此一來,影像之間即可進行匹配,而相似的兩特徵向量點則會被比對出來,值得一提的是,SIFT演算法有將每個特徵點之方向考慮進去,所以也成功解決Harris角點偵測非rotation-invariant的問題,雖然SIFT在尺度以及視角旋轉改變下可以得到非常好的匹配結果,不過此演算法的缺點即是運算量非常龐大,導致整體之運算非常耗時,而無法達到即時運算之效果。Image recognition mainly performs feature detection on the captured images. In the past ten years, many image feature recognition algorithms have been proposed, and the most famous one is the scale feature invariance proposed by David G. Lowe at the 1999 Computer Vision Conference. Scale-invariant feature transform (SIFT) The SIFT algorithm mainly detects feature points on the image, and then assigns different high-dimensional vector descriptions to each feature point. In this way, images can be matched and similar The two eigenvector points will be compared. It is worth mentioning that the SIFT algorithm takes the direction of each feature point into account, so it also successfully solves the problem of Harris corner detection non-rotation-invariant, although SIFT Very good matching results can be obtained under the change of scale and viewing angle rotation. However, the disadvantage of this algorithm is that the amount of calculation is very large, which makes the overall calculation very time-consuming and cannot achieve the effect of real-time calculation.
習知專利前案,例如中華民國TW201142718專利「用於在均勻及非均勻照明變化中改善特徵偵測的尺度空間正規化技術」,係一種關於用於改善影像辨識系統之效能效率的方法及技術。其特徵方法是:包含:藉由獲取一影像之兩個不同經平滑版本之間的差而產生一尺度空間影像差;藉由將該尺度空間影像差除以該影像之一第三經平滑版本而產生一經正規化之尺度空間影像差,其中該影像之該第三經平滑版本係與該影像之該兩個不同經平滑版本中之最平滑者一樣平滑或比該最平滑者平滑;及使用該經正規化之尺度空間影像差以偵測該影像之一或多個特徵。唯上述之專利前案,未將每個特徵點之方向考慮進去,致使在視角旋轉改變下無法獲得好的匹配結果。Prior known patents, such as the ROC TW201142718 Patent "Scale Space Normalization Technique for Improved Feature Detection in Uniform and Non-Uniform Illumination Variations", relates to a method and technique for improving the performance efficiency of an image recognition system . The method of characterizing includes: generating a scale space image difference by obtaining the difference between two different smoothed versions of an image; by dividing the scale space image difference by a third smoothed version of the image generating a normalized scale-space image difference, wherein the third smoothed version of the image is as smooth as or smoother than the smoothest of the two different smoothed versions of the image; and using The normalized scale space image difference is used to detect one or more features of the image. Only in the aforementioned patent case, the orientation of each feature point is not taken into consideration, so that a good matching result cannot be obtained under the rotation of the viewing angle.
近年來有一些研究將SIFT演算法實現於FPGA處理平台上,主要透過平行處理之概念來加快運算時間,如2008年Vanderlei Bonato提出以軟硬體協同設計的概念,將SIFT部分演算法於FPGA上使用硬體電路加速實現,Jianhui Wang也於2014年提出一種基於嵌入式系統特徵點偵測與匹配的架構,其結果顯示已經可以達到每秒處理60張影像,Jie Jiang也提出以FPGA全硬體架構實現SIFT偵測以及匹配演算法。In recent years, some studies have implemented the SIFT algorithm on the FPGA processing platform, mainly through the concept of parallel processing to speed up the computing time. For example, in 2008, Vanderlei Bonato proposed the concept of software and hardware co-design, and part of the SIFT algorithm was implemented on the FPGA. Using hardware circuit acceleration, Jianhui Wang also proposed an architecture based on embedded system feature point detection and matching in 2014. The results show that it can process 60 images per second. Jie Jiang also proposed to use FPGA full hardware The architecture implements SIFT detection and matching algorithms.
回顧近五年的硬體實現SIFT演算法,可以發現處理速度及硬體消耗過多。過去Vourvoulakis [請參見J. Vourvoulakis, J. Kalomiros and J. Lygouras, “A complete processor for SIFT feature matching in video sequences,”2017 9th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS) , Bucharest, RO, pp. 95-100, 2017]提出了一種基於FPGA的管線式架構。在640×480的影像,最高可以處理70 fps。然而硬體是用量過大,會限制未來在FPGA上的擴充應用。Yum [請參見J. Yum, C. Lee, J. Kim and H. Lee, “A Novel Hardware Architecture with Reduced Internal Memory for Real-Time Extraction of SIFT in an HD Video,” inIEEE Transactions on Circuits and Systems for Video Technology , vol. 26, no. 10, pp. 1943-1954, Oct. 2016]提出的架構是使用外部記憶體,減少內部暫存器使用量。由於外部記憶體會有頻寬(Bandwidth)問題,提出了減少頻寬的方法。並且可以處裡36.85fps的1280×720影像。Yum [請參見J. Yum, C. Lee, J. Park, J. Kim and H. Lee, “A Hardware Architecture for the Affine-Invariant Extension of SIFT,” inIEEE Transactions on Circuits and Systems for Video Technology , vol. 28, no. 11, pp. 3251-3261, Nov. 2018]提出了光柵掃描(Raster scan)的方式並搭配外部記憶體,修改仿射變換(Affine transform),減少暫存器存取的時間。最終,仿射變換模組提升了325%的產出量(Throughput)。在影像為640×480有著20fps的處理速度。然而Yum [請參見J. Yum, C. Lee, J. Kim and H. Lee, “A Novel Hardware Architecture with Reduced Internal Memory for Real-Time Extraction of SIFT in an HD Video,” inIEEE Transactions on Circuits and Systems for Video Technology , vol. 26, no. 10, pp. 1943-1954, Oct. 2016]與Yum [請參見J. Yum, C. Lee, J. Park, J. Kim and H. Lee, “A Hardware Architecture for the Affine-Invariant Extension of SIFT,” inIEEE Transactions on Circuits and Systems for Video Technology , vol. 28, no. 11, pp. 3251-3261, Nov. 2018]的速度過低。Acharya [請參見K. A. Acharya, R. V. Babu, and S. S. Vadhiyar, “A real-time implementation of SIFT using GPU,”Journal of Real-Time Image Processing. vol. 14, no. 2, pp. 267-277, 2018]用運GUP來實現SIFT,對於640×480的影像有55fps的處理速度。並且已成功應用於目標偵測、目標追蹤。由於SIFT運算量大,透過優化與GPU來提升速度,速度比未使用GPU提升了12.2%。若將此系統實現於硬體中,提升的速度不僅只有12.2%。Li [請參見S. Li, W. Wang, W. Pan, C. J. Hsu and C. Lu, “FPGA-Based Hardware Design for Scale-Invariant Feature Transform,” inIEEE Access , vol. 6, pp. 43850-43864, 2018]提出了基於FPGA的SIFT架構。以獨立的方式,透過軟體模擬的高斯核與原始影像進行卷積運算,建立出不同的高斯影像。並且提出了一套利用伴隨矩陣來實現特徵偵測,取代除法器。來減少硬體使用量。在產生特徵描述子時,使用了CORDIC演算法來計算像素的方向及梯度。並可以處裡150fps的640×480影像。但在硬體花費過高,且並硬體無特徵匹配架構,在匹配時還是須經過軟體。加上此篇架構並無資料有效訊號,在與外部記憶體溝通時會產生不連續的資料傳輸,這會使系統運算錯誤,降低系統運行於FPGA上的穩定度。Looking back at the hardware implementation of the SIFT algorithm in the past five years, it can be found that the processing speed and hardware consumption are excessive. Past Vourvoulakis [see J. Vourvoulakis, J. Kalomiros and J. Lygouras, “A complete processor for SIFT feature matching in video sequences,” 2017 9th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS) , Bucharest, RO, pp. 95-100, 2017] proposed an FPGA-based pipelined architecture. In 640×480 images, it can handle up to 70 fps. However, the amount of hardware is too large, which will limit future expansion applications on FPGAs. Yum [See J. Yum, C. Lee, J. Kim and H. Lee, “A Novel Hardware Architecture with Reduced Internal Memory for Real-Time Extraction of SIFT in an HD Video,” in IEEE Transactions on Circuits and Systems for Video Technology , vol. 26, no. 10, pp. 1943-1954, Oct. 2016] proposes an architecture that uses external memory to reduce internal scratchpad usage. Since the external memory has a bandwidth problem, a method for reducing the bandwidth is proposed. And it can handle 1280×720 images at 36.85fps. Yum [See J. Yum, C. Lee, J. Park, J. Kim and H. Lee, “A Hardware Architecture for the Affine-Invariant Extension of SIFT,” in IEEE Transactions on Circuits and Systems for Video Technology , vol . 28, no. 11, pp. 3251-3261, Nov. 2018] proposed a raster scan method with external memory, modified affine transform (Affine transform), and reduced the time of register access . In the end, the Affine Transform module improved throughput by 325%. The video is 640×480 with 20fps processing speed. However Yum [see J. Yum, C. Lee, J. Kim and H. Lee, "A Novel Hardware Architecture with Reduced Internal Memory for Real-Time Extraction of SIFT in an HD Video," in IEEE Transactions on Circuits and Systems for Video Technology , vol. 26, no. 10, pp. 1943-1954, Oct. 2016] and Yum [see J. Yum, C. Lee, J. Park, J. Kim and H. Lee, “A Hardware Architecture for the Affine-Invariant Extension of SIFT,” in IEEE Transactions on Circuits and Systems for Video Technology , vol. 28, no. 11, pp. 3251-3261, Nov. 2018] is too slow. Acharya [See KA Acharya, RV Babu, and SS Vadhiyar, “A real-time implementation of SIFT using GPU,” Journal of Real-Time Image Processing. vol. 14, no. 2, pp. 267-277, 2018] Using the GPU to implement SIFT, there is a processing speed of 55fps for 640×480 images. And it has been successfully applied to target detection and target tracking. Due to the large amount of computation of SIFT, the speed is improved by optimization and GPU, and the speed is 12.2% higher than that without GPU. If this system is implemented in hardware, the speed increase is not only 12.2%. Li [See S. Li, W. Wang, W. Pan, CJ Hsu and C. Lu, “FPGA-Based Hardware Design for Scale-Invariant Feature Transform,” in IEEE Access , vol. 6, pp. 43850-43864 , 2018] proposed an FPGA-based SIFT architecture. In an independent manner, different Gaussian images are created by convolving the original image with the Gaussian kernel simulated by the software. And a set of feature detection is proposed using the adjoint matrix to replace the divider. to reduce hardware usage. When generating the feature descriptor, the CORDIC algorithm is used to calculate the direction and gradient of the pixel. And can handle 150fps 640 × 480 video. However, when the hardware cost is too high, and the hardware has no feature-matching architecture, it still needs to go through the software when matching. In addition, there is no data valid signal in this architecture, which will cause discontinuous data transmission when communicating with the external memory, which will cause system operation errors and reduce the stability of the system running on the FPGA.
然而以FPGA全硬體架構來實現SIFT演算法時,仍需運算指數函數、浮點數及大幅使用除法器邏輯閘,使得影像辨識耗費大量運算時間,而無法達到即時辨識之目的。However, when the SIFT algorithm is implemented with the FPGA hardware architecture, it still needs to calculate exponential functions, floating point numbers and use divider logic gates, which makes image recognition consume a lot of computing time and cannot achieve the purpose of real-time recognition.
本發明的目的在於提供一種立體影像辨識及匹配系統,其中該影像金字塔建構模組,與該影像輸入模組耦接,係預先以軟體找出複數個不同尺度之高斯模板遮罩參數,再透過複數個高斯濾波器模組平行進行複數個卷積運算,其中各所述卷積運算係依該影像資料與一所述遮罩參數進行,以獲得複數個高斯影像,用以克服習知技術在高斯模板運算時使用指數函數所產生的硬體浮點數及耗費大量運算成本之問題,以達到有效的提升系統效能之目的。The object of the present invention is to provide a stereoscopic image recognition and matching system, wherein the image pyramid construction module is coupled to the image input module, and a plurality of Gaussian template mask parameters of different scales are found in advance by software, and then passed through A plurality of Gaussian filter modules perform a plurality of convolution operations in parallel, wherein each of the convolution operations is performed according to the image data and a mask parameter to obtain a plurality of Gaussian images, which is used to overcome the conventional technology in The problem of using the hardware floating point numbers generated by the exponential function and consuming a large amount of computing costs in Gaussian template operation can effectively improve the system performance.
為達上述目的,本發明提供一種立體影像辨識及匹配系統,其包括: 一第一SIFT模組,其輸入端為一左眼視覺影像,用以進行一左眼的特徵偵測與描述後輸出一左眼影像特徵點;一第二SIFT模組,其輸入端為一右眼視覺影像,用以進行一右眼的特徵偵測與描述後輸出一右眼影像特徵點;一座標計算模組,耦接至該左眼視覺影像特徵點及右眼視覺影像特徵點,用以計算及輸出該左眼影像特徵點及右眼影像特徵點的影像座標;以及一立體功能匹配模組,分別耦接至該第一SIFT模組、座標計算模組及第二SIFT模組,根據該左眼影像特徵點、右眼影像特徵點及其影像座標進行匹配後輸出。In order to achieve the above object, the present invention provides a stereoscopic image recognition and matching system, which includes: a first SIFT module, whose input end is a left eye visual image, for performing feature detection and description of a left eye and outputting a left-eye image feature point; a second SIFT module whose input terminal is a right-eye visual image, which is used to detect and describe a right-eye feature and output a right-eye image feature point; a coordinate calculation module , coupled to the left-eye visual image feature point and the right-eye visual image feature point, for calculating and outputting the image coordinates of the left-eye image feature point and the right-eye image feature point; and a stereo function matching module, respectively coupled to Connected to the first SIFT module, the coordinate calculation module and the second SIFT module, and output after matching according to the left-eye image feature point, the right-eye image feature point and their image coordinates.
本發明的另一目的在於提供一種立體影像辨識及匹配系統,該第一SIFT模組進一步包括:一第一影像金字塔建構模組,與該左眼視覺影像耦接,係預先以軟體找出複數個不同尺度之高斯模板遮罩參數,再透過複數個高斯濾波器模組平行進行複數個卷積運算,其中各所述卷積運算係依該影像資料與一所述遮罩參數進行,以獲得複數個高斯影像,接著將兩兩的不同尺度之高斯模板遮罩參數將減後,再透過複數個相減後的高斯濾波器模組平行進行複數個卷積運算,其中各所述卷積運算係依該影像資料與一所述遮罩參數進行,以獲得複數個高斯差分影像;一第一功能偵測模組,與該第一影像金字塔建構模組耦接,係對該第一影像金字塔建構模組輸出之影像資料進行極值偵測;一第一功能描述子模組,與該第一影像金字塔建構模組耦接,用以將每個像素的描述子給計算出來,透過與周圍點來找出該點的方向與梯度,並且利用範圍統計的方式,直方圖 統計範圍內的方向梯度,建立出64維的描述子;以及一第一選擇器,其輸入端分別耦接至該第一功能偵測模組及該第一功能描述子模組,用以擇一輸出。Another object of the present invention is to provide a stereoscopic image recognition and matching system, the first SIFT module further includes: a first image pyramid construction module, coupled with the left-eye visual image, and finds plural numbers by software in advance Gaussian mask parameters of different scales, and then a plurality of convolution operations are performed in parallel through a plurality of Gaussian filter modules, wherein each of the convolution operations is performed according to the image data and a mask parameter to obtain After a plurality of Gaussian images, the mask parameters of the Gaussian templates of different scales are then subtracted, and then a plurality of convolution operations are performed in parallel through a plurality of subtracted Gaussian filter modules, wherein each of the convolution operations The process is performed according to the image data and a mask parameter to obtain a plurality of Gaussian difference images; a first function detection module, coupled to the first image pyramid construction module, is the first image pyramid The image data output by the construction module is used for extreme value detection; a first functional description sub-module is coupled with the first image pyramid construction module, and is used to calculate the descriptor of each pixel, and through and surrounding point to find the direction and gradient of the point, and use range statistics, histogram The directional gradient within the statistical range is used to establish a 64-dimensional descriptor; and a first selector, the input of which is respectively coupled to the first function detection module and the first function description submodule for selecting an output.
其中該第二SIFT模組進一步包括:一第二影像金字塔建構模組,與該右眼視覺影像耦接,係預先以軟體找出複數個不同尺度之高斯模板遮罩參數,再透過複數個高斯濾波器模組平行進行複數個卷積運算,其中各所述卷積運算係依該影像資料與一所述遮罩參數進行,以獲得複數個高斯影像,接著將兩兩的不同尺度之高斯模板遮罩參數將減後,再透過複數個相減後的高斯濾波器模組平行進行複數個卷積運算,其中各所述卷積運算係依該影像資料與一所述遮罩參數進行,以獲得複數個高斯差分影像;一第二功能偵測模組,與該第二影像金字塔建構模組耦接,係對該第二影像金字塔建構模組輸出之影像資料進行極值偵測; 一第二功能描述子模組,與該第二影像金字塔建構模組耦接,用以將每個像素的描述子給計算出來,透過與周圍點來找出該點的方向與梯度,並且利用範圍統計的方式,統計範圍內的方向梯度,建立出64維的描述子;以及一第二選擇器,其輸入端分別耦接至該第二功能偵測模組及該第二功能描述子模組,用以擇一輸出。Wherein the second SIFT module further includes: a second image pyramid construction module, coupled with the right eye visual image, finds a plurality of Gaussian template mask parameters of different scales by software in advance, and then passes through a plurality of Gaussian mask parameters The filter module performs a plurality of convolution operations in parallel, wherein each of the convolution operations is performed according to the image data and a mask parameter to obtain a plurality of Gaussian images, and then two Gaussian templates of different scales are combined. After the mask parameters are subtracted, a plurality of convolution operations are performed in parallel through a plurality of subtracted Gaussian filter modules, wherein each of the convolution operations is performed according to the image data and a mask parameter to obtain obtaining a plurality of Gaussian difference images; a second function detection module, coupled to the second image pyramid construction module, for performing extreme value detection on the image data output by the second image pyramid construction module; a first Two functional descriptor sub-modules, coupled with the second image pyramid construction module, are used to calculate the descriptor of each pixel, find out the direction and gradient of the point through and surrounding points, and use range statistics way of counting the directional gradients within the range to establish a 64-dimensional descriptor; and a second selector whose input ends are respectively coupled to the second function detection module and the second function description submodule, Used to select an output.
其中該第一影像金字塔建構模組進一步包括:一第一高斯影像金字塔,建立該第一高斯影像金字塔時,需要先建立連續尺度的空間影像,透過初始影像與高斯模板進行卷積運算,即可得到高斯模糊影像:以及一第一高斯差分金字塔,耦接至該第一高斯影像金字塔,接著將兩兩的不同尺度之高斯模板遮罩參數將減後,再與原始影像進行卷積運算,以獲得複數個高斯差分影像。The first image pyramid construction module further includes: a first Gaussian image pyramid. When building the first Gaussian image pyramid, it is necessary to create a continuous-scale spatial image first, and then perform a convolution operation between the initial image and the Gaussian template, and then you can Obtaining a Gaussian blurred image: and a first Gaussian difference pyramid, coupled to the first Gaussian image pyramid, and then subtracting the Gaussian template mask parameters of each pair of different scales, and then performing a convolution operation with the original image to obtain Obtain multiple Gaussian difference images.
其中該第一功能偵測模組進一步包括:一第一極值偵測模組,耦接至該第一高斯差分金字塔,透過兩兩高斯模板相減後與原始影像進行卷積運算的高斯差分影像,即可得到高通影像,再利用高通影像偵測極值特徵;一第一高對比偵測模組,耦接至該第一高斯差分金字塔,其進一步包括一第一一階偏微分矩陣模組、一第一海森矩陣模組、一第一伴隨矩陣模組、一第一行列式計算模組、一第一高對比度特徵偵測模組及一第一角點偵測模組,再將該第一一階偏微分矩陣模組、第一伴隨矩陣模組、第一行列式計算模組及該第一高對比度特徵偵測模組之輸出訊號進行一及運算計算出高對比度特徵;一第一角點偵測模組,耦接至該第一高斯差分金字塔,該第一海森矩陣模組的輸出會進入該第一角點偵測模組,計算出角特徵;以及一第一及閘,其輸入端分別耦接至該第一極值偵測模組、第一高對比偵測模組及第一角點偵測模組,經由該第一及閘運算後即可得特徵點。Wherein the first function detection module further includes: a first extreme value detection module, coupled to the first Gaussian difference pyramid, through the Gauss difference of the original image after the subtraction of the two Gaussian templates and the convolution operation with the original image image, a high-pass image can be obtained, and then the high-pass image is used to detect extreme value features; a first high-contrast detection module is coupled to the first Gaussian difference pyramid, which further includes a first-order partial differential matrix module set, a first Hessian matrix module, a first companion matrix module, a first determinant calculation module, a first high-contrast feature detection module, and a first corner detection module, and then performing a sum operation on the output signals of the first-order partial differential matrix module, the first adjoint matrix module, the first determinant calculation module and the first high-contrast feature detection module to calculate the high-contrast feature; a first corner detection module coupled to the first Gaussian difference pyramid, the output of the first Hessian matrix module will enter the first corner detection module to calculate corner features; and a first An and gate, the input terminals of which are respectively coupled to the first extreme value detection module, the first high-contrast detection module and the first corner detection module, after the operation of the first sum gate can be obtained Feature points.
其中該第一功能描述子模組進一步包括:一第一梯度計算模組,耦接至該第一高斯影像金字塔,用以計算完像素點方向;一第一方向計算模組,耦接至該第一高斯影像金字塔,用以計算完像素點梯度;一第一方向梯度範圍統計模組,耦接至該第一梯度計算模組及第一方向計算模組,用以統計像素點方向及梯度;以及一第一正規化模組,耦接至該第一方向梯度範圍統計模組,用以將該描述子進行標準化。The first function description sub-module further includes: a first gradient calculation module, coupled to the first Gaussian image pyramid, for calculating the direction of pixels; a first direction calculation module, coupled to the first Gaussian image pyramid The first Gaussian image pyramid is used to calculate the gradient of the pixel point; a first direction gradient range statistics module is coupled to the first gradient calculation module and the first direction calculation module, and is used to calculate the pixel point direction and gradient ; and a first normalization module, coupled to the first directional gradient range statistics module, for normalizing the descriptor.
其中該第二影像金字塔建構模組進一步包括:一第二高斯模糊金字塔,建立該第二高斯模糊金字塔時,需要先建立連續尺度的空間影像,透過初始影像與高斯模板進行卷積運算,即可得到高斯模糊影像:以及一第二高斯差分金字塔,耦接至該第二高斯模糊金字塔,接著將兩兩的不同尺度之高斯模板遮罩參數將減後,再與原始影像進行卷積運算,以獲得複數個高斯差分影像。The second image pyramid construction module further includes: a second Gaussian blur pyramid. When establishing the second Gaussian blur pyramid, a continuous scale spatial image needs to be established first, and a convolution operation is performed on the initial image and the Gaussian template. Obtaining a Gaussian blurred image: and a second Gaussian difference pyramid, coupled to the second Gaussian blurred pyramid, and then subtracting the Gaussian template mask parameters of each pair of different scales, and then performing a convolution operation with the original image to obtain Obtain multiple Gaussian difference images.
其中該第二功能偵測模組進一步包括:一第二極值偵測模組,耦接至該第二高斯差分金字塔,透過兩兩高斯模板相減後與原始影像進行卷積運算的高斯差分影像,即可得到高通影像,再利用高通影像偵測極值特徵;一第二高對比偵測模組,耦接至該第一高斯差分金字塔,其進一步包括一第二一階偏微分矩陣模組、一第二海森矩陣模組、一第二伴隨矩陣模組、一第二行列式計算模組、一第二高對比度偵測模組及一第二角點偵測模組,再將該第二一階偏微分矩陣模組、第二伴隨矩陣模組、第二行列式計算模組及該第二高對比度偵測模組之輸出訊號進行一及運算計算出高對比度特徵;一第二角點偵測模組,耦接至該第二高斯差分金字塔,該第二海森矩陣模組的輸出會進入該第二角點偵測模組,計算出角特徵;以及一第二及閘,其輸入端分別耦接至該第二極值偵測模組、第二高對比偵測模組及第二角點偵測模組,經由該第二及閘運算後即可得特徵點。Wherein the second function detection module further includes: a second extreme value detection module, coupled to the second Gaussian difference pyramid, and the Gaussian difference of the original image is subjected to a convolution operation after subtracting two Gaussian templates. image, a high-pass image can be obtained, and then the high-pass image is used to detect extreme value features; a second high-contrast detection module is coupled to the first Gaussian difference pyramid, which further includes a second-first-order partial differential matrix module group, a second Hessian matrix module, a second companion matrix module, a second determinant calculation module, a second high contrast detection module and a second corner detection module, and then The output signals of the second first-order partial differential matrix module, the second adjoint matrix module, the second determinant calculation module and the second high-contrast detection module are summed to calculate high-contrast features; a first A two-corner point detection module is coupled to the second Gaussian difference pyramid, and the output of the second Hessian matrix module will enter the second corner point detection module to calculate corner features; and a second and a gate, whose input terminals are respectively coupled to the second extreme value detection module, the second high contrast detection module and the second corner detection module, and the feature points can be obtained after the second and gate operations .
其中該第二功能描述子模組進一步包括:一第二梯度計算模組,耦接至該第二高斯模糊金字塔,用以計算完像素點方向;一第二方向計算模組,耦接至該第二高斯模糊金字塔,用以計算完像素點梯度;一第二方向梯度範圍統計模組,耦接至該第二梯度計算模組及第二方向計算模組,用以統計像素點方向及梯度;以及一第二正規化模組,耦接至該第二方向梯度範圍統計模組,用以將該描述子進行標準化。Wherein the second function description sub-module further includes: a second gradient calculation module, coupled to the second Gaussian blur pyramid, used to calculate the direction of the pixel point; a second direction calculation module, coupled to the The second Gaussian fuzzy pyramid is used to calculate the gradient of the pixel point; a second direction gradient range statistics module is coupled to the second gradient calculation module and the second direction calculation module, and is used to calculate the pixel point direction and gradient ; and a second normalization module, coupled to the second directional gradient range statistics module, for normalizing the descriptor.
其中該立體功能匹配模組進一步包括:一串列轉並列記憶體,分別耦接至該第一SIFT模組及第二SIFT模組,分別可將複數個左、右影像特徵點與特徵點座標依序存入複數個暫存器中,並同時將暫存器的複數個特徵點資訊輸出,達到串列輸入轉並列輸出的效果;一最小維度計算模組,耦接至該串列轉並列記憶體,用以當右影像特徵點訊號來臨時,會判斷該串列轉並列記憶體的複數個左影像的y 座標與右影像R的座標比對,是否相等;一匹配模組,耦接至該最小維度計算模組,用以找尋該匹配模組輸出的最小值,最終再判斷輸出後的最小值是否過大,若大於某個閥值,予以剔除;以及一深度計算模組,耦接至該匹配模組,用以計算該特徵點在實際中與立體視覺攝影機的距離,也就是深度計算,主要是透過匹配點在影像上的距離與實際的距離進行相似三角形的方式判斷。The three-dimensional function matching module further includes: a serial-to-parallel memory, respectively coupled to the first SIFT module and the second SIFT module, which can respectively match a plurality of left and right image feature points and feature point coordinates Store in a plurality of registers in sequence, and output the information of a plurality of feature points in the registers at the same time, so as to achieve the effect of serial input to parallel output; a minimum dimension calculation module is coupled to the serial to parallel output The memory is used for judging whether the y -coordinates of the plurality of left images in the serial-to-parallel memory are compared with the coordinates of the right image R when the right image feature point signal arrives, and whether they are equal; a matching module is coupled to to the minimum dimension calculation module to find the minimum value output by the matching module, and finally to determine whether the outputted minimum value is too large, and if it is greater than a certain threshold, it will be rejected; and a depth calculation module, coupled to To the matching module, it is used to calculate the actual distance between the feature point and the stereo vision camera, that is, the depth calculation, which is mainly judged by the similar triangle method between the distance of the matching point on the image and the actual distance.
為使 貴審查委員能其進一步瞭解本發明之結構、特徵及其目的,茲附以圖示及較佳具體實施例之詳細說明如後。In order to enable your examiners to further understand the structure, features and purposes of the present invention, drawings and detailed descriptions of preferred embodiments are attached as follows.
請參照圖1,其繪示本發明一較佳實施例之立體影像辨識及匹配系統之組合示意圖。Please refer to FIG. 1 , which is a schematic diagram illustrating a combination of a stereoscopic image recognition and matching system according to a preferred embodiment of the present invention.
如圖1所示,本發明之立體影像辨識及匹配系統,其包括:一第一SIFT模組100;一第二SIFT模組200;一座標計算模組300;以及一立體功能匹配模組400。As shown in FIG. 1 , the stereoscopic image recognition and matching system of the present invention includes: a
其中,該第一SIFT模組100其輸入端為一左眼視覺影像,用以進行一左眼的特徵偵測與描述後輸出一左眼影像特徵點。The input end of the
該第二SIFT模組200其輸入端為一右眼視覺影像,用以進行一右眼的特徵偵測與描述後輸出一右眼影像特徵點。The input end of the
該座標計算模組300係分別耦接至該左眼視覺影像特徵點及右眼視覺影像特徵點,用以計算及輸出該左眼影像特徵點及右眼影像特徵點的影像座標。The coordinate
該立體功能匹配模組400分別耦接至該第一SIFT模組100、座標計算模組300及第二SIFT模組200,根據該左眼影像特徵點、右眼影像特徵點及其影像座標進行匹配後輸出。The stereo
請一併參照圖2及圖3,其中,圖2繪示本發明一較佳實施例之第一SIFT模組之細部方塊示意圖示意圖;圖3繪示本發明一較佳實施例之第二SIFT模組之細部方塊示意圖示意圖。Please refer to FIG. 2 and FIG. 3 together, wherein, FIG. 2 shows a schematic block diagram of a detailed block diagram of a first SIFT module according to a preferred embodiment of the present invention; FIG. 3 shows a second SIFT module according to a preferred embodiment of the present invention. Schematic diagram of the detailed block diagram of the module.
如圖2所示,該第一SIFT模組100進一步包括:一第一影像金字塔建構模組110;一第一功能偵測模組120;一第一功能描述子模組130;以及一第一選擇器140。As shown in FIG. 2, the
其中,該第一影像金字塔建構模組110與該左眼視覺影像耦接,其進一步包含一第一高斯影像金字塔111及一第一差分影像金字塔112,該第一高斯影像金字塔111係預先以軟體找出複數個不同尺度之高斯模板遮罩參數,再透過複數個高斯濾波器模組(圖未示)平行進行複數個卷積運算,其中各所述卷積運算係依該影像資料與一所述遮罩參數進行,以獲得複數個高斯影像,之後,再將所述複數個高斯影像兩兩輸入至該第一差分影像金字塔112,進行高斯影像相減。The first image
在影像金字塔的卷積運算中,本發明採用8-bit的7×7遮罩。為了將暫存器的數量減到最少,本發明使用49個8-bit的暫存器與6列RAM基礎的移位暫存器(RAM-Based shift register),它為Altera內建優化的元件,具有較少的硬體及較有效率的操作速度。由於本發明採用寬度例如但不限於為640像素(pixel)的影像輸入,因此每個RAM-Based shift register陣列中會有633個暫存器。In the convolution operation of the image pyramid, the present invention uses an 8-
該第一功能偵測模組120耦接至該第一影像金字塔建構模組110,係對該第一影像金字塔建構模組110輸出之影像資料進行極值偵測。The first
該第一功能描述子模組130耦接至該第一影像金字塔建構模組110,用以將每個像素的描述子(Descriptor)給計算出來,透過與周圍點來找出該點的方向與梯度,並且利用範圍統計模組統計範圍內的方向梯度,建立出64維的描述子。The first
該第一選擇器140其輸入端分別耦接至該第一功能偵測模組120及該第一功能描述子模組130,用以擇一後輸出。Input ends of the
其中,該第一影像金字塔建構模組110主要是為了找出特徵偵測時所需要的連續尺度空間差異,它包含該第一高斯影像金字塔111及該第一差分影像金字塔112。該第一高斯影像金字塔111是透過高斯濾波器模組,將初始影像模糊化,並消除影像雜訊,確保所獲得的高斯影像在後續的運算不受雜訊的干擾。該第一差分影像金字塔112是將該兩兩高斯模板相減,再與原始影像進行卷積運算,產生出高斯差分影像,勾勒出影像輪廓,以保留影像特徵,作為後續特徵點偵測的基礎。在過去,已有加快產生高斯金字塔及差分影像金字塔的方式被提出來,例如中華民國發明第I592897號專利中所示。該做法雖然可以計算出差分影像,但卻需要花較多的硬體及時間來進行減法運算。The first image
請參照圖4,其繪示本發明一較佳實施例之第一SIFT模組以平行處理的架構來同時完成高斯及差分影像之示意圖。Please refer to FIG. 4 , which is a schematic diagram illustrating that the first SIFT module in a preferred embodiment of the present invention simultaneously completes Gaussian and differential images in a parallel processing structure.
如圖所示,在本發明中,為了再加快影像金字塔的建構速度,使用平行(Parallel)處理的架構在該第一高斯影像金字塔111及複數個該第一差分影像金字塔112對經一遮罩115,例如但不限於為7×7遮罩,遮罩後的初始影像來同時完成高斯影像及差分影像。其中差分影像,Dn
(x
,y
),是先將相鄰的兩個高斯核相減後,再進行卷積(Convolution)來產生。其卷積運算所採用的方程式如下:
其中Ln
(x
,y
)和Gn
(x
,y
)分別為第n
層的高斯影像和高斯核,而I
(x
,y
)為初始影像。這不僅可減少大量的硬體,並可以加快操作的速度。在這裡,本發明僅會保留一張高斯影像作為後續的處理。在此架構下,當第一筆資料來臨時,僅需要一個時脈的時間即可將該7x7遮罩115輸出,並將輸出結果與高斯核進行卷積運算。再接下來的其他尺寸不同的遮罩,一樣也可以用此架構來實現。where L n ( x , y ) and G n ( x , y ) are the Gaussian image and Gaussian kernel of the nth layer, respectively, and I ( x , y ) is the initial image. Not only does this save a lot of hardware, it also speeds up the operation. Here, the present invention only retains a Gaussian image for subsequent processing. Under this architecture, when the first data comes, it only takes one clock to output the
其中,在該第一功能偵測模組120中,在該第一影像金字塔110建立後,特徵偵測會將該第一影像金字塔110輸出的三張連續的差分影像進行矩陣的運算,再經由三種偵測(含極值、高對比及角點)來完成。Wherein, in the first
請參照圖5,其繪示本發明一較佳實施例之第一SIFT模組之第一功能偵測模組之特徵偵測的硬體架構之示意圖。Please refer to FIG. 5 , which is a schematic diagram of the hardware structure of the feature detection of the first function detection module of the first SIFT module according to a preferred embodiment of the present invention.
如圖所示,該第一功能偵測模組120之特徵偵測的硬體架構至少包含:一第一極值偵測模組121、一第一高對比度偵測模組122;一第一角點偵測模組123;一第一一階偏微分矩陣模組124;一第一海森矩陣模組125;一第一伴隨矩陣模組126以及一第一行列式計算模組127。當該第一極值偵測模組121、第一高對比度偵測模組122以及第一角點偵測模組123滿足三種偵測的結果,即表示為特徵點。其中,該第一極值偵測模組121可偵測極大值及極小值。As shown in the figure, the hardware structure of the feature detection of the first
在該第一極值偵測模組121中,本發明之複數個遮罩128,例如但不限於為3×3 遮罩產生出的27筆資料(3D DoG image)進行極大值或極小值的偵測。偵測的方法是利用第二層的差分影像中的中間值與周圍26筆資料比較。若比較的結果是極大值或極小值,則本發明將此點定義為特徵點。然而,由於經過極值偵測後的特徵點還是會有不易辨識的問題或是雜訊干擾的問題。因此,本發明藉由高對比度偵測模組122及角點偵測模組123來進行高對比度特徵與角特徵來限制極值特徵的結果。如圖5所示,將該第一極值偵測模組121、第一高對比度偵測模組122以及第一角點偵測模組123偵測的輸出經過一第一及閘129保留本發明需要的特徵點,以改善特徵點不易辨識且易受雜訊干擾的情形。其中,該第一角點偵測模組123在進行平面上的角偵測與高對比度偵測時,需要使用二階偏微分進行運算,且在硬體實現上,本發明使用第一海森矩陣模組125來表示二階偏微分,其計算角點偵測的方式請參照中華民國發明第I592897號專利說明書,在此不擬重複贅述。本發明之計算高對比度的方式與
中華民國發明第I592897號專利所提出的計算方式不同之處在於:該
第一高對比度模組122在進行高對比度偵測時,是利用該第一一階偏微分模組124、第一海森矩陣模組125、第一伴隨矩陣模組126以及第一行列式計算模組127完成,其計算方法如下:其中,該第一伴隨矩陣模組126的輸入是二階偏微分矩陣(海森矩陣)的每個元素,並經由計算(計算方法可以參考維基百科: https://zh.wikipedia.org/wiki/%E4%BC%B4%E9%9A%8F%E7%9F%A9%E9%98%B5),而實際上我們的伴隨矩陣會是這樣,如方程式(2)所示。其中我們可以發現Adj 12
=Adj 21
,Adj 13
=Adj 31
,Adj 23
=Adj 32
,這部分是因為我們的輸入端(海森矩陣)的元素在H 12
=H21
,H
13=H 31
,H 23
=H 32
。
該第一行列式計算模組127計算二階偏微分矩陣(海森矩陣)的行列式值,計算方法如方程式(3)所示。目的是為了完成逆矩陣的計算。
在高對比度偵測時,需要計算出偵測點與周圍像素變化量來判斷。在本發明中,我們設定當差分影像中的像素變化量大於1/32時,來表示該偵測點為具有高對比度特徵,並予以保留,而不是0.03來減少硬體的成本,
請再參照圖2,該第一功能描述子模組130是利用第一影像金字塔110輸出的高斯影像,將每個點與周遭點利用範圍統計的方式進行方向及梯度統計。其中,該第一功能描述子模組130進一步包括:一第一梯度計算模組131,耦接至該第一高斯影像金字塔110,用以計算完像素點方向;一第一方向計算模組132,耦接至該第一高斯影像金字塔111,用以計算完像素點梯度;一第一範圍統計模組133,耦接至該第一梯度計算模組131及第一方向計算模組132,用以統計像素點方向及梯度;以及一第一正規化模組134,耦接至該第一範圍統計模組133,用以將該描述子進行標準化。Referring to FIG. 2 again, the first function description sub-module 130 uses the Gaussian image output by the
該第一功能描述子模組130的動作原理如下:首先,高斯影像會經過3×3 遮罩(圖未示)的範圍遮罩,將左右、上下的像素值相減,找出該像素點在x
軸方向的差值方程式(14)與y
軸方向的差值方程式(15)。
Δp 為像素點在x 軸的差值,Δq 為像素點在y 軸的差值。 Δp is the difference value of the pixel point on the x -axis, and Δq is the difference value of the pixel point on the y -axis.
接著利用方程式(14)和方程式(15)計算出像素點的方向與梯度。在[D. G. Lowe, “Distinctive image features from scale-invariant keypoints,”Int. J. Comput.
Vis., vol. 60, no. 2, pp. 91–110, 2004]中,使用了方程式(16)、方程式(17)計算像素點的角度及梯度。
然而硬體實現這樣的數學會消耗極大的硬體。因此,本發明提出了一套三角法來求出方向與梯度。Implementing such math in hardware, however, is extremely hardware-intensive. Therefore, the present invention proposes a set of trigonometry to find the direction and gradient.
請一併參照圖6(a)及圖6(b),其中,圖6(a)繪示本發明一較佳實施例之具梯度計算功能之三角方法之示意圖;圖6(b)繪示本發明一較佳實施例之另一具梯度計算功能之三角方法之示意圖。Please refer to FIG. 6( a ) and FIG. 6( b ) together, wherein FIG. 6( a ) shows a schematic diagram of a trigonometric method with gradient calculation function according to a preferred embodiment of the present invention; FIG. 6( b ) shows A schematic diagram of another trigonometric method with gradient calculation function according to a preferred embodiment of the present invention.
如圖所示,我們觀察方程式(17),可以發現該方程式猶如畢氏定理,把方程式(17)的m 視為直角三角形的斜邊,Δp 與Δq 的差值視為兩短邊,如圖6(a)所示。接著比較Δp 及Δq 何者較大,將大的近似成斜邊(Hypotenuse)m ,如圖6(b)所示。此做法會有誤差,且誤差最大會在Δp 與Δq 相等時產生,而本發明利用查表法將誤差所小,如表1所式。而本發明所計算方向的方式是利用Δp 與Δq 在平面座標上的位址來定義,如表2所示。簡單來說,若Δp 與Δq 均大於0,且Δp 大於Δq 本發明把此定義為第0方向,若Δp 與Δq 均大於0,且Δp 小於Δq 我們把此定義為第1方向,以此類推。As shown in the figure, when we observe equation (17), we can find that this equation is like the Pythagorean theorem, considering m in equation (17) as the hypotenuse of a right triangle, and the difference between Δp and Δq as the two short sides, As shown in Figure 6(a). Next, compare which of Δp and Δq is larger, and approximate the larger one as a hypotenuse (Hypotenuse) m , as shown in Fig. 6(b). There will be errors in this method, and the maximum error will occur when Δp and Δq are equal, and the present invention uses a table look-up method to reduce the error, as shown in Table 1. The method of calculating the direction in the present invention is defined by the addresses of Δp and Δq on the plane coordinates, as shown in Table 2. In short, if both Δp and Δq are greater than 0, and Δp is greater than Δq , the present invention defines this as the 0th direction, if both Δp and Δq are greater than 0, and Δp is less than Δq , we define this as is the first direction, and so on.
表1具有梯度校正的查表
表2 角度範圍條件及方向分布
計算完每個偵測點的方向與梯度後,接著會進入到第一範圍統計模組133
,該模組會統計16×16的像素點的方向及梯度。統計前會先將16×16的範圍區分成16個區塊,每個區塊有4×4的像素,並在每個區塊中統計8個方向,統計後會有16個區塊每個區塊8個方向的梯度。此時一共會有128個方向梯度,表示為一個偵測點的特徵描述子。然而128個維度在硬體實現上會消耗大量的資源,因此我們提出了降低維度的方法。觀察圖6(a),可以發現的0個方向與的4個方向為相反方向;第1個方向與第5個方向一樣也是相反方向,在向量上可以將他們視為反向向量。因此我們將的0個方向與第4個方向相減;第1個方向與第5個方向相減,以此類推,並用有號數表示,如此一來便可以將128維的描述子化簡成64維,並保留了每個方向梯度的特性。After calculating the direction and gradient of each detection point, the first
為了降低該立體功能匹配模組400進行立體匹配時的硬體使用量,必須降低該描述子的位元數,並且保留資料的分佈性。本發明利用正規化(Normalization)將描述子的64維進行正規化。本發明透過正規化的方式將描述子64×13-bit壓縮成64×9-bit。將描述子向量F
=(f 1
,f 2
, …,f 64
)透過方程式(18)進行正規化,可得到新的正規化描述子N
=(n 1
,n 2
, …,n 64
)。其中S
為原始描述子向量的總和,如方程式(19)所示。由於正規化後的數值非常小,因此正規化後的值必須乘上權重(w
)來放大,可得放大後的描述子L
=(l 1
,l 2
, …,l 64
),如方程式(20)所示,本發明的權重(w)
為127。在經過該第一正規化模組134之正規化後,原本64×13=832維的有號數特徵描述子下降至64×9=576維的有號數,大幅降低了硬體需暫存描述子的資源用量。
在硬體實現上,為了要減少硬體使用量,本發明透過位移的方式取代部分乘法運算。由於權重乘上描述子向量除以描述子向量總和會小於0,因此需將權重位移後並且除上描述子向量的總和,確保相除的結果大於0,接著乘上描述子後再進行位移, 硬體實現方程式(10),以完成描述子正規化,如方程式(21)所式。
請再參照圖3,本發明之該第二SIFT模組200,其輸入端為一右眼視覺影像,用以進行一右眼的特徵偵測與描述後輸出一右眼影像特徵點,其進一步包括:一第二影像金字塔建構模組210;一第二功能偵測模組220;一第二功能描述子模組230;以及一第二選擇器240。Referring to FIG. 3 again, the input of the
該第二影像金字塔建構模組210主要是為了找出特徵偵測時所需要的連續尺度空間差異,它包含一第二高斯影像金字塔211及一第二差分影像金字塔212。其中,該第二高斯影像金字塔211是透過高斯濾波器模組(圖未示),將初始影像模糊化,並消除影像雜訊,確保所獲得的高斯影像在後續的運算不受雜訊的干擾。該第二差分影像金字塔212是將該高斯影像兩兩相減,勾勒出影像輪廓,以保留影像特徵,作為後續特徵點偵測的基礎。其詳情請參照上述該第一影像金字塔建構模組110之相關說明。The second image
該第二功能偵測模組220進一步包括:一第二極值偵測模組221、一第二高對比度偵測模組222;一第二角點偵測模組223;一第二一階偏微分矩陣模組;一第二海森矩陣模組;一第二伴隨矩陣模組;以及一第二行列式計算模組(其中,該第二一階偏微分矩陣模組、第二海森矩陣模組、第二伴隨矩陣模組;以及第二行列式計算模組皆圖未示,其詳情請參照與其類似之圖5)。當該第二極值偵測模組221、第二高對比度偵測模組222以及第二角點偵測模組223滿足三種偵測的結果,即表示為特徵點。其中,該第二極值偵測模組221可偵測極大值及極小值。該第二功能偵測模組220之詳情請參照上述該第一功能偵測模組120之說明。The second
將該第二極值偵測模組221、第二高對比度偵測模組222以及第二角點偵測模組223偵測的輸出經過一第二及閘229保留本發明需要的特徵點,以改善特徵點不易辨識且易受雜訊干擾的情形。其中,該第二角點偵測模組223在進行平面上的角偵測與高對比度偵測時,需要使用二階偏微分進行運算,且在硬體實現上,本發明使用第二海森矩陣模組來表示二階偏微分,其詳情請參照中華民國發明第I592897號專利說明書,在此不擬重複贅述。其計算高對比度的方式請參照
[0043]段中所述。The outputs detected by the second extreme
該第二功能描述子模組230是利用影像金字塔輸出的高斯影像,將每個點與周遭點利用範圍統計的方式進行方向及梯度統計。其中,該第二功能描述子模組230進一步包括:一第二梯度計算模組231,耦接至該第二高斯影像金字塔210,用以計算完像素點方向;一第二方向計算模組232,耦接至該第二高斯影像金字塔211,用以計算完像素點梯度;一第二範圍統計模組233,耦接至該第二梯度計算模組231及第二方向計算模組232,用以統計像素點方向及梯度;以及一第二正規化模組234,耦接至該第二範圍統計模組233,用以將該描述子進行標準化。其詳情請參照上述該第二功能描述子模組130之說明。The second function description sub-module 230 uses the Gaussian image output from the image pyramid to perform direction and gradient statistics on each point and surrounding points by means of range statistics. Wherein, the second function description sub-module 230 further includes: a second
請參照圖7,其繪示本發明一較佳實施例之立體功能匹配模組之細部方塊示意圖。Please refer to FIG. 7 , which shows a detailed block diagram of a three-dimensional function matching module according to a preferred embodiment of the present invention.
如圖所示,該立體功能匹配模組400進一步包括:一串列轉並列記憶體410,分別耦接至該第一SIFT模組100及第二SIFT模組200,分別可將複數個暫存器(圖未示),例如但不限於為8個暫存器,左、右影像特徵點與特徵點座標依序存入複數個暫存器中,並同時將暫存器的複數個特徵點資訊輸出,達到串列輸入轉並列輸出的效果;一最小維度計算模組420,耦接至該串列轉並列記憶體410,當右影像特徵點訊號來臨時,會判斷該串列轉並列記憶體410的複數個左影像的y
座標與右影像R的座標比對,是否相等;一匹配模組430,耦接至該最小維度計算模組420,用以找尋該最小維度計算模組420輸出的最小值,最終再判斷輸出後的最小值是否過大,若大於某個閥值,予以剔除;以及一深度計算模組440,耦接至該匹配模組430,用以計算該特徵點在實際中與立體視覺攝影機的距離,也就是深度計算,主要是透過匹配點在影像上的距離與實際的距離進行相似三角形的方式判斷。As shown in the figure, the stereo
經由本發明立體影像辨識及匹配系統之實施,其具有第一及第二影像金字塔建構模組,係預先以軟體找出複數個不同尺度之高斯模板遮罩參數,再透過複數個高斯濾波器模組平行進行複數個卷積運算,其中各所述卷積運算係依該影像資料與一所述遮罩參數進行,以獲得複數個高斯影像,以克服習知技術在高斯模板運算時使用指數函數所產生的硬體浮點數及耗費大量運算成本之問題;該海森反矩陣模組運算係利用伴隨矩陣的方式,將計算出之伴隨矩陣及行列式之值輸出至低對比度特徵偵測模組,並利用數值推導方式計算以取代複數個除法器之使用;該正規化運算模組係在計算特徵點向量之正規化數值時,乘上一增益值後,使用右移運算,用以大幅減少除法器之使用。藉由減少計算量與增進特徵點匹配正確率之方式,提升系統運算效能,以達到即時立體影像辨識及匹配之目的。因此,確實較習知之影像辨識系統具有進步性。Through the implementation of the stereoscopic image recognition and matching system of the present invention, which has first and second image pyramid construction modules, a plurality of Gaussian template mask parameters of different scales are found in advance by software, and then a plurality of Gaussian filter models are passed through the software. A plurality of convolution operations are performed in parallel, wherein each of the convolution operations is performed according to the image data and a mask parameter to obtain a plurality of Gaussian images, in order to overcome the use of exponential functions in the Gaussian template operation in the prior art The generated hardware floating-point numbers and the problem of consuming a lot of computing costs; the Hessian inverse matrix module operation uses the adjoint matrix to output the calculated adjoint matrix and determinant values to the low-contrast feature detection module. group, and use the numerical derivation method to calculate to replace the use of multiple dividers; the normalization operation module is to multiply a gain value when calculating the normalized value of the feature point vector, and then use the right shift operation to greatly Reduce the use of dividers. By reducing the amount of calculation and improving the accuracy of feature point matching, the system computing performance is improved to achieve the purpose of real-time stereoscopic image recognition and matching. Therefore, it is indeed more advanced than the conventional image recognition system.
本案所揭示者,乃較佳實施例,舉凡局部之變更或修飾而源於本案之技術思想而為熟習該項技藝之人所易於推知者,俱不脫本案之專利權範疇。What is disclosed in this case is a preferred embodiment, and any partial changes or modifications that originate from the technical ideas of this case and are easily inferred by those who are familiar with the art are within the scope of the patent right of this case.
綜上所陳,本案無論就目的、手段與功效,在在顯示其迥異於習知之技術特徵,且其首先發明合於實用,亦在在符合新型之專利要件,懇請 貴審查委員明察,並祈早日賜予專利,俾嘉惠社會,實感德便。To sum up, regardless of the purpose, means and effect of this case, it is showing its technical characteristics that are completely different from the conventional ones, and its first invention is suitable for practical use, and it also meets the requirements of a new type of patent. Granting a patent as soon as possible will benefit the society, and it will be a real sense of virtue.
100:第一SIFT模組 110:第一影像金字塔建構模組 111:第一高斯影像金字塔 112:第一差分影像金字塔 115:遮罩 120:第一功能偵測模組 121:第一極值偵測模組 122:第一高對比度偵測模組 123:第一角點偵測模組 124:第一一階偏微分矩陣模組 125:第一海森矩陣模組 126:第一伴隨矩陣模組 127:第一行列式計算模組 128:遮罩 129:第一及閘 130:第一功能描述子模組 131:第一梯度計算模組 132:第一方向計算模組 133:第一範圍統計模組 134:第一正規化模組 140:第一選擇器 200:第二SIFT模組 210:第二影像金字塔建構模組 211:第二高斯影像金字塔 212:第二差分影像金字塔 220:第二功能偵測模組 221:第二極值偵測模組 222:第二高對比度偵測模組 223:第二角點偵測模組 229:第二及閘 230:第二功能描述子模組 231:第二梯度計算模組 232:第二方向計算模組 233:第二範圍統計模組 234:第二正規化模組 240:第二選擇器 300:座標計算模組 400:立體功能匹配模組 410:串列轉並列記憶體 420:最小維度計算模組 430:匹配模組 440:深度計算模組100: The first SIFT module 110: First Image Pyramid Building Block 111: First Gaussian Image Pyramid 112: First Difference Image Pyramid 115:Mask 120: The first function detection module 121: The first extreme value detection module 122: The first high contrast detection module 123: The first corner detection module 124: The first order partial differential matrix module 125: First Hessian Matrix Module 126: First Companion Matrix Module 127: The first determinant calculation module 128:Mask 129: First and gate 130: The first function description submodule 131: The first gradient calculation module 132: The first direction calculation module 133: The first range statistics module 134: First Normalization Module 140:First selector 200: Second SIFT module 210: Second Image Pyramid Building Block 211: Second Gaussian Image Pyramid 212: Second Difference Image Pyramid 220: Second function detection module 221: The second extreme value detection module 222: The second high contrast detection module 223: The second corner detection module 229: Second and gate 230: Second function description submodule 231: Second gradient calculation module 232: Second direction calculation module 233: Second Range Statistics Module 234: Second Normalization Module 240: Second selector 300: Coordinate calculation module 400: Stereo function matching module 410: Serial to Parallel Memory 420: Minimum dimension calculation module 430: Matching Modules 440: Depth Computing Module
圖1為一示意圖,其繪示本發明一較佳實施例之立體影像辨識及匹配系統之組合示意圖。 圖2為一示意圖,其繪示本發明一較佳實施例之第一SIFT模組之細部方塊示意圖示意圖。 圖3為一示意圖,其繪示本發明一較佳實施例之第二SIFT模組之細部方塊示意圖示意圖。 圖4為一示意圖,其繪示本發明一較佳實施例之第一SIFT模組以平行處理的架構來同時完成高斯及差分影像之示意圖。 圖5為一示意圖,其繪示本發明一較佳實施例之第一SIFT模組之第一功能偵測模組之特徵偵測的硬體架構之示意圖。 圖6(a)為一示意圖,其繪示本發明一較佳實施例之具梯度計算功能之三角方法之示意圖。 圖6(b)為一示意圖,其繪示本發明一較佳實施例之另一具梯度計算功能之三角方法之示意圖。 圖7為一示意圖,其繪示本發明一較佳實施例之立體功能匹配模組之細部方塊示意圖。FIG. 1 is a schematic diagram showing a combined schematic diagram of a stereoscopic image recognition and matching system according to a preferred embodiment of the present invention. 2 is a schematic diagram illustrating a detailed block schematic diagram of a first SIFT module according to a preferred embodiment of the present invention. 3 is a schematic diagram showing a detailed block schematic diagram of a second SIFT module according to a preferred embodiment of the present invention. FIG. 4 is a schematic diagram illustrating a first SIFT module in a preferred embodiment of the present invention to simultaneously complete Gaussian and differential images in a parallel processing structure. 5 is a schematic diagram showing a schematic diagram of the hardware structure of the feature detection of the first function detection module of the first SIFT module according to a preferred embodiment of the present invention. FIG. 6( a ) is a schematic diagram showing a schematic diagram of a triangulation method with a gradient calculation function according to a preferred embodiment of the present invention. FIG. 6( b ) is a schematic diagram illustrating another triangulation method with a gradient calculation function according to a preferred embodiment of the present invention. 7 is a schematic diagram showing a detailed block diagram of a three-dimensional function matching module according to a preferred embodiment of the present invention.
100:第一SIFT模組100: The first SIFT module
200:第二SIFT模組200: Second SIFT module
300:座標計算模組300: Coordinate calculation module
400:立體功能匹配模組400: Stereo function matching module
Claims (8)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW109130809A TWI765339B (en) | 2020-09-08 | 2020-09-08 | Stereoscopic Image Recognition and Matching System |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW109130809A TWI765339B (en) | 2020-09-08 | 2020-09-08 | Stereoscopic Image Recognition and Matching System |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| TW202211681A TW202211681A (en) | 2022-03-16 |
| TWI765339B true TWI765339B (en) | 2022-05-21 |
Family
ID=81747108
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| TW109130809A TWI765339B (en) | 2020-09-08 | 2020-09-08 | Stereoscopic Image Recognition and Matching System |
Country Status (1)
| Country | Link |
|---|---|
| TW (1) | TWI765339B (en) |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TW201337835A (en) * | 2012-03-15 | 2013-09-16 | Ind Tech Res Inst | Method and apparatus for constructing image blur pyramid, and image feature extracting circuit |
| TW201715882A (en) * | 2015-10-16 | 2017-05-01 | 財團法人工業技術研究院 | Device and method for depth estimation |
| CN108537235A (en) * | 2018-03-27 | 2018-09-14 | 北京大学 | A kind of method of low complex degree scale pyramid extraction characteristics of image |
-
2020
- 2020-09-08 TW TW109130809A patent/TWI765339B/en active
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TW201337835A (en) * | 2012-03-15 | 2013-09-16 | Ind Tech Res Inst | Method and apparatus for constructing image blur pyramid, and image feature extracting circuit |
| TW201715882A (en) * | 2015-10-16 | 2017-05-01 | 財團法人工業技術研究院 | Device and method for depth estimation |
| CN108537235A (en) * | 2018-03-27 | 2018-09-14 | 北京大学 | A kind of method of low complex degree scale pyramid extraction characteristics of image |
Also Published As
| Publication number | Publication date |
|---|---|
| TW202211681A (en) | 2022-03-16 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN112364865B (en) | A detection method for moving small objects in complex scenes | |
| Li et al. | Epi-based oriented relation networks for light field depth estimation | |
| CN104504723B (en) | Image registration method based on remarkable visual features | |
| CN111709980A (en) | Multi-scale image registration method and device based on deep learning | |
| Ttofis et al. | High-quality real-time hardware stereo matching based on guided image filtering | |
| Duan et al. | Weighted multi-projection: 3D point cloud denoising with tangent planes | |
| Cambuim et al. | Hardware module for low-resource and real-time stereo vision engine using semi-global matching approach | |
| Tao et al. | Robust point sets matching by fusing feature and spatial information using nonuniform Gaussian mixture models | |
| Zhang et al. | Ednet: Efficient disparity estimation with cost volume combination and attention-based spatial residual | |
| CN106295710A (en) | Image local feature matching process, device and terminal of based on non-geometric constraint | |
| CN111161348A (en) | A method, device and device for object pose estimation based on monocular camera | |
| Ding et al. | Real-time stereo vision system using adaptive weight cost aggregation approach | |
| TWI765339B (en) | Stereoscopic Image Recognition and Matching System | |
| TWI592897B (en) | Image Recognition Accelerator System | |
| CN108447084A (en) | Stereo matching compensation method based on ORB features | |
| CN114549429A (en) | Depth data quality evaluation method and device based on hypergraph structure | |
| Huang et al. | The common self-polar triangle of separate circles: properties and applications to camera calibration | |
| Tang et al. | A GMS-guided approach for 2D feature correspondence selection | |
| Hamzah et al. | Development of depth map from stereo images using sum of absolute differences and edge filters | |
| Hamzah et al. | Development of stereo matching algorithm based on sum of absolute RGB color differences and gradient matching | |
| Tola | Multiview 3D Reconstruction of a scene containing independently moving objects | |
| Perez-Patricio et al. | A fuzzy logic approach for stereo matching suited for real-time processing | |
| Nazmi et al. | Disparity Map from Stereo Images for Three-dimensional Surface Reconstruction | |
| Sebe et al. | Evaluation of intensity and color corner detectors for affine invariant salient regions | |
| Fathi et al. | Low-cost and real-time hardware implementation of stereo vision system on FPGA |