[go: up one dir, main page]

TWI731624B - Method for estimating position of electronic device, electronic device and computer device - Google Patents

Method for estimating position of electronic device, electronic device and computer device Download PDF

Info

Publication number
TWI731624B
TWI731624B TW109108944A TW109108944A TWI731624B TW I731624 B TWI731624 B TW I731624B TW 109108944 A TW109108944 A TW 109108944A TW 109108944 A TW109108944 A TW 109108944A TW I731624 B TWI731624 B TW I731624B
Authority
TW
Taiwan
Prior art keywords
scene
feature point
head
mounted display
time point
Prior art date
Application number
TW109108944A
Other languages
Chinese (zh)
Other versions
TW202136854A (en
Inventor
李彥賢
黃士挺
黃昭世
Original Assignee
宏碁股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 宏碁股份有限公司 filed Critical 宏碁股份有限公司
Priority to TW109108944A priority Critical patent/TWI731624B/en
Application granted granted Critical
Publication of TWI731624B publication Critical patent/TWI731624B/en
Publication of TW202136854A publication Critical patent/TW202136854A/en

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention provides a method for estimating a position of a head-mounted display (HMD), the HMD and a computer device. The method includes: obtaining scene images at an i-th time point, and extracting feature points in each scene image; identifying specific object regions in each scene image, and retrieving a specific feature point failing to correspond to each specific object region; obtaining a feature point position of each scene feature point; obtaining a first device position of the HMD at the i-th time point, and obtaining a moving distance of the HMD; obtaining a measured position of the HMD at the (i+1)-th time point; estimating a second device position of the HMD at the (i+1)-th time point based on the first device position, the moving distance, and the measured position.

Description

估計頭戴顯示器位置的方法、電腦裝置及頭戴顯示器Method for estimating the position of head mounted display, computer device and head mounted display

本發明是有關於一種空間定位技術,且特別是有關於一種估計頭戴顯示器位置的方法、電腦裝置及頭戴顯示器(head-mounted display,HMD)。The present invention relates to a spatial positioning technology, and particularly relates to a method for estimating the position of a head-mounted display, a computer device, and a head-mounted display (HMD).

請參照圖1,其是習知由頭戴顯示器上不同鏡頭對同一場景拍攝的不同影像。在習知技術中,由於所拍攝的場景199內存在燈具或動物等物體,因而容易讓頭戴顯示器(例如機器人或頭戴式顯示器)上的不同鏡頭所拍攝的場景影像111、112之間存在較大的變異,且亦容易造成部分影像區域出現過度曝光的情形。在此情況下,習知在建立特徵點之間的鏈結時容易出現錯誤,進而可能在後續估計頭戴顯示器的位置/座標時出現較大的誤差。Please refer to FIG. 1, which is a conventional head-mounted display that captures different images of the same scene by different lenses. In the conventional technology, because there are objects such as lamps or animals in the captured scene 199, it is easy to make the scene images 111 and 112 shot by different lenses on the head-mounted display (such as a robot or a head-mounted display) exist between the scene images 111 and 112. Larger variation, and it is easy to cause overexposure in some image areas. Under this circumstance, it is easy to make mistakes when establishing the link between the feature points in the prior art, which may lead to larger errors in the subsequent estimation of the position/coordinates of the head mounted display.

有鑑於此,本發明提供一種估計頭戴顯示器位置的方法、電腦裝置及頭戴顯示器,其可用於解決上述技術問題。In view of this, the present invention provides a method for estimating the position of a head-mounted display, a computer device, and a head-mounted display, which can be used to solve the above-mentioned technical problems.

本發明提供一種估計頭戴顯示器位置的方法,其中頭戴顯示器上設置有多個鏡頭,所述方法包括:在第i個時間點取得由前述鏡頭對一參考場景拍攝的多個場景影像,並擷取各場景影像中的至少一特徵點;辨識各場景影像中對應於至少一特定物體的至少一特定物體區域,並從各場景影像中取出未對應於各特定物體區域的至少一特定特徵點,其中至少一特定特徵點對應於參考場景中的至少一場景特徵點;基於各場景影像的至少一特定特徵點取得各場景特徵點在參考場景中的一特徵點位置;取得頭戴顯示器在所述第i個時間點時在參考場景中的一第一裝置位置,並取得頭戴顯示器在所述第i個時間點及第i+1個時間點之間的一移動距離;基於各場景特徵點的特徵點位置取得頭戴顯示器在第i+1個時間點時在參考場景中的測量位置;基於第一裝置位置、移動距離、測量位置估計頭戴顯示器在所述第i+1個時間點時在參考場景中的一第二裝置位置。The present invention provides a method for estimating the position of a head-mounted display, wherein a plurality of lenses are arranged on the head-mounted display, and the method includes: obtaining a plurality of scene images taken by the aforementioned lens on a reference scene at the i-th time point, and Capture at least one feature point in each scene image; identify at least one specific object area corresponding to at least one specific object in each scene image, and extract at least one specific feature point that does not correspond to each specific object area from each scene image , Wherein at least one specific feature point corresponds to at least one scene feature point in the reference scene; obtains a feature point position of each scene feature point in the reference scene based on at least one specific feature point of each scene image; The position of a first device in the reference scene at the i-th time point, and a moving distance of the head-mounted display between the i-th time point and the (i+1)-th time point is obtained; based on each scene feature The characteristic point position of the point obtains the measurement position of the head-mounted display in the reference scene at the i+1th time point; based on the position of the first device, the moving distance, and the measured position, it is estimated that the head-mounted display is at the i+1th time Click on a second device location in the reference scene.

本發明提供一種頭戴顯示器,包括多個鏡頭、儲存電路及處理器。儲存電路儲存多個模組。處理器耦接於前述鏡頭及儲存電路,存取前述模組以執行下列步驟:在第i個時間點取得由前述鏡頭對一參考場景拍攝的多個場景影像,並擷取各場景影像中的至少一特徵點;辨識各場景影像中對應於至少一特定物體的至少一特定物體區域,並從各場景影像中取出未對應於各特定物體區域的至少一特定特徵點,其中至少一特定特徵點對應於參考場景中的至少一場景特徵點;基於各場景影像的至少一特定特徵點取得各場景特徵點在參考場景中的一特徵點位置;取得頭戴顯示器在所述第i個時間點時在參考場景中的一第一裝置位置,並取得頭戴顯示器在所述第i個時間點及第i+1個時間點之間的一移動距離;基於各場景特徵點的特徵點位置取得頭戴顯示器在第i+1個時間點時在參考場景中的測量位置;基於第一裝置位置、移動距離、測量位置估計頭戴顯示器在所述第i+1個時間點時在參考場景中的一第二裝置位置。The invention provides a head-mounted display, which includes a plurality of lenses, a storage circuit and a processor. The storage circuit stores multiple modules. The processor is coupled to the aforementioned lens and storage circuit, and accesses the aforementioned module to perform the following steps: obtain a plurality of scene images captured by the aforementioned lens on a reference scene at the i-th time point, and capture each scene image At least one feature point; identifying at least one specific object region corresponding to at least one specific object in each scene image, and extracting at least one specific feature point not corresponding to each specific object region from each scene image, wherein at least one specific feature point Corresponding to at least one scene feature point in the reference scene; obtain a feature point position of each scene feature point in the reference scene based on at least one specific feature point of each scene image; obtain when the head mounted display is at the i-th time point A first device position in the reference scene, and a moving distance of the head mounted display between the i-th time point and the i+1-th time point is obtained; the head is obtained based on the feature point position of each scene feature point The measured position of the head-mounted display in the reference scene at the i+1th time point; based on the position of the first device, the moving distance, and the measured position, the estimated position of the head-mounted display in the reference scene at the i+1th time point A second device location.

本發明提供一種電腦裝置,包括儲存電路及處理器。儲存電路儲存多個模組。處理器耦接於儲存電路,存取前述模組以執行下列步驟:在第i個時間點取得由一頭戴顯示器上的多個鏡頭對一參考場景拍攝的多個場景影像,並擷取各場景影像中的至少一特徵點;辨識各場景影像中對應於至少一特定物體的至少一特定物體區域,並從各場景影像中取出未對應於各特定物體區域的至少一特定特徵點,其中至少一特定特徵點對應於參考場景中的至少一場景特徵點;基於各場景影像的至少一特定特徵點取得各場景特徵點在參考場景中的一特徵點位置;取得頭戴顯示器在所述第i個時間點時在參考場景中的一第一裝置位置,並取得頭戴顯示器在所述第i個時間點及第i+1個時間點之間的一移動距離;基於各場景特徵點的特徵點位置取得頭戴顯示器在第i+1個時間點時在參考場景中的測量位置;基於第一裝置位置、移動距離、測量位置估計頭戴顯示器在所述第i+1個時間點時在參考場景中的一第二裝置位置。The invention provides a computer device including a storage circuit and a processor. The storage circuit stores multiple modules. The processor is coupled to the storage circuit, and accesses the aforementioned module to perform the following steps: obtain multiple scene images taken by multiple lenses on a head-mounted display on a reference scene at the i-th time point, and capture each At least one feature point in the scene image; identifying at least one specific object area corresponding to at least one specific object in each scene image, and extracting at least one specific feature point not corresponding to each specific object area from each scene image, wherein at least A specific feature point corresponds to at least one scene feature point in the reference scene; a feature point position of each scene feature point in the reference scene is obtained based on at least one specific feature point of each scene image; A first device position in the reference scene at two time points, and a moving distance of the head-mounted display between the i-th time point and the i+1-th time point is obtained; based on the characteristics of each scene feature point The point position obtains the measured position of the head mounted display in the reference scene at the i+1th time point; based on the position of the first device, the moving distance, and the measured position, it is estimated that the head mounted display is at the i+1th time point The location of a second device in the reference scene.

請參照圖2,其是依據本發明之一實施例繪示的頭戴顯示器示意圖。在不同的實施例中,頭戴顯示器200例如是機器人,或是可用於提供虛擬實境(virtual reality,VR)/增擴實境(augmented reality,AR)服務的頭戴式顯示器,但可不限於此。如圖2所示,頭戴顯示器200包括儲存電路202、處理器204、鏡頭2061~206N(N為鏡頭總數)。Please refer to FIG. 2, which is a schematic diagram of a head-mounted display according to an embodiment of the present invention. In different embodiments, the head-mounted display 200 is, for example, a robot, or a head-mounted display that can be used to provide virtual reality (VR)/augmented reality (AR) services, but it may not be limited to this. As shown in FIG. 2, the head-mounted display 200 includes a storage circuit 202, a processor 204, and lenses 2061 to 206N (N is the total number of lenses).

儲存電路202例如是任意型式的固定式或可移動式隨機存取記憶體(Random Access Memory,RAM)、唯讀記憶體(Read-Only Memory,ROM)、快閃記憶體(Flash memory)、硬碟或其他類似裝置或這些裝置的組合,而可用以記錄多個程式碼或模組。The storage circuit 202 is, for example, any type of fixed or removable random access memory (Random Access Memory, RAM), read-only memory (Read-Only Memory, ROM), flash memory (Flash memory), hard disk Disk or other similar devices or a combination of these devices can be used to record multiple codes or modules.

處理器204耦接於儲存電路202及鏡頭2061~206N,並可為一般用途處理器、特殊用途處理器、傳統的處理器、數位訊號處理器、多個微處理器(microprocessor)、一個或多個結合數位訊號處理器核心的微處理器、控制器、微控制器、特殊應用積體電路(Application Specific Integrated Circuit,ASIC)、現場可程式閘陣列電路(Field Programmable Gate Array,FPGA)、任何其他種類的積體電路、狀態機、基於進階精簡指令集機器(Advanced RISC Machine,ARM)的處理器以及類似品。The processor 204 is coupled to the storage circuit 202 and the lenses 2061~206N, and can be a general purpose processor, a special purpose processor, a traditional processor, a digital signal processor, multiple microprocessors, one or more A microprocessor, controller, microcontroller, Application Specific Integrated Circuit (ASIC), Field Programmable Gate Array (FPGA), any other Types of integrated circuits, state machines, processors based on Advanced RISC Machine (ARM) and similar products.

鏡頭2061~206N分別可以是電荷耦合元件(Charge coupled device,CCD)鏡頭、互補式金氧半電晶體(Complementary metal oxide semiconductor transistors,CMOS)鏡頭,或其他可用於拍攝影像的鏡頭。在本發明的實施例中,鏡頭2061~206N在頭戴顯示器200上的設置位置可假設為固定。換言之,鏡頭2061~206N彼此之間的相對位置可理解為已知的。The lenses 2061 to 206N may be charge coupled device (CCD) lenses, complementary metal oxide semiconductor transistors (CMOS) lenses, or other lenses that can be used to shoot images. In the embodiment of the present invention, the positions of the lenses 2061 to 206N on the head mounted display 200 can be assumed to be fixed. In other words, the relative positions of the lenses 2061 to 206N to each other can be understood as known.

在本發明的實施例中,處理器204可存取儲存電路202中記錄的模組、程式碼來實現本發明提出的估計頭戴顯示器位置的方法,其細節詳述如下。In the embodiment of the present invention, the processor 204 can access the modules and program codes recorded in the storage circuit 202 to implement the method for estimating the position of the head mounted display proposed by the present invention. The details are described in detail as follows.

請參照圖3,其是依據本發明之一實施例繪示的估計頭戴顯示器位置的方法流程圖。本實施例的方法可由圖2的頭戴顯示器200執行,以下即搭配圖2所示的元件說明圖3各步驟的細節。Please refer to FIG. 3, which is a flowchart of a method for estimating the position of a head mounted display according to an embodiment of the present invention. The method of this embodiment can be executed by the head-mounted display 200 in FIG. 2. The details of each step in FIG. 3 are described below in conjunction with the components shown in FIG. 2.

首先,在步驟S310中,處理器204可在第i個時間點取得由鏡頭2061~206N對參考場景拍攝的多個場景影像,並擷取各場景影像中的特徵點,其中i為時間點的索引值。在一實施例中,假設頭戴顯示器200中僅包括2個鏡頭2061、2062,且其用於對圖1所示的場景199(即,參考場景)進行拍攝,則處理器204所取得的場景影像例如是圖1所示的影像111、112,但可不限於此。First, in step S310, the processor 204 may obtain multiple scene images taken by the lenses 2061~206N of the reference scene at the i-th time point, and capture the feature points in each scene image, where i is the time point Index value. In an embodiment, assuming that the head-mounted display 200 includes only two lenses 2061 and 2062, and they are used to photograph the scene 199 (ie, the reference scene) shown in FIG. 1, the scene obtained by the processor 204 The images are, for example, the images 111 and 112 shown in FIG. 1, but it is not limited to this.

在取得上述場景影像之後,處理器204例如可基於尺度不變特徵轉換(Scale-invariant feature transform, SIFT)演算法或加速穩健特徵(Speeded Up Robust Features, SURF)演算法擷取各場景影像中的特徵點,但可不限於此。After obtaining the above-mentioned scene images, the processor 204 may, for example, capture the information in each scene image based on the Scale-invariant Feature Transform (SIFT) algorithm or the Speeded Up Robust Features (SURF) algorithm. Feature points, but not limited to this.

之後,在步驟S320中,處理器204可辨識各場景影像中對應於特定物體的特定物體區域,並從各場景影像中取出未對應於各特定物體區域的特定特徵點。在不同的實施例中,本發明所考慮的特定物體例如是燈具、動物(例如人類、貓、狗等)的至少其中之一,但可不限於此。After that, in step S320, the processor 204 can identify specific object regions corresponding to specific objects in each scene image, and extract specific feature points that do not correspond to each specific object region from each scene image. In different embodiments, the specific object considered by the present invention is, for example, at least one of a lamp and an animal (such as a human, a cat, a dog, etc.), but it may not be limited thereto.

在一實施例中,處理器204例如可採用某些經訓練以用於辨識燈具、動物的機器學習演算法來辨識各場景影像中對應於特定物體的特定物體區域,並將位於各特定物體區域中的特徵點排除,使得各場景影像中僅剩下未對應於各特定物體區域的特定特徵點,但可不限於此。In one embodiment, the processor 204 may, for example, use certain machine learning algorithms trained to identify lamps and animals to identify specific object regions corresponding to specific objects in each scene image, and will be located in each specific object region. The feature points are excluded in each scene image, so that only specific feature points that do not correspond to each specific object area are left in each scene image, but it is not limited to this.

進一步而言,如先前所提及的,當所取得的場景影像中存在燈具或動物時,容易使得後續的特徵點鏈結發生錯誤。因此,透過步驟S304,本發明可直接排除對應於燈具、動物等特定物件的特徵點,從而避免上述情況發生,進而提升後續對於頭戴顯示器200的定位精度,但可不限於此。Furthermore, as mentioned earlier, when there are lamps or animals in the obtained scene image, it is easy to cause errors in the subsequent feature point linkage. Therefore, through step S304, the present invention can directly exclude the characteristic points corresponding to specific objects such as lamps and animals, thereby avoiding the occurrence of the above-mentioned situation and improving the subsequent positioning accuracy of the head-mounted display 200, but it is not limited to this.

之後,在步驟S330中,處理器204可基於各場景影像的特定特徵點取得各場景特徵點在參考場景中的特徵點位置。在一實施例中,處理器204可基於相關的現有技術(例如「 Lowe, D.G. Distinctive Image Features From Scale-Invariant Keypoints. International Journal of Computer Vision 60, 91-110(2004)」)得知不同場景影像中的哪些特定特徵點係對應於參考場景中的同一個場景特徵點(參考場景中可包括多個場景特徵點),並將這些特徵點進行鏈結(linkage),而鏈結後的結果可大致參考圖1所例示的內容(惟對應於燈具、動物等特定物體的特徵點已被排除)。 After that, in step S330, the processor 204 may obtain the feature point position of each scene feature point in the reference scene based on the specific feature point of each scene image. In one embodiment, the processor 204 can learn images of different scenes based on related existing technologies (for example, " Lowe, DG Distinctive Image Features From Scale-Invariant Keypoints. International Journal of Computer Vision 60, 91-110 (2004)"). Which specific feature points in the reference scene correspond to the same scene feature point in the reference scene (the reference scene can include multiple scene feature points), and link these feature points, and the result of the linking can be Roughly refer to the content illustrated in Figure 1 (but the characteristic points corresponding to specific objects such as lamps and animals have been excluded).

在已知不同場景影像中的哪些特定特徵點係對應於同一個場景特徵點之後,處理器204即可進一步據以估計此場景特徵點在參考場景中的特徵點位置。舉例而言,在一實施例中,假設上述場景影像中包括由鏡頭2061拍攝的第一場景影像及由鏡頭2062拍攝的第二場景影像,所述第一場景影像包括第一特定特徵點,所述第二場景影像包括第二特定特徵點,且第一特定特徵點及第二特定特徵點皆對應於場景特徵點中的第一場景特徵點。After knowing which specific feature points in different scene images correspond to the same scene feature point, the processor 204 can further estimate the location of the feature point of the scene feature point in the reference scene. For example, in one embodiment, it is assumed that the aforementioned scene image includes a first scene image taken by the lens 2061 and a second scene image taken by the lens 2062, and the first scene image includes a first specific feature point, so The second scene image includes a second specific feature point, and both the first specific feature point and the second specific feature point correspond to the first scene feature point in the scene feature points.

在此情況下,處理器204可利用三角視差技術以基於第一特定特徵點、第二特定特徵點及鏡頭2061、2062的相對位置估計第一場景特徵點在參考場景中的特徵點位置。由於鏡頭2061、2062的相對位置經假設為已知,故處理器204僅需簡易地基於三角視差法即可估計上述第一場景特徵點在參考場景中的特徵點位置,而三角視差法的相關細節可參照相關技術文獻(例如「 R. Mur-Artal and J. D. Tardós, "ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras," in IEEE Transactions on Robotics, vol. 33, no. 5, pp. 1255-1262, Oct. 2017.」),於此不另贅述。 In this case, the processor 204 can use the triangular parallax technology to estimate the position of the feature point of the first scene feature point in the reference scene based on the relative positions of the first specific feature point, the second specific feature point, and the lenses 2061 and 2062. Since the relative positions of the lenses 2061 and 2062 are assumed to be known, the processor 204 only needs to simply estimate the position of the feature points of the first scene in the reference scene based on the triangular parallax method, and the correlation of the triangular parallax method For details, please refer to the relevant technical literature (for example, " R. Mur-Artal and JD Tardós, "ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras," in IEEE Transactions on Robotics, vol. 33 , no. 5, pp. 1255-1262, Oct. 2017.” ), I will not repeat it here.

接著,在步驟S340中,處理器204可取得頭戴顯示器200在第i個時間點時在參考場景中的第一裝置位置,並取得頭戴顯示器200在第i個時間點及第i+1個時間點之間的移動距離(以

Figure 02_image001
表示)。應了解的是,本發明的技術概念大致可理解為持續地基於頭戴顯示器200在前一個時間點的裝置位置以及各場景特徵點的特徵點位置來估計頭戴顯示器200在當下時間點的裝置位置。因此,若所考慮的時間點是第i+1個時間點,則對於處理器204而言,頭戴顯示器200在第第i個時間點時在參考場景中的裝置位置(即上述第一裝置位置)可視為是已知的,但可不限於此。 Next, in step S340, the processor 204 may obtain the first device position of the head-mounted display 200 in the reference scene at the i-th time point, and obtain the head-mounted display 200 at the i-th time point and the i+1th time point. Moving distance between time points (in
Figure 02_image001
Said). It should be understood that the technical concept of the present invention can be roughly understood as a device that continuously estimates the head-mounted display 200 at the current time point based on the device position of the head-mounted display 200 at the previous time point and the feature point positions of each scene feature point. position. Therefore, if the considered time point is the i+1th time point, for the processor 204, the head mounted display 200 is at the i-th time point in the device position in the reference scene (that is, the first device The location) can be regarded as known, but it is not limited to this.

此外,本發明的頭戴顯示器200可設置有慣性測量單元(inertial measurement unit,IMU)(未繪示),而處理器204可取得此IMU在第i個時間點及第i+1個時間點之間所測得的加速度,並據以估計頭戴顯示器200在第i個時間點及第i+1個時間點之間的移動距離。In addition, the head mounted display 200 of the present invention may be provided with an inertial measurement unit (IMU) (not shown), and the processor 204 may obtain the IMU at the i-th time point and the i+1-th time point According to the measured acceleration, the moving distance of the head mounted display 200 between the i-th time point and the (i+1)-th time point is estimated.

接著,在步驟S350中,處理器204可基於各場景特徵點的特徵點位置取得頭戴顯示器200在第i+1個時間點時在參考場景中的測量位置。在一實施例中,頭戴顯示器200在第i+1個時間點時在參考場景中的測量位置可假設為原點,故在已知各場景特徵點的特徵點位置(可表示為向量的形式)的情況下,處理器204可簡易地據以反推而求得原點的位置,亦即頭戴顯示器200在第i+1個時間點的測量位置,但可不限於此。Next, in step S350, the processor 204 may obtain the measurement position of the head mounted display 200 in the reference scene at the i+1th time point based on the feature point position of each scene feature point. In one embodiment, the measurement position of the head-mounted display 200 in the reference scene at the i+1th time point can be assumed to be the origin, so the feature point position of each scene feature point (which can be expressed as a vector In the case of form), the processor 204 can easily calculate the position of the origin, that is, the measurement position of the head mounted display 200 at the i+1th time point, but it is not limited to this.

之後,在步驟S360中,處理器204可基於第一裝置位置、移動距離、測量位置估計頭戴顯示器在第i+1個時間點時在參考場景中的第二裝置位置。After that, in step S360, the processor 204 may estimate the position of the second device in the reference scene of the head mounted display at the i+1th time point based on the position of the first device, the moving distance, and the measured position.

。在一實施例中,處理器204可使用卡爾曼濾波器(Kalman filter)以基於第一裝置位置、移動距離及上述測量位置估計頭戴顯示器200在第i+1個時間點時在參考場景中的第二裝置位置,而使用卡爾曼濾波器時的相關技術細節說明如下。. In an embodiment, the processor 204 may use a Kalman filter to estimate that the head mounted display 200 is in the reference scene at the i+1th time point based on the position of the first device, the moving distance, and the above-mentioned measured position. The position of the second device and the related technical details when using the Kalman filter are described below.

在一實施例中,頭戴顯示器200在三度空間中的第一裝置位置可表示為

Figure 02_image003
的狀態向量(state vector),其中
Figure 02_image005
,且
Figure 02_image007
Figure 02_image009
Figure 02_image011
的一次及二次微分。此外,
Figure 02_image013
,其分別對應於三度空間中的x、y、z座標。 In an embodiment, the first device position of the head mounted display 200 in the three-dimensional space can be expressed as
Figure 02_image003
State vector (state vector), where
Figure 02_image005
And
Figure 02_image007
and
Figure 02_image009
for
Figure 02_image011
The first and second derivative of. In addition,
Figure 02_image013
, Which respectively correspond to the x, y, and z coordinates in the three-dimensional space.

此外,另定義

Figure 02_image015
為轉換矩陣(transition matrix),其可表徵為:
Figure 02_image017
,其中
Figure 02_image019
Figure 02_image021
為第i個時間點及第i+1個時間點之間的時間間隔。 In addition, another definition
Figure 02_image015
It is the transition matrix, which can be characterized as:
Figure 02_image017
,among them
Figure 02_image019
,
Figure 02_image021
Is the time interval between the i-th time point and the i+1-th time point.

此外,另定義

Figure 02_image023
為狀態向量的共變異數矩陣(covariance matrix)。在一實施例中,
Figure 02_image025
,其中
Figure 02_image027
Figure 02_image029
為資料的變異數,且
Figure 02_image023
的值可隨時間而改變。 In addition, another definition
Figure 02_image023
Is the covariance matrix of the state vector. In one embodiment,
Figure 02_image025
,among them
Figure 02_image027
,
Figure 02_image029
Is the variance of the data, and
Figure 02_image023
The value of can change over time.

此外,另定義

Figure 02_image031
為雜訊的共變異數矩陣,其可表徵為
Figure 02_image033
,其中
Figure 02_image035
,而
Figure 02_image037
為白雜訊(white noise)的雜訊功率。 In addition, another definition
Figure 02_image031
Is the covariance matrix of noise, which can be characterized as
Figure 02_image033
,among them
Figure 02_image035
,and
Figure 02_image037
It is the noise power of white noise.

此外,另定義

Figure 02_image039
為用於將
Figure 02_image041
轉換為所需結果(例如測量位置)的測量矩陣,其可表徵為
Figure 02_image043
,其中
Figure 02_image045
。 In addition, another definition
Figure 02_image039
For the
Figure 02_image041
The measurement matrix converted to the desired result (such as the measurement position), which can be characterized as
Figure 02_image043
,among them
Figure 02_image045
.

此外,另定義

Figure 02_image047
為測量的共變異數矩陣,其可表徵為
Figure 02_image049
,其中
Figure 02_image051
Figure 02_image037
為即時定位與地圖構建(Simultaneous localization and mappingSLAM)的測量變異數。 In addition, another definition
Figure 02_image047
Is the measured covariance matrix, which can be characterized as
Figure 02_image049
,among them
Figure 02_image051
,
Figure 02_image037
Measured variance for Simultaneous localization and mappingSLAM (Simultaneous localization and mappingSLAM).

在一實施例中,在卡爾曼濾波器的預測(predict)階段中,處理器204可依下式產生一預測位置(以

Figure 02_image053
表示):
Figure 02_image055
,其中控制因子(control factor)
Figure 02_image057
可作為
Figure 02_image059
代換至
Figure 02_image041
中。 In one embodiment, in the predict stage of the Kalman filter, the processor 204 may generate a predicted position according to the following formula (in order to
Figure 02_image053
Said):
Figure 02_image055
, Where control factor
Figure 02_image057
can be used as
Figure 02_image059
Substitute to
Figure 02_image041
in.

接著,處理器204可調整

Figure 02_image023
以將預測的不確定性納入考量,其中調整後的
Figure 02_image023
(表徵為
Figure 02_image061
)可依「
Figure 02_image063
」的式子計算而得。 Then, the processor 204 can adjust
Figure 02_image023
In order to take into account the uncertainty of the forecast, the adjusted
Figure 02_image023
(Characterized as
Figure 02_image061
) May follow "
Figure 02_image063
”Is calculated by the formula.

之後,在卡爾曼濾波器的更新(update)階段中,處理器204可計算

Figure 02_image065
,以獲得關於其準確性的測量值和相關協方差矩陣。 After that, in the update phase of the Kalman filter, the processor 204 can calculate
Figure 02_image065
, To obtain the measurement value and the correlation covariance matrix about its accuracy.

接著,處理器204可以下式計算測量位置(以

Figure 02_image067
表示)與預測位置之間的差距(以
Figure 02_image069
表示):
Figure 02_image071
。 Then, the processor 204 can calculate the measurement position with the following formula (in
Figure 02_image067
Expressed) and the gap between the predicted position (in
Figure 02_image069
Said):
Figure 02_image071
.

之後,處理器204可基於測量位置或是預測位置之間較準確的一者計算縮放因子(scaling factor)(以

Figure 02_image073
表示),其中
Figure 02_image075
After that, the processor 204 may calculate a scaling factor (scaling factor) based on the measured position or the predicted position, whichever is more accurate.
Figure 02_image073
Means), where
Figure 02_image075
.

之後,處理器204可基於縮放因子、預測位置及上述差距估計頭戴顯示器200的第二裝置位置(以

Figure 02_image077
表示),其中
Figure 02_image079
。並且,處理器204可再根據在測量中的確定性來更新為狀態向量的共變異數矩陣(即
Figure 02_image023
),其中
Figure 02_image081
,且此步驟可體現出
Figure 02_image023
隨時間而變的特性。以上與卡爾曼濾波器有關的進一步技術細節可參照相關的現有技術文獻(例如「 Kalman, Rudolph Emil. "A new approach to linear filtering and prediction problems." (1960): 35-45.」),於此不另贅述。 After that, the processor 204 may estimate the second device position of the head-mounted display 200 based on the zoom factor, the predicted position, and the aforementioned gap (in order to
Figure 02_image077
Means), where
Figure 02_image079
. In addition, the processor 204 may then update the covariance matrix of the state vector according to the certainty in the measurement (that is,
Figure 02_image023
),among them
Figure 02_image081
, And this step can reflect
Figure 02_image023
Characteristics that change over time. For further technical details related to the Kalman filter above, please refer to related prior art documents (for example, " Kalman, Rudolph Emil. "A new approach to linear filtering and prediction problems." (1960): 35-45. "). This will not be repeated here.

請參照圖4,其是依據本發明之一實施例繪示的電腦裝置示意圖。在不同的實施例中,電腦裝置400例如是可取得由頭戴顯示器200所拍攝的場景影像的個人電腦、筆記型電腦、伺服器或其他類似裝置,但可不限於此。如圖4所示,電腦裝置400可包括儲存電路402及處理器404,其中儲存電路402及處理器404的各個可能的實施方式可參照儲存電路202及處理器204的相關說明,於此不另贅述。Please refer to FIG. 4, which is a schematic diagram of a computer device according to an embodiment of the present invention. In different embodiments, the computer device 400 is, for example, a personal computer, a notebook computer, a server or other similar devices that can obtain scene images captured by the head-mounted display 200, but it is not limited to this. As shown in FIG. 4, the computer device 400 may include a storage circuit 402 and a processor 404. For various possible implementations of the storage circuit 402 and the processor 404, please refer to the relevant descriptions of the storage circuit 202 and the processor 204, which are not described here. Go into details.

在一實施例中,處理器404可存取儲存電路402中記錄的模組、程式碼來實現圖3的方法。換言之,以上記載的由頭戴顯示器200的處理器204所執行的各種步驟/操作皆可改由電腦裝置400的處理器404基於所取得的場景影像而執行以估計頭戴顯示器200的第二裝置位置。相關細節可參照先前實施例中的說明,於此不另贅述。In an embodiment, the processor 404 can access the modules and program codes recorded in the storage circuit 402 to implement the method in FIG. 3. In other words, the various steps/operations performed by the processor 204 of the head-mounted display 200 described above can be changed to be executed by the processor 404 of the computer device 400 based on the obtained scene image to estimate the second device of the head-mounted display 200 position. For the relevant details, please refer to the description in the previous embodiment, which will not be repeated here.

綜上所述,透過排除對應於特定物件(例如動物、燈具等)的特徵點,本發明的方法可讓不同場景影像之間的特徵點鏈結更為精準,從而實現更為精準的頭戴顯示器定位。To sum up, by excluding the feature points corresponding to specific objects (such as animals, lamps, etc.), the method of the present invention can make the linking of feature points between images of different scenes more accurate, thereby achieving more accurate headwear Display positioning.

雖然本發明已以實施例揭露如上,然其並非用以限定本發明,任何所屬技術領域中具有通常知識者,在不脫離本發明的精神和範圍內,當可作些許的更動與潤飾,故本發明的保護範圍當視後附的申請專利範圍所界定者為準。Although the present invention has been disclosed in the above embodiments, it is not intended to limit the present invention. Anyone with ordinary knowledge in the relevant technical field can make some changes and modifications without departing from the spirit and scope of the present invention. The protection scope of the present invention shall be subject to those defined by the attached patent application scope.

111,112:影像 199:場景 200:頭戴顯示器 202,402:儲存電路 204,404:處理器 2061~206N:鏡頭 400:電腦裝置 S310~S360:步驟111,112: image 199: Scene 200: Head-mounted display 202, 402: storage circuit 204,404: Processor 2061~206N: lens 400: computer device S310~S360: steps

圖1是習知由頭戴顯示器上不同鏡頭對同一場景拍攝的不同影像。 圖2是依據本發明之一實施例繪示的頭戴顯示器示意圖。 圖3是依據本發明之一實施例繪示的估計頭戴顯示器位置的方法流程圖。 圖4是依據本發明之一實施例繪示的電腦裝置示意圖。 Figure 1 is a conventional head-mounted display with different images shot on the same scene by different lenses. FIG. 2 is a schematic diagram of a head-mounted display according to an embodiment of the present invention. FIG. 3 is a flowchart of a method for estimating the position of a head mounted display according to an embodiment of the present invention. FIG. 4 is a schematic diagram of a computer device according to an embodiment of the present invention.

S310~S360:步驟 S310~S360: steps

Claims (8)

一種估計頭戴顯示器位置的方法,其中該頭戴顯示器上設置有多個鏡頭,所述方法包括: 在第i個時間點取得由該些鏡頭對一參考場景拍攝的多個場景影像,並擷取各該場景影像中的至少一特徵點; 辨識各該場景影像中對應於至少一特定物體的至少一特定物體區域,並從各該場景影像中取出未對應於各該特定物體區域的至少一特定特徵點,其中該至少一特定特徵點對應於該參考場景中的至少一場景特徵點; 基於各該場景影像的該至少一特定特徵點取得各該場景特徵點在該參考場景中的一特徵點位置; 取得該頭戴顯示器在所述第i個時間點時在該參考場景中的一第一裝置位置,並取得該頭戴顯示器在所述第i個時間點及第i+1個時間點之間的一移動距離; 基於各該場景特徵點的該特徵點位置取得該頭戴顯示器在所述第i+1個時間點時在該參考場景中的一測量位置; 基於該第一裝置位置、該移動距離、該測量位置估計該頭戴顯示器在所述第i+1個時間點時在該參考場景中的一第二裝置位置。 A method for estimating the position of a head-mounted display, wherein a plurality of lenses are arranged on the head-mounted display, and the method includes: Acquiring a plurality of scene images taken by the lenses on a reference scene at the i-th time point, and capturing at least one feature point in each of the scene images; Identify at least one specific object area corresponding to at least one specific object in each of the scene images, and extract at least one specific feature point that does not correspond to each specific object area from each of the scene images, wherein the at least one specific feature point corresponds to At least one scene feature point in the reference scene; Obtaining a feature point position of each scene feature point in the reference scene based on the at least one specific feature point of each scene image; Obtain a first device position of the head-mounted display in the reference scene at the i-th time point, and obtain the head-mounted display between the i-th time point and the i+1-th time point A moving distance of; Obtaining a measurement position of the head-mounted display in the reference scene at the i+1th time point based on the feature point position of each feature point of the scene; Estimating a second device position of the head mounted display in the reference scene at the (i+1)th time point based on the first device position, the moving distance, and the measured position. 如請求項1所述的方法,其中擷取各該場景影像中的該至少一特徵點的步驟包括: 基於一尺度不變特徵轉換演算法或一加速穩健特徵演算法擷取各該場景影像中的該至少一特徵點。 The method according to claim 1, wherein the step of capturing the at least one characteristic point in each of the scene images includes: The at least one feature point in each scene image is extracted based on a scale-invariant feature conversion algorithm or an accelerated robust feature algorithm. 如請求項1所述的方法,其中該至少一特定物體包括一燈具及一動物的至少其中之一。The method according to claim 1, wherein the at least one specific object includes at least one of a lamp and an animal. 如請求項1所述的方法,其中該些鏡頭包括一第一鏡頭及一第二鏡頭,該些場景影像包括由該第一鏡頭拍攝的一第一場景影像及由該第二鏡頭拍攝的一第二場景影像,該第一場景影像包括一第一特定特徵點,該第二場景影像包括一第二特定特徵點,該第一特定特徵點及該第二特定特徵點皆對應於該些場景特徵點中的一第一場景特徵點,且基於各該場景影像的該至少一特定特徵點取得各該場景特徵點在該參考場景中的該特徵點位置的步驟包括: 利用三角視差技術以基於該第一特定特徵點、該第二特定特徵點及該第一鏡頭與該第二鏡頭的一相對位置估計該第一場景特徵點在該參考場景中的該特徵點位置。 The method according to claim 1, wherein the lenses include a first lens and a second lens, and the scene images include a first scene image taken by the first lens and a first scene image taken by the second lens A second scene image, the first scene image includes a first specific feature point, the second scene image includes a second specific feature point, the first specific feature point and the second specific feature point both correspond to the scenes A first scene feature point in the feature points, and the step of obtaining the feature point position of each scene feature point in the reference scene based on the at least one specific feature point of each scene image includes: Using triangular parallax technology to estimate the position of the feature point of the first scene feature point in the reference scene based on the first specific feature point, the second specific feature point, and a relative position of the first lens and the second lens . 如請求項1所述的方法,其中該頭戴顯示器設置有一慣性測量單元,且取得該頭戴顯示器在所述第i個時間點及第i+1個時間點之間的該移動距離的步驟包括: 取得該慣性測量單元在所述第i個時間點及所述第i+1個時間點之間所測得的加速度,並據以估計該頭戴顯示器在所述第i個時間點及所述第i+1個時間點之間的該移動距離。 The method according to claim 1, wherein the head-mounted display is provided with an inertial measurement unit, and the step of obtaining the moving distance of the head-mounted display between the i-th time point and the i+1-th time point include: Obtain the acceleration measured by the inertial measurement unit between the i-th time point and the i+1-th time point, and then estimate the head-mounted display at the i-th time point and the The moving distance between the i+1th time point. 如請求項1所述的方法,其中基於各該場景特徵點的該特徵點位置、該第一裝置位置及該移動距離估計該頭戴顯示器在所述第i+1個時間點時在該參考場景中的該第二裝置位置的步驟包括: 使用一卡爾曼濾波器以基於各該場景特徵點的該特徵點位置、該第一裝置位置、該移動距離、該頭戴顯示器在所述第i+1個時間點時的一測量位置估計該頭戴顯示器在所述第i+1個時間點時在該參考場景中的一第二裝置位置。 The method according to claim 1, wherein the head-mounted display is estimated to be at the reference point at the i+1th time point based on the location of the feature point, the location of the first device, and the moving distance of each feature point of the scene The steps of the second device position in the scene include: A Kalman filter is used to estimate the location based on the feature point location of each scene feature point, the first device location, the moving distance, and a measurement location of the head-mounted display at the i+1th time point. The head-mounted display is a second device position in the reference scene at the i+1th time point. 一種頭戴顯示器,包括: 多個鏡頭; 一儲存電路,儲存多個模組;以及 一處理器,耦接於該些鏡頭及該儲存電路,存取該些模組以執行下列步驟: 在第i個時間點取得由該些鏡頭對一參考場景拍攝的多個場景影像,並擷取各該場景影像中的至少一特徵點; 辨識各該場景影像中對應於至少一特定物體的至少一特定物體區域,並從各該場景影像中取出未對應於各該特定物體區域的至少一特定特徵點,其中該至少一特定特徵點對應於該參考場景中的至少一場景特徵點; 基於各該場景影像的該至少一特定特徵點取得各該場景特徵點在該參考場景中的一特徵點位置; 取得該頭戴顯示器在所述第i個時間點時在該參考場景中的一第一裝置位置,並取得該頭戴顯示器在所述第i個時間點及第i+1個時間點之間的一移動距離; 基於各該場景特徵點的該特徵點位置取得該頭戴顯示器在所述第i+1個時間點時在該參考場景中的一測量位置; 基於該第一裝置位置、該移動距離、該測量位置估計該頭戴顯示器在所述第i+1個時間點時在該參考場景中的一第二裝置位置。 A head-mounted display, including: Multiple shots A storage circuit for storing multiple modules; and A processor, coupled to the lenses and the storage circuit, accesses the modules to perform the following steps: Acquiring a plurality of scene images taken by the lenses on a reference scene at the i-th time point, and capturing at least one feature point in each of the scene images; Identify at least one specific object area corresponding to at least one specific object in each of the scene images, and extract at least one specific feature point that does not correspond to each specific object area from each of the scene images, wherein the at least one specific feature point corresponds to At least one scene feature point in the reference scene; Obtaining a feature point position of each scene feature point in the reference scene based on the at least one specific feature point of each scene image; Obtain a first device position of the head-mounted display in the reference scene at the i-th time point, and obtain the head-mounted display between the i-th time point and the i+1-th time point A moving distance of; Obtaining a measurement position of the head-mounted display in the reference scene at the i+1th time point based on the feature point position of each feature point of the scene; Estimating a second device position of the head mounted display in the reference scene at the i+1th time point based on the first device position, the moving distance, and the measured position. 一種電腦裝置,包括: 一儲存電路,儲存多個模組;以及 一處理器,耦接於該儲存電路,存取該些模組以執行下列步驟: 在第i個時間點取得由一頭戴顯示器上的多個鏡頭對一參考場景拍攝的多個場景影像,並擷取各該場景影像中的至少一特徵點; 辨識各該場景影像中對應於至少一特定物體的至少一特定物體區域,並從各該場景影像中取出未對應於各該特定物體區域的至少一特定特徵點,其中該至少一特定特徵點對應於該參考場景中的至少一場景特徵點; 基於各該場景影像的該至少一特定特徵點取得各該場景特徵點在該參考場景中的一特徵點位置; 取得該頭戴顯示器在所述第i個時間點時在該參考場景中的一第一裝置位置,並取得該頭戴顯示器在所述第i個時間點及第i+1個時間點之間的一移動距離; 基於各該場景特徵點的該特徵點位置取得該頭戴顯示器在所述第i+1個時間點時在該參考場景中的一測量位置; 基於該第一裝置位置、該移動距離、該測量位置估計該頭戴顯示器在所述第i+1個時間點時在該參考場景中的一第二裝置位置。 A computer device including: A storage circuit for storing multiple modules; and A processor, coupled to the storage circuit, accesses the modules to perform the following steps: Obtain a plurality of scene images taken by a plurality of lenses on a head-mounted display on a reference scene at the i-th time point, and capture at least one characteristic point in each of the scene images; Identify at least one specific object area corresponding to at least one specific object in each of the scene images, and extract at least one specific feature point that does not correspond to each specific object area from each of the scene images, wherein the at least one specific feature point corresponds to At least one scene feature point in the reference scene; Obtaining a feature point position of each scene feature point in the reference scene based on the at least one specific feature point of each scene image; Obtain a first device position of the head-mounted display in the reference scene at the i-th time point, and obtain the head-mounted display between the i-th time point and the i+1-th time point A moving distance of; Obtaining a measurement position of the head-mounted display in the reference scene at the i+1th time point based on the feature point position of each feature point of the scene; Estimating a second device position of the head mounted display in the reference scene at the i+1th time point based on the first device position, the moving distance, and the measured position.
TW109108944A 2020-03-18 2020-03-18 Method for estimating position of electronic device, electronic device and computer device TWI731624B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW109108944A TWI731624B (en) 2020-03-18 2020-03-18 Method for estimating position of electronic device, electronic device and computer device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW109108944A TWI731624B (en) 2020-03-18 2020-03-18 Method for estimating position of electronic device, electronic device and computer device

Publications (2)

Publication Number Publication Date
TWI731624B true TWI731624B (en) 2021-06-21
TW202136854A TW202136854A (en) 2021-10-01

Family

ID=77517354

Family Applications (1)

Application Number Title Priority Date Filing Date
TW109108944A TWI731624B (en) 2020-03-18 2020-03-18 Method for estimating position of electronic device, electronic device and computer device

Country Status (1)

Country Link
TW (1) TWI731624B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI279142B (en) * 2005-04-20 2007-04-11 Univ Nat Chiao Tung Picture capturing and tracking method of dual cameras
WO2018134897A1 (en) * 2017-01-17 2018-07-26 マクセル株式会社 Position and posture detection device, ar display device, position and posture detection method, and ar display method
CN110197510A (en) * 2019-06-05 2019-09-03 广州极飞科技有限公司 Scaling method, device, unmanned plane and the storage medium of binocular camera

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI279142B (en) * 2005-04-20 2007-04-11 Univ Nat Chiao Tung Picture capturing and tracking method of dual cameras
WO2018134897A1 (en) * 2017-01-17 2018-07-26 マクセル株式会社 Position and posture detection device, ar display device, position and posture detection method, and ar display method
CN110197510A (en) * 2019-06-05 2019-09-03 广州极飞科技有限公司 Scaling method, device, unmanned plane and the storage medium of binocular camera

Also Published As

Publication number Publication date
TW202136854A (en) 2021-10-01

Similar Documents

Publication Publication Date Title
US12056886B2 (en) Systems and methods for depth estimation using generative models
WO2021139176A1 (en) Pedestrian trajectory tracking method and apparatus based on binocular camera calibration, computer device, and storage medium
JP6793151B2 (en) Object tracking device, object tracking method and object tracking program
JP2009515493A (en) Determining camera movement
CN110008795B (en) Image target tracking method and system and computer readable recording medium
US11544926B2 (en) Image processing apparatus, method of processing image, and storage medium
WO2018112788A1 (en) Image processing method and device
CN110458025B (en) A target recognition and localization method based on binocular camera
CN109769326B (en) A method, device and equipment for chasing light
JP7444604B2 (en) Image processing device and method, and imaging device
CN117896626B (en) Method, device, equipment and storage medium for detecting motion trajectory with multiple cameras
CN115035158B (en) Target tracking method and device, electronic equipment and storage medium
JP6077425B2 (en) Video management apparatus and program
JP7649978B2 (en) Three-dimensional model generating method and three-dimensional model generating device
EP3035242B1 (en) Method and electronic device for object tracking in a light-field capture
JP2017016592A (en) Main subject detection device, main subject detection method and program
CN110310325B (en) Virtual measurement method, electronic device and computer readable storage medium
TWI731624B (en) Method for estimating position of electronic device, electronic device and computer device
CN118382877A (en) Information processing method, information processing system, information processing program, and computer-readable non-transitory recording medium recording the information processing program
WO2020155024A1 (en) Method and apparatus for missing data processing of three dimensional trajectory data
CN114500873A (en) Tracking shooting system
JP2023084461A (en) MAIN SUBJECT DETERMINATION DEVICE, IMAGING DEVICE, MAIN SUBJECT DETERMINATION METHOD, AND PROGRAM
WO2025180116A1 (en) Gesture tracking method and apparatus, device, readable storage medium, and program product
JP6362947B2 (en) Video segmentation apparatus, method and program
WO2020255742A1 (en) Animal information management system and animal information management method