TWI907161B

TWI907161B - Method for display virtual object, head mounted display system and head mounted display apparatus

Info

Publication number: TWI907161B
Application number: TW113144324A
Authority: TW
Inventors: 彭思華; 曹淩帆; 胡議元
Original assignee: 宏碁股份有限公司
Filing date: 2024-11-18
Publication date: 2025-12-01

Abstract

A method for display virtual object, a head mounted display system and a head mounted display apparatus are provided. The method includes the following steps. An image capturing device on the head mounted display apparatus is used to capture an environment image toward a display screen. A plurality of preset markers are presented on the display screen. Real-time pose information of the display screen is determined based on the preset markers in the environment image. Based on the real-time pose information of the display screen and device pose information of the head-mounted display apparatus, target pose information of the display screen in an augmented reality coordinate system is determined. According to the target pose information of the display screen, a virtual object anchored on the display screen is displayed through the head-mounted display device.

Description

Methods for displaying virtual objects, head-mounted display systems and head-mounted display devices

本發明是有關於擴增實境顯示，且特別是有關於一種顯示虛擬物件的方法、頭戴顯示系統與頭戴顯示裝置。This invention relates to augmented reality display, and more particularly to a method for displaying virtual objects, a head-mounted display system, and a head-mounted display device.

隨著科技的進步，擴增實境（Augmented Reality，AR）技術的應用日益廣泛。擴增實境技術能將虛擬資訊疊加於真實世界中，從而創造出更具互動性與沉浸感的體驗。不僅如此，搭載擴增實境顯示功能的頭戴顯示裝置，也逐漸與現實生活中的其他顯示設備相結合，實現更多元化與更具功能性的顯示場景。然而，為了使頭戴顯示裝置與其他顯示裝置能夠順利協同運作，需要根據其他顯示裝置的屏幕尺寸、位置、解析度等相關參數來動態調整虛擬物件的顯示位置、大小及比例。這樣才能確保虛擬內容與真實世界的顯示效果達到最佳的協同狀態。目前來說，如何使虛擬物件與真實場景中其他顯示裝置的顯示內容區域理想貼合，仍為本領域技術人員關心的議題。With the advancement of technology, the application of Augmented Reality (AR) technology is becoming increasingly widespread. AR technology overlays virtual information onto the real world, creating a more interactive and immersive experience. Furthermore, head-mounted displays equipped with AR capabilities are increasingly integrating with other real-world display devices, enabling more diverse and functional display scenarios. However, for head-mounted displays to work smoothly with other display devices, the display position, size, and proportion of virtual objects need to be dynamically adjusted based on parameters such as the screen size, position, and resolution of the other display devices. This ensures optimal harmony between the virtual content and the real-world display. Currently, how to make virtual objects ideally match the display content areas of other display devices in real scenes remains a concern for technical personnel in this field.

本發明實施例提供一種顯示虛擬物件的方法，包括下列步驟。利用一頭戴顯示裝置上的影像擷取裝置朝一顯示屏幕擷取一環境影像。多個預設標記呈現於顯示屏幕上。根據環境影像中的多個預設標記，決定顯示屏幕的即時位姿資訊。根據顯示屏幕的即時位姿資訊與頭戴顯示裝置的裝置位姿資訊，決定顯示屏幕於擴增實境座標系統中的目標位姿資訊。根據顯示屏幕的目標位姿資訊，透過頭戴顯示裝置顯示錨定於顯示屏幕的一虛擬物件。This invention provides a method for displaying a virtual object, comprising the following steps: Capturing an environmental image towards a display screen using an image capturing device on a head-mounted display device. Presenting multiple preset markers on the display screen. Determining real-time pose information of the display screen based on the multiple preset markers in the environmental image. Determining target pose information of the display screen in an augmented reality coordinate system based on the real-time pose information of the display screen and the device pose information of the head-mounted display device. Displaying a virtual object anchored to the display screen via the head-mounted display device based on the target pose information of the display screen.

本發明實施例提供一種頭戴顯示系統，其包括頭戴顯示裝置以及計算機裝置。頭戴顯示裝置包括影像擷取裝置。影像擷取裝置用以朝顯示屏幕擷取一環境影像。多個預設標記呈現於顯示屏幕上。計算機裝置連接頭戴顯示裝置，並包括儲存裝置與處理器。處理器耦接儲存裝置，並經配置以執行下列操作。根據環境影像中的多個預設標記，決定顯示屏幕的即時位姿資訊。根據顯示屏幕的即時位姿資訊與頭戴顯示裝置的裝置位姿資訊，決定顯示屏幕於擴增實境座標系統中的目標位姿資訊。根據顯示屏幕的目標位姿資訊，透過頭戴顯示裝置顯示錨定於顯示屏幕的一虛擬物件。This invention provides a head-mounted display system, comprising a head-mounted display device and a computer device. The head-mounted display device includes an image capturing device. The image capturing device is used to capture an environmental image toward a display screen. A plurality of preset markers are displayed on the display screen. The computer device is connected to the head-mounted display device and includes a storage device and a processor. The processor is coupled to the storage device and configured to perform the following operations: determining real-time pose information of the display screen based on the plurality of preset markers in the environmental image; determining target pose information of the display screen in an augmented reality coordinate system based on the real-time pose information of the display screen and the device pose information of the head-mounted display device. Based on the target pose information on the display screen, a virtual object anchored to the display screen is displayed through a head-mounted display device.

本發明實施例提供一種頭戴顯示裝置，其包括影像擷取裝置、儲存裝置以及處理器。影像擷取裝置用以朝顯示屏幕擷取一環境影像。多個預設標記呈現於顯示屏幕上。處理器耦接儲存裝置，並經配置以執行下列操作。根據環境影像中的多個預設標記，決定顯示屏幕的即時位姿資訊。根據顯示屏幕的即時位姿資訊與頭戴顯示裝置的裝置位姿資訊，決定顯示屏幕於擴增實境座標系統中的目標位姿資訊。根據顯示屏幕的目標位姿資訊，透過頭戴顯示裝置顯示錨定於顯示屏幕的一虛擬物件。This invention provides a head-mounted display device, comprising an image capturing device, a storage device, and a processor. The image capturing device captures an environmental image toward a display screen. Multiple preset markers are displayed on the display screen. The processor is coupled to the storage device and configured to perform the following operations: determining real-time pose information of the display screen based on the multiple preset markers in the environmental image; determining target pose information of the display screen in an augmented reality coordinate system based on the real-time pose information of the display screen and the device pose information of the head-mounted display device; and displaying a virtual object anchored to the display screen via the head-mounted display device based on the target pose information of the display screen.

基於上述，於本發明的實施例中，多個預設標記呈現於顯示屏幕上。根據頭戴顯示裝置所擷取之環境影像中的預設標記，可估測顯示屏幕的即時位姿資訊。顯示屏幕於擴增實境座標系統中的目標位姿資訊可根據顯示屏幕的即時位姿資訊與頭戴顯示裝置的裝置位姿資訊來決定。於是，虛擬物件的顯示位置可基於顯示屏幕的目標位姿資訊來決定，好讓使用者可經由頭戴顯示裝置觀看到穩定地錨定於顯示屏幕上的虛擬物件。基此，可提高虛擬與真實顯示的協同精度與靈活性，實現更為流暢的擴增實境體驗。Based on the above, in embodiments of the present invention, multiple preset markers are displayed on the screen. The real-time pose information of the display screen can be estimated based on the preset markers in the environmental image captured by the head-mounted display. The target pose information of the display screen in the augmented reality coordinate system can be determined based on the real-time pose information of the display screen and the device pose information of the head-mounted display. Therefore, the display position of the virtual object can be determined based on the target pose information of the display screen, allowing the user to view a virtual object stably anchored on the display screen via the head-mounted display. Based on this, the coordination accuracy and flexibility between virtual and real displays can be improved, achieving a smoother augmented reality experience.

本發明的部份實施例接下來將會配合附圖來詳細描述，以下的描述所引用的元件符號，當不同附圖出現相同的元件符號將視為相同或相似的元件。這些實施例只是本發明的一部份，並未揭示所有本發明的可實施方式。更確切的說，這些實施例只是本發明的專利申請範圍中的方法與系統的範例。Some embodiments of the present invention will be described in detail below with reference to the accompanying drawings. Component symbols used in the following description, when appearing in different accompanying drawings, are considered to be the same or similar components. These embodiments are only a part of the present invention and do not disclose all possible embodiments of the present invention. More precisely, these embodiments are merely examples of methods and systems within the scope of the patent applications of the present invention.

圖1A與圖1B是依照本發明一實施例的頭戴顯示系統的示意圖。請參照圖1A，頭戴顯示系統10A包括頭戴顯示裝置110以及顯示屏幕120。請參照圖1B，頭戴顯示系統10B包括頭戴顯示裝置110、計算機裝置130以及顯示屏幕120。Figures 1A and 1B are schematic diagrams of a head-mounted display system according to an embodiment of the present invention. Referring to Figure 1A, the head-mounted display system 10A includes a head-mounted display device 110 and a display screen 120. Referring to Figure 1B, the head-mounted display system 10B includes a head-mounted display device 110, a computer device 130, and a display screen 120.

於一些實施例中，顯示屏幕120可例如為筆電屏幕、桌上顯示器、平板電腦屏幕、電視機等等。從另一觀點來看，顯示屏幕120可以是液晶顯示器（Liquid Crystal Display，LCD）、發光二極體（Light Emitting Diode，LED）顯示器、有機發光二極體（Organic Light Emitting Diode，OLED）等各類型的顯示器，本發明對此不限制。或者，於一些實施例中，顯示屏幕120可為投影布幕或其他適於顯示投影畫面的牆壁或其他投影表面等等。In some embodiments, the display screen 120 may be, for example, a laptop screen, a desktop monitor, a tablet computer screen, a television, etc. From another perspective, the display screen 120 may be various types of displays, such as a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, or an Organic Light Emitting Diode (OLED), and the present invention is not limited thereto. Alternatively, in some embodiments, the display screen 120 may be a projection screen or other wall or other projection surface suitable for displaying projected images, etc.

於一些實施例中，頭戴顯示裝置110可例如是擴增實境眼鏡或混合實境裝置等等。於一些實施例中，頭戴顯示裝置110可經由有線傳輸介面或無線傳輸介面連接至包括顯示屏幕120的電子設備上。舉例而言，頭戴顯示裝置110可經由有線傳輸介面或無線傳輸介面連接至包括顯示屏幕120的筆記型電腦。頭戴顯示系統10A與10B可用於向使用者提供擴增實境內容。頭戴顯示裝置110可用以顯示虛擬物件。於一些實施例中，頭戴顯示裝置110所顯示之虛擬物件會顯示為錨定於實際場景中的顯示屏幕120。In some embodiments, the head-mounted display device 110 may be, for example, augmented reality glasses or a mixed reality device. In some embodiments, the head-mounted display device 110 may be connected to an electronic device including a display screen 120 via a wired or wireless transmission interface. For example, the head-mounted display device 110 may be connected to a laptop computer including a display screen 120 via a wired or wireless transmission interface. Head-mounted display systems 10A and 10B can be used to provide augmented reality content to a user. The head-mounted display device 110 can be used to display virtual objects. In some embodiments, the virtual objects displayed by the head-mounted display device 110 are displayed as a display screen 120 anchored to a real-world scene.

於一些實施例中，當使用者配戴頭戴顯示裝置110並朝向實際場景中的顯示屏幕120時，頭戴顯示裝置110所顯示的虛擬物件可為虛擬畫面或虛擬視窗等等。In some embodiments, when a user wears a head-mounted display device 110 and faces a display screen 120 in a real-world setting, the virtual objects displayed by the head-mounted display device 110 may be virtual images or virtual windows, etc.

須說明的是，於圖1A的實施範例中，頭戴顯示裝置110可包括影像擷取裝置111、儲存裝置112，以及處理器113，且頭戴顯示裝置110的處理器113可自行產生顯示內容。另一方面，於圖1B的實施範例中，頭戴顯示裝置110可包括影像擷取裝置111，並連接至包括儲存裝置112以及處理器113的計算機裝置130。也就是說，於圖1B的範例中，頭戴顯示裝置110是顯示由計算機裝置130的處理器113所決定的顯示內容。It should be noted that, in the embodiment of FIG1A, the head-mounted display device 110 may include an image capturing device 111, a storage device 112, and a processor 113, and the processor 113 of the head-mounted display device 110 can generate display content itself. On the other hand, in the embodiment of FIG1B, the head-mounted display device 110 may include an image capturing device 111 and is connected to a computer device 130 including a storage device 112 and a processor 113. That is, in the example of FIG1B, the head-mounted display device 110 displays display content determined by the processor 113 of the computer device 130.

影像擷取裝置111用以擷取環境影像並且包括具有透鏡以及感光元件的攝像鏡頭。感光元件用以感測進入透鏡的光線強度，進而產生影像。感光元件可以例如是電荷耦合元件（charge coupled device，CCD）、互補性氧化金屬半導體（complementary metal-oxide semiconductor，CMOS）元件或其他元件，本發明不在此設限。在一實施例中，影像擷取裝置111固定設置於頭戴顯示裝置110上，並用於拍攝位於頭戴顯示裝置110前方的實際場景。舉例而言，當使用者配戴頭戴顯示裝置110時，影像擷取裝置111可位於使用者雙眼之間或位於某一眼前方而朝使用者前方的實際場景進行拍攝動作。於本發明實施例中，影像擷取裝置111可朝顯示屏幕120拍攝。Image capturing device 111 is used to capture environmental images and includes a camera lens with a lens and a photosensitive element. The photosensitive element is used to sense the intensity of light entering the lens, thereby generating an image. The photosensitive element can be, for example, a charge-coupled device (CCD), a complementary metal-oxide semiconductor (CMOS) element, or other elements, and the present invention is not limited thereto. In one embodiment, image capturing device 111 is fixedly mounted on head-mounted display device 110 and is used to capture the actual scene located in front of head-mounted display device 110. For example, when a user wears the head-mounted display device 110, the image capturing device 111 can be positioned between the user's eyes or in front of one of their eyes to capture images of the actual scene in front of the user. In this embodiment of the invention, the image capturing device 111 can capture images of the display screen 120.

另外需要說明的是，使用者透過頭戴顯示裝置110之顯示器所觀看到的內容是疊加虛擬物件的擴增實境場景。頭戴顯示裝置110可包括用以顯示虛擬物件的顯示器（未繪示於圖1A與圖1B）。於一些實施例中，頭戴顯示裝置110的顯示器可為光穿透顯示器，好讓使用者觀看時能夠呈現出相對於觀看者另一側的實際場景。於另一些實施例中，頭戴顯示裝置110的顯示器可為同時顯示實際場景與虛擬物件的顯示器，亦即頭戴顯示裝置110可同時顯示實際場景與虛擬物件。It should also be noted that the content viewed by the user through the display of the head-mounted display device 110 is an augmented reality scene overlaid with virtual objects. The head-mounted display device 110 may include a display for displaying virtual objects (not shown in Figures 1A and 1B). In some embodiments, the display of the head-mounted display device 110 may be a light-transmitting display, so that the user can view the actual scene on the other side relative to the viewer. In other embodiments, the display of the head-mounted display device 110 may be a display that simultaneously displays the actual scene and virtual objects, that is, the head-mounted display device 110 can display the actual scene and virtual objects simultaneously.

儲存裝置112用以儲存資料與供處理器113存取的程式碼（例如作業系統、應用程式、驅動程式）等資料，其可以例如是任意型式的固定式或可移動式隨機存取記憶體（random access memory，RAM）、唯讀記憶體（read-only memory，ROM）、快閃記憶體（flash memory）或其組合。The storage device 112 is used to store data and program code (such as operating system, application, driver) for access by the processor 113. The data can be, for example, any type of fixed or removable random access memory (RAM), read-only memory (ROM), flash memory, or a combination thereof.

處理器113耦接儲存裝置112，例如是中央處理單元（central processing unit，CPU）、應用處理器（application processor，AP），或是其他可程式化之一般用途或特殊用途的微處理器（microprocessor）、數位訊號處理器（digital signal processor，DSP）、影像訊號處理器（image signal processor，ISP）、圖形處理器（graphics processing unit，GPU）或其他類似裝置、積體電路及其組合。處理器113可存取並執行記錄在儲存裝置112中的程式碼與軟體元件，以實現本發明實施例中的顯示虛擬物件的方法。Processor 113 is coupled to storage device 112, such as a central processing unit (CPU), application processor (AP), or other programmable general-purpose or special-purpose microprocessor, digital signal processor (DSP), image signal processor (ISP), graphics processing unit (GPU), or other similar devices, integrated circuits, and combinations thereof. Processor 113 can access and execute program code and software elements recorded in storage device 112 to implement the method of displaying virtual objects in the embodiments of the present invention.

圖2是依照本發明一實施例的頭戴顯示系統的應用情境圖。請參照圖2，當使用者透過頭戴顯示裝置110的顯示器觀看顯示屏幕120時，使用者可看到虛擬物件V_obj1與虛擬物件V_obj2疊加於實際場景的擴增實境場景。詳細而言，當使用者在配戴頭戴顯示裝置110的情況下觀看虛擬物件時，影像擷取裝置111會朝顯示屏幕120拍攝包括多張環境影像的影像序列。處理器113可依據這些環境影像來即時決定虛擬物件V_obj1與虛擬物件V_obj2的顯示參數，像是顯示邊界、顯示尺寸或顯示位置等等，致使虛擬物件V_obj1與虛擬物件V_obj2可呈現為錨定於顯示屏幕120上。頭戴顯示裝置110的顯示器上所顯示的虛擬物件V_obj1與虛擬物件V_obj2可隨著使用者移動或其頭部轉動而相對移動。Figure 2 is an application scenario diagram of a head-mounted display system according to an embodiment of the present invention. Referring to Figure 2, when a user views the display screen 120 through the display of the head-mounted display device 110, the user can see an augmented reality scene where virtual objects V_obj1 and V_obj2 are superimposed on the actual scene. In detail, when the user views the virtual object while wearing the head-mounted display device 110, the image capturing device 111 captures a sequence of images, including multiple environmental images, onto the display screen 120. The processor 113 can determine the display parameters of the virtual objects V_obj1 and V_obj2 in real time based on these environmental images, such as display boundaries, display size, or display position, so that the virtual objects V_obj1 and V_obj2 can be displayed as anchored on the display screen 120. The virtual objects V_obj1 and V_obj2 displayed on the display of the head-mounted display device 110 can move relative to each other as the user moves or their head turns.

如圖2的範例所示，當使用者透過頭戴顯示裝置110的顯示器觀看顯示屏幕120時，使用者可看到自顯示屏幕120的右顯示邊框向外展開的虛擬物件V_obj1，與覆蓋於顯示屏幕120之顯示範圍上的虛擬物件V_obj2。虛擬物件V_obj1與虛擬物件V_obj2可用以提供各種資訊給使用者，例如是視窗、文件、影像、桌面或執行應用程式生成的視覺輸出等等。然而，圖2僅為一示範說明，本發明對於虛擬物件的數量與其顯示位置並不限制。As illustrated in Figure 2, when a user views the display screen 120 through the display of the head-mounted display device 110, the user can see a virtual object V_obj1 extending outward from the right display border of the display screen 120, and a virtual object V_obj2 covering the display area of the display screen 120. The virtual objects V_obj1 and V_obj2 can be used to provide various information to the user, such as windows, documents, images, desktops, or visual output generated by running applications. However, Figure 2 is only an illustrative example, and the present invention does not limit the number of virtual objects or their display position.

需說明的是，影像擷取裝置111可定時地且連續地拍攝多張環境影像（例如以30Hz的擷取幀率來產生環境影像），而處理器113可重複地計算顯示屏幕120於擴增實境座標系統下的目標位姿資訊，以根據顯示屏幕120的目標位姿資訊持續更新虛擬物件的顯示位置。藉此，在滿足顯示虛擬物件V_obj1與虛擬物件V_obj2之條件的情況下，即便使用者的位置改變或其頭部轉動，虛擬物件V_obj1與虛擬物件V_obj2依然可顯示為錨定於實際場景中的相對於顯示屏幕120的一個固定位置上。舉例而言，即便使用者的位置改變或其頭部轉動，虛擬物件V_obj2的物件邊界可維持於貼合顯示屏幕112的屏幕邊界。It should be noted that the image capturing device 111 can capture multiple environmental images periodically and continuously (e.g., generating environmental images at a capture frame rate of 30Hz), and the processor 113 can repeatedly calculate the target pose information of the display screen 120 in the augmented reality coordinate system to continuously update the display position of the virtual object based on the target pose information of the display screen 120. Therefore, provided the conditions for displaying virtual objects V_obj1 and V_obj2 are met, even if the user's position changes or their head turns, virtual objects V_obj1 and V_obj2 can still be displayed as fixed positions anchored in the actual scene relative to the display screen 120. For example, even if the user's position changes or their head turns, the object boundary of virtual object V_obj2 can remain aligned with the screen boundary of display screen 112.

需說明的是，於本發明實施例中，顯示屏幕120呈現有多個預設標記，而處理器113可依據環境影像中的預設標記來定位顯示屏幕120的即時位置與獲取顯示屏幕120的即時姿態，從而根據顯示屏幕120的即時位姿資訊決定虛擬物件的顯示參數，像是顯示邊界、顯示尺寸或顯示位置等等，致使虛擬物件可呈現為錨定於顯示屏幕120上。以下即搭配頭戴顯示系統10A與10B的各元件列舉實施例，以說明顯示虛擬物件的方法的詳細步驟。It should be noted that in this embodiment of the invention, the display screen 120 displays multiple preset markers, and the processor 113 can locate the real-time position of the display screen 120 and obtain the real-time posture of the display screen 120 based on the preset markers in the environmental image. Therefore, based on the real-time posture information of the display screen 120, the processor determines the display parameters of the virtual object, such as display boundaries, display size, or display position, so that the virtual object can be displayed as if anchored to the display screen 120. The following examples, in conjunction with the components of the head-mounted display systems 10A and 10B, illustrate the detailed steps of the method for displaying virtual objects.

圖3是依照本發明一實施例的錨定顯示方法的流程圖。請參照圖1A、圖1B與圖3，本實施例的方式適用於上述實施例中的頭戴顯示系統10A與10B，以下即搭配頭戴顯示系統10A與10B中的各項元件說明本實施例之顯示虛擬物件的方法的詳細步驟。Figure 3 is a flowchart of the anchoring display method according to an embodiment of the present invention. Referring to Figures 1A, 1B and 3, the method of this embodiment is applicable to the head-mounted display systems 10A and 10B in the above embodiments. The following describes the detailed steps of the method for displaying virtual objects in this embodiment with reference to the various components in the head-mounted display systems 10A and 10B.

於步驟S310，處理器113利用一頭戴顯示裝置110上的影像擷取裝置111朝顯示屏幕120擷取一環境影像。於此，多個預設標記呈現於顯示屏幕120上。也就是說，當使用者配戴頭戴顯示裝置110並面對顯示屏幕120時，影像擷取裝置111可朝顯示屏幕120進行拍攝，而獲取包括部份或全部的預設標記的環境影像。In step S310, the processor 113 uses an image capturing device 111 on a head-mounted display 110 to capture an environmental image toward the display screen 120. Here, multiple preset markers are displayed on the display screen 120. That is, when the user wears the head-mounted display 110 and faces the display screen 120, the image capturing device 111 can capture an environmental image toward the display screen 120, including some or all of the preset markers.

於一些實施例中，這些預設標記可以是尺寸大小一樣的多個二維碼。於一些實施例中，多個預設標記可包括多個二進制平方標記（ArUco marker）。每一個二進制平方標記是由一個寬的黑邊和一個內部的二進位矩陣組成，內部的矩陣決定了二進制平方標記的標記識別符。從另一觀點來看，每一個二進制平方標記可以是由4x4、5x5、6x6、7x7或8x8等單位方格的內部矩陣所組成的方形矩陣。於其他些實施例中，這些預設標記可以是QR碼。In some embodiments, these default markers can be multiple QR codes of the same size. In some embodiments, multiple default markers can include multiple binary square markers (ArUco markers). Each binary square marker consists of a wide black border and an internal binary matrix, the internal matrix determining the identifier of the binary square marker. Alternatively, each binary square marker can be a square matrix composed of internal matrices of unit squares such as 4x4, 5x5, 6x6, 7x7, or 8x8. In other embodiments, these default markers can be QR codes.

於一些實施例中，顯示屏幕120可顯示多個預設標記。或者，於一些實施例中，這些預設標記可以投影於顯示屏幕120上。或者，於一些實施例中，這些預設標記可以實施為貼附於顯示屏幕120上的貼紙或圖卡等等。In some embodiments, the display screen 120 may display multiple preset icons. Alternatively, in some embodiments, these preset icons may be projected onto the display screen 120. Alternatively, in some embodiments, these preset icons may be implemented as stickers or cards, etc., attached to the display screen 120.

於一些實施例中，多個預設標記呈現於顯示屏幕120的多個屏幕角落與屏幕中心。多個預設標記包括多個角落標記與一中心標記。多個角落標記位於顯示屏幕120之多個屏幕角落，而中心標記位於顯示屏幕120的屏幕中心。於一些實施例中，顯示屏幕120可將四個角落標記分別顯示於四個屏幕角落。顯示屏幕120可將中心標記分別顯示於屏幕中心。這些預設標記可以是對應於不同標記識別符的二進制平方標記。In some embodiments, multiple default icons are displayed at multiple corners and the center of the display screen 120. The multiple default icons include multiple corner icons and one center icon. The multiple corner icons are located at multiple corners of the display screen 120, while the center icon is located at the center of the display screen 120. In some embodiments, the display screen 120 may display the four corner icons at the four corners of the screen respectively. The display screen 120 may display the center icon at the center of the screen respectively. These default icons may be binary squared icons corresponding to different icon identifiers.

舉例來說，圖4是依照本發明一實施例的顯示屏幕與多個預設標記的示意圖。請參照圖4，多個角落標記M1～M4分別顯示於顯示屏幕120的四個屏幕角落。中心標記M5位於顯示屏幕120的屏幕中心。多個角落標記M1～M4與中心標記M5分別為尺寸相同的二進制平方標記，且對應於不同標記識別符。For example, Figure 4 is a schematic diagram of a display screen and multiple preset marks according to an embodiment of the present invention. Referring to Figure 4, multiple corner marks M1 to M4 are displayed at the four corners of the display screen 120. The center mark M5 is located at the center of the display screen 120. The multiple corner marks M1 to M4 and the center mark M5 are all binary square marks of the same size and correspond to different mark identifiers.

於步驟S320，處理器113根據環境影像中的多個預設標記，決定顯示屏幕120的即時位姿資訊。詳細來說，在處理器113獲取環境影像之後，處理器113可從環境影像中識別出部份的或全部的多個預設標記。處理器113可透過邊緣檢測與四邊形檢測等等影像處理從環境影像中識別出一或多個預設標記，以獲取這些預設標記於影像擷取裝置111之相機座標系統下的位移資訊與旋轉資訊。之後，根據一或多個預設標記於相機座標系統下的位移資訊與旋轉資訊，處理器113可決定顯示屏幕120於相機座標系統下的即時位姿資訊。顯示屏幕120的即時位姿資訊可包括相機座標系統下的位移資訊與旋轉資訊。In step S320, the processor 113 determines the real-time pose information of the display screen 120 based on multiple preset markers in the environmental image. Specifically, after acquiring the environmental image, the processor 113 can identify some or all of the multiple preset markers from the environmental image. The processor 113 can identify one or more preset markers from the environmental image through image processing such as edge detection and quadrilateral detection to obtain the displacement and rotation information of these preset markers in the camera coordinate system of the image capturing device 111. Subsequently, based on one or more preset displacement and rotation information marked on the camera coordinate system, the processor 113 can determine the real-time pose information of the display screen 120 in the camera coordinate system. The real-time pose information of the display screen 120 may include displacement and rotation information in the camera coordinate system.

於一些實施例中，當環境影像包括多個角落標記M1～M4與中心標記M5，處理器113可使用n點透視法（Perspective-n-Point，PnP）問題來計算多個角落標記M1～M4與中心標記M5分別相對於影像擷取裝置111的位移資訊與旋轉資訊。In some embodiments, when the environmental image includes multiple corner markers M1 to M4 and a center marker M5, the processor 113 can use the Perspective-n-Point (PnP) problem to calculate the displacement and rotation information of the multiple corner markers M1 to M4 and the center marker M5 relative to the image capturing device 111.

於一些實施例中，當多個預設標記為多個二進制平方標記（ArUco marker），處理器113可根據環境影像估測出各個二進制平方標記的旋轉向量與位移向量。此外，於一些實施例中，處理器113可將各個二進制平方標記的旋轉向量轉換為單位四元數。In some embodiments, when multiple default markers are multiple binary square markers (ArUco markers), the processor 113 can estimate the rotation vector and displacement vector of each binary square marker based on the environmental image. In addition, in some embodiments, the processor 113 can convert the rotation vector of each binary square marker into a single quaternion.

於一些實施例中，位移資訊可實施為位移向量（Translation Vector）。位移向量可包括三個向量元素，其分別為對應於X軸、Y軸與Z軸的位移量。旋轉向量可實施為旋轉向量（Rotation Vector）或單位四元數（Unit Quaternion）。像是，旋轉向量可包括俯仰角、滾轉角、和偏航角。In some embodiments, displacement information may be implemented as a translation vector. A translation vector may include three vector elements, representing the displacement along the X, Y, and Z axes, respectively. Rotation vectors may be implemented as rotation vectors or unit quaternions. For example, a rotation vector may include pitch, roll, and yaw angles.

於一些實施例中，處理器113還可檢查環境影像中的多個經識別標記是否符合顯示屏幕120上的多個預設標記。舉例而言，處理器113可判斷環境影像中的多個經識別標的標記識別符是否為多個預設標記的標記識別符。當環境影像中的多個經識別標記不符合顯示屏幕120上的多個預設標記，處理器113可禁能虛擬物件的顯示功能。In some embodiments, processor 113 may also check whether multiple identifiers in the environmental image match multiple preset identifiers on display screen 120. For example, processor 113 may determine whether the identifiers of multiple identifiers in the environmental image are the identifiers of multiple preset identifiers. When multiple identifiers in the environmental image do not match the multiple preset identifiers on display screen 120, processor 113 may disable the display function of virtual objects.

於步驟S330，處理器113根據顯示屏幕120的即時位姿資訊與頭戴顯示裝置110的裝置位姿資訊，決定顯示屏幕120於一擴增實境座標系統中的目標位姿資訊。詳細來說，處理器113可進行相機座標系統與擴增實境座標系統（亦可稱為頭戴顯示座標系統）之間的座標轉換，以獲取顯示屏幕120於擴增實境座標系統下的即時位姿資訊。於此，顯示屏幕120於擴增實境座標系統下的即時位姿資訊是顯示屏幕120相對於頭戴顯示裝置110的相對位姿資訊。處理器113可進一步根據頭戴顯示裝置110的裝置位姿資訊，根據顯示屏幕120於擴增實境座標系統下的即時位姿資訊決定顯示屏幕120的絕對位姿資訊（亦即擴增實境座標系統中的目標位姿資訊）。In step S330, the processor 113 determines the target pose information of the display screen 120 in an augmented reality coordinate system based on the real-time pose information of the display screen 120 and the device pose information of the head-mounted display device 110. Specifically, the processor 113 can perform coordinate conversion between a camera coordinate system and an augmented reality coordinate system (also referred to as a head-mounted display coordinate system) to obtain the real-time pose information of the display screen 120 in the augmented reality coordinate system. Here, the real-time pose information of the display screen 120 in the augmented reality coordinate system is the relative pose information of the display screen 120 relative to the head-mounted display device 110. The processor 113 can further determine the absolute pose information of the display screen 120 (i.e., the target pose information in the augmented reality coordinate system) based on the device pose information of the head-mounted display device 110 and the real-time pose information of the display screen 120 in the augmented reality coordinate system.

於步驟S340，處理器113根據顯示屏幕120的目標位姿資訊，透過頭戴顯示裝置110顯示錨定於顯示屏幕120的一虛擬物件。處理器113可根據顯示屏幕120的目標位姿資訊決定虛擬物件於擴增實境座標系統中的三維成像位置，並根據虛擬物件於擴增實境座標系統中的三維成像位置決定所述虛擬物件於顯示圖幀中的顯示位置。虛擬物件將錨定於顯示屏幕120上並不隨頭戴式顯示裝置110移動而改變位置，致使虛擬物件可與實際場景中的顯示屏幕120結合為一體，從而增進視覺體驗與提昇便利性。In step S340, the processor 113 displays a virtual object anchored to the display screen 120 via the head-mounted display device 110 based on the target pose information of the display screen 120. The processor 113 can determine the three-dimensional imaging position of the virtual object in the augmented reality coordinate system based on the target pose information of the display screen 120, and determine the display position of the virtual object in the display frame based on the three-dimensional imaging position of the virtual object in the augmented reality coordinate system. The virtual object is anchored to the display screen 120 and its position does not change as the head-mounted display device 110 moves, so that the virtual object can be integrated with the display screen 120 in the actual scene, thereby enhancing the visual experience and improving convenience.

圖5是依照本發明一實施例的顯示虛擬物件的方法的流程圖。請參照圖1A、圖1B與圖5，本實施例的方式適用於上述實施例中的頭戴顯示系統10A與10B，以下即搭配頭戴顯示系統10A與10B中的各項元件說明本實施例之顯示虛擬物件的方法的詳細步驟。Figure 5 is a flowchart of a method for displaying virtual objects according to an embodiment of the present invention. Referring to Figures 1A, 1B and 5, the method of this embodiment is applicable to the head-mounted display systems 10A and 10B in the above embodiments. The following describes the detailed steps of the method for displaying virtual objects according to this embodiment with reference to the various components in the head-mounted display systems 10A and 10B.

於步驟S510，處理器113控制顯示屏幕120顯示多個預設標記。如圖4範例所示，顯示屏幕120可顯示4個角落標記M1～M4與1個中心標記M5。於步驟S520，處理器113利用一頭戴顯示裝置110上的影像擷取裝置111朝顯示屏幕120擷取一環境影像。In step S510, processor 113 controls display screen 120 to display multiple preset markers. As shown in the example of Figure 4, display screen 120 can display four corner markers M1 to M4 and one center marker M5. In step S520, processor 113 uses image capturing device 111 on a head-mounted display device 110 to capture an environmental image towards display screen 120.

於步驟S530，處理器113根據環境影像中的多個預設標記，決定顯示屏幕120的即時位姿資訊。於本實施例中，步驟S530可實施為步驟S531至步驟S532。In step S530, the processor 113 determines the real-time pose information of the display screen 120 based on multiple preset markers in the environmental image. In this embodiment, step S530 can be implemented as steps S531 to S532.

於步驟S531，處理器113可計算環境影像中多個預設標記其中至少一者的旋轉資訊與位移資訊。旋轉資訊可以是旋轉向量或四元數。位移資訊可以是位移向量。於步驟S532，處理器113根據多個預設標記其中至少一者的旋轉資訊與位移資訊，決定顯示屏幕120於影像擷取裝置111之相機座標系統下的第一即時位姿資訊。In step S531, processor 113 calculates the rotation and displacement information of at least one of multiple preset markers in the environmental image. The rotation information can be a rotation vector or a quaternion. The displacement information can be a displacement vector. In step S532, processor 113 determines the first real-time pose information of the display screen 120 in the camera coordinate system of the image capturing device 111 based on the rotation and displacement information of at least one of the multiple preset markers.

以圖4為例來說，當處理器113從環境影像中識別出角落標記M1～M4與中心標記M5，處理器113可獲取各個角落標記M1～M4的旋轉資訊與位移資訊以及中心標記M5的旋轉資訊與位移資訊。之後，於一些實施例中，處理器113可根據各個角落標記M1～M4的旋轉資訊與位移資訊來決定顯示屏幕120於相機座標系統下的第一即時位姿資訊。或者，於一些實施例中，處理器113可根據中心標記M5的旋轉資訊與位移資訊來決定顯示屏幕120於相機座標系統下的第一即時位姿資訊。又或者，於一些實施例中，處理器113可根據各個角落標記M1～M4的旋轉資訊與位移資訊與中心標記M5的旋轉資訊與位移資訊，來決定顯示屏幕120於相機座標系統下的第一即時位姿資訊。Taking Figure 4 as an example, when the processor 113 identifies corner markers M1 to M4 and center marker M5 from the environmental image, the processor 113 can acquire rotation and displacement information of each corner marker M1 to M4 and the center marker M5. Subsequently, in some embodiments, the processor 113 can determine the first real-time pose information of the display screen 120 in the camera coordinate system based on the rotation and displacement information of each corner marker M1 to M4. Alternatively, in some embodiments, the processor 113 can determine the first real-time pose information of the display screen 120 in the camera coordinate system based on the rotation and displacement information of the center marker M5. Alternatively, in some embodiments, the processor 113 can determine the first real-time pose information of the display screen 120 in the camera coordinate system based on the rotation and displacement information of each corner marker M1 to M4 and the rotation and displacement information of the center marker M5.

於一些實施例中，處理器113可根據多個預設標記中的一中心標記的旋轉資訊與位移資訊，決定顯示屏幕120的第一即時位姿資訊中的即時位移資訊與即時旋轉資訊。以圖4為例來說，處理器113可直接將中心標記M5的旋轉資訊與位移資訊設置為顯示屏幕120於相機座標系統下的第一即時位姿資訊。第一即時位姿資訊中的即時位移資訊為中心標記M5的位移資訊。第一即時位姿資訊中的即時旋轉資訊為中心標記M5的旋轉資訊。In some embodiments, processor 113 can determine the real-time displacement and rotation information in the first real-time pose information of display screen 120 based on the rotation and displacement information of a center marker among multiple preset markers. Taking Figure 4 as an example, processor 113 can directly set the rotation and displacement information of center marker M5 as the first real-time pose information of display screen 120 in the camera coordinate system. The real-time displacement information in the first real-time pose information is the displacement information of center marker M5. The real-time rotation information in the first real-time pose information is the rotation information of center marker M5.

於一些實施例中，處理器113可對多個預設標記中的多個角落標記的位移資訊進行一位移平均運算，以獲取第一即時位姿資訊中的即時位移資訊。以圖4為例來說，處理器113可計算出各個角落標記M1～M4的位移向量，並分別計算這4個位移向量於三軸上的三個平均值，從而獲取第一即時位姿資訊中的平均位移向量。亦即，第一即時位姿資訊中的即時位移資訊可為平均位移向量。In some embodiments, processor 113 can perform a displacement averaging operation on the displacement information of multiple corner markers among multiple preset markers to obtain the real-time displacement information in the first real-time pose information. Taking Figure 4 as an example, processor 113 can calculate the displacement vectors of each corner marker M1 to M4, and calculate the three average values of these four displacement vectors on the three axes respectively, thereby obtaining the average displacement vector in the first real-time pose information. That is, the real-time displacement information in the first real-time pose information can be the average displacement vector.

於一些實施例中，處理器113可對多個預設標記中的多個角落標記的旋轉資訊進行一球面線性內插（Spherical linear interpolation）運算，以獲取第一即時位姿資訊中的即時旋轉資訊。以圖4為例來說，處理器113可計算出各個角落標記M1～M4的四元數。接著，處理器113可對角落標記M1的四元數與角落標記M4的四元數進行一球面線性內插運算，以獲取第一球面線性內插結果。處理器113可對角落標記M2的四元數與角落標記M3的四元數進行一球面線性內插運算，以獲取第二球面線性內插結果。接著，處理器113可對第一球面線性內插結果與第二球面線性內插結果進行一球面線性內插運算，以獲取第一即時位姿資訊中的即時四元數（亦即，即時旋轉資訊）。In some embodiments, processor 113 can perform a spherical linear interpolation operation on the rotation information of multiple corner markers among multiple preset markers to obtain the real-time rotation information in the first real-time pose information. Taking Figure 4 as an example, processor 113 can calculate the quaternions of each corner marker M1 to M4. Then, processor 113 can perform a spherical linear interpolation operation on the quaternions of corner marker M1 and corner marker M4 to obtain a first spherical linear interpolation result. Processor 113 can perform a spherical linear interpolation operation on the quaternions of corner marker M2 and corner marker M3 to obtain a second spherical linear interpolation result. Next, the processor 113 can perform a spherical linear interpolation operation on the first spherical linear interpolation result and the second spherical linear interpolation result to obtain the real-time quaternion (i.e., real-time rotation information) in the first real-time pose information.

此外，於一些實施例中，處理器113可同時根據中心標記的旋轉資訊與位移資訊與各個角落標記的旋轉資訊與位移資訊，決定顯示屏幕120的第一即時位姿資訊。圖6是依照本發明一實施例的決定顯示屏幕的即時位姿資訊的流程圖。請參照圖6。Furthermore, in some embodiments, the processor 113 can simultaneously determine the first real-time pose information of the display screen 120 based on the rotation and displacement information of the center marker and the rotation and displacement information of each corner marker. Figure 6 is a flowchart of determining the real-time pose information of the display screen according to an embodiment of the present invention. Please refer to Figure 6.

於步驟S611，處理器113可根據多個角落標記的旋轉資訊與位移資訊，獲取第一旋轉資訊與第一位移資訊。基於前文可知，處理器113可透過球面線性內插運算而根據多個角落標記的旋轉資訊獲取第一旋轉資訊。處理器113可透過位移平均運算而根據多個角落標記的位移資訊獲取第一位移資訊。In step S611, processor 113 can obtain first rotation information and first displacement information based on rotation and displacement information from multiple corner markers. As mentioned above, processor 113 can obtain first rotation information based on rotation information from multiple corner markers through spherical linear interpolation. Processor 113 can obtain first displacement information based on displacement information from multiple corner markers through displacement averaging.

於步驟S612，處理器113可根據中心標記的旋轉資訊與位移資訊，獲取第二旋轉資訊與第二位移資訊。第二旋轉資訊可為中心標記的旋轉資訊。第二位移資訊可為中心標記的位移資訊。In step S612, processor 113 can obtain second rotation information and second displacement information based on the rotation information and displacement information of the center mark. The second rotation information can be the rotation information of the center mark. The second displacement information can be the displacement information of the center mark.

於步驟S613，處理器113可根據頭戴顯示裝置110與顯示屏幕120之間的距離決定權重因子。也就是說，權重因子是根據頭戴顯示裝置110與顯示屏幕120之間的距離而決定。透過查表或函式計算，處理器113可根據頭戴顯示裝置110與顯示屏幕120之間的距離決定權重因子。於一些實施例中，處理器113可將頭戴顯示裝置110與顯示屏幕120之間的距離代入一預設函式而決定權重因子。In step S613, the processor 113 may determine a weighting factor based on the distance between the head-mounted display device 110 and the display screen 120. That is, the weighting factor is determined based on the distance between the head-mounted display device 110 and the display screen 120. The processor 113 may determine the weighting factor based on the distance between the head-mounted display device 110 and the display screen 120 through a lookup table or function calculation. In some embodiments, the processor 113 may substitute the distance between the head-mounted display device 110 and the display screen 120 into a preset function to determine the weighting factor.

於步驟S614，處理器113可根據權重因子對第一旋轉資訊與第二旋轉資訊進行加權和運算，以獲取第一即時位姿資訊中的即時旋轉資訊。於步驟S615，處理器113可根據權重因子對第一位移資訊與第二位移資訊進行加權和運算，以獲取第一即時位姿資訊中的即時位移資訊。In step S614, processor 113 can perform a weighted sum operation on the first rotation information and the second rotation information according to the weight factor to obtain the real-time rotation information in the first real-time pose information. In step S615, processor 113 can perform a weighted sum operation on the first displacement information and the second displacement information according to the weight factor to obtain the real-time displacement information in the first real-time pose information.

舉例來說，第一旋轉資訊所對應的第一加權權重可為權重因子，第二旋轉資訊所對應的第二加權權重可為1與權重因子之間的差。像是，當第一加權權重為w1，則第二加權權重可為1-w1。當頭戴顯示裝置110較靠近顯示屏幕120時，用於中心標記的第二加權權重可較高，且用於角落標記的第一加權權重可較低。反之，當頭戴顯示裝置110較遠離顯示屏幕120時，用於角落標記的第一加權權重可較高，且用於中心標記的第二加權權重可較低。換言之，中心標記的第二加權權重反相關於頭戴顯示裝置110與顯示屏幕120之間的距離。角落標記的第一加權權重正相關於頭戴顯示裝置110與顯示屏幕120之間的距離。原因在於，當頭戴顯示裝置110較靠近顯示屏幕120時，角落標記對應的旋轉資訊與位移資訊有不穩定的現象。For example, the first weighted weight corresponding to the first rotation information can be a weighting factor, and the second weighted weight corresponding to the second rotation information can be the difference between 1 and the weighting factor. For instance, if the first weighted weight is w1, then the second weighted weight can be 1-w1. When the head-mounted display 110 is closer to the display screen 120, the second weighted weight used for the center marker can be higher, and the first weighted weight used for the corner marker can be lower. Conversely, when the head-mounted display 110 is farther from the display screen 120, the first weighted weight used for the corner marker can be higher, and the second weighted weight used for the center marker can be lower. In other words, the second weighted weight of the center marker is inversely related to the distance between the head-mounted display 110 and the display screen 120. The first weighted weight of the corner marker is directly related to the distance between the head-mounted display device 110 and the display screen 120. This is because when the head-mounted display device 110 is closer to the display screen 120, the rotation and displacement information corresponding to the corner marker becomes unstable.

於一些實施例中，處理器113可根據多個預設標記其中任意三者來決定顯示屏幕120的第一即時位姿資訊。進一步來說，處理器113可根據多個預設標記其中任意三者的位移資訊（亦即於相機座標系統中的相機座標位置）計算出顯示屏幕120的一法向量，並將此法向量轉換為顯示屏幕120的旋轉資訊。其中，處理器113可根據影像擷取裝置111的相機內參數與相機外參數，將環境影像中的多個預設標記的影像座標轉換為相機座標位置。In some embodiments, processor 113 can determine the first real-time pose information of display screen 120 based on any three of a plurality of preset markers. Further, processor 113 can calculate a normal vector of display screen 120 based on the displacement information (i.e., camera coordinate position in the camera coordinate system) of any three of the plurality of preset markers, and convert this normal vector into rotation information of display screen 120. Wherein, processor 113 can convert the image coordinates of a plurality of preset markers in the environmental image into camera coordinate positions based on the camera intrinsic and extrinsic parameters of image capturing device 111.

於一些實施例中，處理器113可對環境影像進行一標記識別，以獲取環境影像中的多個預設標記。處理器113可根據標記識別的標記識別結果，從多個候選位姿估測演算法挑選一目標位姿估測演算法。此目標位姿估測演算法用以決定顯示屏幕120的即時位姿資訊。詳細來說，基於前述實施例說明可知，處理器113可根據環境影像中的一或多個預設標記來決定顯示屏幕120的即時位姿資訊。因此，處理器113可標記識別結果來決定適合的目標位姿估測演算法。In some embodiments, processor 113 may perform marker identification on the environmental image to obtain multiple preset markers in the environmental image. Based on the marker identification result, processor 113 may select a target pose estimation algorithm from multiple candidate pose estimation algorithms. This target pose estimation algorithm is used to determine the real-time pose information of the display screen 120. Specifically, as described in the foregoing embodiments, processor 113 may determine the real-time pose information of the display screen 120 based on one or more preset markers in the environmental image. Therefore, processor 113 may use marker identification results to determine a suitable target pose estimation algorithm.

如此一來，即便無法自環境影像識別出呈現於顯示屏幕120上所有預設標記，像是標記毀損等等因素造成，處理器113依然可切換為不同的目標位姿估測演算法來決定顯示屏幕120的即時位姿資訊。舉例來說，當處理器113只從環境影像中識別出中心標記，處理器113可根據中心標記的旋轉資訊與位移資訊來決定顯示屏幕120的即時位姿資訊。舉例來說，當處理器113從環境影像中識別出中心標記與所有角落標記，處理器113可根據頭戴顯示裝置110與顯示屏幕120之間的距離來決定使用各個角落標記的計算顯示屏幕120的即時位姿資訊，並忽略中心標記。In this way, even if it is impossible to identify all the preset markers presented on the display screen 120 from the environmental image, such as due to marker damage or other factors, the processor 113 can still switch to different target pose estimation algorithms to determine the real-time pose information of the display screen 120. For example, when the processor 113 only identifies the center marker from the environmental image, the processor 113 can determine the real-time pose information of the display screen 120 based on the rotation and displacement information of the center marker. For example, when the processor 113 identifies the center marker and all corner markers from the environmental image, the processor 113 can determine the use of each corner marker to calculate the real-time pose information of the display screen 120 based on the distance between the head-mounted display device 110 and the display screen 120, and ignore the center marker.

於步驟S540，處理器113根據顯示屏幕120的即時位姿資訊與頭戴顯示裝置110的裝置位姿資訊，決定顯示屏幕120於一擴增實境座標系統中的目標位姿資訊。於本實施例中，步驟S540可實施為步驟S541至步驟S542。In step S540, the processor 113 determines the target pose information of the display screen 120 in an augmented reality coordinate system based on the real-time pose information of the display screen 120 and the device pose information of the head-mounted display device 110. In this embodiment, step S540 can be implemented as steps S541 to S542.

於步驟S541，處理器113可根據擴增實境座標系統與相機座標系統之間的座標轉換關係，將相機座標系統下的第一即時位姿資訊轉換為擴增實境座標系統下的第二即時位姿資訊。此座標轉換關係可事前建立並記錄於儲存裝置112之中。此座標轉換關係取決於影像擷取裝置111於頭戴顯示裝置110上的擺放位置與拍攝方向，可經由事前測定而產生。In step S541, the processor 113 can convert the first real-time pose information under the camera coordinate system into the second real-time pose information under the augmented reality coordinate system based on the coordinate transformation relationship between the augmented reality coordinate system and the camera coordinate system. This coordinate transformation relationship can be established in advance and recorded in the storage device 112. This coordinate transformation relationship depends on the placement position of the image capturing device 111 on the head-mounted display device 110 and the shooting direction, and can be generated by prior measurement.

舉例而言，圖7是依照本發明一實施例的座標轉換的示意圖。相機座標系統Cam_c1的原點為影像擷取裝置111的所在位置，且相機座標系統的Z軸指向擷取裝置111的前方。另外，擴增實境座標系統AR_c1的座標原點Ori1位於視覺中心或場景中心，可透過使用者自行標定（例如透過按壓頭戴顯示裝置110上的初始化設定鍵）而決定。擴增實境座標系統AR_c1的Z軸可相反於相機座標系統Cam_c1的Z軸。於圖7的範例中，頭戴顯示裝置110的裝置中心點HPC1與擴增實境座標系統AR_c1的座標原點Ori1重合。影像擷取裝置111與頭戴顯示裝置110的裝置中心點HPC1之間存在一個固定位移。處理器113可根據此固定位移以及擴增實境座標系統AR_c1與相機座標系統Cam_c1之間的旋轉關係來進行座標轉換，以將相機座標系統Cam_c1下的第一即時位姿資訊轉換為擴增實境座標系統AR_c1下的第二即時位姿資訊。For example, Figure 7 is a schematic diagram of coordinate transformation according to an embodiment of the present invention. The origin of the camera coordinate system Cam_c1 is the location of the image capturing device 111, and the Z-axis of the camera coordinate system points in front of the capturing device 111. Additionally, the coordinate origin Ori1 of the augmented reality coordinate system AR_c1 is located at the visual center or scene center and can be determined by the user (e.g., by pressing the initialization setting button on the head-mounted display 110). The Z-axis of the augmented reality coordinate system AR_c1 can be opposite to the Z-axis of the camera coordinate system Cam_c1. In the example of Figure 7, the device center point HPC1 of the head-mounted display 110 coincides with the coordinate origin Ori1 of the augmented reality coordinate system AR_c1. There is a fixed displacement between the image capturing device 111 and the device center point HPC1 of the head-mounted display device 110. The processor 113 can perform coordinate transformation based on this fixed displacement and the rotation relationship between the augmented reality coordinate system AR_c1 and the camera coordinate system Cam_c1, so as to convert the first real-time pose information under the camera coordinate system Cam_c1 into the second real-time pose information under the augmented reality coordinate system AR_c1.

於步驟S542，處理器113可根據頭戴顯示裝置110的裝置位姿資訊與顯示屏幕120的第二即時位姿資訊，獲取顯示屏幕120相對於擴增實境座標系統的一座標原點的目標位姿資訊。具體來說，顯示屏幕120的第二即時位姿資訊是顯示屏幕120於擴增實境座標系統下相對於頭戴顯示裝置110的相對位姿資訊。於是，處理器113可根據頭戴顯示裝置110於擴增實境座標系統下的裝置位姿資訊與顯示屏幕120的第二即時位姿資訊，來獲取顯示屏幕120相對於擴增實境座標系統的一座標原點的目標位姿資訊。In step S542, the processor 113 can obtain target pose information of the display screen 120 relative to a coordinate origin of the augmented reality coordinate system based on the device pose information of the head-mounted display device 110 and the second real-time pose information of the display screen 120. Specifically, the second real-time pose information of the display screen 120 is the relative pose information of the display screen 120 to the head-mounted display device 110 in the augmented reality coordinate system. Therefore, the processor 113 can obtain target pose information of the display screen 120 relative to a coordinate origin of the augmented reality coordinate system based on the device pose information of the head-mounted display device 110 in the augmented reality coordinate system and the second real-time pose information of the display screen 120.

舉例而言，圖8是依照本發明一實施例的座標轉換的示意圖。請參照圖8，在完成擴增實境座標系統AR_c1的座標原點Ori1的標定之後，頭戴顯示裝置110可於空間中任意移動。參照圖7與圖8可知，頭戴顯示裝置110的裝置中心點HPC1可從與座標原點Ori1移動至其他位置。於是，在進行座標轉換而獲取擴增實境座標系統AR_c1下顯示屏幕120的第二即時位姿資訊之後，處理器113可根據頭戴顯示裝置110相對於擴增實境座標系統AR_c1的座標原點Ori1的位移資訊與旋轉資訊（亦即，裝置位姿資訊），來獲取顯示屏幕120相對於擴增實境座標系統AR_c1的一座標原點Ori1的目標位姿資訊。For example, Figure 8 is a schematic diagram of coordinate transformation according to an embodiment of the present invention. Referring to Figure 8, after the coordinate origin Ori1 of the augmented reality coordinate system AR_c1 is calibrated, the head-mounted display device 110 can move arbitrarily in space. Referring to Figures 7 and 8, it can be seen that the device center point HPC1 of the head-mounted display device 110 can be moved from the coordinate origin Ori1 to other positions. Therefore, after performing coordinate transformation to obtain the second real-time pose information of the display screen 120 under the augmented reality coordinate system AR_c1, the processor 113 can obtain the target pose information of the display screen 120 relative to the coordinate origin Ori1 of the augmented reality coordinate system AR_c1 based on the displacement and rotation information of the head-mounted display device 110 relative to the coordinate origin Ori1 of the augmented reality coordinate system AR_c1 (i.e., device pose information).

於一些實施例中，處理器113可根據下列公式(1)至公式(3)，將顯示屏幕120於相機座標系統中的第一即時位移資訊轉換成相對於擴增實境座標系統的一座標原點的目標位移資訊。公式(1) 公式(2) 公式(3) 其中，代表顯示屏幕120於相機座標系統中相對於影像擷取裝置111的第一即時位移資訊；代表影像擷取裝置111與頭戴顯示裝置110的裝置中心點之間的固定位移；代表頭戴顯示裝置110的旋轉四元數；代表頭戴顯示裝置110於擴增實境座標系統中的裝置位移資訊；代表顯示屏幕120於擴增實境座標系統中的目標位移資訊。 In some embodiments, the processor 113 may convert the first real-time displacement information of the display screen 120 in the camera coordinate system into target displacement information relative to a coordinate origin of the augmented reality coordinate system according to the following formulas (1) to (3). Formula (1) Formula (2) Formula (3) Where, The first real-time displacement information of the display screen 120 relative to the image capturing device 111 in the camera coordinate system; The fixed displacement between the center point of the image capturing device 111 and the head-mounted display device 110; The rotational quaternion representing the head-mounted display device 110; This represents the device displacement information of the head-mounted display 110 in the augmented reality coordinate system; This represents the target displacement information displayed on screen 120 in the augmented reality coordinate system.

於一些實施例中，處理器113可根據下列公式(4)至公式(5)，將顯示屏幕120於相機座標系統中的第一即時旋轉資訊轉換成相對於擴增實境座標系統的一座標原點的目標旋轉資訊。公式(4) 公式(5) 其中，代表顯示屏幕120於相機座標系統中的第一即時位旋轉資訊；代表將X軸旋轉180度的旋轉矩陣；代表頭戴顯示裝置110於擴增實境座標系統中的裝置旋轉資訊；代表顯示屏幕120於擴增實境座標系統中的目標旋轉資訊。 In some embodiments, the processor 113 may convert the first real-time rotation information of the display screen 120 in the camera coordinate system into target rotation information relative to a coordinate origin of the augmented reality coordinate system according to the following formulas (4) to (5). Formula (4) Formula (5) Where, This represents the first real-time position rotation information of the display screen 120 in the camera coordinate system; This represents a rotation matrix that rotates the X-axis by 180 degrees. This represents the device rotation information of the head-mounted display 110 in the augmented reality coordinate system; This represents the target rotation information displayed on screen 120 in the augmented reality coordinate system.

最後，於步驟S550，處理器113根據顯示屏幕120的目標位姿資訊，透過頭戴顯示裝置110顯示錨定於顯示屏幕120的一虛擬物件。Finally, in step S550, the processor 113 displays a virtual object anchored to the display screen 120 via the head-mounted display device 110 based on the target pose information of the display screen 120.

綜上所述，於本發明的實施例中，根據環境影像中呈現於顯示屏幕上的預設標記，可估測顯示屏幕的即時位姿資訊。顯示屏幕於擴增實境座標系統中的目標位姿資訊可根據顯示屏幕的即時位姿資訊與頭戴顯示裝置的裝置位姿資訊來決定。於是，虛擬物件的顯示位置可基於顯示屏幕的目標位姿資訊來決定，好讓使用者可經由頭戴顯示裝置觀看到穩定地錨定於顯示屏幕上的虛擬物件。基此，可提高虛擬與真實顯示的協同精度與靈活性，實現更為流暢的擴增實境體驗。藉此，可提昇使用者使用頭戴顯示裝置觀看虛擬物件的觀看體驗。In summary, in the embodiments of the present invention, the real-time pose information of the display screen can be estimated based on preset markers presented on the display screen in the environmental image. The target pose information of the display screen in the augmented reality coordinate system can be determined based on the real-time pose information of the display screen and the device pose information of the head-mounted display device. Therefore, the display position of the virtual object can be determined based on the target pose information of the display screen, allowing the user to view a virtual object stably anchored on the display screen via the head-mounted display device. Based on this, the coordination accuracy and flexibility between virtual and real displays can be improved, achieving a smoother augmented reality experience. This can enhance the user experience of viewing virtual objects using head-mounted displays.

雖然本發明已以實施例揭露如上，然其並非用以限定本發明，任何所屬技術領域中具有通常知識者，在不脫離本發明的精神和範圍內，當可作些許的更動與潤飾，故本發明的保護範圍當視後附的申請專利範圍所界定者為準。Although the present invention has been disclosed above by way of embodiments, it is not intended to limit the present invention. Anyone with ordinary skill in the art may make some modifications and refinements without departing from the spirit and scope of the present invention. Therefore, the scope of protection of the present invention shall be determined by the appended patent application.

10A,10B:頭戴顯示系統 110:頭戴顯示裝置 111:影像擷取裝置 112:儲存裝置 113:處理器 120:顯示屏幕 S310～S340,S510～550,S611～S615:步驟 V_obj1,V_obj2:虛擬物件 130:計算機裝置 AR_c1:擴增實境座標系統 Cam_c1:相機座標系統 HPC1:裝置中心點 M1～M4:角落標記 M5:中心標記 Ori1:座標原點10A, 10B: Head-mounted display system 110: Head-mounted display device 111: Image capturing device 112: Storage device 113: Processor 120: Display screen S310～S340, S510～550, S611～S615: Steps V_obj1, V_obj2: Virtual objects 130: Computer device AR_c1: Augmented reality coordinate system Cam_c1: Camera coordinate system HPC1: Device center point M1～M4: Corner markers M5: Center marker Ori1: Coordinate origin

圖1A是依照本發明一實施例的頭戴顯示系統的示意圖。圖1B是依照本發明一實施例的頭戴顯示系統的示意圖。圖2是依照本發明一實施例的顯示虛擬物件的應用情境圖。圖3是依照本發明一實施例的顯示虛擬物件的方法的流程圖。圖4是依照本發明一實施例的顯示屏幕與多個預設標記的示意圖。圖5是依照本發明一實施例的顯示虛擬物件的方法的流程圖。圖6是依照本發明一實施例的決定顯示屏幕的即時位姿資訊的流程圖。圖7是依照本發明一實施例的座標轉換的示意圖。圖8是依照本發明一實施例的座標轉換的示意圖。 Figure 1A is a schematic diagram of a head-mounted display system according to an embodiment of the present invention. Figure 1B is a schematic diagram of a head-mounted display system according to an embodiment of the present invention. Figure 2 is an application scenario diagram of displaying virtual objects according to an embodiment of the present invention. Figure 3 is a flowchart of a method for displaying virtual objects according to an embodiment of the present invention. Figure 4 is a schematic diagram of a display screen and multiple preset markers according to an embodiment of the present invention. Figure 5 is a flowchart of a method for displaying virtual objects according to an embodiment of the present invention. Figure 6 is a flowchart of determining the real-time pose information of the display screen according to an embodiment of the present invention. Figure 7 is a schematic diagram of coordinate transformation according to an embodiment of the present invention. Figure 8 is a schematic diagram of coordinate transformation according to an embodiment of the present invention.

S310~S340:步驟 S310~S340: Steps

Claims

A method for displaying a virtual object includes: capturing an environmental image toward a display screen using an image capture device on a head-mounted display device, wherein a plurality of preset markers are displayed on the display screen; determining real-time pose information of the display screen based on the plurality of preset markers in the environmental image; determining target pose information of the display screen in an augmented reality coordinate system based on the real-time pose information of the display screen and device pose information of the head-mounted display device; and displaying a virtual object anchored to the display screen via the head-mounted display device based on the target pose information of the display screen.

The method for displaying a virtual object as described in claim 1 further includes: controlling the display screen to display the plurality of preset markers, wherein the plurality of preset markers includes a plurality of corner markers and a center marker, the plurality of corner markers being located at a plurality of screen corners of the display screen, and the center marker being located at the center of the display screen.

The method for displaying a virtual object as described in claim 1, wherein the step of determining the real-time pose information of the display screen based on the plurality of preset markers in the environmental image comprises: calculating rotational and displacement information of at least one of the plurality of preset markers in the environmental image; and determining first real-time pose information of the display screen in the camera coordinate system of the image capturing device based on the rotational and displacement information of at least one of the plurality of preset markers.

The method for displaying a virtual object as described in claim 3, wherein the step of determining the target pose information of the display screen in the augmented reality coordinate system based on the real-time pose information of the display screen and the device pose information of the head-mounted display device includes: converting the first real-time pose information in the camera coordinate system to second real-time pose information in the augmented reality coordinate system based on a coordinate transformation relationship between the augmented reality coordinate system and the camera coordinate system; and obtaining the target pose information of the display screen relative to a coordinate origin of the augmented reality coordinate system based on the device pose information of the head-mounted display device and the second real-time pose information of the display screen.

The method for displaying a virtual object as described in claim 3, wherein the step of determining the first real-time pose information of the display screen in the camera coordinate system of the image capturing device based on rotation and displacement information of at least one of the plurality of preset markers includes: Determining the real-time displacement information and real-time rotation information in the first real-time pose information of the display screen based on the rotation and displacement information of a center marker among the plurality of preset markers.

The method for displaying a virtual object as described in claim 3, wherein the step of determining the first real-time pose information of the display screen in the camera coordinate system of the image capturing device based on rotation and displacement information of at least one of the plurality of preset markers includes: performing a displacement averaging operation on the displacement information of a plurality of corner markers among the plurality of preset markers to obtain the real-time displacement information in the first real-time pose information; and performing a spherical linear interpolation operation on the rotation information of the plurality of corner markers among the plurality of preset markers to obtain the real-time rotation information in the first real-time pose information.

The method for displaying a virtual object as described in claim 6, wherein the plurality of preset markers includes a plurality of corner markers and a center marker, and the step of determining the first real-time pose information of the display screen in the camera coordinate system of the image capturing device based on rotation and displacement information of at least one of the plurality of preset markers includes: Acquiring first rotation information and first displacement information based on the rotation and displacement information of the plurality of corner markers; Acquiring second rotation information and second displacement information based on the rotation and displacement information of the center marker; Performing a weighted sum operation on the first rotation information and the second rotation information according to a weighting factor to obtain the real-time rotation information in the first real-time pose information; and The first displacement information and the second displacement information are weighted and summed according to the weighting factor to obtain the real-time displacement information in the first real-time pose information.

The method for displaying a virtual object as described in claim 7, wherein the weighting factor is determined based on the distance between the head-mounted display and the display screen.

The method for displaying a virtual object as described in claim 1 further includes: performing a marker identification on the environmental image to obtain the plurality of preset markers in the environmental image; and selecting a target pose estimation algorithm from a plurality of candidate pose estimation algorithms based on the marker identification result, wherein the target pose estimation algorithm is used to determine the real-time pose information of the display screen.

The method for displaying virtual objects as described in claim 1, wherein the plurality of preset markers includes a plurality of binary square markers (ArUco markers).

A head-mounted display system includes: A head-mounted display device, including: An image capturing device for capturing an environmental image toward a display screen, wherein a plurality of preset markers are displayed on the display screen; A computer device connected to the head-mounted display device and including: A storage device; and A processor coupled to the storage device, configured to: Determine real-time pose information of the display screen based on the plurality of preset markers in the environmental image; Determine target pose information of the display screen in an augmented reality coordinate system based on the real-time pose information of the display screen and device pose information of the head-mounted display device; and Based on the target pose information displayed on the display screen, a virtual object anchored to the display screen is displayed via the head-mounted display device.

A head-mounted display device includes: an image capturing device for capturing an environmental image toward a display screen, wherein a plurality of preset markers are displayed on the display screen; a storage device; and a processor coupled to the storage device and configured to: determine real-time pose information of the display screen based on the plurality of preset markers in the environmental image; determine target pose information of the display screen in an augmented reality coordinate system based on the real-time pose information of the display screen and device pose information of the head-mounted display device; and display a virtual object anchored to the display screen via the head-mounted display device based on the target pose information of the display screen.