CN107194968B

CN107194968B - Image recognition and tracking method, device, smart terminal and readable storage medium

Info

Publication number: CN107194968B
Application number: CN201710351693.2A
Authority: CN
Inventors: 孙星; 郭晓威
Original assignee: Tencent Technology Shanghai Co Ltd
Current assignee: Tencent Technology Shanghai Co Ltd
Priority date: 2017-05-18
Filing date: 2017-05-18
Publication date: 2024-01-16
Anticipated expiration: 2037-05-18
Also published as: CN107194968A; WO2018210305A1

Abstract

The invention discloses an image identification tracking method, an image identification tracking device, an intelligent terminal and a readable storage medium. The method comprises the following steps: obtaining a recognition result of a mark pattern in the image; positioning a mark pattern in the image according to the identification result of the mark pattern, and carrying out target tracking by the positioned mark pattern to obtain translation information of the mark pattern in the image in space; and fusing the translation information with a plurality of sensors in the intelligent terminal to output rotation angles to form a pose matrix of the mark pattern in the image. Under the effect of multi-sensor fusion in the intelligent terminal, the rotation angle is obtained without adopting the process of target tracking, the time performance is improved as a whole, the limitation that the rotation angle is inaccurate and even cannot be calculated when the rotation angle is obtained by adopting the process of target tracking is avoided, and the combination of the target tracking process and the multi-sensor fusion ensures the fast tracking speed, has stronger stability and accuracy and can give consideration to the tracking effect and the time performance.

Description

Image recognition and tracking method, device, smart terminal and readable storage medium

技术领域Technical field

本发明涉及互联网应用技术领域，特别涉及一种图像的识别跟踪方法、装置、智能终端和可读存储介质。The invention relates to the field of Internet application technology, and in particular to an image recognition and tracking method, device, intelligent terminal and readable storage medium.

背景技术Background technique

随着互联网应用技术的迅猛发展，智能终端通过对已获得图像的目标追踪而估计得到此图像所对应的姿态，根据所得到的姿态而基于已获得图像实现各种交互应用。此图像所对应的姿态用于描述捕获此图像所对应实体目标在物理空间发生的平移和旋转。With the rapid development of Internet application technology, smart terminals estimate the posture corresponding to the image by tracking the target of the obtained image, and implement various interactive applications based on the obtained image based on the obtained posture. The posture corresponding to this image is used to describe the translation and rotation of the entity target corresponding to the captured image in physical space.

现有的目标追踪是基于特征点追踪实现的。一般而言，追踪的特征点数目越多，特征点的描述子越复杂，追踪效果越好，所估计的姿态越准确，但是，运行速度也越慢，即，现有的目标追踪存在着时间性能和追踪效果矛盾的局限性。Existing target tracking is based on feature point tracking. Generally speaking, the greater the number of tracked feature points, the more complex the descriptors of the feature points, the better the tracking effect, and the more accurate the estimated posture. However, the slower the running speed, that is, there is a time-consuming process for existing target tracking. Performance and tracking effects are contradictory limitations.

单纯基于特征点的追踪必然会带来时间性能和追踪效果的取舍问题。对于现有目标追踪在智能终端上的应用，为了获得较好的时间性能以及处理资源的限制，大都采用简单的特征点，以获得较快的特征点提取速度和追踪速度，但是，追踪准确率却非常低，无法使得智能终端上目标追踪的实现兼顾追踪效果和时间性能。Tracking based solely on feature points will inevitably bring about trade-offs between time performance and tracking effects. For existing target tracking applications on smart terminals, in order to obtain better time performance and limit processing resources, simple feature points are mostly used to obtain faster feature point extraction speed and tracking speed. However, the tracking accuracy However, it is very low and cannot achieve both tracking effect and time performance for target tracking on smart terminals.

发明内容Contents of the invention

为了解决相关技术中存在的图像中目标追踪的实现无法兼顾追踪效果和时间性能的技术问题，本发明的一个目的在于提供一种图像的识别跟踪方法和装置，用于解决现有技术所存在的无法同时保证追踪效果和时间性能的缺陷。In order to solve the technical problem in the related art that the implementation of target tracking in images cannot take into account the tracking effect and time performance, one object of the present invention is to provide an image recognition and tracking method and device to solve the problems existing in the existing technology. The defect of tracking effect and time performance cannot be guaranteed at the same time.

一种图像的识别跟踪方法，所述方法包括：An image recognition and tracking method, the method includes:

获得智能终端所捕获图像中标记图案的识别结果；Obtain the recognition results of the mark pattern in the image captured by the smart terminal;

根据所述标记图案的识别结果定位所述图像中的标记图案，由定位的所述标记图案进行目标追踪获得所述图像中标记图案在空间的平移信息；Locate the marking pattern in the image according to the recognition result of the marking pattern, and perform target tracking based on the positioned marking pattern to obtain the spatial translation information of the marking pattern in the image;

将所述平移信息和智能终端中多传感器融合而输出的旋转角度形成所述图像中标记图案的位姿矩阵。The translation information and the rotation angle output by fusion of multiple sensors in the smart terminal form a pose matrix of the marking pattern in the image.

一种图像的识别跟踪装置，其特征在于，所述装置包括：An image recognition and tracking device, characterized in that the device includes:

识别结果获得模块，用于获得智能终端所捕获图像中标记图案的识别结果；The recognition result acquisition module is used to obtain the recognition result of the mark pattern in the image captured by the smart terminal;

目标追踪模块，用于根据所述标记图案的识别结果定位所述图像中的标记图案，由定位的所述标记图案进行目标追踪获得所述图像中标记图案在空间的平移信息；A target tracking module, configured to locate the mark pattern in the image according to the recognition result of the mark pattern, and perform target tracking based on the positioned mark pattern to obtain the spatial translation information of the mark pattern in the image;

位姿获得模块，用于将所述平移信息和智能终端中多传感器融合而输出的旋转角度形成所述图像中标记图案的位姿矩阵。A pose acquisition module is configured to combine the translation information and the rotation angle output by fusion of multiple sensors in the smart terminal to form a pose matrix of the marking pattern in the image.

一种智能终端，包括：An intelligent terminal includes:

处理器；以及processor; and

存储器，所述存储器上存储有计算机可读指令，所述计算机可读指令被所述处理器执行时实现如上所述的图像的识别跟踪方法。A memory on which computer readable instructions are stored. When the computer readable instructions are executed by the processor, the image recognition and tracking method as described above is implemented.

一种计算机可读存储介质，其上存储有计算机程序，所述计算机程序被处理器执行时实现如上所述的图像的识别跟踪方法。A computer-readable storage medium has a computer program stored thereon, and when the computer program is executed by a processor, the image recognition and tracking method as described above is implemented.

本发明的实施例提供的技术方案可以包括以下有益效果：The technical solutions provided by the embodiments of the present invention may include the following beneficial effects:

对于所捕获的图像，首先获得智能终端所捕获图像中标记图案的识别结果，然后根据标记图案的识别结果定位图像中的标记图案，由定位的标记图案进行目标追踪获得图像中标记图案在空间的平移信息，最终将平移信息和智能终端中多传感器融合而输出的旋转角度形成图像标记图案的位姿矩阵，在智能终端中多传感器融合的作用下，不再需要采用目标追踪的过程而获得旋转角度，从整体上提升时间性能，并且也避免了采用目标追踪的过程而获得旋转角度时旋转角度很不准确，甚至无法计算的局限性，目标追踪过程和多传感器融合的结合在保证很快的追踪速度的同时，还具备较强的稳定性和准确性，能够兼顾追踪效果和时间性能。For the captured image, first obtain the recognition result of the mark pattern in the image captured by the smart terminal, then locate the mark pattern in the image based on the recognition result of the mark pattern, and perform target tracking based on the positioned mark pattern to obtain the spatial position of the mark pattern in the image. The translation information is finally fused with the rotation angle output by the multi-sensors in the smart terminal to form the pose matrix of the image mark pattern. With the multi-sensor fusion in the smart terminal, it is no longer necessary to use the target tracking process to obtain the rotation. Angle, improves the time performance as a whole, and also avoids the limitation that the rotation angle is very inaccurate or even impossible to calculate when obtaining the rotation angle using the target tracking process. The combination of the target tracking process and multi-sensor fusion ensures fast While tracking speed, it also has strong stability and accuracy, and can take into account tracking effect and time performance.

应当理解的是，以上的一般描述和后文的细节描述仅是示例性的，并不能限制本发明。It should be understood that the above general description and the following detailed description are only exemplary and do not limit the present invention.

附图说明Description of the drawings

此处的附图被并入说明书中并构成本说明书的一部分，示出了符合本发明的实施例，并于说明书一起用于解释本发明的原理。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description serve to explain the principles of the invention.

图1是根据一示例性实施例示出的一种图像的识别跟踪方法的流程图；Figure 1 is a flow chart of an image recognition and tracking method according to an exemplary embodiment;

图2是根据一示例性实施例示出的对传感器数据进行智能终端在空间中的旋转角度的细节进行描述的流程图；Figure 2 is a flowchart describing details of the rotation angle of a smart terminal in space based on sensor data according to an exemplary embodiment;

图3是根据一示例性实施例示出的对步骤230的细节进行描述的流程图；FIG. 3 is a flowchart describing details of step 230 according to an exemplary embodiment;

图4是根据根据图1对应实施例示出的对步骤130的细节进行描述的流程图；Figure 4 is a flowchart describing the details of step 130 according to the corresponding embodiment shown in Figure 1;

图5是根据一示例性实施例示出的对相对于首次被识别到标记图案的图像，进行当前所捕获图像的透射变换预处理获得透射图像，透射图像用于进行当前所捕获图像的目标追踪步骤的细节进行描述的流程图；FIG. 5 illustrates, according to an exemplary embodiment, performing transmission transformation preprocessing on the currently captured image relative to the image in which the mark pattern is first recognized to obtain a transmission image, and the transmission image is used to perform a target tracking step on the currently captured image. A flowchart describing the details;

图6是根据图1对应实施例示出的对步骤110的细节进行描述的流程图；Figure 6 is a flowchart describing the details of step 110 according to the corresponding embodiment of Figure 1;

图7是根据一示例性实施例示出的智能终端中现实增强系统的框架图；Figure 7 is a framework diagram of a reality augmentation system in a smart terminal according to an exemplary embodiment;

图8是图7对应实施例中追踪图像步骤的实现框架图；Figure 8 is an implementation framework diagram of the image tracking step in the embodiment corresponding to Figure 7;

图9是根据一示例性实施例示出的一种图像的识别跟踪装置的框图；Figure 9 is a block diagram of an image recognition and tracking device according to an exemplary embodiment;

图10是根据图9对应实施例示出的对目标追踪模块的细节进行描述的框图；Figure 10 is a block diagram describing details of the target tracking module according to the embodiment corresponding to Figure 9;

图11是根据另一示例性实施例示出的一种图像的识别跟踪装置的框图；Figure 11 is a block diagram of an image recognition and tracking device according to another exemplary embodiment;

图12是根据一示例性实施例示出的对透射变换模块的细节进行描述的框图；Figure 12 is a block diagram illustrating details of a transmission transformation module according to an exemplary embodiment;

图13是根据一示例性实施例示出的一种装置的框图。Figure 13 is a block diagram of a device according to an exemplary embodiment.

具体实施方式Detailed ways

这里将详细地对示例性实施例执行说明，其示例表示在附图中。下面的描述涉及附图时，除非另有表示，不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本发明相一致的所有实施方式。相反，它们仅是与如所附权利要求书中所详述的、本发明的一些方面相一致的装置和方法的例子。Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the drawings, the same numbers in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with the invention. Rather, they are merely examples of apparatus and methods consistent with aspects of the invention as detailed in the appended claims.

图1是根据一示例性实施例示出的一种图像的识别跟踪方法的流程图。在一个示例性实施例中，该图像的识别跟踪方法，如图1所示，可以包括以下步骤。FIG. 1 is a flow chart of an image recognition and tracking method according to an exemplary embodiment. In an exemplary embodiment, the image recognition and tracking method, as shown in Figure 1, may include the following steps.

在步骤110中，获得智能终端所捕获图像中标记图案的识别结果。In step 110, the recognition result of the mark pattern in the image captured by the smart terminal is obtained.

其中，智能终端用于执行本发明的图像识别跟踪过程，在此智能终端，将首先获取其捕获的图像。在此应当说明，智能终端的图像捕获过程，可以是当前所执行的，也可以是预先执行的。通过本发明的图像识别跟踪过程为智能终端所即时捕获或者预先捕获的图像执行识别跟踪过程。Among them, the smart terminal is used to perform the image recognition and tracking process of the present invention. Here, the smart terminal will first obtain the image it captured. It should be noted here that the image capturing process of the smart terminal may be currently executed or may be executed in advance. Through the image recognition and tracking process of the present invention, the recognition and tracking process is performed for images captured instantly or pre-captured by the smart terminal.

当然，可以理解的，对于执行图像捕获的智能终端，与当前对所捕获图像执行识别跟踪过程的智能终端，可以是同一智能终端，也可以是各不相同的两个智能终端。对于各不相同的智能终端，将需要执行图像捕获的智能终端将所捕获图像和相应的其它数据，例如后续所说的传感器数据传递给执行识别跟踪过程的智能终端。Of course, it can be understood that the smart terminal that performs image capture and the smart terminal that currently performs the recognition and tracking process on the captured image can be the same smart terminal, or they can be two different smart terminals. For different smart terminals, the smart terminal that performs image capture will need to transfer the captured image and corresponding other data, such as the sensor data mentioned later, to the smart terminal that performs the identification and tracking process.

在一个示例性实施例中，智能终端所捕获图像是智能终端配置的拍摄组件，例如，各种内置或者外置的摄像头实现。由此，所捕获图像为真实场景图像。In an exemplary embodiment, the image captured by the smart terminal is a shooting component configured by the smart terminal, for example, various built-in or external camera implementations. Thus, the captured image is a real scene image.

具体而言，随着智能终端对现实环境的拍摄，智能终端所捕获图像用于以图像的形式反映真实场景。可以理解的，真实场景，即为现实环境中的场景。也就是说，所捕获图像是用户携带的智能终端针对现实环境进行拍摄而捕获得到的，当然，也应当进一步说明的是，其可以是当前所捕获得到的，也可以是预先所获得的，在此不进行限定。Specifically, as the smart terminal captures the real environment, the images captured by the smart terminal are used to reflect the real scene in the form of images. It can be understood that real scenes are scenes in real environments. That is to say, the captured image is captured by the smart terminal carried by the user against the real environment. Of course, it should be further explained that it can be captured currently or obtained in advance. This is not limited.

标记图案为预先所指定的图案，对于图像而言，标记图案是通过针对现实环境中布设的标记进行拍摄而存在于图像中的。The marker pattern is a pre-specified pattern. As for the image, the marker pattern exists in the image by photographing markers laid out in the real environment.

图像中标记图案的识别结果用于指示图像中是否存在标记图案以及标记图案在图像中的位置，具体而言，对于图像中存在标记图案的情况，所获得的识别结果便指示了标记图案被识别，此时，即可对此图像执行目标追踪。The recognition result of the mark pattern in the image is used to indicate whether the mark pattern exists in the image and the position of the mark pattern in the image. Specifically, for the case where the mark pattern exists in the image, the obtained recognition result indicates that the mark pattern is recognized. , at this point, target tracking can be performed on this image.

在此应当说明的是，对于图像中标记图案的识别结果，可以由图像与预置的标记图像进行匹配而输出得到，也可以通过其它方式完成图像中标记图像的识别之后输出得到的，在此不进行限定，只要能够从输出的信息中获得图像中标记图案的识别结果即可。It should be noted here that the recognition result of the mark pattern in the image can be output by matching the image with the preset mark image, or can be output after the recognition of the mark image in the image is completed in other ways. Here There is no limitation, as long as the recognition result of the mark pattern in the image can be obtained from the output information.

在一个示例性实施例的具体实现中，智能终端所捕获图像中标记图案的识别结果，由图像与预置标记图案的匹配获得或由所捕获图像中标记图案的用户指定的触发获得。In a specific implementation of an exemplary embodiment, the recognition result of the mark pattern in the image captured by the smart terminal is obtained by matching the image with the preset mark pattern or by a user-specified trigger of the mark pattern in the captured image.

通过这一捕获图像中标记图案的识别结果获得方式的配置，为智能终端的图像识别跟踪过程提供多种方式，以便于能够灵活适配于各种场景以及智能终端的性能，提高图像识别跟踪的可靠性。Through this configuration of the method of obtaining the recognition result of the mark pattern in the captured image, multiple methods are provided for the image recognition and tracking process of the smart terminal, so that it can be flexibly adapted to various scenarios and the performance of the smart terminal, and improve the efficiency of image recognition and tracking. reliability.

在步骤130中，根据标记图案的识别结果定位图像中的标记图案，由定位的标记图案进行目标追踪获得图像中标记图案在空间的平移信息。In step 130, the marking pattern in the image is located according to the recognition result of the marking pattern, and target tracking is performed based on the positioned marking pattern to obtain spatial translation information of the marking pattern in the image.

其中，图像的目标追踪是指对图像执行目标追踪过程，在此所指的目标即为图像中的标记图案。因此，对于图像的目标追踪而言，其必将是在标记图案的识别结果作用下触发进行的，以避免进行无效的目标追踪，浪费运算资源，提高处理效率。Among them, image target tracking refers to performing a target tracking process on the image, and the target referred to here is the mark pattern in the image. Therefore, for image target tracking, it must be triggered by the recognition result of the mark pattern, so as to avoid invalid target tracking, waste of computing resources, and improve processing efficiency.

在所触发进行图像的目标追踪中，是对图像中的标记图案运算其位姿的过程，在本发明的示例性实施例中，由目标追踪而获得的标记图案位姿，将包括图像中标记图案在空间中的平移信息。In the triggered target tracking of the image, it is a process of calculating the pose of the mark pattern in the image. In an exemplary embodiment of the present invention, the pose of the mark pattern obtained by target tracking will include the mark pattern in the image. Translation information of the pattern in space.

所指的空间，是与真实场景所在空间，即物理空间对应的，其是所在智能终端构建的三维空间。因此，图像中标记图案在空间中的平移信息，是相对于空间中的三个方位而言的。The space referred to corresponds to the space where the real scene is located, that is, the physical space, which is the three-dimensional space constructed by the smart terminal where it is located. Therefore, the translation information of the mark pattern in the image in space is relative to the three orientations in the space.

空间中的三个方位是在所构建的空间坐标系中三个坐标轴所指向的方向。相对于物理空间，构建了空间坐标系，物理空间和空间之间存在着物理坐标系和所构建空间坐标系之间的转换，由此，方能够将所捕获的现实环境精准映射至空间，准确获得空间中的平移信息。The three orientations in space are the directions pointed by the three coordinate axes in the constructed spatial coordinate system. Relative to the physical space, a spatial coordinate system is constructed. There is a conversion between the physical coordinate system and the constructed spatial coordinate system between physical space and space. Therefore, the captured real environment can be accurately mapped to the space. Get translation information in space.

空间坐标系为三维坐标系，其包括相互垂直的x坐标轴、y坐标轴和z坐标轴，图像中标记图案在空间中的平移信息是以x坐标轴、y坐标轴和z坐标轴为基准而运算得到的平移距离。The spatial coordinate system is a three-dimensional coordinate system, which includes mutually perpendicular x-coordinate axes, y-coordinate axes, and z-coordinate axes. The translation information of the marking pattern in the image in space is based on the x-coordinate axes, y-coordinate axes, and z-coordinate axes. And the translation distance obtained by the operation.

具体的，图像中标记图案在空间中的平移信息用于指示标记图案在空间中发生的移动，平移信息包括在空间水平面上的平移距离以及垂直距离。在空间水平面上的平移距离是分别对应于空间中的两个坐标轴方位的。Specifically, the translation information of the marking pattern in the image in space is used to indicate the movement of the marking pattern in space. The translation information includes the translation distance and the vertical distance on the horizontal plane of space. The translation distance on the horizontal plane of space corresponds to the orientation of the two coordinate axes in space.

对于图像的目标追踪而言，能够对标记图案在空间中三个方位平移有快速稳定的预测，所以将仍然进行图像中的目标追踪，但不再进行图像中标记图案的相关旋转信息的运算，由此便将使得目标追踪的时间性能和速度得到极大提升和改善。For target tracking in images, it is possible to quickly and stably predict the translation of the marking pattern in three directions in space, so the target tracking in the image will still be performed, but the calculation of the relevant rotation information of the marking pattern in the image will no longer be performed. As a result, the time performance and speed of target tracking will be greatly improved and improved.

在一个示例性实施例的具体现实现中，用于实现图像中目标追踪的算法，可以是单目标追踪算法，也可以在匹配速度足够快时采用连续的匹配，还可以使用深度学习的方式实现，甚至于多目标追踪算法，在此不进行一一列举。In a specific implementation of an exemplary embodiment, the algorithm used to implement target tracking in the image can be a single target tracking algorithm, or can use continuous matching when the matching speed is fast enough, or can also be implemented using deep learning. , and even multi-target tracking algorithms, we will not list them one by one here.

在步骤150中，将平移信息和智能终端中多传感器融合而输出的旋转角度形成图像中标记图案的位姿矩阵。In step 150, the translation information and the rotation angle output by fusion of multiple sensors in the smart terminal form a pose matrix of the marking pattern in the image.

其中，智能终端是指执行本发明图像识别跟踪过程的终端设备，例如，智能终端可以是智能手机、平板电脑等便携移动终端。智能终端中装设有各种传感器，因此，可以用于执行图像捕获的智能终端中多个传感器输出的传感器数据进行融合，而获得反映智能终端所发生旋转的旋转角度，由于智能终端捕获图像且通过所装设的传感器输出传感器数据，因此，多传感器融合而输出的旋转角度，即可用于描述图像中标记图案所对应位姿中旋转的发生。The smart terminal refers to the terminal device that performs the image recognition and tracking process of the present invention. For example, the smart terminal can be a portable mobile terminal such as a smart phone or a tablet computer. Smart terminals are equipped with various sensors. Therefore, the sensor data output by multiple sensors in the smart terminal that performs image capture can be fused to obtain a rotation angle that reflects the rotation of the smart terminal. Since the smart terminal captures images and The installed sensors output sensor data. Therefore, the rotation angle output by multi-sensor fusion can be used to describe the occurrence of rotation in the pose corresponding to the marker pattern in the image.

智能终端中的多传感器融合能够快速准确的实现旋转角度的计算，并且与图像的目标追踪相配合，而避免了目标追踪中旋转角度的计算不准确甚至无法计算的问题，进而方能够保证效率和准确性。Multi-sensor fusion in smart terminals can quickly and accurately calculate the rotation angle, and cooperate with the target tracking of the image to avoid the problem of inaccurate or even impossible calculation of the rotation angle in target tracking, thereby ensuring efficiency and accuracy.

在经由图像的目标追踪获得标记图案在空间中的平移信息之后，将此平移信息与多传感器融合而输出的旋转角度一并形成图像中标记图案的位姿矩阵。After the translation information of the marking pattern in space is obtained through target tracking of the image, this translation information is combined with the rotation angle output by multi-sensor fusion to form a pose matrix of the marking pattern in the image.

所指的将平移信息和旋转角度一并形成图像中标记图案的位姿矩阵，是指将平移信息所指示标记图案在空间中的移动，以及旋转角度所指示的旋转分别作为元素而一并形成标记图案的位姿矩阵。The pose matrix that combines the translation information and the rotation angle to form the marking pattern in the image means that the movement of the marking pattern in the space indicated by the translation information and the rotation indicated by the rotation angle are formed together as elements respectively. The pose matrix of the marking pattern.

例如，对于平移信息所指示标记图案在空间中的移动，其可以通过相对于空间中三个坐标轴方位而分别获得的距离体现；旋转角度也可以通过相对于空间中三个坐标轴所分别对应的坐标轴旋转角度，因此，即可获得六个自由度的标记图案位姿矩阵。For example, the movement of the mark pattern in space indicated by the translation information can be reflected by the distance obtained respectively relative to the orientation of the three coordinate axes in the space; the rotation angle can also be represented by the distance corresponding to the three coordinate axes in the space. The rotation angle of the coordinate axis, therefore, the marking pattern pose matrix with six degrees of freedom can be obtained.

至此，将相对于空间中三个坐标轴所分别对应的距离，以及相相对于三个坐标轴所分别对应的坐标轴旋转角度分别作为矩阵中元素而构成标记图案的位姿矩阵。At this point, the distances corresponding to the three coordinate axes in the space and the rotation angles of the coordinate axes corresponding to the three coordinate axes are respectively used as elements in the matrix to form the pose matrix of the marking pattern.

图像中标记图案的位姿矩阵用于描述标记图案相对于其初始位姿而发生的位姿变化。通过运算得到的位姿矩阵，将使得后续所实现的业务实现场景是与位姿矩阵相匹配的，进而适配于图像以及图像中的标记图案。The pose matrix of the marker pattern in the image is used to describe the pose changes of the marker pattern relative to its initial pose. The pose matrix obtained through calculation will make the subsequent business implementation scenario match the pose matrix, and then be adapted to the image and the marking pattern in the image.

例如，对于后续所实现的现实增强业务场景，根据所运算得到的位姿矩阵而进行虚拟场景图像的投影，以使得被投影至图像的虚拟场景图像是与其位姿矩阵相适配的，保证了后续业务实现场景在实现上的精准性和适应性。For example, for the subsequently implemented augmented reality business scene, the virtual scene image is projected according to the calculated pose matrix, so that the virtual scene image projected to the image is adapted to its pose matrix, ensuring that The accuracy and adaptability of subsequent business implementation scenarios.

如上所述的示例性实施例中，在图像的目标追踪和多传感器融合的作用下，使得智能终端中图像识别跟踪的实现能够在保持较好的追踪准确率下以非常快的速度完成运算，既可以保证智能终端，即移动端的实时性，也可以保证追踪效果不会受到影响，使得移动端本地实时标记图案的追踪成为现实，并且具有更好的稳定性。As described above, in the exemplary embodiments, with the help of image target tracking and multi-sensor fusion, the implementation of image recognition and tracking in smart terminals can complete operations at a very fast speed while maintaining good tracking accuracy. It can not only ensure the real-time performance of the smart terminal, that is, the mobile terminal, but also ensure that the tracking effect will not be affected, making the tracking of local real-time mark patterns on the mobile terminal a reality and with better stability.

在一个示例性实施例中，步骤350之前，该图像的识别跟踪方法，还可以包括以下步骤。In an exemplary embodiment, before step 350, the image recognition and tracking method may further include the following steps.

获得智能终端捕获图像时多个传感器输出的传感器数据；Obtain sensor data output by multiple sensors when the smart terminal captures images;

对传感器数据执行多传感器融合算法计算智能终端在空间中的旋转角度，旋转角度由多传感器融合而输出，且用于形成图像中标记图案的位姿矩阵。A multi-sensor fusion algorithm is executed on the sensor data to calculate the rotation angle of the smart terminal in space. The rotation angle is output by multi-sensor fusion and used to form the pose matrix of the marking pattern in the image.

其中，如前所述的，多传感器融合是针对智能终端中多个传感器所输出的数据，即传感器数据而言的。在传感器数据中执行多传感器融合算法，计算得到智能终端在空间中分别对应于三个方位的旋转角度，计算得到的旋转角度即对应于图像中标记图案在空间中发生的旋转。Among them, as mentioned above, multi-sensor fusion is aimed at the data output by multiple sensors in the smart terminal, that is, sensor data. A multi-sensor fusion algorithm is executed in the sensor data to calculate the rotation angles of the smart terminal corresponding to three directions in space. The calculated rotation angles correspond to the rotation of the mark pattern in the image in space.

进一步的，图2是根据一示例性实施例示出的对传感器数据进行智能终端在空间中的旋转角度的细节进行描述的流程图。该步骤，如图2所示，可以包括以下步骤。Further, FIG. 2 is a flowchart describing details of the rotation angle of a smart terminal in space based on sensor data according to an exemplary embodiment. This step, as shown in Figure 2, may include the following steps.

在步骤210中，获得智能终端捕获图像时多个传感器输出的传感器数据。In step 210, sensor data output by multiple sensors when the smart terminal captures images is obtained.

其中，为了多传感器融合下实现图像中标记图案的位姿预测，此图像的捕获与智能终端中传感器数据的采集是同时进行的，以保证所采集的传感器数据能够对应于智能终端捕获图像的姿态，进而方能够使得依据传感器数据而进行的计算相对于图像是准确的。Among them, in order to realize the pose prediction of the mark pattern in the image under multi-sensor fusion, the capture of this image is carried out simultaneously with the collection of sensor data in the smart terminal to ensure that the collected sensor data can correspond to the pose of the image captured by the smart terminal. , so that the calculation based on the sensor data is accurate relative to the image.

除此之外，智能终端中多个传感器进行的数据采集，也可以在智能终端捕获图像时保持姿态而执行，还可以是在智能终端捕获图像之前执行并保持姿态而捕获，在此不进行限定。In addition, the data collection by multiple sensors in the smart terminal can also be performed while maintaining the posture when the smart terminal captures the image, or it can be performed before the smart terminal captures the image and maintain the posture to capture, which is not limited here. .

智能终端中用于在捕获真实图像时输出传感器数据的多个传感器，是指智能终端中能够用于运算得到智能终端的旋转角度的所有传感器。在一个示例性实施例中，传感器的数量为三个。The multiple sensors in the smart terminal used to output sensor data when capturing real images refer to all sensors in the smart terminal that can be used to calculate the rotation angle of the smart terminal. In an exemplary embodiment, the number of sensors is three.

传感器数据，是与智能终端的旋转相关的多个传感器相关的。例如，传感器数据可以是由智能终端中角速度计、加速度计和重力感应计输出的数据。The sensor data is related to multiple sensors related to the rotation of the smart terminal. For example, the sensor data may be data output by an angular velocity meter, an accelerometer, and a gravity sensor meter in the smart terminal.

在智能终端进行图像的捕获时，传感器便进行数据采集，以输出此图像关联的传感器数据。When the smart terminal captures an image, the sensor collects data to output the sensor data associated with the image.

在步骤230中，对传感器数据进行智能终端自身在空间中的旋转角度计算。In step 230, the sensor data is used to calculate the rotation angle of the smart terminal itself in space.

其中，通过传感器数据而进行空间中旋转角度的计算，获得相对于每一方位的旋转角度。在此应当说明的是，所指的方位，即为空间中所构建三维坐标系的坐标轴方向。Among them, the rotation angle in space is calculated through sensor data, and the rotation angle relative to each orientation is obtained. It should be noted here that the orientation referred to is the direction of the coordinate axis of the three-dimensional coordinate system constructed in space.

具体的，图3是根据一示例性实施例示出的对步骤230的细节进行描述的流程图。该步骤230，如图3所示，可以包括以下步骤。Specifically, FIG. 3 is a flowchart describing the details of step 230 according to an exemplary embodiment. This step 230, as shown in Figure 3, may include the following steps.

在步骤231中，进行传感器数据中角速度的积分，获得智能终端分别相对于空间中各个方位的旋转粗略值。In step 231, the angular velocity in the sensor data is integrated to obtain a rough rotation value of the smart terminal relative to each orientation in space.

其中，传感器数据包括角速度、加速度和重力方向信息。角速度是由智能终端中角速度计采集得到的，加速度是由加速度计采集得到，而重力方向信息则是由重力感应计采集得到。Among them, the sensor data includes angular velocity, acceleration and gravity direction information. The angular velocity is collected by the angular velocity meter in the smart terminal, the acceleration is collected by the accelerometer, and the gravity direction information is collected by the gravity sensor.

随着图像捕获的进行，也由智能终端中指定的多个传感器获得了传感器数据。提取传感器数据中的角速度，首先进行传感器数据中角速度的积分，获得智能终端分别相对于空间中每一方位的旋转粗略值。As the image capture proceeds, sensor data is also obtained by multiple sensors specified in the smart terminal. To extract the angular velocity in the sensor data, first integrate the angular velocity in the sensor data to obtain the rough rotation value of the smart terminal relative to each orientation in space.

也就是说，在角速度的积分过程中，角速度计的设备误差是一直在累积的，因此，未能得到准确值，而仅仅获得相对于每一方位的旋转粗略值。That is to say, during the integration process of angular velocity, the equipment error of the angular velocity meter is always accumulated. Therefore, the accurate value cannot be obtained, but only the rough value of the rotation relative to each orientation is obtained.

例如，可以理解的，在空间中存在着三个坐标轴方向，即此空间所建立坐标系的x坐标轴、y坐标轴和z坐标轴所指向的三个方向，这三个方向便是所指的方位，对每一方位都进行传感器数据中角速度的积分，分别获得每一方位的旋转粗略值。For example, it can be understood that there are three coordinate axis directions in space, that is, the three directions pointed by the x coordinate axis, y coordinate axis, and z coordinate axis of the coordinate system established in this space. These three directions are the Referring to the orientation, the angular velocity in the sensor data is integrated for each orientation to obtain a rough rotation value for each orientation.

在步骤233中，根据传感器数据中的加速度和重力方向信息对旋转粗略值进行旋转角度的辅助计算获得所在智能终端在空间中分别相对于各个方位的旋转角度。In step 233, an auxiliary calculation of the rotation angle is performed on the rough rotation value based on the acceleration and gravity direction information in the sensor data to obtain the rotation angle of the smart terminal relative to each orientation in space.

其中，在获得智能终端分别相对于空间中各个方位的旋转粗略值之后，将以传感器数据中的加速度和重力方向信息对此旋转粗略值进行精确计算，以最终得到准确的旋转角度。Among them, after obtaining the rough rotation value of the smart terminal relative to each orientation in space, the rough rotation value will be accurately calculated using the acceleration and gravity direction information in the sensor data to finally obtain the accurate rotation angle.

传感器数据中加速度和重力方向信息都是记录智能终端的位置和移动的信息，因此，将以加速度和重力方向信息为辅助，而进行卡尔曼滤波，由此来降低旋转粗略值中的误差，进而获得误差被大大降低的旋转角度。The acceleration and gravity direction information in the sensor data are information that records the position and movement of the smart terminal. Therefore, the acceleration and gravity direction information will be used as an aid to perform Kalman filtering, thereby reducing the error in the rough rotation value, and then A rotation angle is obtained at which the error is greatly reduced.

具体而言，在运算得到智能终端分别相对于空间中各个方位的旋转粗略值之后，便将传感器数据中的加速度和重力方向信息作为辅助，和角速度一起送入卡尔曼滤波器，由卡尔曼滤波器输出智能终端自身在空间中分别相对于各个方位的旋转角度。Specifically, after calculating the rough rotation values of the smart terminal relative to various directions in space, the acceleration and gravity direction information in the sensor data are used as assistance and sent to the Kalman filter together with the angular velocity. The Kalman filter The device outputs the rotation angle of the smart terminal itself relative to each orientation in space.

图4是根据根据图1对应实施例示出的对步骤130的细节进行描述的流程图。该步骤130，如图4所示，可以包括以下步骤。FIG. 4 is a flowchart describing the details of step 130 according to the corresponding embodiment shown in FIG. 1 . This step 130, as shown in Figure 4, may include the following steps.

在步骤131中，通过识别结果指示的标记图案被识别定位图像中的标记图案。In step 131, the mark pattern indicated by the recognition result is recognized to locate the mark pattern in the image.

在步骤133中，根据定位的标记图案进行目标追踪，从进行的目标追踪获得图像中标记图案在空间水平面的平移距离以及标记图案相对于预存储标记图像的缩放尺寸。In step 133, target tracking is performed according to the positioned marking pattern, and the translation distance of the marking pattern in the image on the spatial horizontal plane and the scaling size of the marking pattern relative to the pre-stored marking image are obtained from the performed target tracking.

其中，经由步骤110获得图像中标记图案的识别结果之后，在图像中标记图案被识别的指示下，触发执行图像的目标追踪过程，并定位图像中的标记图案。After the recognition result of the mark pattern in the image is obtained through step 110, under the instruction that the mark pattern in the image is recognized, the target tracking process of the image is triggered and the mark pattern in the image is located.

在图像的目标追踪过程中，可以得到图像在空间水平面上对应于左右、上下两个方位的移动距离，即为标记图案在空间水平面的平移距离。During the target tracking process of the image, the moving distance of the image corresponding to the left and right, up and down directions on the horizontal plane of space can be obtained, which is the translation distance of the marking pattern on the horizontal plane of space.

如前所述的，标记图像是预先存储且包含了标记图案的图像，因此，图像中标记图案是与标记图像呈一定的比例关系，即对应于一缩放尺寸的，由目标追踪过程的执行而获得此缩放尺寸。As mentioned above, the mark image is an image that is pre-stored and contains the mark pattern. Therefore, the mark pattern in the image is in a certain proportional relationship with the mark image, that is, corresponds to a zoom size, and is determined by the execution of the target tracking process. Get this scaled size.

在步骤135中，根据缩放尺寸和标记图像的尺寸计算得到图像中标记图案在空间的垂直距离，垂直距离和平移距离形成平移信息。In step 135, the vertical distance in space of the marking pattern in the image is calculated based on the zoom size and the size of the marking image, and the vertical distance and the translation distance form translation information.

其中，获取所预先存储标记图像的尺寸，并在缩放尺寸的配合下计算标记图案在空间中的垂直距离，即空间中垂直方向的平移距离，从而将垂直距离和空间水平面的平移距离一并形成平移信息。Among them, the size of the pre-stored mark image is obtained, and the vertical distance of the mark pattern in the space is calculated with the cooperation of the zoom size, that is, the translation distance in the vertical direction in the space, thereby forming the vertical distance and the translation distance of the horizontal plane of the space together. Pan information.

通过如上所述的示例性实施例，便在智能终端中实现了多传感器融合算法，快速准确的计算出空间中智能终端在三个方位的旋转角度，其也将是捕获图像以及图像中标记图案的位姿。Through the exemplary embodiments described above, a multi-sensor fusion algorithm is implemented in a smart terminal, and the rotation angle of the smart terminal in three directions in space can be quickly and accurately calculated, which will also be the captured image and the mark pattern in the image. posture.

在另一个示例性实施例中，步骤110进一步包括：智能终端持续进行图像捕获且获得当前所捕获图像中标记图案的识别结果。In another exemplary embodiment, step 110 further includes: the smart terminal continues to capture images and obtains a recognition result of the mark pattern in the currently captured image.

与之相对应的，步骤130之前，该图像的识别跟踪方法，还包括以下步骤。Correspondingly, before step 130, the image recognition and tracking method also includes the following steps.

相对于首次被识别到标记图案的图像，进行当前所捕获图像的透射变换预处理获得透射图像，透射图像用于进行当前所捕获图像的目标追踪。Relative to the image in which the mark pattern is recognized for the first time, a transmission transformation preprocessing of the currently captured image is performed to obtain a transmission image, and the transmission image is used for target tracking of the currently captured image.

其中，在此首先应当说明的是，为图像而实现的识别追踪是在智能终端所持续进行的图像捕获中进行的。换而言之，被执行识别追踪的图像，是智能终端持续捕获得到的每一图像。Among them, it should be noted here that the recognition and tracking implemented for the image is performed during the continuous image capture by the smart terminal. In other words, the image to be recognized and tracked is every image continuously captured by the smart terminal.

例如，在以帧为单位进行真实场景的拍摄，所获得的图像即为拍摄中的一帧图像，随着拍摄的持续进行，下一帧图像也将被执行识别追踪过程。For example, when shooting a real scene in frame units, the image obtained is one frame of the image being shot. As the shooting continues, the next frame of image will also be subjected to the recognition and tracking process.

如前所述的，将通过图像的目标追踪过程获得图像中标记图案在空间中的平移信息，而为了保证平移信息的准确性，并且进一步简化目标追踪过程，提高处理速度和效率，将在对图像执行目标追踪过程之前，对此图像进行优化，即执行透射变换预处理。As mentioned above, the translation information of the mark pattern in the image in space will be obtained through the target tracking process of the image. In order to ensure the accuracy of the translation information, further simplify the target tracking process, and improve the processing speed and efficiency, we will Before the image performs the target tracking process, the image is optimized, that is, transmission transformation preprocessing is performed.

通过透射变换预处理所获得的透射图像，执行此图像的目标追踪过程，进而使得用于目标追踪的透射图像是与首次识别到标记图案的图像的空间角度姿态一致，从而使得目标追踪的实现中不再需要考虑旋转角度，而直接获得平移信息即可。The transmission image obtained through transmission transformation preprocessing is performed to perform the target tracking process of this image, so that the transmission image used for target tracking is consistent with the spatial angle and posture of the image where the marking pattern is first recognized, thus enabling target tracking to be achieved. There is no need to consider the rotation angle anymore, and the translation information can be obtained directly.

在此应当补充说明，图像的识别跟踪方法，所获得的位姿矩阵是相对于初始位姿而言的，即相对于首次被识别到标记图案的图像，其所对应的位姿。It should be added here that with the image recognition and tracking method, the pose matrix obtained is relative to the initial pose, that is, relative to the pose corresponding to the image in which the mark pattern is first recognized.

进一步的，图5是根据一示例性实施例示出的对相对于首次被识别到标记图案的图像，进行当前所捕获图像的透射变换预处理获得透射图像，透射图像用于进行当前所捕获图像的目标追踪步骤的细节进行描述的流程图。Further, FIG. 5 shows, according to an exemplary embodiment, a transmission image obtained by performing transmission transformation preprocessing on the currently captured image relative to the image in which the mark pattern is first recognized, and the transmission image is used to perform preprocessing of the currently captured image. A flowchart describing the details of the goal tracking steps.

该步骤，如图5所示，可以包括以下步骤。This step, as shown in Figure 5, may include the following steps.

在步骤301中，获取首次被识别到标记图案的图像中标记图案对应的旋转角度，以旋转角度作为初始旋转角度。In step 301, the rotation angle corresponding to the marking pattern in the image in which the marking pattern is recognized for the first time is obtained, and the rotation angle is used as the initial rotation angle.

在步骤303中，根据智能终端中多传感器融合而输出的旋转角度以及初始旋转角度，运算得到当前所捕获图像和首次被识别到标记图案的图像二者之间的夹角。In step 303, based on the rotation angle output by multi-sensor fusion in the smart terminal and the initial rotation angle, the angle between the currently captured image and the image in which the mark pattern is first recognized is calculated to obtain.

其中，对于一图像，在为其完成了智能终端中的多传感器融合之后，获得此图像的旋转角度。为此图像获取首次被识别到标记图案的图像中标记图案对应的初始旋转角度，通过运算此图像的旋转角度和初始旋转角度二者之间的差值获得夹角。Among them, for an image, after completing multi-sensor fusion in the smart terminal, the rotation angle of the image is obtained. For this image, the initial rotation angle corresponding to the marking pattern in the image in which the marking pattern is recognized for the first time is obtained, and the included angle is obtained by calculating the difference between the rotation angle of the image and the initial rotation angle.

在步骤305中，通过夹角进行当前所捕获图像的透射变换获得透射图像。In step 305, a transmission image is obtained by performing transmission transformation of the currently captured image through the included angle.

其中，对所捕获图像执行透射变换实质为校正所捕获图像，消除其与首次被识别到标记图案的图像之间的旋转和畸变误差，进而方便后续进行目标追踪。Among them, performing transmission transformation on the captured image is essentially to correct the captured image and eliminate the rotation and distortion errors between it and the image where the mark pattern is first recognized, thereby facilitating subsequent target tracking.

在获得透射图像之后，便使用此透射图像来执行图像的目标追踪过程，而不再直接使用图像。After the transmission image is obtained, the transmission image is used to perform the object tracking process of the image without using the image directly.

对于图像中标记图案位姿矩阵的获得，可以结合所获得的夹角，此夹角也是分别对应于所在空间中的三个方位的，结合所获得的夹角和平移信息形成位姿矩阵，由此，就可以快速得到六个自由度的相机位姿矩阵。To obtain the pose matrix of the marking pattern in the image, the obtained included angle can be combined. This included angle also corresponds to the three orientations in the space. The obtained included angle and translation information are combined to form the pose matrix, as follows: In this way, the camera pose matrix with six degrees of freedom can be quickly obtained.

在一个示例性实施例中，图1所示实施例中的步骤110，可以包括：In an exemplary embodiment, step 110 in the embodiment shown in Figure 1 may include:

进行智能终端所捕获图像和预存储标记图像二者之间的匹配，识别图像中是否存在标记图案，获得图像中标记图案的识别结果。Match the image captured by the smart terminal and the pre-stored marked image, identify whether there is a marked pattern in the image, and obtain the recognition result of the marked pattern in the image.

其中，在进行图像的目标追踪之前，将进行此图像与预置标记图像的匹配，如果图像匹配上标记图像，则说明图像中存在着标记图案，图像中的标记图案被识别，从而获得相应的识别结果。Among them, before the target tracking of the image, the image will be matched with the preset marked image. If the image matches the marked image, it means that there is a marked pattern in the image, and the marked pattern in the image is recognized, thereby obtaining the corresponding Recognition results.

如果图像未匹配上标记图像，则说明图像中不存在标记图案，从而将不进行后续的目标追踪，而等待下一图像。If the image does not match the marked image, it means that there is no marked pattern in the image, so subsequent target tracking will not be performed and the next image will be waited.

图6是根据图1对应实施例示出的对步骤110的细节进行描述的流程图。该步骤110，如图6所示，可以包括以下步骤。FIG. 6 is a flowchart describing the details of step 110 according to the embodiment corresponding to FIG. 1 . This step 110, as shown in Figure 6, may include the following steps.

在步骤111中，接收在所捕获图像中进行标记图案指定的用户指令。In step 111, a user instruction for specifying a mark pattern in the captured image is received.

其中，图像中是否存在标记图案，可以利用与用户的交互识别来实现。具体而言，对捕获得的图像进行显示，此时，用户可查看所捕获的图像，并确认此图像中是否存在着标记图案，如果存在着标记图案，则在图像中触发标记图案的指定操作，与之相对应的，智能终端将响应这一指定操作而生成图像中进行标记图案指定的用户指令。Among them, whether there is a mark pattern in the image can be realized by interactive recognition with the user. Specifically, the captured image is displayed. At this time, the user can view the captured image and confirm whether there is a mark pattern in the image. If there is a mark pattern, trigger the specified operation of the mark pattern in the image. , correspondingly, the smart terminal will respond to this specifying operation and generate a user instruction for specifying the mark pattern in the image.

例如，用户在图像中标记图案指定操作的触发，可以是用户在所显示的图像中框选出标记图案的操作，当然也可以其它操作，在此不进行限定。For example, the user's operation of specifying a mark pattern in the image may be triggered by the user's operation of selecting a mark pattern in the displayed image. Of course, other operations may also be possible, which are not limited here.

在步骤113中，根据用户指令获得图像中标记图案的识别结果。In step 113, the recognition result of the mark pattern in the image is obtained according to the user instruction.

通过如上所述的示例性实施例，使得图像中的标记图像通过用户交互的方式简单快捷的实现，对于图像的识别跟踪而言，将进一步提高了准确性和效率。Through the exemplary embodiments described above, the marked image in the image can be realized simply and quickly through user interaction, which will further improve the accuracy and efficiency of image recognition and tracking.

在一个示例性实施例中，步骤150之后，该图像的识别跟踪方法还包括：In an exemplary embodiment, after step 150, the image recognition and tracking method further includes:

根据图像中标记图案的位姿矩阵进行预置虚拟场景图像在图像中的投影。The preset virtual scene image is projected into the image according to the pose matrix of the marking pattern in the image.

其中，在获得图像中标记图案的位姿矩阵之后，即可以此为依据进行图像中虚拟场景图像的投影，进而实现各种业务场景，进而在智能终端构建面向个人的现实增强系统，也可以构建面向企业的现实增强辅助办公系统等。Among them, after obtaining the pose matrix of the mark pattern in the image, you can use this as a basis to project the virtual scene image in the image, and then realize various business scenarios, and then build a personal-oriented reality augmentation system on the smart terminal, or you can build Reality augmented auxiliary office system for enterprises, etc.

例如，对于面向企业的现实增强辅助办公系统的实现，可以在识别跟踪到标记图像，此标记图案对应于现实环境中的办公桌或者办公室内的特定位置，依据标记图像的位姿矩阵而产生邮件、会议通知、视频会议等业务场景，也可应用于远程会议中，以产生会议伙伴就在身边的真实感。For example, for the implementation of an enterprise-oriented augmented reality auxiliary office system, a marked image can be identified and tracked. This marked pattern corresponds to a desk in the real environment or a specific location in the office, and an email is generated based on the pose matrix of the marked image. , meeting notifications, video conferencing and other business scenarios can also be applied to remote meetings to create a real sense that meeting partners are around you.

通过如上所述的示例性实施例，便能够在移动端，即智能终端中实现快速标记识别追踪，使得智能终端本地实时标记图案识别追踪成为现实，并且相较于传统特征点追踪具有更好的稳定性和更高的追踪准确率。Through the exemplary embodiments described above, rapid mark recognition and tracking can be realized on the mobile terminal, that is, the smart terminal, making local real-time mark pattern recognition and tracking of the smart terminal a reality, and having better performance than traditional feature point tracking. Stability and higher tracking accuracy.

以智能终端中现实增强系统的实现为例，对如上所述的示例性实施例结合产品进行阐述。Taking the implementation of the reality augmentation system in a smart terminal as an example, the exemplary embodiments described above are described in combination with products.

图7是根据一示例性实施例示出的智能终端中现实增强系统的框架图。智能终端持续进行现实环境的拍摄，而不断捕获图像，进而形成视频图像。Figure 7 is a framework diagram of a reality augmentation system in a smart terminal according to an exemplary embodiment. The smart terminal continues to shoot the real environment and continuously captures images to form video images.

也就是说，所不断捕获的图像，即为智能终端中现实增强显示的视频图像中的一帧图像。In other words, the continuously captured image is one frame of the video image displayed in the smart terminal with augmented reality.

如图7所示的，在捕获得到一帧图像之后，对其进行标记图像的匹配，即如系统框架中的步骤410所示，如果匹配上，则执行步骤430，进行图像追踪，进而最终获得这一帧图像的位姿矩阵，从而在智能终端的现实增强系统中方能够为此帧图像实现虚拟场景图像的投影，进而进行现实增强显示。As shown in Figure 7, after a frame of image is captured, the marked image is matched, that is, as shown in step 410 in the system framework. If matched, step 430 is executed to perform image tracking, and finally obtain The pose matrix of this frame of image can realize the projection of the virtual scene image for this frame of image in the reality augmentation system of the smart terminal, and then perform reality augmentation display.

在执行步骤410而未匹配上标记图像时，将等待下一帧图像；如果能成功执行追踪图像步骤，则也将等待下一帧图像。When step 410 is performed but the marked image is not matched, the next frame of image will be waited for; if the tracking image step can be successfully performed, the next frame of image will also be waited for.

以此类推，不断为捕获的图像执行识别跟踪。By analogy, recognition tracking is continuously performed for the captured images.

图8是图7对应实施例中追踪图像步骤的实现框架图。对于视频图像中的每一帧图像，都将在多传感器的配合下获得相应的透射图像，即如实现框架中步骤510所示。FIG. 8 is an implementation framework diagram of the image tracking step in the embodiment corresponding to FIG. 7 . For each frame of the video image, a corresponding transmission image will be obtained with the cooperation of multiple sensors, as shown in step 510 in the implementation framework.

进而以透射图像为输入而进行单目标追踪，获得透射变换的位姿矩阵，即如步骤530。透射变换的位姿矩阵是消除了旋转而获得的，因此，其仅仅描述了图像中标记图案的平移信息，而并未包含旋转角度。Then, single target tracking is performed using the transmission image as input, and the pose matrix of the transmission transformation is obtained, that is, step 530. The pose matrix of the transmission transformation is obtained by eliminating rotation. Therefore, it only describes the translation information of the marker pattern in the image, but does not include the rotation angle.

此时，在多传感器的配合下，通过多传感器融合所获得的旋转角度与此透射变换的位姿矩阵一起，形成图像中标记图案在空间中的位姿矩阵，至此，便为智能终端中现实增强系统的实现提供了最为快速稳定的实现，进一步加速性能的同时，也不再需要进行时间性能和识别效果的取舍。At this time, with the cooperation of multiple sensors, the rotation angle obtained through multi-sensor fusion, together with the pose matrix of this transmission transformation, forms the pose matrix of the mark pattern in the image in space. At this point, it is the reality in the smart terminal. The implementation of the enhanced system provides the fastest and most stable implementation. While further accelerating performance, there is no longer a need to make a trade-off between time performance and recognition effect.

通过如上所述的实现，充分融合了智能终端中的多个传感器，其追踪的稳定性和准确率都保持在很高的水平。Through the above-mentioned implementation, multiple sensors in the smart terminal are fully integrated, and the tracking stability and accuracy are maintained at a high level.

下述为本发明装置实施例，可以用于执行本发明上述图像的识别跟踪方法实施例。对于本发明装置实施例中未披露的细节，请参照本发明图像的识别跟踪方法实施例。The following are device embodiments of the present invention, which can be used to perform the above-mentioned image recognition and tracking method embodiments of the present invention. For details not disclosed in the device embodiments of the present invention, please refer to the image recognition and tracking method embodiments of the present invention.

图9是根据一示例性实施例示出的一种图像的识别跟踪装置的框图。该图像的识别跟踪装置，如图9所示，可以包括但不限于：识别结果获得模块710、目标追踪模块730和位姿获得模块750。FIG. 9 is a block diagram of an image recognition and tracking device according to an exemplary embodiment. The image recognition and tracking device, as shown in Figure 9, may include but is not limited to: a recognition result acquisition module 710, a target tracking module 730, and a pose acquisition module 750.

识别结果获得模块710，用于获得智能终端所捕获图像中标记图案的识别结果。The recognition result obtaining module 710 is used to obtain the recognition result of the mark pattern in the image captured by the smart terminal.

目标追踪模块730，用于根据标记图案的识别结果定位图像中的标记图案，由定位的标记图案进行目标追踪获得图像中标记图案在空间的平移信息。The target tracking module 730 is used to locate the mark pattern in the image according to the recognition result of the mark pattern, and perform target tracking based on the positioned mark pattern to obtain the spatial translation information of the mark pattern in the image.

位姿获得模块750，用于将平移信息和智能终端中多传感器融合而输出的旋转角度形成图像中标记图案的位姿矩阵。The pose acquisition module 750 is used to fuse the translation information and the rotation angle output by fusion of multiple sensors in the smart terminal to form a pose matrix of the marking pattern in the image.

图10是根据图9对应实施例示出的对目标追踪模块的细节进行描述的框图。该目标追踪模块730，如图10所示，可以包括但不限于：标记定位单元731、追踪执行单元733和平移信息形成单元735。FIG. 10 is a block diagram describing details of the target tracking module according to the embodiment corresponding to FIG. 9 . The target tracking module 730, as shown in FIG. 10, may include but is not limited to: a mark positioning unit 731, a tracking execution unit 733, and a translation information forming unit 735.

标记定位单元731，用于通过识别结果指示的标记图案被识别定位图像中的标记图案。The mark positioning unit 731 is configured to identify and position the mark pattern in the image based on the mark pattern indicated by the recognition result.

追踪执行单元733，用于根据定位的标记图案进行目标追踪，从进行的目标追踪获得图像中标记图案在空间水平面的平移距离以及标记图案相对于预存储标记图像的缩放尺寸。The tracking execution unit 733 is configured to perform target tracking according to the positioned marking pattern, and obtain the translation distance of the marking pattern in the image on the spatial horizontal plane and the scaling size of the marking pattern relative to the pre-stored marking image from the performed target tracking.

平移信息形成单元735，用于根据缩放尺寸和标记图像的尺寸计算得到图像中标记图案在空间的垂直距离，垂直距离和平移距离形成平移信息。The translation information forming unit 735 is used to calculate the vertical distance of the marking pattern in the image in space according to the zoom size and the size of the marking image. The vertical distance and the translation distance form translation information.

图11是根据另一示例性实施例示出的一种图像的识别跟踪装置的框图，该图像的识别跟踪装置还包括但不限于：数据获得模块810和多传感器融合模块 830。Figure 11 is a block diagram of an image recognition and tracking device according to another exemplary embodiment. The image recognition and tracking device also includes but is not limited to: a data acquisition module 810 and a multi-sensor fusion module 830.

数据获得模块810，用于获得智能终端捕获图像时多个传感器输出的传感器数据。The data acquisition module 810 is used to obtain sensor data output by multiple sensors when the smart terminal captures images.

多传感器融合模块830，用于对传感器数据执行多传感器融合算法计算智能终端在空间中的旋转角度，旋转角度由多传感器融合而输出，且用于形成图像中标记图案的位姿矩阵。The multi-sensor fusion module 830 is used to perform a multi-sensor fusion algorithm on sensor data to calculate the rotation angle of the smart terminal in space. The rotation angle is output by multi-sensor fusion and used to form a pose matrix of the marking pattern in the image.

在另一个示例性实施例中，识别结果获得模块710进一步用于智能终端持续进行图像捕获且获得当前所捕获图像中标记图案的识别结果。In another exemplary embodiment, the recognition result obtaining module 710 is further used for the smart terminal to continuously perform image capture and obtain the recognition result of the mark pattern in the currently captured image.

该图像的识别跟踪装置还包括透射变换模块。该透射变换模块用于相对于首次被识别到标记图案的图像，进行当前所捕获图像的透射变换预处理获得透射图像，透射图像用于进行当前所捕获图像的目标追踪。The image recognition and tracking device also includes a transmission transformation module. The transmission transformation module is used to perform transmission transformation preprocessing on the currently captured image relative to the image in which the mark pattern is first recognized to obtain a transmission image, and the transmission image is used for target tracking of the currently captured image.

进一步的，图12是根据一示例性实施例示出的对透射变换模块的细节进行描述的框图。该透射变换模块910，如图12所示，可以包括但不限于：初始旋转获得单元911、旋转变换单元913和图像透射变换单元915。Further, FIG. 12 is a block diagram illustrating details of the transmission transformation module according to an exemplary embodiment. The transmission transformation module 910, as shown in Figure 12, may include but is not limited to: an initial rotation acquisition unit 911, a rotation transformation unit 913 and an image transmission transformation unit 915.

初始旋转获得单元911，用于获取首次被识别到标记图案的图像中标记图案对应的旋转角度，以旋转角度作为初始旋转角度。The initial rotation obtaining unit 911 is used to obtain the rotation angle corresponding to the marking pattern in the image in which the marking pattern is recognized for the first time, and use the rotation angle as the initial rotation angle.

旋转变换单元913，用于根据智能终端中多传感器融合而输出的旋转角度以及初始旋转角度，运算得到当前所捕获图像和首次被识别到标记图案的图像二者之间的夹角。The rotation transformation unit 913 is used to calculate the angle between the currently captured image and the image in which the mark pattern is first recognized based on the rotation angle output by multi-sensor fusion in the smart terminal and the initial rotation angle.

图像透射变换单元915，用于通过夹角进行当前所捕获图像的透射变换获得透射图像。The image transmission transformation unit 915 is configured to perform transmission transformation of the currently captured image through the included angle to obtain a transmission image.

在一个示例性实施例中，图9所示的识别结果获得模块710进一步用于进行所捕获图像和标记图像二者之间的匹配，识别图像中是否存在标记图案，获得图像中标记图案的识别结果。In an exemplary embodiment, the recognition result obtaining module 710 shown in Figure 9 is further used to perform matching between the captured image and the marked image, identify whether there is a marked pattern in the image, and obtain the identification of the marked pattern in the image. result.

在另一个示例性实施例中，该图像的识别跟踪装置还包括但不限于投影模块。投影模块用于根据图像中标记图案的位姿矩阵进行预置虚拟场景图像在图像中的投影。In another exemplary embodiment, the image recognition and tracking device also includes, but is not limited to, a projection module. The projection module is used to project the preset virtual scene image in the image according to the pose matrix of the marking pattern in the image.

图13是根据一示例性实施例示出的一种装置的框图。例如，装置900可以是图1所示实施环境中的智能终端110。例如，智能终端110可以是智能手机、平板电脑等终端设备。Figure 13 is a block diagram of a device according to an exemplary embodiment. For example, the device 900 may be the smart terminal 110 in the implementation environment shown in FIG. 1 . For example, the smart terminal 110 may be a terminal device such as a smartphone or a tablet computer.

参照图13，装置900可以包括以下一个或多个组件：处理组件902，存储器904，电源组件906，多媒体组件908，音频组件910，传感器组件914以及通信组件916。Referring to Figure 13, apparatus 900 may include one or more of the following components: processing component 902, memory 904, power supply component 906, multimedia component 908, audio component 910, sensor component 914, and communication component 916.

处理组件902通常控制装置900的整体操作，诸如与显示，电话呼叫，数据通信，相机操作以及记录操作相关联的操作等。处理组件902可以包括一个或多个处理器918来执行指令，以完成下述的方法的全部或部分步骤。此外，处理组件902可以包括一个或多个模块，便于处理组件902和其他组件之间的交互。例如，处理组件902可以包括多媒体模块，以方便多媒体组件908和处理组件902之间的交互。Processing component 902 generally controls the overall operations of device 900, such as operations associated with display, phone calls, data communications, camera operations, recording operations, and the like. Processing component 902 may include one or more processors 918 to execute instructions to complete all or part of the methods described below. Additionally, processing component 902 may include one or more modules that facilitate interaction between processing component 902 and other components. For example, processing component 902 may include a multimedia module to facilitate interaction between multimedia component 908 and processing component 902.

存储器904被配置为存储各种类型的数据以支持在装置900的操作。这些数据的示例包括用于在装置900上操作的任何应用程序或方法的指令。存储器 904可以由任何类型的易失性或非易失性存储设备或者它们的组合实现，如静态随机存取存储器(StaticRandomAccess Memory，简称SRAM)，电可擦除可编程只读存储器(Electrically ErasableProgrammable Read-Only Memory，简称 EEPROM)，可擦除可编程只读存储器(ErasableProgrammable Read Only Memory，简称EPROM)，可编程只读存储器(Programmable Red-Only Memory，简称PROM)，只读存储器(Read-Only Memory，简称ROM)，磁存储器，快闪存储器，磁盘或光盘。存储器904中还存储有一个或多个模块，该一个或多个模块被配置成由该一个或多个处理器918执行，以完成下述图3、图4、图5和图6任一所示方法中的全部或者部分步骤。Memory 904 is configured to store various types of data to support operations at device 900 . Examples of such data include instructions for any application or method operating on device 900 . The memory 904 can be implemented by any type of volatile or non-volatile storage device or their combination, such as static random access memory (Static Random Access Memory, SRAM for short), electrically erasable programmable read-only memory (Electrically Erasable Programmable Read -Only Memory (EEPROM for short), Erasable Programmable Read Only Memory (EPROM for short), Programmable Red-Only Memory (PROM for short), Read-Only Memory , referred to as ROM), magnetic memory, flash memory, magnetic disk or optical disk. One or more modules are also stored in the memory 904, and the one or more modules are configured to be executed by the one or more processors 918 to complete any of the following Figures 3, 4, 5 and 6. Show all or part of the steps in the method.

电源组件906为装置900的各种组件提供电力。电源组件906可以包括电源管理系统，一个或多个电源，及其他与为装置900生成、管理和分配电力相关联的组件。Power supply component 906 provides power to the various components of device 900 . Power supply components 906 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to device 900 .

多媒体组件908包括在所述装置900和用户之间的提供一个输出接口的屏幕。在一些实施例中，屏幕可以包括液晶显示器(Liquid Crystal Display，简称 LCD)和触摸面板。如果屏幕包括触摸面板，屏幕可以被实现为触摸屏，以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界，而且还检测与所述触摸或滑动操作相关的持续时间和压力。屏幕还可以包括有机电致发光显示器(Organic Light Emitting Display，简称OLED)。Multimedia component 908 includes a screen that provides an output interface between the device 900 and the user. In some embodiments, the screen may include a liquid crystal display (LCD for short) and a touch panel. If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from the user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide action. The screen may also include an Organic Light Emitting Display (OLED for short).

音频组件910被配置为输出和/或输入音频信号。例如，音频组件910包括一个麦克风(Microphone，简称MIC)，当装置900处于操作模式，如呼叫模式、记录模式和语音识别模式时，麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器904或经由通信组件916发送。在一些实施例中，音频组件910还包括一个扬声器，用于输出音频信号。Audio component 910 is configured to output and/or input audio signals. For example, the audio component 910 includes a microphone (Microphone, referred to as MIC). When the device 900 is in an operating mode, such as a call mode, a recording mode, and a voice recognition mode, the microphone is configured to receive external audio signals. The received audio signals may be further stored in memory 904 or sent via communications component 916 . In some embodiments, audio component 910 also includes a speaker for outputting audio signals.

传感器组件914包括一个或多个传感器，用于为装置900提供各个方面的状态评估。例如，传感器组件914可以检测到装置900的打开/关闭状态，组件的相对定位，传感器组件914还可以检测装置900或装置900一个组件的位置改变以及装置900的温度变化。在一些实施例中，该传感器组件914还可以包括磁传感器，压力传感器或温度传感器。Sensor component 914 includes one or more sensors for providing various aspects of status assessment for device 900 . For example, sensor component 914 may detect an open/closed state of device 900, the relative positioning of components, sensor component 914 may also detect a change in position of device 900 or a component of device 900, and a change in temperature of device 900. In some embodiments, the sensor assembly 914 may also include a magnetic sensor, a pressure sensor, or a temperature sensor.

通信组件916被配置为便于装置900和其他设备之间有线或无线方式的通信。装置900可以接入基于通信标准的无线网络，如WiFi(WIreless-Fidelity，无线保真)。在一个示例性实施例中，通信组件916经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中，所述通信组件916还包括近场通信(Near FieldCommunication，简称NFC)模块，以促进短程通信。例如，在NFC模块可基于射频识别(RadioFrequency Identification，简称RFID)技术，红外数据协会(Infrared DataAssociation，简称IrDA)技术，超宽带(Ultra Wideband，简称UWB)技术，蓝牙技术和其他技术来实现。Communication component 916 is configured to facilitate wired or wireless communication between apparatus 900 and other devices. The device 900 can access a wireless network based on a communication standard, such as WiFi (WIreless-Fidelity). In one exemplary embodiment, the communication component 916 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 916 also includes a Near Field Communication (NFC) module to facilitate short-range communication. For example, the NFC module can be implemented based on Radio Frequency Identification (RFID) technology, Infrared Data Association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth technology and other technologies.

在示例性实施例中，装置900可以被一个或多个应用专用集成电路 (ApplicationSpecific Integrated Circuit，简称ASIC)、数字信号处理器、数字信号处理设备、可编程逻辑器件、现场可编程门阵列、控制器、微控制器、微处理器或其他电子元件实现，用于执行下述方法。In an exemplary embodiment, the device 900 may be controlled by one or more application specific integrated circuits (Application Specific Integrated Circuits, ASICs for short), digital signal processors, digital signal processing equipment, programmable logic devices, field programmable gate arrays, device, microcontroller, microprocessor or other electronic components for executing the following methods.

可选的，本发明还提供一种智能终端，该电视终端可以用于图1所示实施环境中，执行图1、图2、图3、图4、图5和图6任一所示的图像的识别跟踪方法的全部或者部分步骤。所述智能终端包括：Optionally, the present invention also provides a smart terminal. The TV terminal can be used in the implementation environment shown in Figure 1 to perform any of the steps shown in Figures 1, 2, 3, 4, 5 and 6. All or part of the steps of the image recognition and tracking method. The smart terminal includes:

处理器；processor;

用于存储处理器可执行指令的存储器；Memory used to store instructions executable by the processor;

其中，所述处理器被配置为执行：Wherein, the processor is configured to perform:

该实施例中的装置的处理器执行操作的具体方式已经在有关该智能终端的图像的识别跟踪方法的实施例中执行了详细描述，此处将不做详细阐述说明。The specific manner in which the processor of the device in this embodiment performs operations has been described in detail in the embodiment of the method for identifying and tracking images of the smart terminal, and will not be described in detail here.

在示例性实施例中，还提供了一种存储介质，该存储介质为计算机可读存储介质，例如可以为包括指令的临时性和非临时性计算机可读存储介质。该存储介指例如包括指令的存储器904，上述指令可由装置900的处理器918执行以完成上述方法。In an exemplary embodiment, a storage medium is also provided, and the storage medium is a computer-readable storage medium, such as a transitory and non-transitory computer-readable storage medium including instructions. The storage medium refers to, for example, the memory 904 including instructions that can be executed by the processor 918 of the device 900 to perform the above method.

应当理解的是，本发明并不局限于上面已经描述并在附图中示出的精确结构，并且可以在不脱离其范围执行各种修改和改变。本发明的范围仅由所附的权利要求来限制。It is to be understood that the present invention is not limited to the precise construction described above and illustrated in the accompanying drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims

1. An image recognition and tracking method, characterized in that the method includes:

Obtain the recognition result of the mark pattern in the image captured by the smart terminal, the recognition result is used to indicate whether the mark pattern exists in the image and the position of the mark pattern in the image;

The mark pattern indicated by the recognition result is recognized to locate the mark pattern in the image;

Obtain the rotation angle corresponding to the marking pattern in the image in which the marking pattern is recognized for the first time, and use the rotation angle as the initial rotation angle;

Obtain sensor data output by multiple sensors when the smart terminal captures the image;

Integrate the angular velocity in the sensor data to obtain the rough rotation value of the smart terminal relative to various directions in space. Based on the acceleration and gravity direction information in the sensor data, the rotation angle is auxiliary calculated to obtain the respective positions of the smart terminal in space. The rotation angle relative to each orientation is output by multi-sensor fusion and used to form the pose matrix of the marking pattern in the image;

According to the rotation angle output by multi-sensor fusion in the smart terminal and the initial rotation angle, calculate the angle between the currently captured image and the image in which the mark pattern is recognized for the first time;

Perform transmission transformation of the currently captured image through the included angle to obtain a transmission image, and the transmission image is used for target tracking of the currently captured image;

The target tracking of the marking pattern positioned through the obtained transmission image is performed so that the transmission image used for target tracking is consistent with the spatial angle posture of the image in which the marking pattern is first recognized, so as to directly obtain the The translation distance of the spatial horizontal plane and the scaling size of the marking pattern relative to the pre-stored marking image;

The vertical distance in space of the marking pattern in the image is calculated according to the zoom size and the size of the marking image. The vertical distance and the translation distance form translation information. The translation information is used to indicate the translation distance of the marking pattern in space. and vertical distance;

The translation information indicates the distance obtained by the mark pattern relative to the orientation of the three coordinate axes in the space, and the rotation angle output by the fusion of multiple sensors in the smart terminal is relative to the rotation angle of the coordinate axes corresponding to the three coordinate axes in the space, The pose matrix of the marking pattern in the image is formed as elements in the matrix respectively, and the rotation angle output by the multi-sensor fusion is used to describe the occurrence of rotation in the pose corresponding to the marking pattern in the image.

2. The method according to claim 1, characterized in that after the translation information and the rotation angle output by fusion of multiple sensors in the smart terminal form a pose matrix of the marking pattern in the image, the method Also includes:

The projection of the preset image in the image is performed according to the pose matrix of the marking pattern in the image.

3. An image recognition and tracking device, characterized in that the device includes:

The recognition result acquisition module is used to obtain the recognition result of the mark pattern in the image captured by the smart terminal;

A target tracking module, configured to locate the mark pattern in the image according to the recognition result of the mark pattern, and perform target tracking based on the positioned mark pattern to obtain the spatial translation information of the mark pattern in the image;

The target tracking module includes: a mark positioning unit, a tracking execution unit and a translation information forming unit;

a mark positioning unit, configured to locate the mark pattern in the image after the mark pattern indicated by the recognition result is recognized;

The device also includes a transmission transformation module, which is used to perform transmission transformation preprocessing of the currently captured image with respect to the image in which the mark pattern is first recognized to obtain a transmission image, and the transmission image is used to perform the currently captured image. Target tracking of captured images;

The transmission transformation module includes:

An initial rotation obtaining unit is used to obtain the rotation angle corresponding to the marking pattern in the image in which the marking pattern is recognized for the first time, and use the rotation angle as the initial rotation angle;

A data acquisition module, used to obtain sensor data output by multiple sensors when the smart terminal captures the image;

The multi-sensor fusion module is used to integrate the angular velocity in the sensor data to obtain the rough rotation value of the smart terminal relative to various directions in space. The rough rotation value is obtained by auxiliary calculation of the rotation angle based on the acceleration and gravity direction information in the sensor data. The rotation angle of the smart terminal relative to each orientation in space, which is output by multi-sensor fusion and used to form the pose matrix of the marking pattern in the image;

A rotation transformation unit, configured to calculate the angle between the currently captured image and the image in which the mark pattern is recognized for the first time based on the rotation angle output by multi-sensor fusion in the smart terminal and the initial rotation angle;

An image transmission transformation unit, configured to perform transmission transformation of the currently captured image through the included angle to obtain a transmission image;

A tracking execution unit configured to perform target tracking of the marking pattern positioned through the obtained transmission image, and perform target tracking such that the transmission image used for target tracking is at a spatial angle with the image in which the marking pattern is first recognized. The posture is consistent to directly obtain the translation distance in the horizontal plane of space and the scaling size of the marking pattern relative to the pre-stored marking image;

a translation information forming unit, configured to calculate the vertical distance of the marking pattern in the image in space according to the zoom size and the size of the marking image, and the vertical distance and the translation distance form translation information;

The posture acquisition module is used to obtain the distances obtained by the translation information indicating the mark pattern relative to the orientation of the three coordinate axes in the space, and the rotation angle output by the multi-sensor fusion in the smart terminal relative to the three coordinate axes in the space. The corresponding coordinate axis rotation angles are used as elements in the matrix to form the pose matrix of the marking pattern in the image. The rotation angle output by the multi-sensor fusion is used to describe the occurrence of rotation in the pose corresponding to the marking pattern in the image. .

4. The device according to claim 3, characterized in that the device further comprises:

A projection module, configured to project a preset virtual scene image in the image according to the pose matrix of the marking pattern in the image.

5. An intelligent terminal, characterized by including:

processor; and

A memory on which computer readable instructions are stored. When the computer readable instructions are executed by the processor, the image recognition and tracking method according to any one of claims 1 to 2 is implemented.

6. A computer-readable storage medium having a computer program stored thereon. When the computer program is executed by a processor, the image recognition and tracking method according to any one of claims 1 to 2 is implemented.