[go: up one dir, main page]

CN113280817B - Visual navigation based on landmarks - Google Patents

Visual navigation based on landmarks Download PDF

Info

Publication number
CN113280817B
CN113280817B CN202010652637.4A CN202010652637A CN113280817B CN 113280817 B CN113280817 B CN 113280817B CN 202010652637 A CN202010652637 A CN 202010652637A CN 113280817 B CN113280817 B CN 113280817B
Authority
CN
China
Prior art keywords
landmark
information
degree
freedom
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010652637.4A
Other languages
Chinese (zh)
Other versions
CN113280817A (en
Inventor
诸小熊
李军舰
姚迪狄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN202010652637.4A priority Critical patent/CN113280817B/en
Publication of CN113280817A publication Critical patent/CN113280817A/en
Application granted granted Critical
Publication of CN113280817B publication Critical patent/CN113280817B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/20Instruments for performing navigational calculations

Landscapes

  • Engineering & Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Automation & Control Theory (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a visual navigation method based on landmarks, which comprises the following steps: determining landmarks in the visual scene; acquiring multi-degree-of-freedom information of the intelligent object relative to the landmark; acquiring multi-degree-of-freedom change information of the intelligent agent relative to the landmark; and navigating the movement of the intelligent body according to the multi-degree-of-freedom change information. According to the scheme, six-degree-of-freedom information of a landmark in a visual scene is constructed through camera image information and pose information of the node gyroscope, wherein the six-degree-of-freedom information comprises three coordinate information of the landmark in the visual scene, namely transverse, longitudinal, far and near coordinate information and three angle information of pitching, rotating and yawing on the coordinate point; then, according to the six-degree-of-freedom information, high-frame-rate visual navigation with the landmark as a reference point can be realized. The invention can be used for displaying virtual articles/figures in VR/AR, can also be used in scenes such as unmanned, robot navigation and the like, and can realize high real-time navigation of an intelligent body on mobile equipment with common computing capability by matching with a gyroscope.

Description

Visual navigation based on landmarks
Technical Field
The invention relates to the technical field of map navigation, in particular to a visual navigation method and device based on landmarks.
Background
With the rapid development of computer vision technology, the visual scene map construction and navigation technology based on computer vision is widely applied to VR/AR (virtual reality/augmented reality), automatic navigation and other scenes due to the characteristics of low cost, wide application and the like.
The most commonly used visual map scene construction scheme is a visual SLAM (simultaneous Localization AND MAPPING, instant localization and map construction) technology, and map information is constructed through sensors, visual odometers and the like and used for judging the position information of the current intelligent agent. This solution has several problems: first, the map construction flow of SLAM is complex: the visual SLAM requires inputting scene information of a plurality of angles to a scene where the scene is located, and then constructs map information through techniques such as feature extraction and matching. Secondly, the calculation complexity is high, and the navigation speed is low: because the map information established by the visual SLAM is relatively more and the features are relatively rich, the navigation calculation amount based on the map is large, and real-time navigation is difficult to realize in common computing equipment, particularly mobile equipment.
Therefore, a scheme of visual navigation is needed, which can reduce the complexity of map construction, improve the navigation speed and support the application on common computing equipment.
Disclosure of Invention
An object of the present invention is to provide a visual navigation method based on landmarks, so as to implement immediate and simple visual landmark construction and high real-time visual navigation.
To achieve the above object, an embodiment of the present invention provides a landmark-based visual navigation method, including:
Determining landmarks in the visual scene;
Acquiring multi-degree-of-freedom information of the intelligent object relative to the landmark;
Acquiring multi-degree-of-freedom change information of the intelligent agent relative to the landmark;
and navigating the movement of the intelligent body according to the multi-degree-of-freedom change information.
Further, the multiple degree of freedom information includes coordinate information and angle information.
Further, the multi-degree-of-freedom information is six-degree-of-freedom information, and the six-degree-of-freedom information comprises an abscissa, an ordinate, a depth coordinate of the agent relative to the landmark in the visual scene, and a pitch angle, a yaw angle and a rotation angle of the agent in a spatial coordinate system; the obtaining the multi-degree-of-freedom information of the landmark relative to the intelligent agent specifically comprises the following steps:
Acquiring a visual scene shot by the intelligent camera, analyzing the visual scene, and determining the abscissa, the ordinate and the depth coordinate of the intelligent body relative to the landmark;
and acquiring sensor data of the intelligent body, and determining the pitch angle, the yaw angle and the rotation angle of the intelligent body in a space coordinate system.
Further, determining landmarks in the visual scene is specifically: the region in the visual scene preselected by the user serves as the landmark.
Further, determining landmarks in the visual scene is specifically: a salient object target in the visual scene is identified as the landmark using a subject identification algorithm, or a specific region is detected as the landmark using a target detection algorithm.
Further, the method further comprises: after the multi-degree-of-freedom information is acquired, initializing an image tracking algorithm by utilizing the multi-degree-of-freedom information, wherein the image tracking algorithm is used for acquiring the position and/or the area of the landmark in the current visual scene.
Further, the method further comprises: judging whether the current landmark is lost, if so, stopping the motion navigation and starting the re-detection step.
Further, the re-detection step specifically includes: and detecting the landmark by taking the last frame before the loss as a template, and if the landmark is detected, re-acquiring the multi-degree-of-freedom information of the intelligent object relative to the landmark.
Further, taking the central coordinates of the region of the landmark image as the abscissa and the ordinate of the landmark, further obtaining the abscissa and the ordinate of the intelligent body relative to the landmark, and taking the distance of the intelligent body camera relative to the landmark as the depth coordinate; the depth value obtaining process comprises the following steps: and acquiring a minimum circumscribed circle of the area of the landmark image, taking the product of the radius R of the circumscribed circle and the prior coefficient k as the depth coordinate of the landmark, and further acquiring the depth coordinate of the intelligent agent relative to the landmark.
Further, the method is characterized in that the change of the multi-degree-of-freedom change information of the intelligent agent relative to the landmark comprises the following steps: the change information of pitch angle, yaw angle and rotation angle, the displacement of the intelligent agent relative to the landmark plane and the depth displacement of the intelligent agent relative to the landmark; the displacement in the landmark plane is the variation of the coordinate of the landmark in the current image frame and the initial coordinate of the landmark.
Further, the depth displacement is determined according to a minimum circumcircle radius of the current landmark image area and a minimum circumcircle radius of the landmark image area when the landmark is constructed.
The embodiment of the invention also provides a visual navigation device based on the landmark, which comprises:
The landmark determining module is used for determining landmarks in the visual scene;
The multi-degree-of-freedom information construction module is used for acquiring multi-degree-of-freedom information of the intelligent object relative to the landmark;
the change information acquisition module is used for acquiring the position change information of the intelligent agent relative to the landmark;
and the visual navigation module is used for navigating the movement of the intelligent body according to the position change information.
Further, the multiple degree of freedom information includes coordinate information and angle information.
Further, the multi-degree-of-freedom information is six-degree-of-freedom information, and the six-degree-of-freedom information comprises an abscissa, an ordinate, a depth coordinate of the agent relative to the landmark in the visual scene, and a pitch angle, a yaw angle and a rotation angle of the agent in a spatial coordinate system; the multi-degree-of-freedom information construction module is specifically used for:
Acquiring a visual scene shot by the intelligent camera, analyzing the visual scene, and determining the abscissa, the ordinate and the depth coordinate of the intelligent body relative to the landmark;
and acquiring sensor data of the intelligent body, and determining the pitch angle, the yaw angle and the rotation angle of the intelligent body in a space coordinate system.
Further, the landmark determining module is specifically configured to: the region in the visual scene preselected by the user serves as the landmark.
Further, the landmark determining module is specifically configured to: a salient object target in the visual scene is identified as the landmark using a subject identification algorithm, or a specific region is detected as the landmark using a target detection algorithm.
Further, the multiple-degree-of-view information construction module is further configured to: after the multi-degree-of-freedom information is acquired, initializing an image tracking algorithm by utilizing the multi-degree-of-freedom information, wherein the image tracking algorithm is used for acquiring the position and/or the area of the landmark in the current visual scene.
Further, the visual navigation module is further configured to determine whether the current landmark is lost, and if the determination result is that the current landmark is lost, stop the motion navigation and start the re-detection module.
Further, the re-detection module is configured to detect the landmark by using the last frame before the loss as a template, and re-acquire the multi-degree-of-freedom information of the intelligent agent relative to the landmark if the landmark is detected.
Further, taking the central coordinates of the region of the landmark image as the abscissa and the ordinate of the landmark, further obtaining the abscissa and the ordinate of the intelligent body relative to the landmark, and taking the distance of the intelligent body camera relative to the landmark as the depth coordinate; the depth value obtaining process comprises the following steps: and acquiring a minimum circumscribed circle of the area of the landmark image, taking the product of the radius R of the circumscribed circle and the prior coefficient k as the depth coordinate of the landmark, and further acquiring the depth coordinate of the intelligent agent relative to the landmark.
Further, the method is characterized in that the change of the multi-degree-of-freedom change information of the intelligent agent relative to the landmark comprises the following steps: the change information of pitch angle, yaw angle and rotation angle, the displacement of the intelligent agent relative to the landmark plane and the depth displacement of the intelligent agent relative to the landmark; the displacement in the landmark plane is the variation of the coordinate of the landmark in the current image frame and the initial coordinate of the landmark.
Further, the depth displacement is determined according to a minimum circumcircle radius of the current landmark image area and a minimum circumcircle radius of the landmark image area when the landmark is constructed. The embodiment of the invention also provides an image acquisition method, which comprises the following steps:
determining an acquisition object in a visual scene, wherein the acquisition object is at least one obvious object or a specific area in the visual scene;
Acquiring an image of the object;
Acquiring multi-degree-of-freedom information of the intelligent agent relative to an acquisition object;
correlating the image of the acquisition object with the multi-degree-of-freedom information;
storing the image of the acquisition object and the associated multi-degree-of-freedom information.
Further, the determining of the acquisition object in the visual scene is specifically: and identifying a significant object in the visual scene as the acquisition object by using an image main body identification algorithm, or detecting a specific area as the acquisition object by using a target detection algorithm.
Further, the multiple degree of freedom information includes coordinate information and angle information.
Further, the multi-degree-of-freedom information is six-degree-of-freedom information, and the six-degree-of-freedom information comprises an abscissa, an ordinate and a depth coordinate of the intelligent agent relative to the acquisition object in the visual scene, and a pitch angle, a yaw angle and a rotation angle of the intelligent agent in a space coordinate system; the acquiring the multi-degree-of-freedom information of the acquisition object relative to the intelligent agent specifically comprises the following steps:
Acquiring a visual scene shot by the intelligent camera, analyzing the visual scene, and determining an abscissa, an ordinate and a depth coordinate of the intelligent body relative to the acquisition object;
and acquiring sensor data of the intelligent body, and determining the pitch angle, the yaw angle and the rotation angle of the intelligent body in a space coordinate system.
Further, the method further comprises:
acquiring environment attribute information when the intelligent agent acquires the object image;
associating the image of the acquisition object with the environment attribute information;
Storing the associated environment attribute information.
Further, the method further comprises:
acquiring multi-degree-of-freedom information and/or environment attribute information of a current intelligent agent relative to a specified object;
acquiring an image of the specified object according to the multi-degree-of-freedom information and/or the environment attribute information;
And presenting the image of the specified object.
The embodiments of the present invention also provide a computer program product comprising computer program instructions for implementing the aforementioned landmark-based visual navigation method or the aforementioned image acquisition method when the instructions are executed by a processor.
The embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored, which when executed, implements the aforementioned landmark-based visual navigation method or the aforementioned image acquisition method.
The beneficial effects of the invention are as follows: the invention provides a visual navigation method based on landmarks, which comprises the following steps: acquiring multi-degree-of-freedom information of the landmark relative to the intelligent body; acquiring multi-degree-of-freedom change information of the landmark relative to the intelligent body; and navigating the movement of the intelligent body according to the multi-degree-of-freedom change information. According to the scheme, six-degree-of-freedom information of a landmark in a visual scene is constructed through camera image information and pose information of the node gyroscope, wherein the six-degree-of-freedom information comprises three coordinate information of the landmark in the visual scene, namely transverse, longitudinal, far and near coordinate information and three angle information of pitching, rotating and yawing on the coordinate point; then, according to the six-degree-of-freedom information, high-frame-rate visual navigation with the landmark as a reference point can be realized. The invention can be used for displaying six-degree-of-freedom virtual objects/figures in VR/AR, and can also be used in scenes such as unmanned, robot navigation and the like. By matching with gyroscope information, high real-time navigation of an intelligent body can be realized on mobile equipment with common computing capacity.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flow chart of a method according to a first embodiment of the invention.
Fig. 2 is a schematic diagram of landmark regions in a visual scene.
FIG. 3 is a block diagram of a device according to a second embodiment of the present invention
Fig. 4 is a flow chart of a method according to a third embodiment of the invention.
Detailed Description
In order to facilitate an understanding and a complete description of the technical solutions of the present invention by a person skilled in the art, reference is made to the accompanying drawings, it being evident that the embodiments described are only some, but not all, of the embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Because the visual SLAM has complex image characteristics in the constructed map image information and has high algorithm complexity, the real-time navigation on the mobile equipment, particularly the mobile equipment (such as a mobile phone) with general computing power is difficult to realize.
According to the scheme, the image tracking algorithm is utilized, the image tracking of the landmark region is only needed, and the displacement and the posture change of the intelligent agent relative to the landmark can be obtained by utilizing the image coordinate system information and the gyroscope information. Because most of the existing image tracking algorithms have the characteristic of high real-time, real-time image tracking can be realized on mobile terminal equipment. Therefore, by matching with gyroscope information, high real-time navigation of the intelligent body can be realized on mobile equipment with common computing capacity.
The intelligent agent mainly refers to a movable device provided with a camera, a gyroscope and a computing unit, such as a smart phone, a unmanned aerial vehicle with a camera, and the like.
Example 1
Referring to fig. 1, an embodiment of the present invention provides a landmark-based visual navigation method, which includes a landmark determining step, a multi-degree-of-freedom information constructing step, a change information acquiring step, and a visual navigation step.
And a landmark determining step for determining landmarks in the visual scene. The landmark in the invention is a mark area for making position and gesture reference in the motion navigation process, and the position area in the visual scene is preset by a user, as shown in fig. 2, the user takes a clothes closet in the visual scene as the landmark. Of course, landmarks may also be determined by certain intelligent algorithms, such as: the most significant object target in the visual scene is identified as a landmark using a subject identification algorithm, or in a particular scene, a particular region (e.g., logo) is detected as a landmark using a target detection algorithm.
And a multi-degree-of-freedom information construction step of acquiring multi-degree-of-freedom information of the intelligent object relative to the landmark. After the multi-degree-of-freedom information is obtained, initializing an image tracking algorithm by utilizing the multi-degree-of-freedom information so as to realize the area tracking of the visual image layer.
Wherein the multi-degree-of-freedom information is six-degree-of-freedom information. The six degrees of freedom include an abscissa, an ordinate, a depth coordinate of the agent relative to the landmark in the visual scene, and pitch, yaw, and rotation angles of the agent in a spatial coordinate system. The visual scene is a visual scene in an image frame shot by the intelligent camera. Attitude angle information such as pitch angle, yaw angle and rotation angle can be obtained through a gyroscope in the intelligent agent.
As shown in fig. 2, the region center coordinates (x, y) of the landmark image are taken as the abscissa and the ordinate of the landmark, so as to obtain the abscissa and the ordinate of the intelligent agent relative to the landmark. And taking the distance of the intelligent camera relative to the landmark as a depth coordinate. The depth value obtaining process comprises the following steps: and acquiring a minimum circumscribed circle of the region of the landmark image, and taking the product of the radius R of the circumscribed circle and the prior coefficient k as the depth coordinate of the landmark, wherein d=R×k, and k is an empirical value set according to specific application and scene.
And the change information acquisition step is used for acquiring the multi-degree-of-freedom change information of the intelligent agent relative to the landmark. The multi-degree-of-freedom variation information includes: three azimuth angle change information (delta_p, delta_r, delta_y), displacement of the agent relative to the landmark plane (delta_x, delta_y), and depth displacement of the agent relative to the landmark delta_d.
The three azimuth angle change information is the change amount of the attitude information of the current intelligent agent and the attitude information when the landmark is constructed, taking a pitch angle as an example, setting the current gyroscope position as P1, and the pitch angle as P0 when the landmark is constructed, so that the angle delta_P=P1-P0 of the change of the pitch angle can be obtained. In the same way, three azimuth angle change information (delta_p, delta_r, delta_y) can be obtained.
For a change in position, we can obtain the position and region of the current landmark in the image through the image tracker. The displacement in the landmark plane is the amount of change of the coordinates of the landmark in the current image frame and the initial coordinates when the landmark is constructed. Taking the horizontal axis x as an example, let the horizontal axis coordinate of the landmark in the current image area be x1, and the initial position be x0, delta_x=x1-x 0 can be obtained. In the same way, the displacement (delta_x, delta_y) in the image plane can be obtained.
For depth displacement, let the minimum circumcircle radius of the current landmark image area be R1, R0 be the minimum circumcircle radius of the landmark image area when the landmark is constructed, delta_d=k (R1/R0).
And the visual navigation step is used for navigating the movement of the intelligent body according to the multi-degree-of-freedom change information.
Preferably, the visual navigation step further comprises: judging whether the current landmark is lost or not by utilizing an image tracking algorithm, if the judgment result is lost, stopping the motion navigation, and starting a re-detection step. Taking KCF (Kernel Correlation Filter) tracking algorithm as an example, the current tracking state can be determined by filtering the response value for each frame.
Preferably, the re-detection step specifically includes: and detecting the landmark by taking the image of the last frame before the loss as a template, and acquiring information of six degrees of freedom of the landmark again if the landmark is detected.
The image tracking algorithm can be any algorithm for realizing object tracking through images, and is not limited to KCF type visual target tracking algorithm.
Example two
Referring to fig. 3, a second embodiment of the present invention provides a visual navigation device 300 based on a landmark, where the device includes a landmark determining module 301, a multi-degree-of-freedom information constructing module 302, a change information obtaining module 303, and a visual navigation module 304.
The landmark determining module 301 is configured to determine landmarks in a visual scene. The landmark in the invention is a mark area for making position and gesture reference in the motion navigation process, and is a position area in a visual scene preset by a user. Of course, landmarks may also be determined by certain intelligent algorithms, such as: the most significant object target in the visual scene is identified as a landmark using a subject identification algorithm, or in a particular scene, a particular region (e.g., logo) is detected as a landmark using a target detection algorithm.
The multiple degree of freedom information construction module 302 is configured to obtain location information of the landmark relative to the agent. After the multi-degree-of-freedom information is obtained, initializing an image tracking algorithm by utilizing the multi-degree-of-freedom information so as to realize the area tracking of the visual image layer. Wherein the multi-degree-of-freedom information is six-degree-of-freedom information. The six degrees of freedom include an abscissa, an ordinate, a depth coordinate of the landmark in the visual scene, and pitch, yaw and rotation angles of the agent in a spatial coordinate system; the visual scene is a visual scene in an image frame shot by the intelligent camera.
The change information obtaining module 303 obtains the multi-degree-of-freedom change information of the landmark relative to the intelligent agent.
The visual navigation module 304 is used for navigating the movement of the intelligent agent according to the change information of the six degrees of freedom of the intelligent agent relative to the landmark.
Preferably, the apparatus further comprises a re-detection module 305. The visual navigation module 304 is further configured to determine whether the current landmark is lost through an image tracking algorithm, and if the determination result is that the landmark is lost, stop the motion navigation, and start the re-detection module 305.
And the re-detection module 305 is configured to detect the landmark by using the image of the last frame before the loss as a template, and re-acquire the multi-degree-of-freedom information of the landmark if the landmark is detected.
Example III
Referring to fig. 4, a third embodiment of the present invention provides an image acquisition method, including:
s401, determining an acquisition object in the visual scene, wherein the acquisition object is at least one obvious object or a specific area in the visual scene.
In addition to collecting specified objects, all objects in a visual scene may also be collected by the present invention. For objects, there is a difference in the kind of objects contained in different scenes. Such as an inter-template scene, objects including furniture, decorations, etc.; whereas for museum scenes, the objects include exhibits.
Acquisition objects are determined by certain intelligent algorithms, such as: a salient object in a visual scene is identified as an acquisition object using an image subject identification algorithm, or a specific region (e.g., logo, furniture, ornament, etc.) is detected as an acquisition object using a target detection algorithm in a specific scene. As shown in fig. 2, a wardrobe in a visual scene is taken as an acquisition object.
S402, acquiring an image of the object. The collected images can help the user browse scene space, such as home decoration scenes or museum scenes, and besides browsing certain specific objects at multiple angles, the user can browse the whole images of the visual scenes and/or the images of other objects.
S403, acquiring multi-degree-of-freedom information of the intelligent agent relative to the acquisition object. After the multi-degree-of-freedom information is obtained, initializing an image tracking algorithm by utilizing the multi-degree-of-freedom information so as to realize the area tracking of the visual image layer.
Wherein the multi-degree-of-freedom information is six-degree-of-freedom information. The six degrees of freedom include an abscissa, an ordinate, a depth coordinate of the agent with respect to the acquisition object, and a pitch angle, a yaw angle, and a rotation angle of the agent in a spatial coordinate system. The visual scene is a visual scene in an image frame shot by the intelligent camera. Attitude angle information such as pitch angle, yaw angle and rotation angle can be obtained through a gyroscope in the intelligent agent.
As shown in fig. 2, the region center coordinates (x, y) of the landmark image are taken as the abscissa and the ordinate of the landmark, so as to obtain the abscissa and the ordinate of the intelligent agent relative to the acquisition object. And taking the distance between the intelligent camera and the acquisition object relative to the camera as a depth coordinate. The depth coordinate acquiring process comprises the following steps: and acquiring a minimum circumscribed circle of the region of the landmark image, and taking the product of the radius R of the circumscribed circle and the prior coefficient k as the depth coordinate of the landmark, wherein d=R×k, and k is an empirical value set according to specific application and scene.
S404, correlating the image of the acquisition object with the multi-degree-of-freedom information. Thus, the mapping relation between the image of the acquisition object and the multi-degree-of-freedom information is established.
Preferably, the environment attribute information when the agent collects the image is further acquired, and the object in the visual scene is associated with the environment attribute information. The environment information includes a time of photographing, a scene type, season information of photographing, and the like.
S405, storing the image of the acquisition object and the associated multi-degree-of-freedom information. Preferably, the associated environment attribute information is also stored.
Through the steps, images of different angles and different distances between objects in the visual scene and the intelligent body are established. The method may also be used to present six degrees of freedom virtual objects/characters in VR/AR.
Preferably, the method further comprises: acquiring multi-degree-of-freedom information and/or environment attribute information of a current intelligent agent relative to a specified object; acquiring an image of the specified object according to the multi-degree-of-freedom information and/or the environment attribute information; and presenting an image of the specified object.
The method of the third embodiment of the invention can realize the acquisition of images of certain objects in the visual scene in the visual navigation process, and establishes the association relationship between the intelligent body track information and the acquired images. Based on the trajectory information, the view angle position, the view angle for viewing a certain formulated object, and the like can be accurately determined.
Taking sample plate image collection as an example, after the method of the third embodiment of the invention is adopted, the whole 3D image between sample plates and images with different visual angles of certain furniture/decorations can be generated based on the collected images. Other users (e.g., clients between visiting templates) can view images of different perspectives of a specified object, and also can view the overall 3D effect between templates as a reference for their own house purchase or decoration. It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described apparatus, modules and units may refer to corresponding procedures of the foregoing method embodiments, and are not repeated herein.
The embodiments of the present invention also disclose a computer program product comprising computer program instructions for implementing the method as in embodiment one or embodiment three when the instructions are executed by a processor.
The embodiment of the invention also discloses a computer readable storage medium, on which a computer program is stored, which when executed, implements the method as in the first or third embodiment.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of methods, apparatus and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart and block diagrams may represent a module, segment, or portion of code, which comprises one or more computer-executable instructions for implementing the logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. It will also be noted that each block or combination of blocks in the block diagrams and flowchart illustrations, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
The previous description of the disclosed embodiments, provided to enable any person skilled in the art to make or use the present invention, is provided for illustration only and not for limitation. Other variations or modifications of the various aspects of the invention will be apparent to those of skill in the art, and are within the scope of the invention.

Claims (20)

1. A landmark-based visual navigation method, the method comprising:
determining landmarks in a visual scene, wherein the landmarks are marker areas in the visual scene for position and posture reference in a motion navigation process;
Acquiring multi-degree-of-freedom information of the intelligent object relative to the landmark;
Acquiring multi-degree-of-freedom change information of the intelligent agent relative to the landmark;
and navigating the movement of the intelligent body according to the multi-degree-of-freedom change information.
2. The method of claim 1, wherein the multiple degree of freedom information includes coordinate information and angle information.
3. The method of claim 2, wherein the multi-degree-of-freedom information is six degrees-of-freedom information including an abscissa, an ordinate, a depth coordinate of the agent relative to the landmark in the visual scene, and a pitch angle, a yaw angle, and a rotation angle of the agent in a spatial coordinate system; the obtaining the multi-degree-of-freedom information of the landmark relative to the intelligent agent specifically comprises the following steps:
Acquiring a visual scene shot by the intelligent camera, analyzing the visual scene, and determining the abscissa, the ordinate and the depth coordinate of the intelligent body relative to the landmark;
and acquiring sensor data of the intelligent body, and determining the pitch angle, the yaw angle and the rotation angle of the intelligent body in a space coordinate system.
4. The method of claim 1, wherein determining landmarks in the visual scene is specifically: the region in the visual scene preselected by the user serves as the landmark.
5. The method of claim 1, wherein determining landmarks in the visual scene is specifically: a salient object target in the visual scene is identified as the landmark using a subject identification algorithm, or a specific region is detected as the landmark using a target detection algorithm.
6. The method of claim 1, wherein the method further comprises: after the multi-degree-of-freedom information is acquired, initializing an image tracking algorithm by utilizing the multi-degree-of-freedom information, wherein the image tracking algorithm is used for acquiring the position and/or the area of the landmark in the current visual scene.
7. The method of claim 1, wherein the method further comprises: judging whether the current landmark is lost, if so, stopping the motion navigation and starting the re-detection step.
8. The method of claim 7, wherein the re-detecting step is specifically: and detecting the landmark by taking the last frame before the loss as a template, and if the landmark is detected, re-acquiring the multi-degree-of-freedom information of the intelligent object relative to the landmark.
9. The method of claim 3, wherein the region center coordinates of the landmark image are taken as the abscissa and the ordinate of the landmark, and further the abscissa and the ordinate of the intelligent body relative to the landmark are obtained, and the distance of the intelligent body camera relative to the landmark is taken as the depth coordinate; the depth coordinate acquiring process comprises the following steps: and acquiring a minimum circumscribed circle of the area of the landmark image, taking the product of the radius R of the circumscribed circle and the prior coefficient k as the depth coordinate of the landmark, and further acquiring the depth coordinate of the intelligent agent relative to the landmark.
10. The method of claim 3, wherein the variation of the multiple degree of freedom variation information of the agent relative to the landmark comprises: the change information of pitch angle, yaw angle and rotation angle, the displacement of the intelligent agent relative to the landmark plane and the depth displacement of the intelligent agent relative to the landmark; the displacement in the landmark plane is the variation of the coordinate of the landmark in the current image frame and the initial coordinate of the landmark.
11. The method of claim 10, wherein the depth displacement is determined based on a minimum circumscribing radius of the current landmark image area and a minimum circumscribing radius of the landmark image area at the time of landmark construction.
12. A landmark-based visual navigation device, comprising:
The landmark determining module is used for determining landmarks in a visual scene, wherein the landmarks are mark areas for making position and gesture references in the motion navigation process in the visual scene;
the multi-degree-of-freedom information construction module is used for acquiring multi-degree-of-freedom information of the intelligent object relative to the landmark;
the change information acquisition module is used for acquiring the position change information of the intelligent agent relative to the landmark;
and the visual navigation module is used for navigating the movement of the intelligent body according to the position change information.
13. An image acquisition method, the method comprising:
determining an acquisition object in a visual scene, wherein the acquisition object is at least one obvious object or a specific area in the visual scene;
Acquiring an image of the object;
acquiring multi-degree-of-freedom information of the intelligent agent relative to the acquisition object;
correlating the image of the acquisition object with the multi-degree-of-freedom information;
storing the image of the acquisition object and the associated multi-degree-of-freedom information.
14. The method according to claim 13, wherein determining the acquisition object in the visual scene is in particular: and identifying a significant object in the visual scene as the acquisition object by using an image main body identification algorithm, or detecting a specific area as the acquisition object by using a target detection algorithm.
15. The method of claim 13, wherein the multiple degree of freedom information includes coordinate information and angle information.
16. The method of claim 15, wherein the multi-degree-of-freedom information is six degrees-of-freedom information including an abscissa, an ordinate, a depth coordinate of the agent relative to the acquisition object in the visual scene, and a pitch angle, a yaw angle, and a rotation angle of the agent in a spatial coordinate system; the acquiring the multi-degree-of-freedom information of the acquisition object relative to the intelligent agent specifically comprises the following steps:
Acquiring a visual scene shot by the intelligent camera, analyzing the visual scene, and determining an abscissa, an ordinate and a depth coordinate of the intelligent body relative to the acquisition object;
and acquiring sensor data of the intelligent body, and determining the pitch angle, the yaw angle and the rotation angle of the intelligent body in a space coordinate system.
17. The method of claim 13, wherein the method further comprises:
acquiring environment attribute information when the intelligent agent acquires the object image;
Associating the image of the object with the environmental attribute information;
Storing the associated environment attribute information.
18. The method of claim 13, wherein the method further comprises:
acquiring multi-degree-of-freedom information and/or environment attribute information of a current intelligent agent relative to a specified object;
acquiring an image of the specified object according to the multi-degree-of-freedom information and/or the environment attribute information;
And presenting the image of the specified object.
19. A computer program product comprising computer program instructions for implementing the visual navigation method of any one of claims 1-11 or the image acquisition method of any one of claims 13-18 when the instructions are executed by a processor.
20. A computer readable storage medium having stored thereon a computer program which, when executed, implements the visual navigation method of any of claims 1-11 or the image acquisition method of any of claims 13-18.
CN202010652637.4A 2020-07-08 2020-07-08 Visual navigation based on landmarks Active CN113280817B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010652637.4A CN113280817B (en) 2020-07-08 2020-07-08 Visual navigation based on landmarks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010652637.4A CN113280817B (en) 2020-07-08 2020-07-08 Visual navigation based on landmarks

Publications (2)

Publication Number Publication Date
CN113280817A CN113280817A (en) 2021-08-20
CN113280817B true CN113280817B (en) 2024-07-23

Family

ID=77275622

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010652637.4A Active CN113280817B (en) 2020-07-08 2020-07-08 Visual navigation based on landmarks

Country Status (1)

Country Link
CN (1) CN113280817B (en)

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6182010B1 (en) * 1999-01-28 2001-01-30 International Business Machines Corporation Method and apparatus for displaying real-time visual information on an automobile pervasive computing client
JP2004030445A (en) * 2002-06-27 2004-01-29 National Institute Of Advanced Industrial & Technology Mobile robot self-position estimation method and system, and program
WO2006109527A1 (en) * 2005-03-30 2006-10-19 National University Corporation Kumamoto University Navigation device and navigation method
CN105241445B (en) * 2015-10-20 2018-07-31 深圳大学 A kind of indoor navigation data capture method and system based on intelligent mobile terminal
TWI574223B (en) * 2015-10-26 2017-03-11 行政院原子能委員會核能研究所 Navigation system using augmented reality technology
CN105910615B (en) * 2016-03-30 2019-08-30 上海工业控制安全创新科技有限公司 A walking navigation method and system based on virtual reality
CN111197984A (en) * 2020-01-15 2020-05-26 重庆邮电大学 Vision-inertial motion estimation method based on environmental constraint

Also Published As

Publication number Publication date
CN113280817A (en) 2021-08-20

Similar Documents

Publication Publication Date Title
CN106092104B (en) A kind of method for relocating and device of Indoor Robot
CN111275763B (en) Closed loop detection system, multi-sensor fusion SLAM system and robot
CN108489482B (en) The realization method and system of vision inertia odometer
CN107967457B (en) A method and system for location recognition and relative positioning that adapts to changes in visual features
CN108406731B (en) A positioning device, method and robot based on depth vision
CN109084732A (en) Positioning and navigation method, device and processing equipment
CN109887053A (en) A kind of SLAM map joining method and system
US20060188131A1 (en) System and method for camera tracking and pose estimation
CN110749308B (en) SLAM-oriented outdoor localization method using consumer-grade GPS and 2.5D building models
CN107665505B (en) Method and device for realizing augmented reality based on plane detection
CN114063099B (en) Positioning method and device based on RGBD
CN110599545B (en) Feature-based dense map construction system
CN113447014A (en) Indoor mobile robot, mapping method, positioning method, and mapping positioning device
Liu et al. Towards SLAM-based outdoor localization using poor GPS and 2.5 D building models
US10977810B2 (en) Camera motion estimation
KR102342945B1 (en) Estimating location method and apparatus for autonomous driving with surround image
CN110827353A (en) A robot positioning method based on monocular camera assistance
CN113689499B (en) A visual rapid positioning method, device and system based on point-surface feature fusion
Xian et al. Fusing stereo camera and low-cost inertial measurement unit for autonomous navigation in a tightly-coupled approach
Huttunen et al. A monocular camera gyroscope
CN112200917A (en) High-precision augmented reality method and system
Bergeon et al. Low cost 3D mapping for indoor navigation
CN117213515A (en) Visual SLAM path planning method and device, electronic equipment and storage medium
CN113280817B (en) Visual navigation based on landmarks
CN114627253A (en) Map construction method, device and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant