CN117974810A

CN117974810A - Target positioning method and device based on binocular vision and electronic equipment

Info

Publication number: CN117974810A
Application number: CN202410382382.2A
Authority: CN
Inventors: 王磊; 付伟男; 王毓综
Original assignee: Hangzhou Lingxi Robot Intelligent Technology Co ltd
Current assignee: Hangzhou Lingxi Robot Intelligent Technology Co ltd
Priority date: 2024-04-01
Filing date: 2024-04-01
Publication date: 2024-05-03
Anticipated expiration: 2044-04-01
Also published as: CN117974810B

Abstract

The application relates to a target positioning method based on binocular vision, which is realized by means of a 3D camera, a first 2D camera and a second 2D camera, wherein the first 2D camera and the second 2D camera are arranged on two sides of the 3D camera and are respectively used for acquiring a first image and a second image of a target object at different angles, and the method comprises the following steps: determining a target image in the three-dimensional image based on a preset target frame, and determining a first mask and a second mask according to projection of the target image in the first image and the second image; filtering the first image through a first mask to obtain a first filtered image, and filtering the second image through a second mask to obtain a second filtered image; identifying target line segments in the first filtering image and the second filtering image respectively, and positioning a target area based on the target line segments, wherein the target area comprises an image of a target object; and determining the center point of the target object in the target area in the first filtering image and the second filtering image respectively, and determining the three-dimensional position of the target object according to the center point.

Description

Target positioning method and device based on binocular vision and electronic equipment

Technical Field

The present application relates to the field of image processing, and in particular, to a binocular vision-based target positioning method, apparatus and electronic device.

Background

In a task scene of computer vision, there are various target objects that need to be identified and grasped, and the positions of the objects need to be accurately identified.

The prior art uses three-dimensional imaging cameras such as 3D structured light cameras, binocular speckle cameras, and TOF cameras to determine the position of an object in three-dimensional space. However, in the prior art, for objects such as metal workpieces which are easy to reflect light, decoding fails due to easy overexposure of the surfaces of the objects, so that complete point clouds cannot be obtained, and the accuracy of target positioning is low.

The existing production line has very high requirements on grabbing and placing, has compact production beat, is generally a few millimeters, has poor three-dimensional imaging effect on objects easy to reflect, cannot meet the production requirements, and has the problems of low target positioning accuracy and incorrect positioning.

Disclosure of Invention

The embodiment of the application provides a target positioning method and device based on binocular vision and electronic equipment, which at least solve the problems of low target positioning accuracy and error positioning in the related technology.

In a first aspect, an embodiment of the present application provides a binocular vision-based target positioning method, where the method is implemented by means of a 3D camera, a first 2D camera and a second 2D camera, where the 3D camera is used to acquire a three-dimensional image of a target object, and the first 2D camera and the second 2D camera are disposed on two sides of the 3D camera and are respectively used to acquire a first image and a second image of the target object at different angles; the method comprises the following steps:

Determining a target image in the three-dimensional image based on a preset target frame, determining a first mask according to the projection of the target image in the first image, and determining a second mask according to the projection of the target image in the second image;

Filtering the first image through the first mask to obtain a first filtered image, and filtering the second image through the second mask to obtain a second filtered image;

identifying a target line segment in the first filtering image and the second filtering image respectively, and positioning a target area based on the target line segment, wherein the target area comprises an image of the target object;

And respectively determining the center point of the target object in the target area in the first filtering image and the second filtering image, and determining the three-dimensional position of the target object according to the center point.

In an embodiment, the determining, in the first filtered image and the second filtered image, a center point of a target object in a target area, and determining a three-dimensional position of the target object according to the center point, includes:

performing feature matching on the target area by adopting preset features of the target object;

Determining a first center point of the target object in a first filtered image and a second center point in the second filtered image based on the feature matching result;

Based on the principle of similar triangles, the three-dimensional position of the target object is determined according to the position relation between the first center point and the second center point and camera calibration parameters.

In an embodiment, the target image comprises images of at least two target objects; acquiring the target area comprises the following steps:

Filtering the first image through the first mask to obtain a first filtered image at least comprising two regions to be selected, and filtering the second image through the second mask to obtain a second filtered image at least comprising two regions to be selected;

Extracting line segments from the first filtering image and the second filtering image respectively, clustering the line segments to obtain line segments to be selected, and screening the line segments to be selected to obtain target line segments;

And determining whether the target line segments are distributed in the region to be selected, and if so, determining the region to be selected as the target region.

In an embodiment, the determining, in the first filtered image and the second filtered image, a center point of the target object in the target area includes:

determining corresponding target areas in the first filtered image and the second filtered image;

And determining the center point of the target object in the first filtering image and the second filtering image according to the corresponding target area.

In an embodiment, the determining the corresponding target area in the first filtered image and the second filtered image includes:

Determining the corresponding relation of the target line segments according to the corresponding positions of the target line segments in the first filtering image and the second filtering image;

and determining the corresponding target area according to the corresponding relation.

In an embodiment, determining the three-dimensional position of the target object comprises:

simultaneously, carrying out feature matching on the corresponding target areas to obtain a first center point of each target object in a first filtering image and a second center point of each target object in a second filtering image;

And simultaneously determining the three-dimensional position of each target object according to the first center point and the second center point of each target object to obtain a group of three-dimensional positions of all target objects.

In a second aspect, an embodiment of the present application provides a binocular vision-based target positioning device, where the device includes a 3D camera, a first 2D camera, and a second 2D camera, and the device is configured to implement the binocular vision-based target positioning method in the first aspect in a matching manner;

the 3D camera is used for acquiring a three-dimensional image of the target object;

the first 2D camera and the second 2D camera are arranged on two sides of the 3D camera and are used for acquiring first images and second images of different angles of a target object.

In an embodiment, the device further comprises light sources arranged above the left side and above the right side of the target object, respectively, the light sources being used for supplementing the target object with illumination.

In a third aspect, an embodiment of the present application provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the binocular vision-based target positioning method according to the first aspect when the processor executes the computer program.

In a fourth aspect, an embodiment of the present application provides a computer readable storage medium having stored thereon a computer program, which when executed by a processor, implements the binocular vision-based target positioning method of the first aspect.

The binocular vision-based target positioning method, the binocular vision-based target positioning device and the electronic equipment provided by the embodiment of the application have at least the following technical effects.

The application filters the first image and the second image through the first mask and the second mask, and filters the background information in the images so as to reduce the background information and other invalid interference information. The target line segment is used for filtering out the area irrelevant to the target object, namely the interference area not containing the target object, so that the position of the target object can be accurately judged later, the calculation amount in the process of determining the center point later is reduced, and the target positioning speed is improved. Meanwhile, the 3D image matching problem is converted into the 2D image through the 2D camera, so that on one hand, the calculation amount of target positioning is reduced, the speed of target positioning is improved, and on the other hand, the target positioning is realized on the 2D image, and the problem of inaccurate target positioning caused by the fact that the 3D image information is lost due to the environment problems such as reflection is avoided.

The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below to provide a more thorough understanding of the other features, objects, and advantages of the application.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:

FIG. 1 is a flow chart illustrating a method of target location according to an exemplary embodiment;

FIG. 2 is a schematic diagram of a target object shown according to an example embodiment;

FIG. 3 is a schematic diagram of a target positioning device according to an exemplary embodiment;

FIG. 4 is a schematic diagram of a structure of an object positioning device according to another exemplary embodiment;

fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The present application will be described and illustrated with reference to the accompanying drawings and examples in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application. All other embodiments, which can be made by a person of ordinary skill in the art based on the embodiments provided by the present application without making any inventive effort, are intended to fall within the scope of the present application.

It is apparent that the drawings in the following description are only some examples or embodiments of the present application, and it is possible for those of ordinary skill in the art to apply the present application to other similar situations according to these drawings without inventive effort. Moreover, it should be appreciated that while such a development effort might be complex and lengthy, it would nevertheless be a routine undertaking of design, fabrication, or manufacture for those of ordinary skill having the benefit of this disclosure, and thus should not be construed as having the benefit of this disclosure.

Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is to be expressly and implicitly understood by those of ordinary skill in the art that the described embodiments of the application can be combined with other embodiments without conflict.

Unless defined otherwise, technical or scientific terms used herein should be given the ordinary meaning as understood by one of ordinary skill in the art to which this application belongs. The terms "a," "an," "the," and similar referents in the context of the application are not to be construed as limiting the quantity, but rather as singular or plural. The terms "comprising," "including," "having," and any variations thereof, are intended to cover a non-exclusive inclusion; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to only those steps or elements but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. The terms "connected," "coupled," and the like in connection with the present application are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. The term "plurality" as used herein means two or more. "and/or" describes an association relationship of an association object, meaning that there may be three relationships, e.g., "a and/or B" may mean: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship. The terms "first," "second," "third," and the like, as used herein, are merely distinguishing between similar objects and not representing a particular ordering of objects.

In a first aspect, an embodiment of the present application provides a binocular vision-based target positioning method, where the method is implemented by means of a 3D camera, a first 2D camera and a second 2D camera, where the 3D camera is used to acquire a three-dimensional image of a target object, and the first 2D camera and the second 2D camera are disposed on two sides of the 3D camera and are respectively used to acquire a first image and a second image of the target object at different angles.

Optionally, the first 2D camera and the second 2D camera are 2D grayscale cameras of the same resolution, and the 3D camera is a low precision 3D camera. The first 2D camera and the second 2D camera are symmetrically arranged on two sides of the 3D camera. The target object is placed on a pre-established rack.

FIG. 1 is a flow chart illustrating a method of target location, as shown in FIG. 1, according to an exemplary embodiment, the method comprising:

step S101, determining a target image in the three-dimensional image based on a preset target frame, determining a first mask according to the projection of the target image in the first image, and determining a second mask according to the projection of the target image in the second image.

Optionally, a preset target frame is determined by a region of interest (Region Of Interest, ROI) selection method, and the three-dimensional image is filtered according to the preset target frame to obtain a target image containing the target object placed on the work-piece carrier. And respectively projecting the target image into a first image and a second image acquired by the two 2D cameras to obtain a first mask and a second mask.

Step S102, the first image is filtered through a first mask to obtain a first filtered image, and the second image is filtered through a second mask to obtain a second filtered image.

Optionally, the first image and the second image are filtered through the first mask and the second mask, and background information in the images is filtered, so that background information and other invalid interference information are reduced, and the follow-up accurate positioning of the target object is facilitated.

Step S103, identifying target line segments in the first filtered image and the second filtered image respectively, and locating a target area based on the target line segments, wherein the target area contains an image of a target object.

Optionally, the target object may be an object with any shape, and the target object is placed on a pre-formulated material rack, and specifically, the corresponding material rack and the target object are designed according to actual application conditions. The target object in this example is a cylindrical workpiece, such as an automotive link bearing. Fig. 2 is a schematic diagram of a target object according to an exemplary embodiment, as shown in fig. 2, when the target object is a cylindrical workpiece, the target line segment is a line segment in a vertical direction, and if the target line segment is within an image range of the target object, an area corresponding to the target object is taken as a target area.

In this way, the interference area is filtered, the area containing the target object is preliminarily determined, the position of the target object can be accurately judged later, and the characteristic matching is carried out in the filtered area later, so that the calculated amount is reduced compared with the characteristic matching of the whole image directly.

In one example, the target image contains images of at least two target objects, step S103 includes:

step S301, filtering the first image through a first mask to obtain a first filtered image including at least two candidate regions, and filtering the second image through a second mask to obtain a second filtered image including at least two candidate regions.

Step S302, extracting line segments from the first filtering image and the second filtering image respectively, clustering the line segments to obtain line segments to be selected, and screening the line segments to be selected to obtain target line segments.

Step S303, determining whether a target line segment is distributed in the region to be selected, if so, determining the region to be selected as a target region.

Optionally, when the target image includes at least two images of the target object, the first filtered image and the second filtered image include at least two candidate regions, respectively. And identifying short line segments in the first filtering image and the second filtering image, processing the short line segments through a clustering algorithm to obtain long line segments, namely line segments to be selected, and screening the line segments to be selected to obtain target line segments. In this embodiment, assuming that the target object is a cylindrical workpiece, referring to fig. 2, a line segment to be selected in the vertical direction is screened as a target line segment. And determining the target area by judging whether the target line segment exists in the area to be selected. In this way, the area irrelevant to the target object, namely the interference area not containing the target object, is filtered, so that the position of the target object can be accurately judged later, the calculation amount in the process of matching the subsequent features is reduced, and the target positioning speed is improved.

With continued reference to fig. 1, step S104 is performed after step S103, as follows.

Step S104, respectively determining the center point of the target object in the target area in the first filtering image and the second filtering image, and determining the three-dimensional position of the target object according to the center point.

Optionally, feature matching is performed on the target area according to the features of the target object, so that the central points of the target object in the first image and the second image are determined, and the three-dimensional position of the target object is calculated through the central points of the target object in the first image and the second image according to the binocular imaging principle.

In this way, the three-dimensional position of the target object is determined according to the 2D camera and the binocular imaging principle, and the 3D image matching problem is converted into the 2D image, so that on one hand, the calculation amount of target positioning is reduced, the speed of target positioning is improved, and on the other hand, positioning is realized on the 2D image, and the problem that the target positioning is inaccurate due to the fact that the 3D image information is lost due to the environment problems such as reflection is avoided.

In one example, step S104 includes:

In step S401, feature matching is performed on the target area by using the preset features of the target object.

Step S402, determining a first center point of the target object in the first filtered image and a second center point in the second filtered image based on the feature matching result.

Step S403, based on the principle of similar triangle, determining the three-dimensional position of the target object according to the position relationship between the first center point and the second center point and the camera calibration parameters.

Optionally, sliding from left to right to bottom during matching, calculating correlation values between all pixel points in the target area and characteristic pixel points of the target object during each matching, and taking the position where the correlation value is maximum and greater than a preset threshold value as the position of the target object. In this way, a first center point of the target object in the first filtered image and a second center point in the second filtered image are determined. And (3) corresponding the first center point and the second center point, calculating binocular parallax according to the first center point and the second center point, and then carrying out similar triangle calculation according to camera calibration parameters to obtain the three-dimensional position of the target object.

In this way, feature matching is performed on the target area, and then the three-dimensional position of the target object is obtained according to the matching result. When the multi-layer material rack exists, objects on the lower-layer material rack can be filtered out, and positioning errors are avoided. Meanwhile, the 3D image matching problem is converted into a 2D image, so that on one hand, the calculation amount of target positioning is reduced, the speed of target positioning is improved, and on the other hand, positioning is realized on the 2D image, and the problem that the target positioning is inaccurate due to the fact that 3D image information is lost due to environment problems such as reflection is avoided.

In one example, the target image contains images of at least two target objects, step S104 includes:

In step S1041, a corresponding target area is determined in the first filtered image and the second filtered image.

In step S1042, the center points of the target objects in the first filtered image and the second filtered image are determined according to the corresponding target areas.

Optionally, the target object is mapped to a target region in the first filtered image and the second filtered image. The target objects in the images acquired by the first camera and the second camera are generally the same, and a first target area in the first filtered image corresponds to a first target area in the second filtered image in the order from left to right, and a second target area in the first filtered image corresponds to a second target area in the second filtered image, so that the target areas in the first filtered image and the second filtered image are corresponding. And determining the center point of each target object in the first filtering image and the second filtering image according to the corresponding relation of the target areas. The target areas of the target objects in the first filtering image and the second filtering image are in one-to-one correspondence, so that the subsequent feature matching of the target areas is facilitated, and the speed of target positioning is improved.

In one example, step S1041 includes: and determining the corresponding relation of the target line segments according to the corresponding positions of the target line segments in the first filtering image and the second filtering image. And determining a corresponding target area according to the corresponding relation.

Optionally, the target objects in the images acquired by the first camera and the second camera are generally the same, the first target line segment in the first filtered image and the first target line segment in the second filtered image are corresponding in order from left to right, the second target line segment in the first filtered image and the second target line segment in the second filtered image are corresponding, and the target line segments in the first filtered image and the second filtered image are all corresponding in order. And determining the corresponding relation of the target areas according to the target areas to which the target line segments belong. In this way, the target areas of the target objects in the first filtering image and the second filtering image are in one-to-one correspondence, so that the subsequent feature matching of the target areas is facilitated, and the speed of target positioning is improved.

In one example, the target image comprises images of at least two target objects, determining a three-dimensional position of the target objects, comprising:

Step S501, feature matching is performed on the corresponding target areas at the same time, so as to obtain a first center point of each target object in the first filtered image and a second center point of each target object in the second filtered image.

Step S502, determining the three-dimensional position of each target object according to the first center point and the second center point of each target object, and obtaining a group of three-dimensional positions of all target objects.

According to the corresponding relation of the target areas in the foregoing embodiment, feature matching is performed on the corresponding target areas at the same time to obtain a first center point and a second center point of each target object, and three-dimensional positions of the target objects are calculated according to the first center point and the second center point of each target object through a multithreading technology and based on a binocular imaging principle, so that when the target image contains images of at least two target objects, a set of three-dimensional positions of all target objects is obtained.

In this way, the feature matching is performed on the target area of the corresponding area, so that the speed of target positioning is greatly improved, and the time required from shooting the image to positioning the target object is only 1.5 seconds in the embodiment of the application, so that the production requirement can be well met.

In summary, the application converts the 3D image matching problem into the 2D image through the 2D camera, thereby reducing the calculation amount of target positioning, improving the speed of target positioning, realizing target positioning on the 2D image, and avoiding the problem of inaccurate target positioning caused by 3D image information loss due to environment problems such as reflection and the like. The application filters the first image and the second image through the first mask and the second mask, and filters the background information in the images so as to reduce the background information and other invalid interference information. The target line segment is used for filtering out the area irrelevant to the target object, namely the interference area not containing the target object, so that the position of the target object can be accurately judged later, the calculated amount during the matching of the subsequent features is reduced, and the target positioning speed is improved. In addition, the application can simultaneously perform feature matching on the target area of the corresponding area, greatly improve the speed of target positioning and well meet the production requirement.

In a second aspect, an embodiment of the present application provides a binocular vision-based target positioning apparatus, where the apparatus includes a 3D camera, a first 2D camera, and a second 2D camera, and the apparatus is configured to cooperate with the binocular vision-based target positioning method of the first aspect.

The 3D camera is used to acquire a three-dimensional image of the target object. The first 2D camera and the second 2D camera are arranged on two sides of the 3D camera and are used for acquiring first images and second images of different angles of the target object.

Optionally, the first 2D camera and the second 2D camera are 2D grayscale cameras of the same resolution, and the 3D camera is a low precision 3D camera. The target positioning device is matched with a pre-established material rack, and a target object is placed on the material rack. Fig. 3 is a schematic structural view of an object positioning apparatus according to an exemplary embodiment, and as shown in fig. 3, a first 2D camera and a second 2D camera are symmetrically disposed at both sides of a 3D camera.

The target positioning device provided by the application is matched with the target positioning method in the first aspect to position the target object, so that the accuracy and the speed of target positioning can be improved. The problem of the difficult snatch that the target object of metal material when three-dimensional location rate is lower leads to is solved, in actual production, can realize that it only takes about 1.5 seconds from beginning shooting the image of target object to confirm the three-dimensional position of target object, can satisfy the production demand well.

In one example, the apparatus further comprises light sources disposed above the left side and above the right side of the target object, respectively, the light sources being for supplementing the target object with illumination.

Alternatively, taking a strip light source as an example, fig. 4 is a schematic structural diagram of a target positioning device according to another exemplary embodiment, as shown in fig. 4, the strip light source supplements illumination to a target object, so that the imaging of a camera is clearer, meanwhile, the light source forms a virtual image of the light source on the surface of a cylindrical target object due to specular reflection, so that a plurality of obvious white stripes are formed on an image of the target object obtained after the 2D camera shoots. In this way, the texture of the target object is increased, which is beneficial to obtaining the target line segment and the feature matching, thereby effectively improving the accuracy of target positioning.

In a third aspect, an embodiment of the present application provides an electronic device, and fig. 5 is a schematic structural diagram of the electronic device provided in the embodiment of the present application. The electronic device includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the binocular vision-based object positioning method provided in the first aspect when executing the program, and the electronic device 60 shown in fig. 5 is merely an example, and should not impose any limitation on the functions and the scope of use of the embodiment of the present application.

The electronic device 60 may be in the form of a general purpose computing device, which may be a server device, for example. Components of electronic device 60 may include, but are not limited to: the at least one processor 61, the at least one memory 62, a bus 63 connecting the different system components, including the memory 62 and the processor 61.

The bus 63 includes a data bus, an address bus, and a control bus.

Memory 62 may include volatile memory such as Random Access Memory (RAM) 621 and/or cache memory 622, and may further include Read Only Memory (ROM) 623.

Memory 62 may also include a program/utility 625 having a set (at least one) of program modules 624, such program modules 624 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.

The processor 61 executes various functional applications and data processing, such as the binocular vision-based object localization method of the first aspect of the present application, by running a computer program stored in the memory 62.

The electronic device 60 may also communicate with one or more external devices 64 (e.g., keyboard, pointing device, etc.). Such communication may occur through an input/output (I/O) interface 65. Also, the model-generating device 60 may also communicate with one or more networks, such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet, through a network adapter 66. As shown, the network adapter 66 communicates with other modules of the model-generating device 60 via the bus 63. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in connection with the model-generating device 60, including, but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID (disk array) systems, tape drives, data backup storage systems, and the like.

It should be noted that although several units/modules or sub-units/modules of an electronic device are mentioned in the above detailed description, such a division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more units/modules described above may be embodied in one unit/module in accordance with embodiments of the present invention. Conversely, the features and functions of one unit/module described above may be further divided into ones that are embodied by a plurality of units/modules.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium having stored thereon a program which, when executed by a processor, implements the binocular vision-based object localization method provided in the first aspect.

More specifically, among others, readable storage media may be employed including, but not limited to: portable disk, hard disk, random access memory, read only memory, erasable programmable read only memory, optical storage device, magnetic storage device, or any suitable combination of the foregoing.

In a possible implementation manner, the present invention may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps of implementing the binocular vision based object positioning method provided in the first aspect, when the program product is run on the terminal device.

Wherein the program code for carrying out the invention may be written in any combination of one or more programming languages, which program code may execute entirely on the user device, partly on the user device, as a stand-alone software package, partly on the user device and partly on the remote device or entirely on the remote device.

The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples illustrate only a few embodiments of the application, which are described in detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.

Claims

1. The target positioning method based on binocular vision is characterized by being realized by means of a 3D camera, a first 2D camera and a second 2D camera, wherein the 3D camera is used for acquiring three-dimensional images of a target object, and the first 2D camera and the second 2D camera are arranged on two sides of the 3D camera and are respectively used for acquiring first images and second images of different angles of the target object; the method comprises the following steps:

2. The binocular vision-based object localization method of claim 1, wherein the determining a center point of the object within the object region in the first filtered image and the second filtered image, respectively, and determining the three-dimensional position of the object according to the center point comprises:

3. The binocular vision based target positioning method of claim 1, wherein the target image comprises images of at least two target objects; acquiring the target area comprises the following steps:

4. A binocular vision based target positioning method according to claim 3, wherein the determining the center point of the target object in the target region in the first and second filtered images, respectively, comprises:

5. The binocular vision based object localization method of claim 4, wherein the determining the corresponding object regions in the first and second filtered images comprises:

6. The binocular vision based target positioning method of claim 5, wherein determining the three-dimensional position of the target object comprises:

7. A binocular vision-based target positioning device, characterized in that the device comprises a 3D camera, a first 2D camera and a second 2D camera, and the device is used for matching with the binocular vision-based target positioning method according to any one of claims 1-6;

8. The binocular vision-based target positioning apparatus of claim 7, further comprising light sources disposed respectively above left and right sides of the target object, the light sources being used to supplement illumination for the target object.

9. An electronic device, comprising

The memory device is used for storing the data,

A processor, and

Computer program stored on the memory and executable on the processor, which when executed implements the binocular vision based object localization method of any one of claims 1 to 6.

10. A computer-readable storage medium, on which a computer program is stored, characterized in that the program, when executed by a processor, implements the binocular vision-based object localization method of any one of claims 1 to 6.