[go: up one dir, main page]

WO2021188104A1 - Estimation de pose d'objet et détection de défauts - Google Patents

Estimation de pose d'objet et détection de défauts Download PDF

Info

Publication number
WO2021188104A1
WO2021188104A1 PCT/US2020/023452 US2020023452W WO2021188104A1 WO 2021188104 A1 WO2021188104 A1 WO 2021188104A1 US 2020023452 W US2020023452 W US 2020023452W WO 2021188104 A1 WO2021188104 A1 WO 2021188104A1
Authority
WO
WIPO (PCT)
Prior art keywords
mask
model
image
detection
pose
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2020/023452
Other languages
English (en)
Inventor
Qian Lin
Augusto Cavalcante VALENTE
Deangeli Gomes NEVES
Guilherme Augusto Silva MEGETO
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hewlett Packard Development Co LP
Original Assignee
Hewlett Packard Development Co LP
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hewlett Packard Development Co LP filed Critical Hewlett Packard Development Co LP
Priority to PCT/US2020/023452 priority Critical patent/WO2021188104A1/fr
Publication of WO2021188104A1 publication Critical patent/WO2021188104A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • G06T7/001Industrial image inspection using an image reference approach
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/75Determining position or orientation of objects or cameras using feature-based methods involving models
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/54Extraction of image or video features relating to texture
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • G06V20/653Three-dimensional objects by matching three-dimensional models, e.g. conformal mapping of Riemann surfaces
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • Three-dimensional (3D) objects may be manufactured using any of a wide variety of manufacturing processes, including 3D printing, machining, casting, molding, extrusion, and the like. Defects may arise in the manufacturing process.
  • 3D printed objects may have excess material (e.g., support material) that remained on object.
  • a portion of a manufactured object may break off during manufacturing.
  • Figure 1 illustrates an example flow diagram of an overview of a system implementing object detection, pose estimation, and defect detection.
  • Figure 2 illustrates example image blocks of 3D models of the same object rendered in different virtual backgrounds in different poses.
  • Figure 3 illustrates an example table of the relative accuracy of machine learning models trained using five different approaches.
  • Figure 4 illustrates an example of an augmented autoencoder training pipeline for training each 3D model in multiple poses using multiple different virtually rendered backgrounds.
  • Figure 5 illustrates an example of locations on a sphere at which different poses of the 3D model are rendered.
  • Figure 6 illustrates an example flow diagram for augmented autoencoder generation of a codebook of pose information.
  • Figure 7 illustrates an example flow diagram for detecting a pose of an object in an image by matching it with a pose from the codebook.
  • Figure 8 illustrates an example flow diagram for detecting a defect based on a difference between a detection mask and model mask exceeding a mask deviation threshold.
  • Figure 9 illustrates examples of captured images, masks, and overlays, according to various examples and variations of the systems and methods described herein.
  • Figure 10 illustrates a block diagram of a system with examples of various subsystems and/or modules for implementing the systems and methods described herein.
  • Figure 11 A illustrates a flow chart of an example method to detect a defect in a physical object.
  • Figure 11 B illustrates another flow chart of another example method to render a graphical user interface with overlaid masks.
  • Manufactured objects may have defects introduced during the manufacturing process. For example, manufactured parts may have excess material that should not be present, missing material that was not correctly manufactured, missing material that has broken off post-manufacturing, swelling, material collapse, texture deformities, cracks, and/or other defects. Parts may be manually inspected by humans to ensure that they are correctly manufactured. Manual inspection can be a tedious and time-consuming process. Moreover, a human inspector may not accurately detect the defects. In instances in which a manufacturing process generates many different objects (e.g., a 3D printer that prints any one of a number of different parts), it may be very difficult for a human inspector to remember the shape and texture that each object should have.
  • a human inspector may leverage a computerized system, such as a coordinate measuring machine (CMM) or 3D scanning system, to generate a 3D image (e.g., a computer-aided drafting (CAD) file) to detect defects in the shape and/or texture of manufactured parts.
  • CMM coordinate measuring machine
  • 3D scanning system to generate a 3D image (e.g., a computer-aided drafting (CAD) file) to detect defects in the shape and/or texture of manufactured parts.
  • the human inspector may manually rotate, resize, and align the 3D image of the manufactured object to compare it with a model version. While this partially autotom ized approach may be more accurate, it is time consuming, requires expensive and complex CMM or 3D scanning systems, and still requires a human inspector.
  • the presently described systems and methods allow for defect detection using an image of an object captured with a relatively inexpensive camera.
  • a machine learning model is used to detect which object from a set of objects is captured in the image.
  • the system accesses a corresponding digital model of the object that is defect- free.
  • the system uses the machine learning model to identify a pose of the object in the captured image and renders the corresponding 3D model of the object in the identified pose.
  • the system identifies defects (e.g., shape and/or texture defects) and/or generates an image or other visualization for comparing the manufactured object and the 3D model.
  • Machine learning is a technique where a machine learning model is trained to perform a task based on a set of examples.
  • Some examples of machine learning such as deep learning, may utilize artificial neural networks.
  • some deep learning-based object detection models may be trained using tens, hundreds, thousands, tens of thousands, etc. of examples per unique object. Manually annotating each of these examples with object location and pose may be time consuming, especially for pixel- level detection (e.g., for segmentation).
  • a processor-based pose estimation system may utilize digital three- dimensional (3D) models of objects to train a neural network, such as a convolutional neural network (CNN), to detect a manufactured 3D object captured as part of an image that also includes a real-world background.
  • a system may detect defects in manufactured objects by identifying and distinguishing the object from a set of objects, detecting or estimating the pose of the identified object, and then comparing the imaged object with a model of the object in the same pose.
  • the system may identify differences in the 3D object relative to the digitally rendered 3D model as defects. In some instances, the system may identify a defect when the differences exceed a deviation threshold.
  • a system may include a training subsystem to train a machine learning model with each of a plurality of digital three-dimensional (3D) models.
  • a camera may capture an image of a 3D object (e.g., a 3D printed object or object manufactured using another manufacturing technique).
  • the object detection subsystem may receive the captured image that includes a 3D object and a background.
  • the object detection subsystem may utilize a machine learning model to determine which 3D model of the set of trained 3D models corresponds to the imaged 3D object.
  • the system may further include a pose estimation subsystem to estimate a pose of the identified 3D object in the image.
  • a 3D model of the identified object in the estimated pose may be retrieved from a database and/or rendered in real-time at the estimated pose.
  • a mask subsystem may generate a detection mask of the manufactured 3D object.
  • the mask subsystem may generate a model mask of a rendering of the identified 3D model at the estimated pose.
  • a defect visualization subsystem may render an overlay image of the detection mask and the model mask.
  • the overlay image may comprise a scaled overlay image in which the model mask is scaled for size to correspond to the detection mask of the imaged 3D object.
  • the scaled overlay image may visually emphasize (e.g., highlight, color, annotate, mark, label, etc.) the differences between the detection mask and the model mask.
  • the system may identify the differences at the pixel level and visually emphasized when they collectively exceed a mask deviation threshold.
  • the system may use a 70% mask deviation threshold that is applicable to an entire mask, a segment of the mask, or a region of the mask. The percentage used for the mask deviation threshold may be adjusted to achieve a target or acceptable rate of false defect detection.
  • the training subsystem may generate synthetic training data to train the machine learning model using virtual renderings of each of the 3D models in multiple different virtual environments in multiple different poses.
  • the training subsystem generates a codebook of feature vectors for each trained view of each of the 3D models (e.g., scale, size, and/or pose).
  • the shape defect detection subsystem may determine a shape defect in the 3D object based on a difference between the detection mask and the model mask exceeding a mask deviation threshold.
  • the shape defect subsystem may detect a texture defect on the 3D object via, for example, a semantic segmentation mask or other learning model-based texture comparison approach.
  • Some portions of the systems and methods described herein may be implemented by a processor in a computer system executing instructions stored in a non-transitory computer-readable medium.
  • the instructions may be physically, logically, or conceptually divided into discrete modules or submodules to implement specific functions.
  • a training module may include instructions to train a machine learning model with each of a plurality of digital 3D models, as described herein.
  • An object detection module may include instructions to identify a digital 3D model from a set of digital 3D models that corresponds to an image of a 3D object using, for example, the trained machine learning model.
  • a pose estimation module may include instructions to estimate a pose of the 3D object in the image.
  • a mask generation module may include instructions to generate a detection mask and/or a model mask, as described herein.
  • the computer-readable medium may further include overlay modules, defect detection modules, comparison modules, and the like to detect defects and/or generate visualizations of overlaid masks to emphasize differences between 3D objects and the corresponding 3D models.
  • modules, systems, and subsystems are described herein as implementing functions and/or as performing actions. In many instances, modules, systems, and subsystems may be divided into sub-modules, subsystems, or even as sub-portions of subsystems. Modules, systems, and subsystems may be implemented in hardware, software, and/or combinations thereof.
  • Figure 1 illustrates an example flow diagram 100 for object detection and defect detection.
  • Object detection 110 may be implemented by, for example, an object detection subsystem analyzing an input image, localizing the object, and matching it to a 3D model in a library of possible 3D models.
  • Pose estimation 120 may be implemented by, for example, a pose estimation subsystem to estimate a six-degree pose of an object by comparing the image of the object with a 3D model of the corresponding object.
  • Pose estimation 120 may, for example, be performed through an analysis of an encoded vector of the image of the physical object and a pose codebook for the identified digital 3D model.
  • the system may implement texture defect detection 130 and/or shape defect detection 140.
  • Shape and/or texture defect detection 130 and 140 may be implemented according to any of the various examples explicitly described herein, in accordance with the systems and methods described in the applications incorporated herein by reference, and/or in accordance with any combination of other defect and difference detection approaches.
  • Figure 2 illustrates example image blocks of 3D models of the same object rendered in different virtual backgrounds in different poses.
  • the object is rendered in the first block 200 in a first pose 251 with a first virtually background.
  • the same object may be rendered in the second block 210 in a second pose 252 with a different virtual background.
  • Two additional poses 253 and 254 of the object are illustrated in blocks 220 and 230 with third and fourth unique backgrounds.
  • synthetic samples of the 3D model rendered in virtually rendered background environments may be realistically modeled, such as by including shadows, occlusions, self-occlusions, textured objects, photorealistic effects, and the physical implications of collision and gravity.
  • Realistic modeling may mitigate the challenges of object detection due to the domain shift between synthetic data and the real world.
  • Any of a wide variety of real-time 3D creation software may be utilized to achieve realistic renderings.
  • Real-time 3D creation software may be used to render each object in a set of objects in multiple poses in multiple rendered environments.
  • multiple objects may be rendered in the same environment. Alternatively, each rendering may include only one object from the set of objects.
  • a learning model for detecting the objects may be trained using the rendered 3D models in the different poses and in different environments. Clutter and occlusion make object detection and six-degree-of-freedom (6-DOF) object pose estimation challenging. Object symmetry, pose ambiguity, variations in lighting, and other environmental and object-specific peculiarities can increase the difficulty in obtaining accurate results.
  • 6-DOF six-degree-of-freedom
  • Figure 3 illustrates an example table 300 of the relative accuracy of machine learning models trained using five different approaches.
  • a set of standardized objects such as the T-LESS set of objects
  • 2D two-dimensional
  • the bounding box and segmentation accuracy of a learning model trained using the first approach is less than 1%.
  • a training dataset is created by cutting the objects from a black background and pasting them over a random background. As illustrated, the detection accuracy of the learning model trained using the dataset from the second approach exhibits little or no improvement.
  • each of the set of objects is rendered as a 3D model (e.g., as a digital, computer-aided drafting (CAD) model) and pasted over random 2D backgrounds. Rendering the 3D models results in significant improvements. Specifically, the detection accuracy of the learning model trained using the dataset with rendered 3D models is slightly above 40% for both bounding box and segmentation analyses.
  • the dataset is created using full 3D scenes with each object digitally rendered as a 3D model and placed in a real 3D background (i.e. , a captured image of a real scene).
  • the synthetic dataset includes a rendering of each object in one virtual 3D scene.
  • each object in the set of objects is rendered as a 3D model within two virtually rendered environments.
  • the fully synthetic dataset includes renderings of each object in randomly selected virtually rendered scenes.
  • the detection accuracy of a learning model trained with a fully synthetic dataset in which both the models and the backgrounds are virtually rendered increases markedly to above 80%.
  • the example percentages in FIG. 3 are intended to illustrate the relative performance of the various approaches but may not correspond to actual simulations or test results.
  • each 3D model in the set of 3D models may be rendered any number of times in any number of poses and in any number of randomly selected or specifically selected backgrounds, each of which is digitally rendered. Variations in lighting, object self-occlusion, the inclusion of multiple objects in a single image, and/or other variations in the 3D renderings may be included to improve the robustness of the trained learning model.
  • the system may discretize the rotational representation space (e.g., Euler angles, quaternions, etc.) and train deep neural networks (DNNs) to identify the closest pose.
  • the learning model may be part of or utilize a DNN model.
  • the training module may utilize a convolutional neural network (CNN) or another artificial neural network (ANN), such as feedforward neural networks, recurrent neural networks, and the like.
  • the system may utilize the Augmented Autoencoder (AAE) approach to estimate the pose.
  • the AAE approach may include training an autoencoder to reconstruct the object of interest from an input image (e.g., exclude the background).
  • Figure 4 illustrates an example of an augmented autoencoder training pipeline 400 for training each 3D model in multiple poses using multiple different virtually rendered backgrounds.
  • a model is rendered, at 410, with a random pose with a rendered, 420, random background.
  • nonrealistic renderings of the models may be used within nonrealistic renderings of backgrounds.
  • realistic renderings of the models may be used within nonrealistic renderings of backgrounds.
  • nonrealistic renderings of the models may be used within realistic renderings of backgrounds.
  • realistic renderings of the models are used within realistic renderings of the backgrounds. While real backgrounds may be utilized, it can be difficult to realistically render 3D models within real backgrounds because, for example, the rendering system may have to deduce the location of lighting sources and reflections. In contrast, virtually rendered 3D models can be accurately reproduced with textures that match the virtually rendered lighting and reflections of the virtually rendered background.
  • the system may add, at 430, augmentation to the virtually rendered 3D model and virtually rendered background.
  • the rendered 3D model within the virtual background may include added noise, applied transformation operations, occlusion, color variations, geometric (affine) transformations, and/or other flaws expected to be present in captured images.
  • training datasets may be augmented to account for the possibility or expectation of black-and-white or grayscale images, vignetting, moire, chromatic aberrations, or other imaging issues.
  • the system may encode, at 440, the input image.
  • the system may generate “latent code” that is used to generate a codebook for each view in combination with a rotation matrix and the original object bounding box (i.e. , the bounding box prior to crop and/or resize).
  • the system may decode, at 450, the out of the encoder to produce an image that, when compared with the originally rendered 3D model, can be described as a loss 460.
  • Figure 5 illustrates an example of locations as a point cloud 500 on a sphere (e.g., “camera locations”) at which different poses of the 3D model 501 are rendered.
  • renderings of the 3D model at poses corresponding to each of the points in the point cloud 500 may be encoded to generate a robust codebook of triplet values of latent code, the rotational matrix, and the original object bounding box.
  • the number of rendered poses that are encoded increases the robustness of the autoencoder with respect to translation, rotation, occlusion, noise, and other augmentation of imaged objects.
  • Figure 6 illustrates an example flow diagram 600 for augmented autoencoder generation of a codebook of pose information.
  • the system may render, at 610, the object with N predefined poses, where N is an integer value.
  • the encoder, 620 generates latent code that is retrieved, at 630, and stored, at 640, to generate a codebook of information for each pose of each rendered 3D model.
  • the codebook may include triplet values of the latent code, the rotation matrix, and the original object bounding box for each pose of each rendered 3D model.
  • Figure 7 illustrates an example flow diagram 700 for detecting a pose of an object in an image by matching it with a pose from the codebook.
  • a detector 710 may receive an image of a manufactured object, such as a 3D printed object, with a real background (e.g., the object placed on a table or other surface). The system detects which model in the set of models corresponds to the imaged object. The image of the object may be resized and/or cropped, at 720. The imaged object may be encoded by the encoder 730. The output of the encoder for the detected object may be compared, at 740, with entries of the codebook 750 for the detected model. For example, a pose detection subsystem may compute a difference or similarity between the output of the encoder for the object with the codebook entries for the detected model using a cosine distance.
  • a pose detection subsystem may compute a difference or similarity between the output of the encoder for the object with the codebook entries for the detected model using a cosine distance.
  • the pose detection subsystem may, for example, estimate a depth value from the scale ratio between the detected bounding box and the codebook scale.
  • the bounding box center may be used to estimate the vertical and horizontal translation.
  • the system may generate similarity vectors, at 760, to determine, at 770, the pose that provides the closest match.
  • Figure 8 illustrates an example flow diagram 800 for detecting a defect based on a difference between a detection mask and model mask exceeding a mask deviation threshold.
  • a mask subsystem may generate, at 810, a detection mask of an imaged object.
  • the detection mask of the imaged object may resemble a silhouette of the imaged object with the background removed.
  • the mask subsystem may generate (e.g., render), at 820, a model mask of the rendered object in the estimated pose of the imaged object. Assuming accurate mask generation and pose estimation, the detection mask and the model mask are the same when the object does not have any manufacturing defects.
  • the system may determine, at 830, a difference between the detection mask and the model mask.
  • the system may implement a dense alignment process for size scaling the relative sizes of the model mask and the detection mask.
  • the differences may be displayed, at 840, as part of a visual display or heatmap with overlaid images of the detection mask and the model mask.
  • the visualization may include overlaid color versions of the masks (e.g., as a heatmap) that visually emphasizes the differences between the detection mask and the model mask.
  • the model mask may be overlaid as an outline or partially transparent silhouette mask overlaid on the image of the object.
  • the differences may be colored, annotated, or otherwise emphasized.
  • the system may perform a mathematical comparison with a threshold value, such as a mask deviation threshold 850, to make a binary decision of “defect” or “no defect.”
  • Figure 9 illustrates examples of captured images, masks, and overlays, according to various examples and variations of the systems and methods described herein.
  • Block 902 illustrates an example of an image of an object with a real background. The background is omitted from block 902 for clarity but could include, for example, a 3D-printer surface, a table, a conveyor belt, an assembly line, other parts, and/or other objects and surfaces.
  • the image in block 902 may be processed according to any of the various examples described herein to detect the object and estimate the pose of the object.
  • Block 904 includes an example of a detection mask of the object.
  • Block 906 includes an example of a model mask of the 3D model identified as corresponding to the detected object in the estimated pose of the detected object.
  • Block 908 illustrates a difference between the detection mask and the model mask, without dense alignment for size scaling.
  • Block 910 includes a heatmap of the differences between the detection mask and the model mask, again without dense alignment for size scaling.
  • Block 912 illustrates emphasized (e.g., colored or shaded) regions of the overlaid detection and model masks identified as defects. Based on a visual comparison of the imaged object in block 902 and the model mask in block 906, it is readily apparent that the defect 990 (block 902) is not accurately identified in block 912 due to the lack of dense alignment and size scaling.
  • Block 914 illustrates a difference between the detection mask and the model mask after dense alignment and size scaling.
  • Block 916 includes a heatmap of the differences between the detection mask and the model mask after dense alignment and size scaling.
  • Block 918 illustrates emphasized (e.g., colored or shaded) regions of the overlaid detection and model masks identified as defects.
  • a visual comparison of the imaged object in block 902 and the model mask in block 906 makes it readily apparent that the defect 990 (block 902) is accurately identified in block 918.
  • Block 920 illustrates an outline of the model mask in block 906 overlaid on the object in the original image of block 902.
  • the model mask does not include dense alignment for size scaling.
  • the model mask includes dense alignment and is sized scaled to improve the accuracy of the alignment of the outline of the model mask overlaid on the object in the original image of block 902.
  • the defect is incorrectly emphasized.
  • the dense alignment of the model mask improves the accuracy of the defect detection, and so the defect is correctly emphasized in block 922.
  • FIG. 10 illustrates a block diagram of a system 1000 with examples of various subsystems and/or modules for implementing the systems, subsystems, and methods described herein.
  • the system 1000 includes a processor 1030, memory 1040, network interface 1050, and a computer-readable medium or hardware electronic subsystems 1070 connected via a bus 1020.
  • the system 1000 may include a training subsystem 1081 to train a machine learning model with each 3D model in a set of 3D models.
  • the training subsystem 1081 may be performed externally to the system 1000, in which case a trained learning model may be included instead of a training subsystem 1081.
  • the system 1000 includes an object detection subsystem to identify, using the trained machine learning model, a 3D object (e.g., a manufactured object, such as a 3D- printed object) as corresponding to one of the plurality of 3D models used to train the machine learning model.
  • a 3D object e.g., a manufactured object, such as a 3D- printed object
  • the system 1000 includes a pose estimation system 1083 to estimate a pose of the 3D object identified in the image by the object detection subsystem 1082.
  • the system 1000 includes a virtual rendering subsystem 1084 to render the 3D model in the estimated pose for mask generation.
  • the virtual rendering subsystem 1084 may also render 3D models in various poses with various rendered backgrounds for training the machine learning model.
  • the system 1000 may include a mask subsystem 1085 to generate a detection mask of the manufactured 3D object.
  • the mask subsystem may also generate a model mask of a rendering of the identified 3D model at the estimated pose.
  • the model mask may be densely aligned for size scaling the model mask relative to the detection mask, as described above, to increase the accuracy of defect detection.
  • the system 1000 may include a defect visualization subsystem 1086 to render a scaled overlay image of the detection mask and the model mask.
  • the defect visualization subsystem 1086 may visually emphasize the differences between the detection mask and the model mask.
  • the system 1000 may include a shape defect detection subsystem 1087 to detect shape defects via, for example, a comparison of the masks and a determination that the difference exceeds a mask deviation threshold.
  • a texture defect detection subsystem 1088 may detect a texture defect via, for example, a semantic segmentation mask or other learning model-based texture comparison approach.
  • Figure 11 A illustrates a flow chart of an example method to detect a defect in a physical object.
  • the system may identify, at 1104, a physical object in a captured image as corresponding to a digital three-dimensional (3D) model in a set of digital 3D models.
  • the system may utilize a trained machine learning model to identify the object in the image.
  • the system may estimate, at 1106, a pose of the physical object via an analysis of an encoded vector of the image of the physical object and a pose codebook for the identified digital 3D model.
  • the system may render, at 1108, a 3D mesh of the digital 3D model at the estimated pose.
  • the system may detect, at 1110, a defect in the physical object based on analysis of the image of the physical object compared to an analysis of the rendered 3D mesh.
  • Figure 11 B illustrates another flow chart of another example method to render a graphical user interface with overlaid masks, similar to the flowchart of Figure 11A.
  • a pre-trained machine learning model may be used to detect an object (or objects) and/or estimate a pose (or poses) of the object(s).
  • a training subsystem may be used to train, at 1100, a machine learning model for object detection.
  • the machine learning model may be trained, at 1102, using a fully synthetic dataset of rendered 3D models in rendered 3D backgrounds.
  • the system may also train a second machine learning model for pose estimation.
  • the second machine learning model for pose estimation may be trained using fully synthetic data as well.
  • the system may identify, at 1104, a physical object in a captured image as corresponding to a digital three-dimensional (3D) model in a set of digital 3D models utilizing the trained machine learning model for object detection
  • the system may estimate, at 1106, a pose of the physical object via an analysis of an encoded vector of the image of the physical object and a pose codebook for the identified digital 3D model using the machine learning model trained for pose estimation.
  • the system may render, at 1108, a 3D mesh of the digital 3D model at the estimated pose.
  • the system may detect, at 1110, a defect in the physical object based on analysis of the image of the physical object compared to an analysis of the rendered 3D mesh.
  • the system may render, at 1112, a graphical user interface or image with overlaid detection and model masks.
  • the system may render, at 1112, a graphical user interface or image of a model mask overlaid on the object in the original image of the object.
  • the detection mask may additionally or alternatively be overlaid on the object in the original image of the object.
  • the realistic renderings of the 3D models and backgrounds are used to generate a realistic and fully synthetic dataset that can be used to train a machine learning model that is robust to illumination and background conditions.
  • the synthetically trained system can be robust to detect 3D models that match imaged objects in a wide variety of domains.
  • the synthetic dataset can be developed for particular target environments, such as indoor environments, outdoor environments, fixed or varied object textures, and/or other scene parameters.
  • many of the examples described herein for object identification and pose estimation can be useful for robotic applications in which a robotic member manipulates the object. For example, a system with a robotic arm may identify and estimate a pose of an object to facilitate precise manipulation of the object via the robotic arm.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

Un modèle d'apprentissage automatique peut être entraîné avec des rendus virtuels de chaque modèle tridimensionnel (3D) d'un ensemble de modèles 3D dans différentes poses et avec différents fonds. Le système peut identifier un objet 3D dans une image comme correspondant à l'un des modèles 3D. Le système peut estimer une pose de l'objet 3D identifié dans l'image et générer un masque de détection. Le système peut générer un masque de modèle sur la base d'un rendu du modèle 3D identifié dans la pose estimée. Un sous-système de visualisation de défauts peut rendre une image de superposition mise à l'échelle du masque de détection et du masque de modèle. Dans certains exemples, le système peut détecter un défaut sur la base d'une différence calculée entre le masque de détection et le masque de modèle.
PCT/US2020/023452 2020-03-18 2020-03-18 Estimation de pose d'objet et détection de défauts Ceased WO2021188104A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/US2020/023452 WO2021188104A1 (fr) 2020-03-18 2020-03-18 Estimation de pose d'objet et détection de défauts

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2020/023452 WO2021188104A1 (fr) 2020-03-18 2020-03-18 Estimation de pose d'objet et détection de défauts

Publications (1)

Publication Number Publication Date
WO2021188104A1 true WO2021188104A1 (fr) 2021-09-23

Family

ID=77771526

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2020/023452 Ceased WO2021188104A1 (fr) 2020-03-18 2020-03-18 Estimation de pose d'objet et détection de défauts

Country Status (1)

Country Link
WO (1) WO2021188104A1 (fr)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115311197A (zh) * 2022-06-21 2022-11-08 福建榕基软件股份有限公司 一种基于热力图的抠图效果评估方法和存储设备
WO2023085587A1 (fr) * 2021-11-11 2023-05-19 Lg Electronics Inc. Appareil d'intelligence artificielle et procédé de détection d'éléments appartenant à une classe « non vu » associé
WO2023117317A1 (fr) * 2021-12-22 2023-06-29 Endress+Hauser Process Solutions Ag Procédé d'inspection automatisée d'un appareil de terrain
CN116721104A (zh) * 2023-08-10 2023-09-08 武汉大学 实景三维模型缺陷检测方法、装置、电子设备及存储介质
CN116934687A (zh) * 2023-06-12 2023-10-24 浙江大学 一种基于半监督辅助学习语义分割的注塑制品表面缺陷检测方法
CN117274215A (zh) * 2023-10-08 2023-12-22 四川启睿克科技有限公司 一种基于记忆库重构的无监督缺陷检测方法
CN117557622A (zh) * 2023-10-13 2024-02-13 中国人民解放军战略支援部队航天工程大学 一种融合渲染信息的航天器位姿估计方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160148371A1 (en) * 2014-11-24 2016-05-26 Siemens Aktiengesellschaft Synthetic data-driven hemodynamic determination in medical imaging
KR20170001648A (ko) * 2015-06-26 2017-01-04 코그넥스코오포레이션 자동화된 산업 검사용 3d 비전 사용
US20170161590A1 (en) * 2015-12-07 2017-06-08 Dassault Systemes Recognition of a 3d modeled object from a 2d image
US20180211373A1 (en) * 2017-01-20 2018-07-26 Aquifi, Inc. Systems and methods for defect detection
US20190108396A1 (en) * 2017-10-11 2019-04-11 Aquifi, Inc. Systems and methods for object identification

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160148371A1 (en) * 2014-11-24 2016-05-26 Siemens Aktiengesellschaft Synthetic data-driven hemodynamic determination in medical imaging
KR20170001648A (ko) * 2015-06-26 2017-01-04 코그넥스코오포레이션 자동화된 산업 검사용 3d 비전 사용
US20170161590A1 (en) * 2015-12-07 2017-06-08 Dassault Systemes Recognition of a 3d modeled object from a 2d image
US20180211373A1 (en) * 2017-01-20 2018-07-26 Aquifi, Inc. Systems and methods for defect detection
US20190108396A1 (en) * 2017-10-11 2019-04-11 Aquifi, Inc. Systems and methods for object identification

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023085587A1 (fr) * 2021-11-11 2023-05-19 Lg Electronics Inc. Appareil d'intelligence artificielle et procédé de détection d'éléments appartenant à une classe « non vu » associé
US12430885B2 (en) 2021-11-11 2025-09-30 Lg Electronics Inc. Artificial intelligence apparatus and method for detecting unseen class items thereof
WO2023117317A1 (fr) * 2021-12-22 2023-06-29 Endress+Hauser Process Solutions Ag Procédé d'inspection automatisée d'un appareil de terrain
CN115311197A (zh) * 2022-06-21 2022-11-08 福建榕基软件股份有限公司 一种基于热力图的抠图效果评估方法和存储设备
CN116934687A (zh) * 2023-06-12 2023-10-24 浙江大学 一种基于半监督辅助学习语义分割的注塑制品表面缺陷检测方法
CN116934687B (zh) * 2023-06-12 2024-02-09 浙江大学 基于半监督学习语义分割的注塑制品表面缺陷检测方法
CN116721104A (zh) * 2023-08-10 2023-09-08 武汉大学 实景三维模型缺陷检测方法、装置、电子设备及存储介质
CN116721104B (zh) * 2023-08-10 2023-11-07 武汉大学 实景三维模型缺陷检测方法、装置、电子设备及存储介质
CN117274215A (zh) * 2023-10-08 2023-12-22 四川启睿克科技有限公司 一种基于记忆库重构的无监督缺陷检测方法
CN117557622A (zh) * 2023-10-13 2024-02-13 中国人民解放军战略支援部队航天工程大学 一种融合渲染信息的航天器位姿估计方法

Similar Documents

Publication Publication Date Title
WO2021188104A1 (fr) Estimation de pose d'objet et détection de défauts
CN109816049B (zh) 一种基于深度学习的装配监测方法、设备及可读存储介质
CN109615611B (zh) 一种基于巡检影像的绝缘子自爆缺陷检测方法
Liu et al. Identifying image composites through shadow matte consistency
CN112132213A (zh) 样本图像的处理方法及装置、电子设备、存储介质
CN109816725A (zh) 一种基于深度学习的单目相机物体位姿估计方法及装置
WO2014105463A2 (fr) Procédés et systèmes d'inspection visuelle automatisée améliorée d'un actif physique
JP7505866B2 (ja) 点検支援方法、点検支援システム、及び点検支援プログラム
CN115601430A (zh) 基于关键点映射的无纹理高反物体位姿估计方法及系统
CN119399098A (zh) 一种工件瑕疵检测方法、系统及计算机程序
CN119007209B (zh) 一种基于点云数据的自动目标检测预标注方法
CN109934873B (zh) 标注图像获取方法、装置及设备
CN114897982A (zh) 一种弱纹理物体的位姿估计方法及系统
CN113068017B (zh) 增强真实场景的视频通量
CN118351272A (zh) 一种基于扩散模型的异构形状注册方法
Yin et al. [Retracted] Virtual Reconstruction Method of Regional 3D Image Based on Visual Transmission Effect
Jin et al. DOPE++: 6D pose estimation algorithm for weakly textured objects based on deep neural networks
CN119693461A (zh) 一种基于图像分割的非开挖钻机钻杆的定位方法及系统
Zhou et al. RAD: A dataset and benchmark for real-life anomaly detection with robotic observations
Seiler et al. Synthetic Data Generation for AI-based Machine Vision Applications
CN118446982A (zh) 缺陷检测方法、装置、计算机设备及存储介质
CN117372521A (zh) 一种适用于对称物体的实时单目6d位姿估计方法及系统
Hodapp et al. Advances in Automated Generation of Convolutional Neural Networks from Synthetic Data in Industrial Environments.
CN115527008A (zh) 基于混合现实技术的安全模拟体验培训系统
JP2022123733A (ja) 検査装置、検査方法、およびプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20925062

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20925062

Country of ref document: EP

Kind code of ref document: A1