Wu et al., 2025 - Google Patents
How does the Machine Perceive Depth for Indoor Single Images with CNN?Wu et al., 2025
View PDF- Document ID
- 17362311635719348373
- Author
- Wu Y
- Heng Y
- Niranjan M
- Kim H
- Publication year
- Publication venue
- Proceedings of the Computer Vision and Pattern Recognition Conference
External Links
Snippet
Depth estimation from a single image is a challenging problem in computer vision because binocular disparity or motion information is not available in the given input. Whereas impressive performances have been reported in this area recently using end-to-end trained …
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6201—Matching; Proximity measures
- G06K9/6202—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30799—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using low-level visual features of the video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30244—Information retrieval; Database structures therefor; File system structures therefor in image databases
- G06F17/30247—Information retrieval; Database structures therefor; File system structures therefor in image databases based on features automatically derived from the image data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration, e.g. from bit-mapped to bit-mapped creating a similar image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Pittaluga et al. | Revealing scenes by inverting structure from motion reconstructions | |
EP3707676B1 (en) | Method for estimating the installation of a camera in the reference frame of a three-dimensional scene, device, augmented reality system and associated computer program | |
Hu et al. | Visualization of convolutional neural networks for monocular depth estimation | |
Huang et al. | Indoor depth completion with boundary consistency and self-attention | |
Guerry et al. | Snapnet-r: Consistent 3d multi-view semantic labeling for robotics | |
Rozantsev et al. | On rendering synthetic images for training an object detector | |
US9501715B2 (en) | Method for detecting salient region of stereoscopic image | |
Pezzementi et al. | Putting image manipulations in context: robustness testing for safe perception | |
Lore et al. | Generative adversarial networks for depth map estimation from RGB video | |
Protas et al. | Visualization methods for image transformation convolutional neural networks | |
Frintrop et al. | A cognitive approach for object discovery | |
KR101833943B1 (en) | Method and system for extracting and searching highlight image | |
Fan et al. | Shading-aware shadow detection and removal from a single image | |
EP3759649B1 (en) | Object recognition from images using cad models as prior | |
Haji-Esmaeili et al. | Large-scale monocular depth estimation in the wild | |
Zhang et al. | An object counting network based on hierarchical context and feature fusion | |
Benn et al. | Robot navigation control based on monocular images: an image processing algorithm for obstacle avoidance decisions | |
KR101592087B1 (en) | Method for generating saliency map based background location and medium for recording the same | |
Fang et al. | Learning visual saliency from human fixations for stereoscopic images | |
Wu et al. | Depth Insight--Contribution of Different Features to Indoor Single-image Depth Estimation | |
Wu et al. | How does the Machine Perceive Depth for Indoor Single Images with CNN? | |
CN115294162B (en) | Target identification method, device, equipment and storage medium | |
Dubey et al. | Guidance System for Visually Impaired Persons Using Deep Learning and Optical Flow | |
Yuan et al. | RGB-D saliency detection: Dataset and algorithm for robot vision | |
He et al. | A novel way to organize 3D LiDAR point cloud as 2D depth map height map and surface normal map |