Wu et al., 2025 - Google Patents

How does the Machine Perceive Depth for Indoor Single Images with CNN?

Wu et al., 2025

Document ID: 17362311635719348373
Author: Wu Y; Heng Y; Niranjan M; Kim H
Publication year: 2025
Publication venue: Proceedings of the Computer Vision and Pattern Recognition Conference

External Links

Cited by

Snippet

Depth estimation from a single image is a challenging problem in computer vision because binocular disparity or motion information is not available in the given input. Whereas impressive performances have been reported in this area recently using end-to-end trained …

Continue reading at openaccess.thecvf.com (PDF) (other versions)

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6201—Matching; Proximity measures
- G06K9/6202—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30799—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using low-level visual features of the video content
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30244—Information retrieval; Database structures therefor; File system structures therefor in image databases
- G06F17/30247—Information retrieval; Database structures therefor; File system structures therefor in image databases based on features automatically derived from the image data
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration, e.g. from bit-mapped to bit-mapped creating a similar image
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation

Similar Documents

Publication	Publication Date	Title
Pittaluga et al.	2019	Revealing scenes by inverting structure from motion reconstructions
EP3707676B1 (en)	2022-05-04	Method for estimating the installation of a camera in the reference frame of a three-dimensional scene, device, augmented reality system and associated computer program
Hu et al.	2019	Visualization of convolutional neural networks for monocular depth estimation
Huang et al.	2019	Indoor depth completion with boundary consistency and self-attention
Guerry et al.	2017	Snapnet-r: Consistent 3d multi-view semantic labeling for robotics
Rozantsev et al.	2015	On rendering synthetic images for training an object detector
US9501715B2 (en)	2016-11-22	Method for detecting salient region of stereoscopic image
Pezzementi et al.	2018	Putting image manipulations in context: robustness testing for safe perception
Lore et al.	2018	Generative adversarial networks for depth map estimation from RGB video
Protas et al.	2018	Visualization methods for image transformation convolutional neural networks
Frintrop et al.	2014	A cognitive approach for object discovery
KR101833943B1 (en)	2018-04-13	Method and system for extracting and searching highlight image
Fan et al.	2020	Shading-aware shadow detection and removal from a single image
EP3759649B1 (en)	2022-04-20	Object recognition from images using cad models as prior
Haji-Esmaeili et al.	2024	Large-scale monocular depth estimation in the wild
Zhang et al.	2019	An object counting network based on hierarchical context and feature fusion
Benn et al.	2012	Robot navigation control based on monocular images: an image processing algorithm for obstacle avoidance decisions
KR101592087B1 (en)	2016-02-04	Method for generating saliency map based background location and medium for recording the same
Fang et al.	2017	Learning visual saliency from human fixations for stereoscopic images
Wu et al.	2023	Depth Insight--Contribution of Different Features to Indoor Single-image Depth Estimation
Wu et al.	2025	How does the Machine Perceive Depth for Indoor Single Images with CNN?
CN115294162B (en)	2022-12-06	Target identification method, device, equipment and storage medium
Dubey et al.	2023	Guidance System for Visually Impaired Persons Using Deep Learning and Optical Flow
Yuan et al.	2018	RGB-D saliency detection: Dataset and algorithm for robot vision
He et al.	2015	A novel way to organize 3D LiDAR point cloud as 2D depth map height map and surface normal map