Zhao, 2024 - Google Patents

Leveraging Depth for 3D Scene Perception

Zhao, 2024

Document ID: 15435703865125205500
Author: Zhao Y
Publication year: 2024

External Links

Cited by

Snippet

Abstract 3D scene perception aims to understand the geometric and semantic information of the surrounding environment. It is crucial in many downstream applications, such as autonomous driving, robotics, AR/VR, and human-computer interaction. Despite its …

Continue reading at search.proquest.com (other versions)

230000008447 perception 0 title abstract description 23

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30244—Information retrieval; Database structures therefor; File system structures therefor in image databases
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/20—Image acquisition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00362—Recognising human body or animal bodies, e.g. vehicle occupant, pedestrian; Recognising body parts, e.g. hand
- G06K9/00369—Recognition of whole body, e.g. static pedestrian or occupant recognition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation

Similar Documents

Publication	Publication Date	Title
Liang et al.	2019	Stereo matching using multi-level cost volume and multi-scale feature constancy
Sharp et al.	2015	Accurate, robust, and flexible real-time hand tracking
Zioulis et al.	2018	Omnidepth: Dense depth estimation for indoors spherical panoramas
Xu et al.	2015	Autoscanning for coupled scene reconstruction and proactive object analysis
WO2019017985A1 (en)	2019-01-24	Robust mesh tracking and fusion by using part-based key frames and priori model
Zanfir et al.	2023	Hum3dil: Semi-supervised multi-modal 3d humanpose estimation for autonomous driving
Pavlakos et al.	2022	The one where they reconstructed 3d humans and environments in tv shows
Luvizon et al.	2023	Scene‐Aware 3D Multi‐Human Motion Capture from a Single Camera
Guo et al.	2025	A Survey of the State of the Art in Monocular 3D Human Pose Estimation: Methods, Benchmarks, and Challenges
Khan et al.	2021	A review of benchmark datasets and training loss functions in neural depth estimation
Khan et al.	2022	Towards monocular neural facial depth estimation: Past, present, and future
Zhao	2024	Leveraging Depth for 3D Scene Perception
Lin et al.	2022	Leveraging deepfakes to close the domain gap between real and synthetic images in facial capture pipelines
Zhang et al.	2023	Survey on controlable image synthesis with deep learning
Bekhit	2021	Computer Vision and Augmented Reality in iOS
Colantoni et al.	2025	When Dance Video Archives Challenge Computer Vision
Koujan	2022	3D Face Modelling, Analysis and Synthesis
Farahanipad	2022	GAN-Based Domain Translation for Hand Pose Estimation and Face Reconstruction
Fu	2024	Long-term Object-based SLAM in Low-dynamic Environments
Zhou	2025	Towards Intelligent Embodied Perception for Indoor Agent
Ranade	2023	Inferring Shape and Appearance of Three-Dimensional Scenes--Advances and Applications
Miu	2022	Computer Vision with Machine Learning on Smartphones for Beauty Applications.
Montserrat	2020	Machine Learning-Based Multimedia Analytics
Cai	2025	Pushing the Boundaries of 3D Spatial Understanding
Van Hoorick	2024	Spatial Reasoning in Dynamic Scenes