Zhao, 2024 - Google Patents
Leveraging Depth for 3D Scene PerceptionZhao, 2024
- Document ID
- 15435703865125205500
- Author
- Zhao Y
- Publication year
External Links
Snippet
Abstract 3D scene perception aims to understand the geometric and semantic information of the surrounding environment. It is crucial in many downstream applications, such as autonomous driving, robotics, AR/VR, and human-computer interaction. Despite its …
- 230000008447 perception 0 title abstract description 23
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30244—Information retrieval; Database structures therefor; File system structures therefor in image databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/20—Image acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00362—Recognising human body or animal bodies, e.g. vehicle occupant, pedestrian; Recognising body parts, e.g. hand
- G06K9/00369—Recognition of whole body, e.g. static pedestrian or occupant recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Liang et al. | Stereo matching using multi-level cost volume and multi-scale feature constancy | |
| Sharp et al. | Accurate, robust, and flexible real-time hand tracking | |
| Zioulis et al. | Omnidepth: Dense depth estimation for indoors spherical panoramas | |
| Xu et al. | Autoscanning for coupled scene reconstruction and proactive object analysis | |
| WO2019017985A1 (en) | Robust mesh tracking and fusion by using part-based key frames and priori model | |
| Zanfir et al. | Hum3dil: Semi-supervised multi-modal 3d humanpose estimation for autonomous driving | |
| Pavlakos et al. | The one where they reconstructed 3d humans and environments in tv shows | |
| Luvizon et al. | Scene‐Aware 3D Multi‐Human Motion Capture from a Single Camera | |
| Guo et al. | A Survey of the State of the Art in Monocular 3D Human Pose Estimation: Methods, Benchmarks, and Challenges | |
| Khan et al. | A review of benchmark datasets and training loss functions in neural depth estimation | |
| Khan et al. | Towards monocular neural facial depth estimation: Past, present, and future | |
| Zhao | Leveraging Depth for 3D Scene Perception | |
| Lin et al. | Leveraging deepfakes to close the domain gap between real and synthetic images in facial capture pipelines | |
| Zhang et al. | Survey on controlable image synthesis with deep learning | |
| Bekhit | Computer Vision and Augmented Reality in iOS | |
| Colantoni et al. | When Dance Video Archives Challenge Computer Vision | |
| Koujan | 3D Face Modelling, Analysis and Synthesis | |
| Farahanipad | GAN-Based Domain Translation for Hand Pose Estimation and Face Reconstruction | |
| Fu | Long-term Object-based SLAM in Low-dynamic Environments | |
| Zhou | Towards Intelligent Embodied Perception for Indoor Agent | |
| Ranade | Inferring Shape and Appearance of Three-Dimensional Scenes--Advances and Applications | |
| Miu | Computer Vision with Machine Learning on Smartphones for Beauty Applications. | |
| Montserrat | Machine Learning-Based Multimedia Analytics | |
| Cai | Pushing the Boundaries of 3D Spatial Understanding | |
| Van Hoorick | Spatial Reasoning in Dynamic Scenes |