Kató Zoltán
Beyond Point-based 3D Reconstruction and Visual Localization of Objects


Intézmény: Szegedi Tudományegyetem
informatikai tudományok
Informatika Doktori Iskola

témavezető: Kató Zoltán
helyszín (magyar oldal): SZTE
helyszín rövidítés: SZTE

A kutatási téma leírása:

The purely segment-level understanding of many current computer vision systems tells us little about where objects are located in 3D space and how agents like humans or vehicles could interact with them. However, recent work has focused on obtaining a geometric understanding of the scene in terms of the 3D volumes and surfaces that compose the scene. This representation enables reasoning about the objects as they exist in a 3D world, rather than simply in the image plane, and has been demonstrated to have a myriad of applications for object detection, autonomous driving, navigation, SmartCity or cultural heritage. Additionally, recent development of depth cameras have made these geometric representations possible and have opened up exciting avenues for research on such imagery.

The goal of the proposed research is to develop new algorithms that exploit the
imaging capabilities of modern 2D and 3D sensors and provide new visual information contents for 3D scene understanding. The project addresses three tasks, that have the common goal of 3D object reconstruction and localization. In this project, we will address the fundamental problem of reconstructing and recognizing 3D data from heterogeneous visual sensors, in particular, Lidar and ToF cameras, passive stereo and multiview reconstruction from perspective and non-perspective imagery using a novel patch-based methodology. We take a unified view of these methods and make use of all these sensors by simultaneously using 3D (range) and 2D (radiometric) images.

Specific aims of the proposed work

Generalized Camera Calibration and Pose Estimation
This problem arises in calibration of non-conventional optics/sensors (e.g. omnidirectional optics, Lidar, etc.). While standard projective camera calibration is extensively studied and has many working solutions, the calibration of camera systems consisting of different sensors (e.g. Lidar, traditional color camera, or infra camera) is less studied. In such mixed environments, correspondence-free and target-less calibration is particularly important since, due to unusual optical distortions and different sensory information, correspondences are difficult to establish. Furthermore, target-less calibration is important when images taken at different time (e.g. a Lidar scan and an infra image) need to be fused. A strongly related area is image-based navigation, which is becoming more and more important with the widespread use of smart mobile phones and UAVs.

3D reconstruction
The current mainstream approach to passive stereo reconstruction is based on projective geometry that provides a full reconstruction framework. The reconstruction theory is well developed and tested for central perspective camera, but it is less developed to other nonlinear central camera models. Region-based methods proved to be highly accurate and robust. Having information established between the corresponding planar regions of images, the normal and location of the observed patch can be directly retrieved. Having partial depth data of a scene, one can use it as a prior for efficient image-based (stereo or multiview) 3D reconstruction to produce an accurate 3D scene representation. A key problem is stereo matching, occlusion, which can be formulated as an MRF model whose energy is minimized via expansion moves and graph-cut. While these algorithms are efficient and can handle occlusions properly, they typically use a very generic smoothing prior. In this project, we want to introduce a new class of methods, which goes beyond point correspondences; where more complex priors (e.g. reconstructed planar patches or partial 3D measurements of the same scene) can be adopted within a single framework.

Localization and Pose Tracking
Knowing the position and orientation of a camera or camera system mounted on a moving platform (e.g. UAV, car, or even humans) allows to localize it in a 3D environment based on camera - 3D world measurements. With the broad availability of 3D data (e.g. whole city scans), such algorithms can be used to track the pose of a moving camera system or alternatively to identify the pose of an object seen by the camera in the 3D world. Such an approach can be useful for detecting important events that affect large number of people and help e.g., organizing rescue plans and/or to provide faster and more precise information to them. Important applications occur in surveillance, industry (visual inspection), autonomous driving, SmartCity (mapping and monitoring of roads, buildings) as well as cultural heritage (precise spatial and spectral documentation of cultural heritage objects and building). Environment monitoring or rescue operations typically rely on various sensors (e.g. lidar, infrared sensors), potentially mounted on moving robots/UAVs, which requires reliable localization of objects or the camera using these heterogeneous datas. The proposed algorithms will be applied to one of these key application areas.

felvehető hallgatók száma: 3

Jelentkezési határidő: 2020-05-31

Minden jog fenntartva © 2007, Országos Doktori Tanács - a doktori adatbázis nyilvántartási száma az adatvédelmi biztosnál: 02003/0001. Program verzió: 2.2358 ( 2017. X. 31. )