ODT - témakiírás: Kató Zoltán: Geometric Alignment and Fusion ...

Geometric Alignment and Fusion of Visual Objects

TÉMAKIÍRÁS

Intézmény: Szegedi Tudományegyetem
informatikai tudományok
Informatika Doktori Iskola

témavezető: Kató Zoltán
helyszín (magyar oldal): SZTE
helyszín rövidítés: SZTE

A kutatási téma leírása:

The purely segment-level understanding of many current computer vision systems tells us little about where objects are located in 3D space and how agents like humans or vehicles could interact with them. However, recent work has focused on obtaining a geometric understanding of the scene in terms of the 3D volumes and surfaces that compose the scene. This representation enables reasoning about the objects as they exist in a 3D world, rather than simply in the image plane, and has been demonstrated to have a myriad of applications for object detection, autonomous driving, navigation, SmartCity or cultural heritage. Additionally, recent development of depth cameras have made these geometric representations possible and have opened up exciting avenues for research on such imagery.

Today, different sensors and approaches are often combined to achieve a detailed, geometrically correct and properly textured 3D or 4D model of an object or a scene. Visual and non-visual sensor data are fused to cope with varying illumination, surface properties, motion and occlusion. This requires good calibration and registration of the modalities such as color images and depth data (LIDAR, hand-held scanners, Kinect, Time-of-Flight (ToF) cameras).

The need for heterogeneous 3D data registration is becoming a must in several applications including autonomous navigation, mapping or even cultural heritage use cases. The main challenge in this type of data fusion is the relaxation of the rigid-transformation constraints. Due to the different physical measurement principles of the different sensors the resulting data undergoes a non-linear distortion, for which some un-distortion can help, but the overall consistency of the resulting 3D models may suffer from problems like occlusion, reflectance, shadow effects of surface unevenness. The core problem of visual sensor data fusion is to transform all images into a common coordinate frame. While an extensive amount of work has been done on this problem the fundamental question of how to reliably and efficiently estimate the transformation relating two images (possibly of different modality) remains largely unsolved. Here we propose an approach aimed at solving directly this fundamental problem by modeling the transformation as a general continuous mapping approximated by a parametric model. We then propose a method for estimating the parameters of the transformation in a computationally efficient manner. Once the transformation has been estimated, we proceed to map one image onto the other, i.e. to perform visual data fusion.

Specific aims of the proposed work

Studying a general parametric mathematical model for characterizing the class of deformations/transformations between the images that need to be registered.

Developing computationally efficient techniques for estimating the transformation parameters from the given images. This is a very challenging problem, due to the inherent nonlinear nature of the problem. Existing methods are computationally intensive since they typically require established point correspondences and the solution of non-convex minimization problems.

Developing efficient visual data fusion techniques which can be used to merge various radiometric (e.g. RGB, infrared modalities) as well as geometric information (e.g. superresolution of depth information) into a coherent 3D or 4D model of a scene.

Important applications occur in surveillance, industry (visual inspection), autonomous driving, SmartCity (mapping and monitoring of roads, buildings) as well as cultural heritage (precise spatial and spectral documentation of cultural heritage objects and building). Environment monitoring or rescue operations typically rely on various sensors (e.g. lidar, infrared sensors), potentially mounted on moving robots/UAVs, which requires reliable fusion of these heterogeneous data. The proposed algorithms will be applied to one of these key application areas.

felvehető hallgatók száma: 2

Jelentkezési határidő: 2022-03-15