Login
 Forum
 
 
Thesis topic proposal
 
Gábor Szűcs
Artificial intelligence research for visual-based knowledge exploration in multimodal environment

THESIS TOPIC PROPOSAL

Institute: Budapest University of Technology and Economics
computer sciences
Doctoral School of Informatics

Thesis supervisor: Gábor Szűcs
Location of studies (in Hungarian): Department of Telecommunications and Media Informatics
Abbreviation of location of studies: TMIT


Description of the research topic:

Many our words come from visual origin, as we like to name everything we see, whether it’s a specific object or a more abstract visual concept like dark, light. During machine vision, we teach systems to recognize different living things and objects by artificial intelligence; however, current methods are suitable for identifying a closed set (given number of elements) object or concept. The task of the student is to research and develop methods that are suitable for learning in the case of an open set of conditions, i.e. they are also able to explore a new type of object or concept while recognizing the existing ones. This activity of discovery can take place with or without human intervention. In human assistance, human-machine communication can be text-based, speech-based, or even visual gesture-based (in a multimodal environment). It is a research task to achieve new scientific results in both directions of communication. The other direction means that the person initiates the dialogue and asks questions about pictures, videos, which the machine has to answer. Research on visual question answering has already begun in a multimodal environment, and deep neural networks are already able to analyse and recognize audio-visual content using RNN, LSTM, RBM, CNN type networks, but the results are not perfect enough, so research includes the further development and combination of these. The aim is to develop a suitable deep neural network architecture, to design a sequence-to-sequence deep neural network encoder and decoder (e.g. beamsearch decoder) so that the artificial intelligence system has to ability to interpret the human question, to find the appropriate knowledge elements, and to give the answer in natural language.

Required language skills: English
Further requirements: 
- angol nyelvismeret
- mesterséges intelligencia valamely részének ismerete

Number of students who can be accepted: 1

Deadline for application: 2021-09-01


2024. IV. 17.
ODT ülés
Az ODT következő ülésére 2024. június 14-én, pénteken 10.00 órakor kerül sor a Semmelweis Egyetem Szenátusi termében (Bp. Üllői út 26. I. emelet).

 
All rights reserved © 2007, Hungarian Doctoral Council. Doctoral Council registration number at commissioner for data protection: 02003/0001. Program version: 2.2358 ( 2017. X. 31. )