ODT - THESIS TOPIC: György Szaszák: Reinforcement learning and self ...

Reinforcement learning and self learning systems

THESIS TOPIC PROPOSAL

Institute: Budapest University of Technology and Economics
computer sciences
Doctoral School of Informatics

Thesis supervisor: György Szaszák
Location of studies (in Hungarian): Távközlési és Médiainformatikai Tanszék
Abbreviation of location of studies: TMIT

Description of the research topic:

Research objectives:
Originally inspired by game theory, reinforcement learning (RL) has penetrated other research fields increasingly. In problems related to information retrieval tasks or recommendation systems – eventually coupled with interactive user interfaces – RL can be exploited in optimizing rankings and/or actions when communicating with the user. In general, q-learning as a RL technique allowing for model-unconstrained learning is usually used to find an optimal action-selection rules (policy) for a process fulfilling the Markov-assumption. This can be exploited in a wide range of machine learning tasks where no underlying model is available. It is an interesting alternative of (or on top of) statistical modeling to try to automatically learn a kind of rule set for a phenomenon or a task to be modeled. For example, a system capable of understanding and executing commands in natural language or a system carrying out a set of complex actions (robotics) could be used for several machine learning and artificial intelligence (AI) tasks and this with a low demand of engineering (modeling) effort. These systems are moreover able to build a kind of “common sense” knowledge, an aspect of know-how in which humans still outperform most AI systems. The goal of the research is therefore to develop new RL methods and apply them to real-life tasks outside game theory. Fields of choice include natural language processing, machine vision, robotics including self-driving cars, information retrieval or trend analysis etc., with a strong aspect on automatically extracting and interpreting semantic relations and meaning and hence providing a kind of mental-conceptual representation as a by-product in the chosen research field.

Some open problems:
- RL modeling (with Q-learning) in the fields of natural language processing, machine vision, self-driving cars, information retrieval or trend analysis etc.
- Life long learning and self learning systems; the agent builds up more and more complex skills as it continuously interacts with its environment.
- Comparison of model-based and model-free approaches.
- Employ complexe exploration strategies (beyond epsilon greedy/ softmax exploration) suitable for sparse reward signals.
- Development of RL methods (reward definitions, risk-aware action selection) in security and safety critical scenarios.

Required language skills: english
Further requirements:
Requirements:
Some experience in deep learning and neural networks, mathematical backgrounds (probability theory, statistics and processes), C/C++, Python and other scripting languages

Number of students who can be accepted: 1

Deadline for application: 2018-07-31