ODT - témakiírás: Gyires-Tóth Bálint: Deep Reinforcement Learning in ...

Deep Reinforcement Learning in Complex Environments

TÉMAKIÍRÁS

Intézmény: Budapesti Műszaki és Gazdaságtudományi Egyetem
villamosmérnöki tudományok
Villamosmérnöki Tudományok Doktori Iskola

témavezető: Gyires-Tóth Bálint
helyszín (magyar oldal): Távközlési és Médiainformatikai Tanszék
helyszín rövidítés: TMIT

A kutatási téma leírása:

Deep neural networks have proven to effectively solve particular problems when a large amount of high-quality training data is available. One of the main advantages of deep learning compared to other machine learning methods is the representation learning capability performed jointly with modeling. Training deep learning models utilizes mathematical operations to minimize or maximize an objective function, such as mean squared error or cross entropy. However, these models are unable to develop manifold strategies.
In contrast, by defining an environment, possible actions in this environment and by assigning rewards and penalties to these actions in different states reinforcement learning is capable of elaborating novel strategies implicitly.
Deep learning paradigm showed outstanding results in reinforcement learning, recently. A number of deep reinforcement learning strategies (like Deep Deterministic Policy Gradient; Proximal Policy Optimization; Asynchronous Advantage Actor-Critic; etc. methods) evolved that achieve faster convergence than the baseline algorithms. Still, in complex environments, deep reinforcement learning is unstable, it is unable to find the optimal and near-optimal action sequences, regularly. However, even if somehow proper action sequences can be learned, state-of-the-art optimization algorithms are likely to require an enormous amount of computational resources.
The goal of this Ph.D. research is to elaborate novel deep reinforcement learning algorithms that are able to efficiently learn optimal and near-optimal action sequences in complex environments. The effectiveness of the elaborated method(s) must be proven at least in one application scenario. Such an application scenario can be (1) virtual environments, (2) small-scale real-world models, and (3) autonomous driving.
The research can be conducted both in English and in Hungarian.

The foreseen research tasks of the Ph.D. student are the following:
• Overview of related scientific papers, including the basic deep neural network elements, the basics and the recent results of reinforcement learning, focusing on deep reinforcement learning.
• Design and implement baseline methods of deep reinforcement learning in different application scenarios. Elaborate at least one simulation environment.
• Conduct research in deep reinforcement learning algorithms in complex environments.
• Focus on scenarios that exhibit special characteristics and challenges prevalent in the autonomous driving domain.
• Propose novel deep reinforcement learning methods that elaborate flexible strategies which are able to be efficiently trained in complex environments.
• Demonstrate the effectiveness of the results at least in one application scenario.
• Objective and subjective evaluation.

előírt nyelvtudás: angol
felvehető hallgatók száma: 1

Jelentkezési határidő: 2019-01-07