Thesis topic proposal
Balázs Csanád Csáji
Reinforcement learning


Institute: Budapest University of Technology and Economics
mathematics and computing
Doctoral School of Mathematics and Computer Sciences

Thesis supervisor: Balázs Csanád Csáji
Location of studies (in Hungarian): SZTAKI(Institute for Computer Science and Control)
Abbreviation of location of studies: BME

Description of the research topic:

Reinforcement learning (RL) is one the main branchesof machine learning and it deals with the problem of learning from sequential interactions with an uncertain, dynamic environment based on feedbacks(e.g., states and immediate costs). Markov decision processes (MDPs) constitute the main mathematical background ofRL.However, unlike in classical MDP studies, in RL the model of the system is typically unavailable, therefore, the dynamicsand the costs have tobe learned (estimated) while the decision makertries to workefficiently. These two goals(exploring the environment and exploiting the information gathered so far) are working against each other leading to the fundamental problem of exploration vs exploitation(estimation vs control). Theoretical support for classical RL methods, such as Q-learning and TD(lambda), are usuallyasymptotic and presuppose eitheratabular representation of the value function or a linear function approximation. Novelchallenges in RL include providing methodswith non-asymptotic (and distribution-free) guarantees, handling partial observability andchanging environments, as well asstudying the notorious exploration-exploitation trade-off (even in simplified problems, such as multi-armed-or contextual bandits). Distributed RL methods is another possibleresearch direction. The theory of stochastic approximation(especially in Markovian environments)andvariousdistribution-free statistical methods are of high importance to provide guarantees for RL.

Required language skills: English
Further requirements: 
Solid background in probability and statistics, programming skills (e.g., Matlab, Python)

Number of students who can be accepted: 1

Deadline for application: 2024-05-31

2024. VII. 26.
ODT ülés
Az ODT következő ülésére 2024. augusztus 1-én, csütörtökön 14.00 órakor kerül sor online formában a Webex felületén.

All rights reserved © 2007, Hungarian Doctoral Council. Doctoral Council registration number at commissioner for data protection: 02003/0001. Program version: 2.2358 ( 2017. X. 31. )