Thesis topic proposal
Deep learning techniques in automatic speech recognition


Institute: Budapest University of Technology and Economics
computer sciences
Doctoral School of Informatics

Thesis supervisor: Péter Mihajlik
Location of studies (in Hungarian): Department of Telecommunications and Media Informatics
Abbreviation of location of studies: TMIT

Description of the research topic:

Automatic Speech Recognition (ASR) is grounded on statistics and uses various Machine Learning (ML) techniques such as Hidden Markov-Models, Maximum Likelihood Decision Trees, Weighted Finite State Transducers, etc. In the last decade the introduction of deep learning revolutionalized ASR and boosted its performance in an unprecedented manner. The development is still in progress and currently the end-to-end ASR that is built entirely on deep neural networks is not only competitive to the classical ASR approaches but in special tasks it can outperform human transcription, as well. The applied deep learning technniques, however, demand huge amount of data what is not always easy to obtain - not to mention the computational resources needed. Therefore it is essential to investigate the downscalability of deep learning ASR technologies developed for well resourced languages in favor of the less resourced ones. Also, inter-lingual transfer learning or other semi-supervised and data augmentation methods are to be explored in order to facilitate less resourced and/or multilingual speech recognition.

Required language skills: english
Number of students who can be accepted: 2

Deadline for application: 2021-09-01

All rights reserved © 2007, Hungarian Doctoral Council. Doctoral Council registration number at commissioner for data protection: 02003/0001. Program version: 2.2358 ( 2017. X. 31. )