Thesis supervisor: Márton Ispány
Location of studies (in Hungarian): University of Debrecen Faculty of Informatics Abbreviation of location of studies: DE IK
Description of the research topic:
Syllabus
Development and investigation of new data mining models and improvement of existing ones which can successfully be applied in various fields of science. The optional subtopics include both supervised and unsupervised models. The supervised data mining models are, among others, regression models and regularization, kernel method and radial basis functions, sparse kernels (SVM and RVM), neural networks, graphical models and Bayesian networks, high-dimensional problems. Non-supervised data mining models include mixtures and the EM algorithm, clustering, Kohonen's nets, dimension reduction, principal component analysis and singular valued decomposition, non-negative matrix factorization, independent component analysis, multidimensional scaling. The research topics also include the analysis of sequential data, particularly the time series analysis. One of important topics, among others, the analysis of non-Gaussian time series, e.g. integer-valued time series. The developed models have to be tested on large datasets. The applications areas are, e.g., web- and text-mining.
Bibliography
Bishop, C. M., Pattern Recognition and Machine Learning, Springer, 2006.
Hastie, T., Tibshirani, R., Friedman, J., The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer-Verlag, 2009.
Feldman, R., Sanger, J., The Text Mining Handbook. Advanced Approaches in Analyzing Unstructured Data. Cambridge, 2006.
Liu, Bing, Web Data Mining, Exploring Hyperlinks, Contents, and Usage Data, Springer 2011.