Advances in data mining  

The traditional data mining techniques are mainly focused on solving classification, regression and clustering problems. However, the recent developments in ICT led to the emergence of new sorts of massive data sets and related data mining problems. Consequently, the field of data mining has rapidly expanded to cover new areas of research, such as: processing huge (tera- or petabytes big) data sets real-time analysis of data streams (internet traffic, sensor data, electronic transactions, etc.), searching for similar pairs of objects such as texts, images, songs, etc., in huge collections of such objects, finding anomalies in data, clustering of massive sets of records, recommendation systems, reduction of data dimensionality applications of DeepLearning to data mining During the course you will learn several techniques, algorithms and tools for addressing these new and challenging data mining problems: Recommender Systems: Collaborative Filtering, MatrixFactorization Algorithms for dimensionality reduction: LLE, t-SNE, UMAP RandomForest and XGBoost: the most popular algorithms for classification and regression trees Algorithms for detecting anomalies in data Locality Sensitive Hashing (LSH): a general technique for finding similar items in huge collections of items Algorithms for mining data streams: sampling, filtering (Bloom filters), probabilistic counting Applications of DeepLearning to data mining Distributed Processing of Massive Data: Hadoop, MapReduce, Spark Outcome: After completing the course, the students should: know most successful algorithms and techniques used in Data Mining; gain some hands-on experience with several algorithms for mining complex data sets; be able to apply the acquired knowledge and skills to new problems.
Presential
English
Advances in data mining
English

Funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or HaDEA. Neither the European Union nor the granting authority can be held responsible for them. The statements made herein do not necessarily have the consent or agreement of the ASTRAIOS Consortium. These represent the opinion and findings of the author(s).