Prerequisites
Probability and Statistics.
Objectives
- Introduction to Applied Statistics and its relevance in Data Science. - Analyze real data using statistical methods to extract relevant information about them and solve practical problems using statistical software. - Know the advantages and limitations of various statistical methodologies to make out the most of them in solving real problems. - Find statistical evidence in the data based on models adjusted to the observations collected. Infer about hypotheses of interest associated with the selected models. - Solve a real problem using the knowledge accumulated in this course: computational project.
Program
1. Exploratory data analysis: (i) Introduction to R. (ii) Visualization of different types of data. (iii) Treatment of missing values. (iv) Outlier detection. 2. Dimensionality reduction: principal component analysis. Covariance and correlation matrices. 3. Regression models: Gaussian, Logistic, Poisson. Variable Selection. Diagnostic Techniques. Model validation. Prediction. 4. Modeling independent data versus time dependent data. 5. Resampling methods: Jackknife, bootstrap, permutation testing and cross-validation. 6. Elements of the Bayesian methodology: a priori representation (conjugate and non-informative distributions), inference by the Bayes theorem and applications to real data problems. 7. Classification: Total probability of misclassification, Fisher linear discriminant analysis, Bayes classification rule. Evaluation of the performance of a classification rule.
Evaluation Methodology
A Test of 1h30m (50%), with a minimum grade of 8.0, and a Computational Project (50%)
Cross-Competence Component
Critical and Innovative Thinking - Project realization involves components of strategic thinking, critical thinking, creativity, and problem-solving strategies without explicit evaluation. Intrapersonal Competencies - Project realization involves components of productivity and time management, stress management, proactivity and initiative, intrinsic motivation and decision making without explicit evaluation. Interpersonal Skills - In assessing the project report, 10% of the rating is given to the form of the reports and 10% of the rating is given to the oral presentation and discussion of the project.
Laboratorial Component
Laboratory work performed with the help of R (or equivalent).
Programming and Computing Component
The laboratory and project work involve R programming. The evaluation percentage in this component is 50%.
More information at: https://fenix.tecnico.ulisboa.pt/cursos/lerc/disciplina-curricular/845953938490004