Multivariate analysis comprises a broad gamma of techniques to extract information from massive amounts of data, but at the same time contains an equally broad gamma of pitfalls. Breaking down the barriers towards multivariate analysis and smoothing the path towards expertise building, are considered as the main objectives of this course. At the end of the course participants will be able to select the appropriate method to solve different kinds of problems, analyse the data and correctly interpret the results.
INTENDED AUDIENCE AND PRIOR KNOWLEDGE
This course is intended to those who in daily practice are faced with large data tables and who are not familiar with the application of multivariate methods, but also to those who have already been playing around with multivariate methods but don’t feel confident in the interpretation of multivariate graphs and numbers. Prior statistical knowledge is not required. The course is at master’s level. Most examples come from chemistry, but even if you are not working in that world you will appreciate this course.
The structure of this course is similar to the (recommended) structured approach for analysing big data sets; start by a thorough qualitative analysis (plotting the data, searching for correlations, outliers, …), then move on to the model building part, validate the model and quantify its predictive ability. Emphasis will be put on correctly selecting and applying the appropriate multivariate method, and on the correct interpretation of the results. The theoretical part will be illustrated with lots of practical examples, and will be alternated with computer exercises on real-life cases.
- Visualisation of big datasets
- Principal Component Analysis (PCA)
- Cluster analysis: searching for groups of similar samples
- Multiple Linear Regression (MLR) with uncorrelated variables
- Multiple Linear Regression (MLR) with correlated variables
- Stepwise regression
- The collinearity problem
- An overview of the pitfalls
- Principal Component Regression (PCR)
- Partial Least Squares (PLS)
- Interpretation of PCR and PLS models
- Validation of regression models
- Detection of outliers and non-linearities
- Prediction with regression models
- Some alternatives
- Feasibility study: does a quantitative analysis make sense?
- Classification (supervised pattern recognition): predicting class membership
- Linear Discriminant Analysis (LDA)
- Specific applications:
- QSAR / QSPR (Quantitative Structure Activity / Property Relations)
- Multivariate SPC (M-SPC)
- Principal Properties Design