NHS_intro_ML

Fundamentals of Machine Learning for Health and Care using R

Slides

Day Slides
1 Introduction to Machine Learning
2 Linear Regression
3 Data Preparation
4 Feature Engineering
5 Classification
6 Classification using Tree Models
7 Introduction to Regularisation
8 Regression using Tree Models

Summary of the course

In this course we will introduce the basic ideas and algorithms of supervised learning and we will implement them using R programming language. A brief theoretical overview of the so-called learning setting will be provided, then the main focus will be on showing practical analysis and modelling of data related to healthcare.

Learning outcomes

Detailed Programme

Day 1: Introduction to Machine Learning

What is machine learning? Types of machine learning. Classification and regression. Training and test sets. Model evaluation. Over-fitting. Overview of Machine Learning Algorithms. No free lunch theorem. Cross validation.

Day 2: Linear Regression

Simple and multivariate linear regression. Polynomial regression. Parameter estimates. Residual analysis. Metrics for model evaluation. Plots and predictions. Feature selection.

Day 3: Data Preparation

Data analysis and pre-processing, exploratory data analysis, handling missing data.

Day 4: Feature Engineering

Feature engineering techniques including but not limited to: transformations, feature extraction, reduction and selection.

Day 5: Classification

Logistic Regression: why logistic regression; logistic function; simple logistic regression; multinomial logistic regression (tentative); ROC curve; feature interpretation; predictions using logistic regression.

Day 6: Classification using Tree Models

Decision Trees: classification using decision trees; understanding and visualising decision trees; advantages and disadvantages of decision trees; predictions. Random Forests: from decisions trees to random forests; training and tuning random forests; predictions.

Day 7: Regression using Tree Models

Using decision trees and random forests for regression. Variable importance.

Day 8: Introduction to Regularisation

Regularisation and over-fitting. Ridge penalty and LASSO penalty. Elastic Nets. Tuning regularised models.