Linear Regression
2. Linear Regression¶
Objectives
Understand the basic concepts of linear regression.
Apply linear regression to real data.
Evaluate the performance of a linear regression model.
Interpret the results of a linear regression.
Expected time to complete: 4 hours
Linear regression is a fundamental model for regression problems, where we aim to predict a continuous value (i.e. a quantitative response) based on one or more explanatory variables. It is a linear model that assumes a linear relationship between the input variable(s) \(x\) or \(\mathbf{x}\) and the single output variable \(y\). Thus, it can be used to understand the relationship between two or more variables. In this chapter, we will learn how to use linear regression to make predictions and understand the relationship between variables. We will learn the basic concepts of linear regression and apply it to real data. We will also learn how to evaluate the performance of a linear regression model and interpret the results.
We will mainly use the Advertising data to explain the key ideas underlying linear regression. The Advertising
dataset contains sales revenue generated with respect to advertisement spends across multiple channels (TV, radio, newspaper) for a single product in a single market. The goal is to predict sales
revenue based on the advertisement spends.
Ingredients
Input: features of data samples
Output: target values of data samples
Model: fit a line (or plane/hyperplane) to the training data and assign the value on the fitted line (or plane/hyperplane) to the test data
Hyperparameter(s): None
Parameters: the intercept(s) and slope(s) of the fitted line (or plane/hyperplane), also known as the bias(es) and weight(s), respectively
Loss function: minimise the total distances of the training data points to the fitted line (or plane/hyperplane)
Learning algorithm: closed-form analytical solution based on linear algebra
System transparency
System logic
Condition to produce certain output: to produce an output \(y\), locate this \(y\) value on the fitted line (or plane/hyperplane) and then find the corresponding input \(x\) (or \(\mathbf{x}\)) value