Machine Learning – Multivariable Linear Regression

Multivariable linear regression example 1

Introduction

Multivariable Linear regression is a common machine learning algorithm. When getting started with machine learning, multivariable linear regression is a great place to dive into next. If you haven’t read the previous article about Simple Linear Regression, I would recommend it, because that is the best place to start.

What is Multivariable Linear Regression?

Multivariable Linear regression in simple terms is a statistical way of measuring the relationship between multiple variables. Such as, as time increases, so does cost.

Multivariable linear regression example 1

Why does linear regression matter? In real life, generally there isn’t 1 variable that predicts a value, often times multiple variable predicts a value. Simply put, you can predict the future!

 

Variable vs Feature

In machine learning, you may hear the term “feature” used often. Feature and variable are often times used interchangeably. Let’s use an example of a feature. Let’s take an apple, what are the basic features of this apple?

Apple

The apple is:

  • Red
  • Round
  • Has a stem

Feature Selection

Feature selection in reality is nearly a field on it’s own. Feature selection is the process of selecting the best features to use to best predict the y value. Here are a few tips when trying to select features:

  1. The less correlated the features are, the better – Using the correlation coefficient
  2. Features must describe the predictive value
  3. Features must be related to the predictive value

The Math

If you recall back to the linear regression formula, y = mx + b, you may notice that the formula is similar. The basic formula is:

y = m1x1 + m2x2 + b

or another way to write this is:

y = w1x1 + w2x2 + b

  • y – the predicted value
  • w1x1 – the first feature
  • w2x– the second feature
  • – the bias

Implement the Math

Let’s say that we are given the following dataset:

House Value (y) Square Footage (x1) Number of Bedrooms (x2)
$141,000 1,300 2
$151,000 1,300 3
$163,000 1,500 2
$174,000 1,500 3

Let’s also say that we have a house with:

  • 3 bedrooms
  • 2,005 square feet

What is the house value?

First, we figure out the slope between feature one, which is the square foot and the house value, y.

Multivariable Linear Regression House Value and Square Footage

The slope is $112.50 per square foot.

Next, we figure out the slope between feature two, which is the number of bedrooms and the house value, y.

Multivariable Linear Regression House Value and number of bedrooms

The slope is $10,500 per bedroom added to the house.

Plugin the Values

Using the same formula as found above, y = w1x1 + w2x2 + b, we now plugin the values into the formula.

  1. Plugin feature one – the square footage slope and the 2,005 square footage value
    1. y = $112.5 * 2,005 + w2x2 + b
  2. Plugin feature two – the number of bedrooms and the 3 bedroom house value
    1. y = $112.5 * 2,005 + $10,500 * 3 + b
  3. Finally, plugin the bias – which in our case is $0
    1. y = $112.5 * 2,005 + $10,500 * 3 + 0
  4. Complete the math
    1. y = $257,062.50

Conclusion

From this article and video, you were able to understand what multivariable linear regression is, what the math looks like, and how to implement multivariable linear regression in a simple problem. Please provide any comments to help improve this post or video for future learners.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.