Categories
4. Linear Regression In R programming language

Working of Linear Regression

We can better understand how linear regression works by using the example of a dataset that contains two fields, Area and Rent, and is used to predict the house’s rent based on the area where it is located. The dataset is:

area-image

As you can see, we are using a simple dataset for our example. Using this uncomplicated data, let’s have a look at how linear regression works, step by step:

1. With the available data, we plot a graph with Area in the X-axis and Rent on Y-axis. The graph will look like the following. Notice that it is a linear pattern with a slight dip. 

rent-area

2. Next, we find the mean of Area and Rent.

mean-area-rent

3. We then plot the mean on the graph.

4. We draw a line of best fit that passes through the mean.

rent-area

5. But we encounter a problem. As you can see below, multiple lines can be drawn through the mean: 

rent-area-multiple-lines

6. To overcome this problem, we keep moving the line to make sure the best fit line has the least square distance from the data points

best-fit-line

7. The least-square distance is found by adding the square of the residuals

adding-square

8. We now arrive at the relation that, Residual is the distance between Y-actual and Y-pred.

rent-residual

9. The value of m & c for the best fit line, y = mx+ c can be calculated using these formulas:

value-m-c-1

10. This helps us find the corresponding values:

corresponding-values

11. With that, we can obtain the values of m & c.

value-m-c-3

12. Now, we can find the value of Y-pred.

pred-y.

13. After calculating, we find that the least square value for the below line is 3.02.

least-square

14. Finally, we are able to plot the Y-pred and this is found out to be the best fit line.

plot-the-y-pred

This shows how the linear regression algorithm works. Now let’s move onto our use case.

Leave a Reply

Your email address will not be published. Required fields are marked *