4. Linear Regression In R programming language

Working of Linear Regression

We can better understand how linear regression works by using the example of a dataset that contains two fields, Area and Rent, and is used to predict the house’s rent based on the area where it is located. The dataset is:


As you can see, we are using a simple dataset for our example. Using this uncomplicated data, let’s have a look at how linear regression works, step by step:

1. With the available data, we plot a graph with Area in the X-axis and Rent on Y-axis. The graph will look like the following. Notice that it is a linear pattern with a slight dip. 


2. Next, we find the mean of Area and Rent.


3. We then plot the mean on the graph.

4. We draw a line of best fit that passes through the mean.


5. But we encounter a problem. As you can see below, multiple lines can be drawn through the mean: 


6. To overcome this problem, we keep moving the line to make sure the best fit line has the least square distance from the data points


7. The least-square distance is found by adding the square of the residuals


8. We now arrive at the relation that, Residual is the distance between Y-actual and Y-pred.


9. The value of m & c for the best fit line, y = mx+ c can be calculated using these formulas:


10. This helps us find the corresponding values:


11. With that, we can obtain the values of m & c.


12. Now, we can find the value of Y-pred.


13. After calculating, we find that the least square value for the below line is 3.02.


14. Finally, we are able to plot the Y-pred and this is found out to be the best fit line.


This shows how the linear regression algorithm works. Now let’s move onto our use case.

Leave a Reply

Your email address will not be published. Required fields are marked *