Home Simple Linear Regression
Post
Cancel

Simple Linear Regression

What is Simple Linear Regression?

Simple linear regression is a statistical method that allows us to summarize and study relationships between two continuous (quantitative) variables:

  • The first variable $X$ is regarded as the predictor, explanatory, or independent variable.
  • The second variable $Y$ is regarded as the response, outcome, or dependent variable.

As a result, the formula for simple linear regression is expressed as the formula below.

$y = ax + b$

You can use simple linear regression when you want to know:

  • How strong the relationship is between two variables (e.g., the relationship between rainfall and soil erosion).
  • The value of the dependent variable at a certain value of the independent variable (e.g., the amount of soil erosion at a certain level of rainfall).



How is the formula made?





As you can see in the image above, there are red crosses (yi) representing actual salaries based on experience. The green crosses (yi^) are the modeled values, representing the salary at a given experience level. The green lines represent the difference between the actual salary and the modeled salary.The goal is to find the straight line where the sum of the distances of the green lines is minimized, which becomes the black line.





Example


Code



1
2
3
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, y_train)

Test Result





Prediction Result







Implementation

This post is licensed under CC BY 4.0 by the author.