Data Science
Regression Analysis In Machine Learning
Walkthrough of Linear, Logistic, and Polynomial Regression
Index Of Contents
· Introduction
· Types of Regression
∘ 1. Linear Regression
∘ 2. Logistic Regression
∘ 3. Polynomial Regression
· Conclusion
Introduction
Regression analysis is a statistical technique for determining the relationship between a target variable (the price of a house) and input parameters (locality, build area, amenities, etc).
In other words, it demonstrates how the value of the target variable varies in response to changes in the values of the input parameters, aka the correlation between input and output variables.
From a machine learning perspective, it helps us predict the outcomes for a continuous-valued dataset and is used for price prediction, inventory forecasting, time-series modeling, etc.
The regression techniques essentially help train machine learning models to look for mathematical patterns/ trends existing in historical data. Thereafter, when a new data point is presented, the model attempts to forecast the outcome based on its previous data learnings.
Types of Regression
There are a wide range of regression techniques that may be utilized by a data scientist or an ML engineer to make predictions. Nevertheless, a specific type of regression analysis is chosen based on the use case, business assumptions, distribution of the dataset, etc. We’ll now look at some of the must-know regression analysis types.
1. Linear Regression
Linear regression is one of the most widely used regression techniques.
In this method, we assume a linear relationship between the dependent target variable (Y) and the independent input parameters (X).
Equation:
y=mX+c
Where:
- “m“ denotes the slope or gradient of the line
- “c” denotes the point of intercept of Y
Based on the number of dependent and independent variables, linear regression is classified into two types:
- Simple Linear Regression
- Multiple Linear Regression
Some of the popular real-world applications of linear regression are:
- Sales Forecasting
- “ETA” plugins on apps
- Price Prediction
Linear Regression Graph
2. Logistic Regression
Logistic Regression is a regression method that is used to solve classification problems. Unlike other regression techniques where the target/ dependent variable is continuous, here, the dependent variable is discrete( Yes/No; 1/0; On/Off; Profit/Loss; Heads/Tails, etc).
Equation:
Where:
- “x“ is any given real number passed in as input.
- “e” is the base of natural logarithm.
Based on the type of problem we are solving, logistic regression is classified into three types:
- Binary Output (True/False; Spam/Not Spam,)
- Multi-Output (Dogs, Cats, Rats)
- Ordinal Output (Low, Medium, High)
The sigmoid function results in an S-shaped (Sigmoid curve) with its minimum value approaching zero and a maximum value approaching 1.
Some of the popular real-world applications of logistic regression are:
- Email Spam Filtering
- Sentiment Of Customer Reviews
- Image Classification
Logistic Regression Graph:
3. Polynomial Regression
Polynomial regression is used to find the relation between dependent and independent variables in a non-linear relationship. Unlike linear regression where the graph is a straight line, polynomial regression tends to curve to fit maximum data points.
Equation:
Where:
- the above equation represents a single variable polynomial of degree “k”
- “β” is the weights or regression coefficients
- “X” represents the independent input variable
In polynomial regression, each element in the equation can take any value for “k” unlike in multiple linear regression where an element can take multiple variables but has to have the same degree(”k-value”).
Some of the popular real-world applications of linear regression are:
- Rate of spread of diseases
- Casualty prediction in case of a calamity
Polynomial Regression Graph
Conclusion:
In this post, we have discussed the general idea of regression analysis along with three of the fundamental regression techniques that will help you get started.
There are several other regression techniques like Ridge regression, Lasso regression, Elastic Net Regression, and Principal Components Regression. However, each of these techniques has a very specific use case pertaining to a set of features in our dataset, a crucial output parameter, etc
I hope you enjoyed reading this article, in one of the upcoming articles, we will see how to build an end-to-end linear regression model.