Logistic regression is one of the first classification techniques that one comes across in machine learning. From its name, it even confuses the beginners as a regression technique though it is a classification technique. However the term 'regression' in its names does mean it is a regression technique. So is it Linear Regression behind the play? What is actually Logistic Regression? I came across few blogs on this topic as mentioned in the References. Here, I will share my notes and a simplified explanation to Logistic Regression.
Logistic regression is used for binary classification, however what works behind is a linear regression. Continuous range of linear regression is mapped to a near binary range of logistic regression, more precisely Logit function. Lets try to understand this
Odds ratio,
\frac{\frac{number of success}{total trials}}{\frac{number of failures}{total trials}}
Range of O:
= 0 when no occurrence of success
= \infty
when all are success
Range of Success outcome:
= P = 0 to 1
Consider a linear regression
latex Y=a+b_iX_i \infty \leq Y \leq +\infty ~and \infty \leq X \leq +\infty
Here the outcome variable Y and feature variables Xs range from \infty
to +\infty
. However in classification, outcome label Y cant't range from latex \infty
to +\infty
. So we need transform the LHS to match the range in linear regression. Consider these options:
P=a+b_iX_i 0 \leq P \leq 1 ~and \infty \leq X \leq +\infty
. Here P is the occurrence of outcome variable which is either 0 or 1 or similar in binary classificationO= \frac{P}{1P}=a+b_iX_i 0 \leq O \leq +\infty ~and \infty \leq X \leq +\infty
. Here outcome label O is Odds ratio P(yes)/P(no)ln(O)= ln(\frac{P}{1P})=a+b_iX_i \infty \leq O \leq +\infty ~and \infty \leq X \leq +\infty
. Here outcome label is natural log of O
In the third case above, range of LHS matches the range of linear regression's RHS. This LHS is the logit function, ln(O). So we can use linear regression on RHS to estimate ln(O) which is what logistic regression is doing. To get the classification outcome label P, rearrange equation for ln(O)

ln(O)= ln(\frac{P}{1P})=a+b_iX_i

\frac{P}{1P} = e^{(a+b_iX_i)}

P=\frac{1}{1+e^{(a+b_iX_i)}}
Plotting P vs. X with a=0, b=1
References