TIL about Logistic Regression permalink
\[\ h_{\theta}(x) = \frac{1}{ 1 + e^{ -{\theta}^{\intercal}x } } \]
Logistic regression is a form of machine learning used to determine a discrete, non-continuous value. Today, I started learning abaut binary classification which has two cases - negative (0) and positive (1) - that your prediction can result in. This is accomplished using a Sigmoid function; otherwise known as a logistic function - hence the name, logistic regression.
Examples include:
- flagging an email as spam or not spam
- determining if a tumor is benign or malignant
- determining if a series of application metrics is an outage
Like linear regression, logistic regression also has a cost function used to fit parameters to the algorithm based on training data:
\[\ J({\theta}) = \frac{1}{m} \cdot -y^{\intercal}log(h) - (1 - y)^{\intercal} log(1-h) \]
And since the cost function is convex, we can use a slightly modified version of gradient descent to learn the parameters:
\[\ \theta := \theta - \frac{\alpha}{m}X^{\intercal}( \frac{1}{1 + e^{ -{\theta}^{\intercal}x }} - \overset{\rarr}{y}) \]
There is a form of multiclass classification in which the prediction has more than two cases, but I get to learn that tomorrow!