# TIL about Logistic Regression permalink

\[\ h_{\theta}(x) = \frac{1}{ 1 + e^{ -{\theta}^{\intercal}x } } \]

*Logistic regression* is a form of machine learning used to determine a discrete, non-continuous value.
Today, I started learning abaut *binary classification* which has two cases - negative (0) and positive (1) - that your prediction can result in.
This is accomplished using a Sigmoid function; otherwise known as a logistic function - hence the name, logistic regression.

Examples include:

- flagging an email as
*spam*or*not spam* - determining if a tumor is
*benign*or*malignant* - determining if a series of application metrics is an
*outage*

Like linear regression, logistic regression also has a cost function used to fit parameters to the algorithm based on training data:

\[\ J({\theta}) = \frac{1}{m} \cdot -y^{\intercal}log(h) - (1 - y)^{\intercal} log(1-h) \]

And since the cost function is convex, we can use a slightly modified version of gradient descent to learn the parameters:

\[\ \theta := \theta - \frac{\alpha}{m}X^{\intercal}( \frac{1}{1 + e^{ -{\theta}^{\intercal}x }} - \overset{\rarr}{y}) \]

There is a form of *multiclass classification* in which the prediction has more than two cases, but I get to learn that tomorrow!