Feature regularization is used to prevent overfitting parameters $$\ \theta$$ for an algorithm. This method brings the decreases the values of the parameters closer to zero, which diminishes the impact the features have on the algorithm. This results in a better fit and a “smoother” appearance when plotted.
Feature regularization is accomplished by extending the cost function and gradient calculation for all $$\ \theta$$ except $$\ \theta_0$$ as follows:
$\ J(\theta) = \frac{1}{2m} [ \sum_{i=1}^{m} (h_{\theta}(x^{(i)}-y^{(i)})^2 + \lambda \sum_{j=1}^{n} \theta_{j}^2 ]$
$\ \theta_j = \theta_j (1 - \alpha \frac{\lambda}{m}) - \alpha \frac{1}{m} \sum_{i=1}^{m}(h_{\theta}(x^{(i)}) - y^{(i)}) x_{j}^{(i)}$