LInear Regression
1. Linear Regression Cheat Sheet
High-Level Overview:
Linear Regression is a simple supervised learning technique that models the relationship between one or more input features (independent variables) and a continuous output (dependent variable) using a linear function.
Machine Learning Engineers often use it as a baseline model to quickly understand the data or to compare against more complex methods.
Key Concepts:
- Model: A line (in 2D) or hyperplane (in higher dimensions) that best fits the data.
- Parameters: Weights (coefficients) and intercept that define the model’s line or plane.
- Cost Function: A metric (often Mean Squared Error) to measure how well the model fits.
- Optimization: Finding the parameters that minimize the cost function (e.g., via Gradient Descent).
Linear Model Equation:
\[ y = w_0 + w_1 x_1 + \ldots + w_n x_n \]
Here:
- \(y\): Predicted value (target)
- \(w_0\): Intercept (bias term)
- \(w_1, \ldots, w_n\): Weights (coefficients) for each feature
- \(x_1, \ldots, x_n\): Input features (independent variables)
Cost Function (Mean Squared Error):
\[ J(\mathbf{w}) = \frac{1}{2m}\sum_{i=1}^{m}(y_i - \hat{y}_i)^2 \]
Here:
- \(m\): Number of data samples
- \(y_i\): Actual value of the target for the \(i\)-th sample
- \(\hat{y}_i\): Predicted value for the \(i\)-th sample
Gradient Descent Update Rule:
\[ w_j := w_j - \alpha \frac{1}{m}\sum_{i=1}^{m}(\hat{y}_i - y_i)x_{i,j} \]
Here:
- \(\alpha\): Learning rate, controlling the size of each update step
- \(x_{i,j}\): Value of feature \( j \) for the \(i\)-th sample
Normal Equation (Closed-form Solution):
\[ \mathbf{w} = (X^T X)^{-1} X^T y \]
Here:
- \(X\): Design matrix of input features
- \(y\): Vector of target values
Step-by-Step Summary:
- Collect Data: Gather features and continuous target values.
- Define Model: Assume a linear relationship \(y = w_0 + \sum w_j x_j\).
- Choose Cost Function: MSE is common to measure prediction error.
- Optimize Parameters: Use Gradient Descent or the Normal Equation to find \(w_j\) that minimize MSE.
- Evaluate & Refine: Check performance (e.g., RMSE), adjust the learning rate, add features, or regularize if needed.
ML Engineer Perspective:
- Use Linear Regression as a quick baseline.
- Interpret weights to understand feature importance.
- Compare with more complex models to see if complexity is justified.
Code Example (Python with scikit-learn):
import numpy as np
from sklearn.linear_model import LinearRegression
# Example: House size vs. House price
X = np.array([[800], [1000], [1200], [1500], [2000]]) # House sizes
y = np.array([180, 200, 240, 300, 360]) # Prices (in thousands)
model = LinearRegression()
model.fit(X, y)
pred_price = model.predict([[1300]])
print("Predicted price for 1300 sq ft:", pred_price[0])
print("Intercept (w0):", model.intercept_)
print("Slope (w1):", model.coef_[0])
For multiple features, provide a 2D array for X. The model finds a weight for each feature plus an intercept.
Key Takeaways:
- Linear Regression fits a linear model to predict a continuous variable.
- MSE is a common cost function measuring prediction error.
- Use Gradient Descent or the Normal Equation for parameter estimation.
- Simple, interpretable baseline for many regression tasks.