Skip to content

Commit 756ce9b

Browse files
[Term Entry] Python statsmodels: model residuals (#5903)
* New file has been added. * Update user-input.md * Update user-input.md * File has been added. * Update content/python/concepts/statsmodels/terms/model-residuals/model-residuals.md * Update content/python/concepts/statsmodels/terms/model-residuals/model-residuals.md * Update content/python/concepts/statsmodels/terms/model-residuals/model-residuals.md * Made the changes. * Changes implemented in the file. * Changes have been done. * Updated model-residuals.md and added example image * Updated model-residuals.md with improvements * Removed the extra spaces. * Changes done. ---------
1 parent ac2926b commit 756ce9b

File tree

2 files changed

+72
-0
lines changed

2 files changed

+72
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
---
2+
Title: 'Model Residuals'
3+
Description: 'Analyzes residuals for Python statistical models, measure model performance, detect patterns, and diagnose problems using concise syntax and examples.'
4+
Subjects:
5+
- 'AI'
6+
- 'Data Science'
7+
- 'Machine Learning'
8+
Tags:
9+
- 'Data'
10+
- 'Linear Regression'
11+
- 'Logistic Regression'
12+
- 'Models'
13+
- 'Statsmodels'
14+
CatalogContent:
15+
- 'learn-python-3'
16+
- 'paths/computer-science'
17+
---
18+
19+
**Model residuals** are calculated as the differences between observed and predicted values for a statistical model. This method measures error or deviation for each data point by using the formula:
20+
21+
![Model Residuals formula](https://raw.githubusercontent.com/Codecademy/docs/main/media/model-residuals-example.png)
22+
23+
Residuals are a key concept in statistical modeling. They are used to evaluate the goodness of fit, identify patterns, detect outliers, and validate assumptions about the model. Analyzing residuals helps enhance model accuracy and reliability by providing information about areas where the model is underperforming.
24+
25+
## Syntax
26+
27+
Here is the general syntax for calculating Model Residuals:
28+
29+
```pseudo
30+
# Fit the model (if not already fitted)
31+
model = sm.OLS(y, X).fit()
32+
# Retrieve the residuals
33+
residuals = model.resid
34+
```
35+
36+
- `sm.OLS(y, X)`: Defines the `OLS` regression model with `y` as the dependent variable and `X` as the independent variable.
37+
- `.fit()`: Fits the model to the data.
38+
- `model.resid`: Extracts the residuals from the fitted model.
39+
40+
## Example
41+
42+
In this example, a linear regression model is fitted using statsmodels, and the residuals are calculated:
43+
44+
```py
45+
import statsmodels.api as sm
46+
import numpy as np
47+
48+
# Step 1: Create sample data
49+
X = np.random.rand(5, 1) # Independent variable (100 samples)
50+
y = 3 * X + np.random.randn(5, 1) # Dependent variable with noise
51+
52+
# Step 2: Add constant to X for the intercept term
53+
X = sm.add_constant(X)
54+
55+
# Step 3: Fit the OLS model
56+
model = sm.OLS(y, X).fit()
57+
58+
# Step 4: Calculate residuals
59+
residuals = model.resid
60+
61+
# Step 5: Display residuals
62+
print("Model Residuals:\n", residuals)
63+
```
64+
65+
Here is the output for the code:
66+
67+
```shell
68+
Model Residuals:
69+
[0.07524913 -1.02179262 1.42678355 -1.50131552 1.02107546]
70+
```
71+
72+
These values indicate how much each prediction deviates from the true value. A smaller residual means the prediction is closer to the actual value, while larger residuals indicate a greater deviation.

media/model-residuals-example.png

46.1 KB
Loading

0 commit comments

Comments
 (0)