Skip to content

Commit 23e0f3e

Browse files
mercury0100mercury0100
andauthored
Improve clarity of TWFE explanation (#522)
* Fixes #516 #516 (comment) * docs: clarify TWFE explanation in Banking dataset notebook - Introduced γ as the global intercept to avoid overloading α - Wrapped dataset variable names (district, year, post_treatment) in \text{} for clearer rendering - Improved equation layout with consistent notation and salmon labels - Revised explanatory bullets for consistency with updated notation * fix "\" appearing in the rendered docs preview in the post_treatment term of the banking dataset notebook --------- Co-authored-by: mercury0100 <[email protected]>
1 parent e6e0037 commit 23e0f3e

File tree

1 file changed

+16
-22
lines changed

1 file changed

+16
-22
lines changed

docs/source/notebooks/did_pymc_banks.ipynb

Lines changed: 16 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
{
22
"cells": [
33
{
4-
"attachments": {},
54
"cell_type": "markdown",
65
"metadata": {},
76
"source": [
@@ -38,7 +37,6 @@
3837
]
3938
},
4039
{
41-
"attachments": {},
4240
"cell_type": "markdown",
4341
"metadata": {},
4442
"source": [
@@ -72,7 +70,6 @@
7270
]
7371
},
7472
{
75-
"attachments": {},
7673
"cell_type": "markdown",
7774
"metadata": {},
7875
"source": [
@@ -108,7 +105,6 @@
108105
]
109106
},
110107
{
111-
"attachments": {},
112108
"cell_type": "markdown",
113109
"metadata": {},
114110
"source": [
@@ -313,7 +309,6 @@
313309
]
314310
},
315311
{
316-
"attachments": {},
317312
"cell_type": "markdown",
318313
"metadata": {},
319314
"source": [
@@ -665,7 +660,6 @@
665660
]
666661
},
667662
{
668-
"attachments": {},
669663
"cell_type": "markdown",
670664
"metadata": {},
671665
"source": [
@@ -829,29 +823,30 @@
829823
"cell_type": "markdown",
830824
"metadata": {},
831825
"source": [
832-
"## Analysis 4 - Two way fixed effects\n",
826+
"## Analysis 4 - Two-way fixed effects\n",
833827
"\n",
834-
"Finally, we can evaluate the difference in difference model in its two-way fixed effects (TWFE) formulation. The two-way fixed effects model is widely used in econometrics for causal inference in panel data settings. It accounts for both unit-specific effects (e.g., differences between districts) and time-specific effects (e.g., shocks or trends affecting all units simultaneously). \n",
828+
"Finally, we can evaluate the difference-in-differences model in its two-way fixed effects (TWFE) formulation. The TWFE model is widely used in econometrics for causal inference in panel data. It accounts for both unit-specific effects (e.g., differences between districts) and time-specific effects (e.g., shocks or trends affecting all units simultaneously).\n",
835829
"\n",
836-
"The TWFE model is equivalent to the classic 2$\\times$2 DiD model (Model 1) - but only in the situation of two groups and two time periods. Outside of this special case the approach is not equivalent and can potentially have some problems {cite:p}`hill2020limitations,imai2021twfepanel`. Readers should proceed with caution when using the TWFE model outside of the 2$\\times$2 case - see for example {cite:p}`kropko2018two,collischon2020let`.\n",
830+
"The TWFE model is equivalent to the classic 2$\\times$2 DiD model (Model 1) only in the special case of two groups and two time periods. Outside of this case the approaches diverge, and TWFE can have important limitations {cite:p}`hill2020limitations,imai2021twfepanel`. Readers should proceed with caution when applying TWFE in richer settings {cite:p}`kropko2018two,collischon2020let`.\n",
837831
"\n",
838-
"The TWFE approach is similar to the previous model in that the `district:post_treatment` interaction term still gives you a treatment indicator variable and the assiated coefficient $\\beta_{\\Delta}$ is the causal effect of the intervention. But it is different in that there is no _linear_ `year` term, instead we have a _categorical_ `year` variable. This means that the model can capture any temporal trends in the data. These can be thought of as capturing time based schocks that affect all units.\n",
832+
"The TWFE approach is similar to the previous model in that the `district:post_treatment` interaction term still defines a treatment indicator variable, and its coefficient $\\Delta$ represents the causal effect of the intervention. The difference is that there is no _linear_ `year` term, instead we use a _categorical_ year variable. This allows the model to flexibly capture any time-specific shocks that affect all units.\n",
839833
"\n",
840-
"The equation for the expected values is:\n",
834+
"The expected values can be written as:\n",
841835
"\n",
842836
"$$\n",
843837
"\\begin{aligned}\n",
844-
"\\mu_i & \\quad + \\alpha[district_i] \\qquad \\textcolor{salmon}{\\text{(unit fixed effect)}}\\\\\n",
845-
" & \\quad + \\beta[year_i] \\qquad \\textcolor{salmon}{\\text{(time fixed effect)}}\\\\\n",
846-
" & \\quad + \\Delta \\cdot district_i \\cdot post~treatment_i \\qquad \\textcolor{salmon}{\\text{(treatment indicator)}}\n",
838+
"\\mu_i &= \\gamma && \\color{#FA8072}{\\text{(global intercept)}} \\\\\n",
839+
"&+ \\alpha[\\text{district}_i] && \\color{#FA8072}{\\text{(unit fixed effect)}} \\\\\n",
840+
"&+ \\beta[\\text{year}_i] && \\color{#FA8072}{\\text{(time fixed effect)}} \\\\\n",
841+
"&+ \\Delta \\cdot \\text{district}_i \\cdot \\text{post_treatment}_i && \\color{#FA8072}{\\text{(treatment indicator)}}\n",
847842
"\\end{aligned}\n",
848843
"$$\n",
849844
"\n",
850-
"Typically the TWFE model is presented in matrix form, so the equation above might look less familiar. It has been adapted for long format data. In particular, note that:\n",
851-
"* $\\alpha$ is a scalar intercept term.\n",
852-
"* $\\alpha[district_i]$ is a vector of fixed effects for each district. There are only 2 districts, so this is a vector of length 2. The $district_i$ indexes the element of $\\alpha$ that corresponds to the district of the $i^{th}$ observation.\n",
853-
"* $\\beta[year_i]$ is a vector of fixed effects for each year. There are 6 years, so this is a vector of length 6. The $year_i$ indexes the element of $\\beta$ that corresponds to the year of the $i^{th}$ observation.\n",
854-
"* $\\Delta$ is a scalar representing the treatment effect, which is the same as the coefficient of the `district:post_treatment` interaction term."
845+
"Here: \n",
846+
"* $\\gamma$ is a scalar global intercept. \n",
847+
"* $\\alpha[\\text{district}_i]$ is a vector of unit (district) fixed effects. With 2 districts, this is a vector of length 2. The index $\\text{district}_i$ selects the effect for the $i^{\\text{th}}$ observation. \n",
848+
"* $\\beta[\\text{year}_i]$ is a vector of time (year) fixed effects. With 6 years, this is a vector of length 6. The index $\\text{year}_i$ selects the effect for the $i^{\\text{th}}$ observation. \n",
849+
"* $\\Delta$ is a scalar treatment effect, corresponding to the coefficient on the `district:post_treatment` interaction term.\n"
855850
]
856851
},
857852
{
@@ -1049,7 +1044,6 @@
10491044
]
10501045
},
10511046
{
1052-
"attachments": {},
10531047
"cell_type": "markdown",
10541048
"metadata": {},
10551049
"source": [
@@ -1062,7 +1056,7 @@
10621056
],
10631057
"metadata": {
10641058
"kernelspec": {
1065-
"display_name": "CausalPy",
1059+
"display_name": "Python 3 (ipykernel)",
10661060
"language": "python",
10671061
"name": "python3"
10681062
},
@@ -1076,7 +1070,7 @@
10761070
"name": "python",
10771071
"nbconvert_exporter": "python",
10781072
"pygments_lexer": "ipython3",
1079-
"version": "3.13.2"
1073+
"version": "3.9.18"
10801074
}
10811075
},
10821076
"nbformat": 4,

0 commit comments

Comments
 (0)