Improve clarity of TWFE explanation (#522)

mercury0100 · mercury0100 · web-flow · commit 23e0f3e9dc00 · 2025-08-29T16:37:11.000+01:00
* Fixes #516 #516 (comment) * docs: clarify TWFE explanation in Banking dataset notebook - Introduced γ as the global intercept to avoid overloading α - Wrapped dataset variable names (district, year, post_treatment) in \text{} for clearer rendering - Improved equation layout with consistent notation and salmon labels - Revised explanatory bullets for consistency with updated notation * fix "\" appearing in the rendered docs preview in the post_treatment term of the banking dataset notebook --------- Co-authored-by: mercury0100 <cooper@invoke.network>
diff --git a/docs/source/notebooks/did_pymc_banks.ipynb b/docs/source/notebooks/did_pymc_banks.ipynb
@@ -1,7 +1,6 @@
 {
  "cells": [
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
@@ -38,7 +37,6 @@
    ]
   },
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
@@ -72,7 +70,6 @@
    ]
   },
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
@@ -108,7 +105,6 @@
    ]
   },
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
@@ -313,7 +309,6 @@
    ]
   },
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
@@ -665,7 +660,6 @@
    ]
   },
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
@@ -829,29 +823,30 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Analysis 4 - Two way fixed effects\n",
+    "## Analysis 4 - Two-way fixed effects\n",
     "\n",
-    "Finally, we can evaluate the difference in difference model in its two-way fixed effects (TWFE) formulation. The two-way fixed effects model is widely used in econometrics for causal inference in panel data settings. It accounts for both unit-specific effects (e.g., differences between districts) and time-specific effects (e.g., shocks or trends affecting all units simultaneously). \n",
+    "Finally, we can evaluate the difference-in-differences model in its two-way fixed effects (TWFE) formulation. The TWFE model is widely used in econometrics for causal inference in panel data. It accounts for both unit-specific effects (e.g., differences between districts) and time-specific effects (e.g., shocks or trends affecting all units simultaneously).\n",
     "\n",
-    "The TWFE model is equivalent to the classic 2$\\times$2 DiD model (Model 1) - but only in the situation of two groups and two time periods. Outside of this special case the approach is not equivalent and can potentially have some problems {cite:p}`hill2020limitations,imai2021twfepanel`. Readers should proceed with caution when using the TWFE model outside of the 2$\\times$2 case - see for example {cite:p}`kropko2018two,collischon2020let`.\n",
+    "The TWFE model is equivalent to the classic 2$\\times$2 DiD model (Model 1) only in the special case of two groups and two time periods. Outside of this case the approaches diverge, and TWFE can have important limitations {cite:p}`hill2020limitations,imai2021twfepanel`. Readers should proceed with caution when applying TWFE in richer settings {cite:p}`kropko2018two,collischon2020let`.\n",
     "\n",
-    "The TWFE approach is similar to the previous model in that the `district:post_treatment` interaction term still gives you a treatment indicator variable and the assiated coefficient $\\beta_{\\Delta}$ is the causal effect of the intervention. But it is different in that there is no _linear_ `year` term, instead we have a _categorical_ `year` variable. This means that the model can capture any temporal trends in the data. These can be thought of as capturing time based schocks that affect all units.\n",
+    "The TWFE approach is similar to the previous model in that the `district:post_treatment` interaction term still defines a treatment indicator variable, and its coefficient $\\Delta$ represents the causal effect of the intervention. The difference is that there is no _linear_ `year` term, instead we use a _categorical_ year variable. This allows the model to flexibly capture any time-specific shocks that affect all units.\n",
     "\n",
-    "The equation for the expected values is:\n",
+    "The expected values can be written as:\n",
     "\n",
     "$$\n",
     "\\begin{aligned}\n",
-    "\\mu_i & \\quad + \\alpha[district_i] \\qquad \\textcolor{salmon}{\\text{(unit fixed effect)}}\\\\\n",
-    " & \\quad + \\beta[year_i] \\qquad \\textcolor{salmon}{\\text{(time fixed effect)}}\\\\\n",
-    " & \\quad + \\Delta \\cdot district_i \\cdot post~treatment_i \\qquad \\textcolor{salmon}{\\text{(treatment indicator)}}\n",
+    "\\mu_i &= \\gamma && \\color{#FA8072}{\\text{(global intercept)}} \\\\\n",
+    "&+ \\alpha[\\text{district}_i] && \\color{#FA8072}{\\text{(unit fixed effect)}} \\\\\n",
+    "&+ \\beta[\\text{year}_i] && \\color{#FA8072}{\\text{(time fixed effect)}} \\\\\n",
+    "&+ \\Delta \\cdot \\text{district}_i \\cdot \\text{post_treatment}_i && \\color{#FA8072}{\\text{(treatment indicator)}}\n",
     "\\end{aligned}\n",
     "$$\n",
     "\n",
-    "Typically the TWFE model is presented in matrix form, so the equation above might look less familiar. It has been adapted for long format data. In particular, note that:\n",
-    "* $\\alpha$ is a scalar intercept term.\n",
-    "* $\\alpha[district_i]$ is a vector of fixed effects for each district. There are only 2 districts, so this is a vector of length 2. The $district_i$ indexes the element of $\\alpha$ that corresponds to the district of the $i^{th}$ observation.\n",
-    "* $\\beta[year_i]$ is a vector of fixed effects for each year. There are 6 years, so this is a vector of length 6. The $year_i$ indexes the element of $\\beta$ that corresponds to the year of the $i^{th}$ observation.\n",
-    "* $\\Delta$ is a scalar representing the treatment effect, which is the same as the coefficient of the `district:post_treatment` interaction term."
+    "Here:  \n",
+    "* $\\gamma$ is a scalar global intercept.  \n",
+    "* $\\alpha[\\text{district}_i]$ is a vector of unit (district) fixed effects. With 2 districts, this is a vector of length 2. The index $\\text{district}_i$ selects the effect for the $i^{\\text{th}}$ observation.  \n",
+    "* $\\beta[\\text{year}_i]$ is a vector of time (year) fixed effects. With 6 years, this is a vector of length 6. The index $\\text{year}_i$ selects the effect for the $i^{\\text{th}}$ observation.  \n",
+    "* $\\Delta$ is a scalar treatment effect, corresponding to the coefficient on the `district:post_treatment` interaction term.\n"
    ]
   },
   {
@@ -1049,7 +1044,6 @@
    ]
   },
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
@@ -1062,7 +1056,7 @@
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "CausalPy",
+   "display_name": "Python 3 (ipykernel)",
    "language": "python",
    "name": "python3"
   },
@@ -1076,7 +1070,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.13.2"
+   "version": "3.9.18"
   }
  },
  "nbformat": 4,