|
1 | 1 | {
|
2 | 2 | "cells": [
|
3 | 3 | {
|
4 |
| - "attachments": {}, |
5 | 4 | "cell_type": "markdown",
|
6 | 5 | "metadata": {},
|
7 | 6 | "source": [
|
|
38 | 37 | ]
|
39 | 38 | },
|
40 | 39 | {
|
41 |
| - "attachments": {}, |
42 | 40 | "cell_type": "markdown",
|
43 | 41 | "metadata": {},
|
44 | 42 | "source": [
|
|
72 | 70 | ]
|
73 | 71 | },
|
74 | 72 | {
|
75 |
| - "attachments": {}, |
76 | 73 | "cell_type": "markdown",
|
77 | 74 | "metadata": {},
|
78 | 75 | "source": [
|
|
108 | 105 | ]
|
109 | 106 | },
|
110 | 107 | {
|
111 |
| - "attachments": {}, |
112 | 108 | "cell_type": "markdown",
|
113 | 109 | "metadata": {},
|
114 | 110 | "source": [
|
|
313 | 309 | ]
|
314 | 310 | },
|
315 | 311 | {
|
316 |
| - "attachments": {}, |
317 | 312 | "cell_type": "markdown",
|
318 | 313 | "metadata": {},
|
319 | 314 | "source": [
|
|
665 | 660 | ]
|
666 | 661 | },
|
667 | 662 | {
|
668 |
| - "attachments": {}, |
669 | 663 | "cell_type": "markdown",
|
670 | 664 | "metadata": {},
|
671 | 665 | "source": [
|
|
829 | 823 | "cell_type": "markdown",
|
830 | 824 | "metadata": {},
|
831 | 825 | "source": [
|
832 |
| - "## Analysis 4 - Two way fixed effects\n", |
| 826 | + "## Analysis 4 - Two-way fixed effects\n", |
833 | 827 | "\n",
|
834 |
| - "Finally, we can evaluate the difference in difference model in its two-way fixed effects (TWFE) formulation. The two-way fixed effects model is widely used in econometrics for causal inference in panel data settings. It accounts for both unit-specific effects (e.g., differences between districts) and time-specific effects (e.g., shocks or trends affecting all units simultaneously). \n", |
| 828 | + "Finally, we can evaluate the difference-in-differences model in its two-way fixed effects (TWFE) formulation. The TWFE model is widely used in econometrics for causal inference in panel data. It accounts for both unit-specific effects (e.g., differences between districts) and time-specific effects (e.g., shocks or trends affecting all units simultaneously).\n", |
835 | 829 | "\n",
|
836 |
| - "The TWFE model is equivalent to the classic 2$\\times$2 DiD model (Model 1) - but only in the situation of two groups and two time periods. Outside of this special case the approach is not equivalent and can potentially have some problems {cite:p}`hill2020limitations,imai2021twfepanel`. Readers should proceed with caution when using the TWFE model outside of the 2$\\times$2 case - see for example {cite:p}`kropko2018two,collischon2020let`.\n", |
| 830 | + "The TWFE model is equivalent to the classic 2$\\times$2 DiD model (Model 1) only in the special case of two groups and two time periods. Outside of this case the approaches diverge, and TWFE can have important limitations {cite:p}`hill2020limitations,imai2021twfepanel`. Readers should proceed with caution when applying TWFE in richer settings {cite:p}`kropko2018two,collischon2020let`.\n", |
837 | 831 | "\n",
|
838 |
| - "The TWFE approach is similar to the previous model in that the `district:post_treatment` interaction term still gives you a treatment indicator variable and the assiated coefficient $\\beta_{\\Delta}$ is the causal effect of the intervention. But it is different in that there is no _linear_ `year` term, instead we have a _categorical_ `year` variable. This means that the model can capture any temporal trends in the data. These can be thought of as capturing time based schocks that affect all units.\n", |
| 832 | + "The TWFE approach is similar to the previous model in that the `district:post_treatment` interaction term still defines a treatment indicator variable, and its coefficient $\\Delta$ represents the causal effect of the intervention. The difference is that there is no _linear_ `year` term, instead we use a _categorical_ year variable. This allows the model to flexibly capture any time-specific shocks that affect all units.\n", |
839 | 833 | "\n",
|
840 |
| - "The equation for the expected values is:\n", |
| 834 | + "The expected values can be written as:\n", |
841 | 835 | "\n",
|
842 | 836 | "$$\n",
|
843 | 837 | "\\begin{aligned}\n",
|
844 |
| - "\\mu_i & \\quad + \\alpha[district_i] \\qquad \\textcolor{salmon}{\\text{(unit fixed effect)}}\\\\\n", |
845 |
| - " & \\quad + \\beta[year_i] \\qquad \\textcolor{salmon}{\\text{(time fixed effect)}}\\\\\n", |
846 |
| - " & \\quad + \\Delta \\cdot district_i \\cdot post~treatment_i \\qquad \\textcolor{salmon}{\\text{(treatment indicator)}}\n", |
| 838 | + "\\mu_i &= \\gamma && \\color{#FA8072}{\\text{(global intercept)}} \\\\\n", |
| 839 | + "&+ \\alpha[\\text{district}_i] && \\color{#FA8072}{\\text{(unit fixed effect)}} \\\\\n", |
| 840 | + "&+ \\beta[\\text{year}_i] && \\color{#FA8072}{\\text{(time fixed effect)}} \\\\\n", |
| 841 | + "&+ \\Delta \\cdot \\text{district}_i \\cdot \\text{post_treatment}_i && \\color{#FA8072}{\\text{(treatment indicator)}}\n", |
847 | 842 | "\\end{aligned}\n",
|
848 | 843 | "$$\n",
|
849 | 844 | "\n",
|
850 |
| - "Typically the TWFE model is presented in matrix form, so the equation above might look less familiar. It has been adapted for long format data. In particular, note that:\n", |
851 |
| - "* $\\alpha$ is a scalar intercept term.\n", |
852 |
| - "* $\\alpha[district_i]$ is a vector of fixed effects for each district. There are only 2 districts, so this is a vector of length 2. The $district_i$ indexes the element of $\\alpha$ that corresponds to the district of the $i^{th}$ observation.\n", |
853 |
| - "* $\\beta[year_i]$ is a vector of fixed effects for each year. There are 6 years, so this is a vector of length 6. The $year_i$ indexes the element of $\\beta$ that corresponds to the year of the $i^{th}$ observation.\n", |
854 |
| - "* $\\Delta$ is a scalar representing the treatment effect, which is the same as the coefficient of the `district:post_treatment` interaction term." |
| 845 | + "Here: \n", |
| 846 | + "* $\\gamma$ is a scalar global intercept. \n", |
| 847 | + "* $\\alpha[\\text{district}_i]$ is a vector of unit (district) fixed effects. With 2 districts, this is a vector of length 2. The index $\\text{district}_i$ selects the effect for the $i^{\\text{th}}$ observation. \n", |
| 848 | + "* $\\beta[\\text{year}_i]$ is a vector of time (year) fixed effects. With 6 years, this is a vector of length 6. The index $\\text{year}_i$ selects the effect for the $i^{\\text{th}}$ observation. \n", |
| 849 | + "* $\\Delta$ is a scalar treatment effect, corresponding to the coefficient on the `district:post_treatment` interaction term.\n" |
855 | 850 | ]
|
856 | 851 | },
|
857 | 852 | {
|
|
1049 | 1044 | ]
|
1050 | 1045 | },
|
1051 | 1046 | {
|
1052 |
| - "attachments": {}, |
1053 | 1047 | "cell_type": "markdown",
|
1054 | 1048 | "metadata": {},
|
1055 | 1049 | "source": [
|
|
1062 | 1056 | ],
|
1063 | 1057 | "metadata": {
|
1064 | 1058 | "kernelspec": {
|
1065 |
| - "display_name": "CausalPy", |
| 1059 | + "display_name": "Python 3 (ipykernel)", |
1066 | 1060 | "language": "python",
|
1067 | 1061 | "name": "python3"
|
1068 | 1062 | },
|
|
1076 | 1070 | "name": "python",
|
1077 | 1071 | "nbconvert_exporter": "python",
|
1078 | 1072 | "pygments_lexer": "ipython3",
|
1079 |
| - "version": "3.13.2" |
| 1073 | + "version": "3.9.18" |
1080 | 1074 | }
|
1081 | 1075 | },
|
1082 | 1076 | "nbformat": 4,
|
|
0 commit comments