When to use which Panel Data models?
Method | Main Use Case | Assumptions | Handles Time-Invariant Variables | Endogeneity Control | When to Use |
---|---|---|---|---|---|
Pooled OLS | Treats panel data as pure cross-sectional data | Assumes no unobserved heterogeneity (all units are identical aside from included variables) | Yes | None | When you are sure there’s no individual-specific effect — rare in real-world panel data |
First Difference (FD) | Controls for unobserved time-invariant heterogeneity by differencing variables | Assumes strict exogeneity; removes fixed effects by differencing | No | Yes (for time-invariant factors) | When changes over time matter more than levels, and data has few time periods |
Fixed Effects (FE) | Controls for unobserved, time-invariant characteristics unique to each unit | Time-invariant individual effects are correlated with regressors | No | Yes | When omitted variables are fixed over time and correlated with regressors |
Random Effects (RE) | Assumes individual effects are random and uncorrelated with regressors | Unobserved effects must be uncorrelated with explanatory variables | Yes | No | When you want to include time-invariant variables and believe there’s no correlation with effects |
Read further on when to use GMM?