Robust Regression Techniques
Robust regression refers to methods that are less sensitive to outliers, heteroskedasticity, and violations of normality of residuals. These techniques aim to produce reliable estimates even when assumptions like homoskedasticity or normal error distribution are violated.
β 1. Regression with Robust Standard Errors
This is the simplest and most common robust technique. It does not change the regression coefficients but adjusts the standard errors so that statistical tests (t, F) remain valid even if:
- Errors are not normally distributed
- There is heteroskedasticity (i.e., non-constant variance)
π In Stata:
reg y x1 x2, vce(robust)
vce(robust)
adjusts the standard errors using the Huber-White sandwich estimator.- Coefficients stay the same, but p-values and confidence intervals are corrected.
β 2. Quantile Regression
While OLS estimates the mean effect of predictors on the dependent variable, quantile regression estimates effects at different percentiles (e.g., median, 25th, 75th). It’s useful when:
- The effect of predictors varies across the distribution
- The residuals are not symmetrically distributed
- The mean is not a good summary (e.g., skewed data)
π In Stata:
qreg y x1 x2
β 3. M-Estimation (Full Robust Regression)
This method minimizes a function that down-weights the influence of outliers on the regression. Unlike vce(robust)
, it adjusts both:
- The regression coefficients
- And the standard errors
It iteratively reweights observations with large residuals.
π In Stata:
rreg y x1 x2
rreg
= robust regression using Huber weights, great when you have influential outliers.
β 4. Bootstrapping
Bootstrapping provides empirical standard errors and confidence intervals by resampling from the data. Itβs powerful when the sampling distribution of the estimator is unknown or complex.
π In Stata:
bootstrap, reps(1000): reg y x1 x2
π§ When to Use What?
Situation | Best Method |
---|---|
Heteroskedasticity | reg ..., vce(robust) |
Outliers or leverage points | rreg , or robust M-estimation |
Skewed distribution, medians | qreg |
Distribution-free inference | bootstrap |