Support for multiple SE adjustments #799

albepal · 2025-01-27T16:02:24Z

Hello,
I am trying to perform a regression that applies both cluster-robust standard errors (by specifying a clustering variable) and Newey-West HAC standard errors, with an optional bandwidth parameter to account for autocorrelation up to a specified number of lags in the error term.

In Stata's reghdfe command, I would use the following options:
cluster(var) bw(lags)
where var represents my clustering variable, and lags is the number of lags I want to include to account for autocorrelation.

I am aware that R's fixest package allows for a similar specification using the option:
NW(lags) ~ vars
I would like to know if pyfixest offers a similar option to combine clustering and HAC standard errors.

Thank you for your help!

The text was updated successfully, but these errors were encountered:

s3alfisc · 2025-02-01T10:42:10Z

Hi @albepal , sorry for not responding quicker, I was out with a sick most of the week!

This can definitely be added, it's even already in the backlog #675. I think implementing a basic HAC estimator should be easy, though I think there are some variants of it that might be a little trickier (different ways of bandwidth selection etc, fixest reverts to the sandwich package for this).

Just to get some context (I haven't been in touched with HAC estimators since having to derive limit theorems for them in my econometrics grad classes 😅 ) - in applied econ, how / when / why would you use them instead of clustered errors?

Also posting https://www.jstatsoft.org/article/view/v082i03 as it seems to be an excellent resource.

albepal · 2025-02-03T09:59:27Z

Hi @s3alfisc thanks a lot for your answer!

Usually you need to use HAC for panel time series. HAC accounts for serial autocorrelation (so tipically when you have a time dimension and your dependent variable at t might be correlated to its value at t-1). Clustering SE is just for within-group correlation.
In my case I am using them on top of clustered SE, because I have different groups in my panel so I want to account heteroskedasticity within each group, but since I observe these groups at different points in time I also want to account for serial autocorrelation.

I think this could be done also with a post-estimation command, like in statsmodels' OLS:
https://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.RegressionResults.get_robustcov_results.html

They allow for different estimators like HAC, hac-panel (Newey-West), hac-groupsum (Driscoll and Kray). Maybe you could opt for something like that?

s3alfisc · 2025-02-12T21:40:24Z

I've now done some reading and both time-series & panel Newey West as well as DK should not be too hard to implement! I have started with Newey West and hope I will be able to show you a first PR by Sunday =)

s3alfisc added the duplicate This issue or pull request already exists label Feb 1, 2025

s3alfisc self-assigned this Feb 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for multiple SE adjustments #799

Support for multiple SE adjustments #799

albepal commented Jan 27, 2025

s3alfisc commented Feb 1, 2025

albepal commented Feb 3, 2025 •

edited

Loading

s3alfisc commented Feb 12, 2025

Support for multiple SE adjustments #799

Support for multiple SE adjustments #799

Comments

albepal commented Jan 27, 2025

s3alfisc commented Feb 1, 2025

albepal commented Feb 3, 2025 • edited Loading

s3alfisc commented Feb 12, 2025

albepal commented Feb 3, 2025 •

edited

Loading