Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Support for multiple SE adjustments #799

Open
albepal opened this issue Jan 27, 2025 · 3 comments
Open

Support for multiple SE adjustments #799

albepal opened this issue Jan 27, 2025 · 3 comments
Assignees
Labels
duplicate This issue or pull request already exists

Comments

@albepal
Copy link

albepal commented Jan 27, 2025

Hello,
I am trying to perform a regression that applies both cluster-robust standard errors (by specifying a clustering variable) and Newey-West HAC standard errors, with an optional bandwidth parameter to account for autocorrelation up to a specified number of lags in the error term.

In Stata's reghdfe command, I would use the following options:
cluster(var) bw(lags)
where var represents my clustering variable, and lags is the number of lags I want to include to account for autocorrelation.

I am aware that R's fixest package allows for a similar specification using the option:
NW(lags) ~ vars
I would like to know if pyfixest offers a similar option to combine clustering and HAC standard errors.

Thank you for your help!

@s3alfisc
Copy link
Member

s3alfisc commented Feb 1, 2025

Hi @albepal , sorry for not responding quicker, I was out with a sick most of the week!

This can definitely be added, it's even already in the backlog #675. I think implementing a basic HAC estimator should be easy, though I think there are some variants of it that might be a little trickier (different ways of bandwidth selection etc, fixest reverts to the sandwich package for this).

Just to get some context (I haven't been in touched with HAC estimators since having to derive limit theorems for them in my econometrics grad classes 😅 ) - in applied econ, how / when / why would you use them instead of clustered errors?

Also posting https://www.jstatsoft.org/article/view/v082i03 as it seems to be an excellent resource.

@s3alfisc s3alfisc added the duplicate This issue or pull request already exists label Feb 1, 2025
@albepal
Copy link
Author

albepal commented Feb 3, 2025

Hi @s3alfisc thanks a lot for your answer!

Usually you need to use HAC for panel time series. HAC accounts for serial autocorrelation (so tipically when you have a time dimension and your dependent variable at t might be correlated to its value at t-1). Clustering SE is just for within-group correlation.
In my case I am using them on top of clustered SE, because I have different groups in my panel so I want to account heteroskedasticity within each group, but since I observe these groups at different points in time I also want to account for serial autocorrelation.

I think this could be done also with a post-estimation command, like in statsmodels' OLS:
https://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.RegressionResults.get_robustcov_results.html

They allow for different estimators like HAC, hac-panel (Newey-West), hac-groupsum (Driscoll and Kray). Maybe you could opt for something like that?

@s3alfisc
Copy link
Member

I've now done some reading and both time-series & panel Newey West as well as DK should not be too hard to implement! I have started with Newey West and hope I will be able to show you a first PR by Sunday =)

@s3alfisc s3alfisc self-assigned this Feb 12, 2025
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

2 participants