Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Mimic fixest::i() by relying on formulaic stateful transforms #782

Open
s3alfisc opened this issue Jan 6, 2025 · 0 comments
Open

Mimic fixest::i() by relying on formulaic stateful transforms #782

s3alfisc opened this issue Jan 6, 2025 · 0 comments
Labels
enhancement New feature or request

Comments

@s3alfisc
Copy link
Member

s3alfisc commented Jan 6, 2025

Prompted by matthewwardrop/formulaic#238, implement a proper i() operator and replace the ugly string parsing.

Easily done via formulaic stateful transforms:

import numpy as np
import pandas as pd
from formulaic.transforms import stateful_transform
from formulaic.transforms.contrasts import C, TreatmentContrasts
from formulaic import model_matrix
import pyfixest as pf

data = pf.get_data()

@stateful_transform
def i(factor_var, ref=None, _state=None, _metadata=None, _spec=None):

    if "i" not in _state:
        _state["i"] = C(data = factor_var, contrasts = TreatmentContrasts(ref))

    return _state["i"]

model_matrix("i(f1, ref = 1.0)", data = data).head()

Challenge: How to add a second variable that can be interacted with factor_var?

I.e. API as

@stateful_transform
def i(factor_var, var = None, ref=None, ref2 = None, _state=None, _metadata=None, _spec=None):

    if var is None: 
        if "i" not in _state:
            _state["i"] = Formula(C(data = factor_var, contrasts = TreatmentContrasts(ref)))
    else: 
        if "i" not in _state:
            # this does not work, need to find where : interaction implemented in formulaic
            _state["i"] = C(data = factor_var, contrasts = TreatmentContrasts(ref)) : C(data = var, contrasts = TreatmentContrasts(ref2))

    return _state["i"]

Maybe too ambitions (but certainly useful for pooling DiD time periods, i.e. months to years): bin and bin2 arguments to combine multiple fixed effects levels.

@s3alfisc s3alfisc added the enhancement New feature or request label Jan 6, 2025
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant