Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Handle pandas categorical types for categorical columns in _causal_analysis.py #602

Merged
merged 6 commits into from
Jun 13, 2022

Conversation

gaugup
Copy link
Contributor

@gaugup gaugup commented Apr 5, 2022

If the categorical type is set for a treatment column explicitly then there is a failure in CausalAnalysis class.

~\AppData\Local\Continuum\miniconda3\envs\nhs-hips\lib\site-packages\econml\solutions\causal_analysis\_causal_analysis.py in individualized_policy(self, Xtest, feature_index, n_rows, treatment_costs, alpha)
   1714                 all_costs = np.array([0] + [treatment_costs] * (len(treatment_arr) - 1))
   1715                 # construct index of current treatment
-> 1716                 current_ind = (current_treatment.reshape(-1, 1) ==
   1717                                treatment_arr.reshape(1, -1)) @ np.arange(len(treatment_arr))
   1718                 current_cost = all_costs[current_ind]

~\AppData\Local\Continuum\miniconda3\envs\nhs-hips\lib\site-packages\pandas\core\ops\common.py in new_method(self, other)
     67         other = item_from_zerodim(other)
     68 
---> 69         return method(self, other)
     70 
     71     return new_method

~\AppData\Local\Continuum\miniconda3\envs\nhs-hips\lib\site-packages\pandas\core\arrays\categorical.py in func(self, other)
    131         if is_list_like(other) and len(other) != len(self) and not hashable:
    132             # in hashable case we may have a tuple that is itself a category
--> 133             raise ValueError("Lengths must match.")
    134 
    135         if not self.ordered:

Solution is to check for the type of the categorical column to see if it is of type pd.core.arrays.categorical.Categorical and extract the numpy array using to_numpy() method.

…alysis.py

If the categorical type is set for a treatment column explicitly then there is a failure in `CausalAnalysis` class.

```
~\AppData\Local\Continuum\miniconda3\envs\nhs-hips\lib\site-packages\econml\solutions\causal_analysis\_causal_analysis.py in individualized_policy(self, Xtest, feature_index, n_rows, treatment_costs, alpha)
   1714                 all_costs = np.array([0] + [treatment_costs] * (len(treatment_arr) - 1))
   1715                 # construct index of current treatment
-> 1716                 current_ind = (current_treatment.reshape(-1, 1) ==
   1717                                treatment_arr.reshape(1, -1)) @ np.arange(len(treatment_arr))
   1718                 current_cost = all_costs[current_ind]

~\AppData\Local\Continuum\miniconda3\envs\nhs-hips\lib\site-packages\pandas\core\ops\common.py in new_method(self, other)
     67         other = item_from_zerodim(other)
     68 
---> 69         return method(self, other)
     70 
     71     return new_method

~\AppData\Local\Continuum\miniconda3\envs\nhs-hips\lib\site-packages\pandas\core\arrays\categorical.py in func(self, other)
    131         if is_list_like(other) and len(other) != len(self) and not hashable:
    132             # in hashable case we may have a tuple that is itself a category
--> 133             raise ValueError("Lengths must match.")
    134 
    135         if not self.ordered:
```
Solution is to check for the type of the categorical column to see if it is of type `pd.core.arrays.categorical.Categorical` and extract the numpy array using `to_numpy()` method.
@gaugup gaugup changed the title Handle pandas categorical types for categorical columns in _causal_analsis.py Handle pandas categorical types for categorical columns in _causal_analysis.py Apr 5, 2022
@kbattocchi
Copy link
Collaborator

Please add a test that only passes with the new code.

Copy link
Collaborator

@kbattocchi kbattocchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test that ensures that we don't regress this.

Copy link
Collaborator

@kbattocchi kbattocchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a test that verifies that the change fixes the behavior.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants