Explore the data with continuous output and category input #540

Vu1992 · 2024-05-09T04:39:13Z

Hi,

Thank for your great work. I have one question regard to the Explore data. Is it possible to use the following code to explain the continuous output and category input in Explore the data:

marginal = Marginal(names).explain_data(X_train, y_train, name='Train Data')
show(marginal)

When i try to use the above code, they return with Type error: Unable to do the formular for 'str'

The text was updated successfully, but these errors were encountered:

paulbkoch · 2024-05-11T01:25:14Z

Hi @Vu1992 -- It should handle continuous output and category input. I don't see that error message in our repo or on the internet. Can you include a stack trace? Also, is the data public?

Vu1992 · 2024-05-13T01:45:44Z

Hi @paulbkoch ,

Thank for your reply. Unfortunately that the data is private, but i can show you what i'm trying to do. I have a dataframe and do the following step with df is my data as a table.
A=df[['BRANCH']] ; B=df[['Gross_Incurred']]; names=['BRANCH']
So basically A and B have the value as in the image bellow

Then I use your code for Data explorer
marginal = Marginal(names).explain_data(A, B, name='Train Data'); show(marginal)
Then python comeback to me with Type Error: unsupported operand type(s) for -: 'str' and 'str

paulbkoch · 2024-05-14T19:25:59Z

I tried to replicate this with the following code:

import numpy as np
import pandas as pd
from interpret.data import Marginal
from interpret import show
names=['BRANCH']
A = pd.DataFrame()
A["BRANCH"] = pd.Series(np.array(['VC', 'VC', 'MS', 'VH'], dtype=np.str_))
B = pd.DataFrame()
B["Gross_Incurred"] = pd.Series(np.array([18000000.0, 36200000000.0, 0.0, -50000000.0], dtype=float))
marginal = Marginal(names).explain_data(A, B, name='Train Data'); show(marginal)

My example works though. Any idea what could be different?

Vu1992 · 2024-05-15T01:34:41Z

Thank for your help.
I don't know what have gone wrong last time but now i tried again it work but the graph do not change when i change to Type Categorical even in your replication.
when i add continuous variable, it show like this

but when i want to see the categorical variable, nothing change

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explore the data with continuous output and category input #540

Explore the data with continuous output and category input #540

Vu1992 commented May 9, 2024

paulbkoch commented May 11, 2024

Vu1992 commented May 13, 2024

paulbkoch commented May 14, 2024

Vu1992 commented May 15, 2024

Explore the data with continuous output and category input #540

Explore the data with continuous output and category input #540

Comments

Vu1992 commented May 9, 2024

paulbkoch commented May 11, 2024

Vu1992 commented May 13, 2024

paulbkoch commented May 14, 2024

Vu1992 commented May 15, 2024