Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

How to get word importance #510

Open
nochimake opened this issue Feb 18, 2024 · 1 comment
Open

How to get word importance #510

nochimake opened this issue Feb 18, 2024 · 1 comment

Comments

@nochimake
Copy link

I have a text sentiment polarity prediction model, roughly structured as RoBERTa + CNN. Now, I want to use InterpretML to explain its prediction results. My code is as follows:

from interpret.glassbox import ExplainableBoostingClassifier
import numpy as np
from tensorflow.keras.preprocessing.sequence import pad_sequences

def analysis_interpret(target_name: str, text_list: list, sentiment_list: list):
    ebm = ExplainableBoostingClassifier()
    data_generator = DataGenerator(text_list, sentiment_list)
    X_train = [np.ravel(arr) for arr in data_generator.input_ids]
    X_train = pad_sequences(X_train)
    X_train = np.array(X_train)
    y_train = sentiment_list
    ebm.fit(X_train, y_train)

    ebm_local = ebm.explain_local(X_train, y_train)

Where DataGenerator is the text processing class for my model. Here, I'm temporarily using RoBERTa's tokenizer to map the text to the required token IDs for modeling. y_train represents the labels predicted by my model. After the statement ebm_local = ebm.explain_local(X_train, y_train), how can I obtain the importance of each word? I have seen people using the ebm_local.get_local_importance_dict() method, but I can't find this method in version 0.5.1.

@paulbkoch
Copy link
Collaborator

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Development

No branches or pull requests

2 participants