How does ROCAUC work in score_array()? #137

janosh · 2022-05-12T11:09:30Z

Seems like there's something wrong with score_array() in the classification case.

Lines 83 to 123 in c3b910e

    
           def score_array(true_array, pred_array, task_type): 
        
               """ 
        
               Score an array according to multiple metrics. 
        
               Args: 
        
                   true_array (list or np.array): The ground truth array 
        
                   pred_array (list or np.array): The predicted (test) array 
        
                   task_type (str): Either regression or classification. 
        
               Returns: 
        
                   (dict): dictionary of the scores, according to all defined 
        
                       metrics. 
        
               """ 
        
               computed = {} 
        
               if task_type == REG_KEY: 
        
                   metrics = REG_METRICS 
        
               elif task_type == CLF_KEY: 
        
                   metrics = CLF_METRICS 
        
               else: 
        
                   raise ValueError( 
        
                       f"'task_type' must be on of {[REG_KEY, CLF_KEY]}, not '{task_type}'" 
        
                   ) 
        
               for metric in metrics: 
        
                   mfunc = METRIC_MAP[metric] 
        
                   if metric == "rocauc": 
        
                       # Both arrays must be in probability form 
        
                       # if pred. array is given in probabilities 
        
                       if isinstance(pred_array[0], float): 
        
                           true_array = homogenize_clf_array(true_array, to_probs=True) 
        
                   # Other clf metrics always be converted to labels 
        
                   elif metric in CLF_METRICS: 
        
                       if isinstance(pred_array[0], float): 
        
                           pred_array = homogenize_clf_array(pred_array, to_labels=True) 
        
                   computed[metric] = mfunc(true_array, pred_array) 
        
               return computed

accuracy comes before rocauc in CLF_METRICS:

CLF_METRICS = ["accuracy", "balanced_accuracy", "f1", "rocauc"]

That means this code will convert the predictions to labels:

# Other clf metrics always be converted to labels
elif metric in CLF_METRICS:
    if isinstance(pred_array[0], float):
        pred_array = homogenize_clf_array(pred_array, to_labels=True)

in which case afterwards

if metric == "rocauc":
    # Both arrays must be in probability form
    # if pred. array is given in probabilities
    if isinstance(pred_array[0], float):
        true_array = homogenize_clf_array(true_array, to_probs=True)

will never be true and so you'd be trying to compute an ROCAUC from true labels vs predicted labels? Maybe I'm missing something?

The text was updated successfully, but these errors were encountered:

ardunn · 2022-05-20T00:32:04Z

@janosh I think you are correct. I will fix this ASAP

ardunn added high priority code Anything having to do with matbench python package code labels Jul 27, 2022

ardunn self-assigned this Jul 27, 2022

ml-evs mentioned this issue Sep 10, 2022

score_array computes roc-auc values on discretized predictions #181

Open

robinruff added a commit to robinruff/matbench that referenced this issue Mar 14, 2023

fix materialsproject#137

88f467c

robinruff mentioned this issue Mar 14, 2023

fix #137 and #181 (faulty ROC AUC scores) #245

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How does ROCAUC work in score_array()? #137

How does ROCAUC work in score_array()? #137

janosh commented May 12, 2022

ardunn commented May 20, 2022

How does ROCAUC work in score_array()? #137

How does ROCAUC work in score_array()? #137

Comments

janosh commented May 12, 2022

ardunn commented May 20, 2022