rfc (api): revisit the option to return a single scalar when calling a metric #1259

glemaitre · 2025-01-29T21:42:15Z

In EstimatorReport and CrossValidationReport, it can happen that a metric will return a dataframe with a single scalar in it. From several discussions it comes to the following trade-off:

Having a single number could be annoying and one would like to get the Python float directly
Having the type of output depending of what is going to be return is a bit annoying

A really explicit solution would be to introduce a parameter for metric that can return a single value, for instance as_type_if_scalar that could be "frame" or "float".

However, I'm seeking for feedback to know if it is the way to go.

from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from skore import EstimatorReport

X, y = make_classification()
estimator = LogisticRegression()
report = EstimatorReport(estimator, X_train=X, y_train=y, X_test=X, y_test=y)
report.metrics.accuracy()

         Metric 	Accuracy (↗︎)
LogisticRegression 	0.97

vs.

report.metrics.accuracy().iloc[0, 0]

np.float64(0.97)

The text was updated successfully, but these errors were encountered:

MarieSacksick · 2025-01-30T10:03:24Z

I'd be happy a solution like this:

report.metrics.accuracy(as_float = True) # false by default, renders a float
report.metrics.precision(as_float = True) # renders an error if positive_label is not defined
report.metrics.precision(as_float = True, positive_label = True) # renders a float

Other possible solution, to take into account the various df :

report.metrics.accuracy(render_as = float) # renders a float
# the param render_as is "pandas" by default but can take ["float", "pandas", "polars"]
report.metrics.precision(as_float = True) # renders an error if positive_label is not defined
report.metrics.precision(as_float = True, positive_label = True) # renders a float

But, pfiou, it seems less fluid for like 0 additional value.

auguste-probabl · 2025-01-30T11:01:54Z

Personally I like the consistency of all metrics methods returning things of the same type, and that it makes it possible to combine them using pandas methods. Admittedly we're limited by the pandas API here (for example in Pytorch, you can convert a N-d tensor to a float using tensor.item()).

adrinjalali · 2025-01-31T10:48:28Z

As a user, I'd find it very odd if report.metrics.accuracy() returns a dataframe, for two reasons:

I might not like pandas, and I might want to handle my data in a different dataframe (like polars)
I expect a single number out of such a metric, the same way that I expect a very complex output coming out of roc, or something I'd expect coming out of a fairlearn.MetricFrame

So to me the natural thing here is to return the simplest form that each metric returns.

When it comes to wanting to show several metrics in the same table, we can have a separate method to return a table, such as:

skore.set_config(dataframe="polars")
report.metrics[["accuracy", "balanced_accuracy", "f1", "precision", "recall"]]
# or
report.metrics.as_frame(["accuracy", "f1", ...], dataframe="polars")

glemaitre · 2025-01-31T12:15:55Z

Summarizing some thoughts that we got IRL with @adrinjalali that could simplify the API and answer the needs. I'll open subsequent dedicated issues reflecting some discussions here:

People have expectation when calling:

report.metrics.subsequent_metric(...)

It should be most probably a Python object like float for a scalar or a dict when we need a more information about the pos_label. In this thread, we are also convinced that we should have a sort of consistency regarding the output type. So here, it is a proposed API that looks a middle ground:

For metrics, we should force least ambiguous Python type but that should not change depending of the input. Let's provide two examples:

report.metrics.accuracy()

will return a float but

report.metrics.precision()

will return a dict. The number of values in the dict will depends on the parameters of precision: in binary case, if pos_label=None then we have both the positive and negative label reporting while if specified then a single value. For multiclass then we have all classes metrics reported. But the type is consistent.

When it comes to consolidate the metrics into a report, the report.metrics.report_metrics() makes a lot of sense now. It avoids you to have to create dataframe for each individual metric and we do the concatenate for you.

At the end, we do not need anymore a new parameter which is what I like about it. Most probably not perfect for everyone but answer the need of many.

sylvaincom · 2025-02-04T16:24:34Z

Thanks

+1 : the pandas dataframe with all metrics accessible in report.metrics.report_metrics(), but for specific metrics what you suggested above

MarieSacksick · 2025-02-07T11:01:13Z

What's the usual process for an rfc issue? Can we close it now that we have #1275 to implement the decision?

auguste-probabl · 2025-02-10T14:19:10Z

Yep, the Definition of Done for this is that a decision was made.

glemaitre changed the title ~~api: revisit the option to return a single scalar when calling a metric~~ rfc (api): revisit the option to return a single scalar when calling a metric Jan 29, 2025

glemaitre mentioned this issue Jan 31, 2025

api: return Python native type when calling metrics function #1275

Closed

auguste-probabl closed this as completed Feb 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rfc (api): revisit the option to return a single scalar when calling a metric #1259

rfc (api): revisit the option to return a single scalar when calling a metric #1259

glemaitre commented Jan 29, 2025 •

edited

Loading

MarieSacksick commented Jan 30, 2025

auguste-probabl commented Jan 30, 2025

adrinjalali commented Jan 31, 2025

glemaitre commented Jan 31, 2025

sylvaincom commented Feb 4, 2025 •

edited

Loading

MarieSacksick commented Feb 7, 2025

auguste-probabl commented Feb 10, 2025

rfc (api): revisit the option to return a single scalar when calling a metric #1259

rfc (api): revisit the option to return a single scalar when calling a metric #1259

Comments

glemaitre commented Jan 29, 2025 • edited Loading

MarieSacksick commented Jan 30, 2025

auguste-probabl commented Jan 30, 2025

adrinjalali commented Jan 31, 2025

glemaitre commented Jan 31, 2025

sylvaincom commented Feb 4, 2025 • edited Loading

MarieSacksick commented Feb 7, 2025

auguste-probabl commented Feb 10, 2025

glemaitre commented Jan 29, 2025 •

edited

Loading

sylvaincom commented Feb 4, 2025 •

edited

Loading