Visualization API changes #265

alexander-held · 2021-08-20T16:18:11Z

This collects information regarding changes in the cabinetry visualization API, and is a follow-up to #251.

feat: make matplotlib core dependency and refactor visualization code #250 made matplotlib a core dependency and refactored the plotting code.
feat: return figures from visualizations #264 made functions in the visualize module return figures (or a list of dictionaries with figures)
feat: model prediction API and yield table changes #267 changed visualize.data_mc to take a model prediction object instead of a model, and added a new channels keyword argument
feat: close single figures by default #271 fixed duplicate display of figures from functions returning figures (instead of lists of dicts) in notebooks (see also Visualization API changes #265 (comment) below)
feat: support custom colors for data/MC plots #399 added support for custom histogram colors for data/MC plots

Outstanding items and open questions (including pieces from #381):

allow injecting axes into plotting functions #142 (aiming at v0.4)
- The natural target for this seems to be visualize.plot_model and visualize.plot_result, and those functions should then likely return artists. Calling these functions directly comes with a loss in convenience, e.g. the correlation matrix pruning threshold. Could consider factoring out the convenience functions? Handing axes to the visualize-level functions is more challenging, since several of these can return multiple figures (and the exact number is not easily known for visualize.templates).
Consider supporting callbacks as suggested in feature request: matplotlib visualize log scale #113 (comment).
Consider making return of figures optional to avoid keeping too many figures in memory for visualize.templates (figures still kept around even with close_figure=True as long as reference to them exists).
In addition to this it seems useful to not override custom rcParams set by users via the mpl.style.use calls in cabinetry but instead only update values that correspond to the matplotlib default. Then users could do something like
```
import matplotlib as mpl
mpl.rcParams['axes.prop_cycle'] = mpl.cycler(color=["salmon", "tan", "mediumseagreen"])
```
to get a custom color scheme. (from Histogram colors in stacks - user interface creation #381)
- this may actually not override everything as initially expected, see fix: add axis labels and binning to cabinetry config file iris-hep/analysis-grand-challenge#117 (comment)
Another idea: new setting style with default style="cabinetry" that will apply the mpl.style.use call, and the option style=None that will skip it. That allows users to set rcParams in any way they want. Some other styling operations like tick label design and such can probably also be factored from the code and put into a style sheet gathering everything. (from Histogram colors in stacks - user interface creation #381)

The text was updated successfully, but these errors were encountered:

alexander-held · 2021-08-28T12:39:37Z

As of 5ed199a, functions returning a single figure cause them to be rendered twice when called as the last line in a notebook cell. The reason is the following I believe:

The matplotlib inline backend looks for any figures that pyplot knows about (plt.get_fignums()), renders them to png, Base64 encodes them, puts that into the notebook (which makes the figure show up) and then closes all figures (presumably plt.close("all")). This rendering will always happen for any figures that are still open, which is the default behavior in cabinetry as of this commit.
The return value of the last line in a cell will also be shown as the result of the cell, and that happens to be a figure in the case of functions like visualize.pulls which produce a single figure.

Functions producing multiple figures are not affected by this duplication, since the return value is a dict and that will not render the functions it contains.

To solve duplication, there are the following options:

Make figure closing default for all functions returning a single figure. A figure can then be shown in the following ways:
- Figure shows up because it is the return value of the last line in the cell:
```
...
visualize.pulls(...)
```
- Figure is produced earlier and assigned to object, and that is referenced again:
```
fig = visualize.pulls(...)
...
fig
```
- Default closing is disabled, but rendering of return value is avoided. The advantage of this is that multiple figures can be rendered and this can happen anywhere in the cell.
```
visualize.pulls(..., close_figure=True);
```
  or
```
_ = visualize.pulls(..., close_figure=True)
```
The downside of this approach is that visualize.data_mc and visualize.templates would still not have figure closing enabled by default and thereby behave differently. The advantage is that the easiest use case of just calling the single-figure-producing functions without thinking about return values or optional arguments "just works" correctly, and the multi-figure functions also work (via a different method).
Make figure closing default for all functions, including multi-figure functions. Rendering of multi-figure functions could then be achieved via a small helper function:
```
from IPython.display import display

def display_helper(fig_list_dict):
    for fig_dict in fig_list_dict:
        display(fig_dict["figure"])
```
This could be called on return values of visualize.data_mc and visualize.templates to show all figures at once, even if they already have been closed.
Return a class with _repr_html_ defined to manually handle things (see hist example). This is similar to the suggestion from feat: Notebook support for viz #163.

A reasonable short term solution seems to be to close figures from single-figure functions by default. There are multiple ways for them to still be rendered anyway. Multi-figure functions can stay open by default, so all figures are also rendered there. In the longer term a more unified solution could be useful. Feedback from analyzers using cabinetry in notebooks is very welcome!

alexander-held · 2023-07-17T20:48:44Z

Examples of editing a data/MC figure (experiment labels, axis labels, removing existing text on the figure and replacing it): https://gist.github.com/alexander-held/2ca63e4c4c3de2114bf8d903bf28bb4a

edit: now also includes an example for how to add a normalized signal (and re-do the legend)

alexander-held added enhancement New feature or request visualization Related to visualization labels Aug 20, 2021

alexander-held mentioned this issue Aug 20, 2021

Visualization API changes for v0.3 #251

Closed

alexander-held added the help wanted Extra attention is needed label Aug 28, 2021

alexander-held mentioned this issue Aug 29, 2021

feat: close single figures by default #271

Merged

This was referenced Apr 5, 2023

Histogram colors in stacks - user interface creation #381

Closed

feat: support custom colors for data/MC plots #399

Merged

alexander-held mentioned this issue Jul 18, 2023

Signal overlay in data/MC plots #422

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Visualization API changes #265

Visualization API changes #265

alexander-held commented Aug 20, 2021 •

edited

Loading

alexander-held commented Aug 28, 2021

alexander-held commented Jul 17, 2023 •

edited

Loading

Visualization API changes #265

Visualization API changes #265

Comments

alexander-held commented Aug 20, 2021 • edited Loading

alexander-held commented Aug 28, 2021

alexander-held commented Jul 17, 2023 • edited Loading

alexander-held commented Aug 20, 2021 •

edited

Loading

alexander-held commented Jul 17, 2023 •

edited

Loading