Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[WIP] improvement suggestions for the label guide documentation (until sorting dimensions chapter) #1699

Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 14 additions & 15 deletions doc/source/user_guide/label_guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,27 +20,28 @@ Example: Default labelling
...: schools = az.load_arviz_data("centered_eight")
...: az.summary(schools)

ArviZ supports label based indexing.
ArviZ supports label based indexing powered by xarray.
Through label based indexing you can use labels to plot a subset of selected variables.

Example: Label based indexing
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

For a case where the coordinate values shown for the theta variable correspond to the `school` dimension
you can plot only the ``tau`` variable to better inspect its 1.03 :func:`~arviz.rhat` value,
only the ``theta`` variable to the ``Choate`` and ``St. Paul's`` coordinates, which have higher means, you can ploy only the ``theta`` variable by running the following command:
For a case where the coordinate values shown for the ``theta`` variable coordinate to the ``school`` dimension
you can indicate ArviZ to plot ``tau`` by including it in the ``var_names`` argument to inspect its 1.03 :func:`~arviz.rhat` value.
To inspect the ``theta`` values for the ``Choate`` and ``St. Paul's`` coordinates, you can include ``theta`` in ``var_names`` and use the ``coords`` argument to select only these two coordinate values.
You can generate this plot with the following command:

.. ipython:: python

@savefig label_guide_plot_trace.png
az.plot_trace(schools, var_names=["tau", "theta"], coords={"school": ["Choate", "St. Paul's"]}, compact=False);

With this you can now identify issues for the low ``tau`` values.
With this you can now identify issues for low ``tau`` values.

Example: Using the labeller argument
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To create a report on Deerfield, Hotchkiss and Lawrenceville schools for the probability of ``theta > 5``, use the labeller argument to customize labels.
To create a report on Deerfield, Hotchkiss and Lawrenceville schools for the probability of ``theta > 5`` and use the labeller argument to customize labels.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence now feels incomplete. It starts with "to create" and a description on what to create but nothing on how to create it.

A bit more context in case it helps. Here there are two goals: creating a report of theta > 5 and using the labeller to customize the labels.

One can be done without the other, but instead of showing how to use the labeller (which I think should be done in the example in he docstring that is still wip), I wanted to show how to use the labeller to solve a more specific and real task. When exploring the mode ourselves, we won't generally care much about the labels, having the same thing as the code is fine. However, if generating a report to be published or presented, we'll probably want to take better care of the presentation, and match the labels to the labels in the equations of the paper instead of variables in the code. Hence the MapLabeller

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for providing a bit of context to this matter. The way I see it, since this piece of text is inside of the example, we don't even need the first part of the sentence.

Unlike the default labels that show ``theta``, not $\theta$ (generated from ``$\theta$`` using $\LaTeX$), the labeller argument presents the label with proper math notation.

You can use :class:`~arviz.labels.MapLabeller` to rename the variable ``theta`` to ``$\theta$``, as shown in the following example:
Expand All @@ -54,26 +55,24 @@ You can use :class:`~arviz.labels.MapLabeller` to rename the variable ``theta``
@savefig label_guide_plot_posterior.png
In [1]: az.plot_posterior(schools, var_names="theta", coords=coords, labeller=labeller, ref_val=5);

Additional Resources
~~~~~~~~~~~~~~~~~~~~
.. seealso::

- For a list of labellers available in ArviZ, see the :ref:`the API reference page <labeller_api>`.
- For common customization classes, see the ``arviz.labels`` module.

Sorting labels
--------------

ArviZ allows labels to be sorted in two ways:

- Using the arguments passed to ArviZ plotting functions
- Sorting the underlying xarray Dataset
- Sorting the underlying :class:`xarray.Dataset`

The first option is more suitable for single time ordering whereas the second option is more suitable for sorting plots consistently.

.. note::

Both ways are limited.
Multidimensional variables can not be separated from each other.
Multidimensional variables can not be separated.
For example, it is possible to sort ``theta, mu,`` or ``tau`` in any order, and within ``theta`` to sort the schools in any order, but it is not possible to sort half of the schools, then ``mu`` and ``tau`` and then the rest of the schools.


Expand Down Expand Up @@ -104,7 +103,7 @@ Sorting variable names
Sorting coordinate values
.........................

To sort coordinate values by mean you have to locate the means of each school and then use the DataArray result to sort the coordinate values.
To sort coordinate values you have to define the order, store it, and use the result to sort the coordinate values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To sort coordinate values you have to define the order, store it, and use the result to sort the coordinate values.
To sort coordinate values you have to define the order, store it, and use the result to sort the coordinate values.
The order can be defined by performing some operations on our xarray objects (like it is shown in the example below)
or by manually creating a list with the desired order.


Example: Sorting the schools by mean
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand All @@ -121,8 +120,8 @@ There are two ways of sorting:

.. tabbed:: ArviZ args

Sort the coordinate values to pass them as a `coords` argument and choose the order of the rows.
To manually sort the schools, `sorted_schools`, define sorted_schools as a list.
Sort the coordinate values to pass them as a `coords` argument and choose the order of the rows.
To manually sort the schools, `sorted_schools`, define sorted_schools as a list.

.. ipython::

Expand All @@ -131,7 +130,7 @@ To manually sort the schools, `sorted_schools`, define sorted_schools as a list.

.. tabbed:: xarray

You can use the :meth:`~xarray.Dataset.sortby` method to order our coordinate values directly at the source.
You can use the :meth:`~xarray.Dataset.sortby` method to order our coordinate values directly at the source.

.. ipython::

Expand Down