-
-
Notifications
You must be signed in to change notification settings - Fork 426
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[WIP] improvement suggestions for the label guide documentation (until sorting dimensions chapter) #1699
Merged
OriolAbril
merged 6 commits into
arviz-devs:main
from
Eva-Lotte:label-guide-documentation-improvement-suggestions
May 27, 2021
Merged
[WIP] improvement suggestions for the label guide documentation (until sorting dimensions chapter) #1699
Changes from 2 commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
5fae4c8
documentation improvement suggestions for the label guide
01fadad
Addressed feedback
549b643
Update doc/source/user_guide/label_guide.rst
Eva-Lotte 2545612
feedback addressed
d08de38
Merge branch 'label-guide-documentation-improvement-suggestions' of h…
49b9468
added changes to CHANGELOG.md
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
@@ -4,42 +4,47 @@ | |||||||||
Label guide | ||||||||||
=========== | ||||||||||
|
||||||||||
Basic labeling | ||||||||||
-------------- | ||||||||||
Basic labelling | ||||||||||
--------------- | ||||||||||
|
||||||||||
All ArviZ plotting functions and some stats functions take an optional ``labeller`` argument. | ||||||||||
By default, labels show the variable name and the coordinate value | ||||||||||
(for multidimensional variables only). | ||||||||||
The first example below uses this default labeling. | ||||||||||
All ArviZ plotting functions and some stats functions can take an optional ``labeller`` argument. | ||||||||||
By default, labels show the variable name. | ||||||||||
Multidimensional variables also show the coordinate value. | ||||||||||
|
||||||||||
.. ipython:: | ||||||||||
|
||||||||||
Example: Default labelling | ||||||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||||||||||
|
||||||||||
In [1]: import arviz as az | ||||||||||
...: schools = az.load_arviz_data("centered_eight") | ||||||||||
...: az.summary(schools) | ||||||||||
|
||||||||||
Thanks to being powered by xarray, ArviZ supports label based indexing. | ||||||||||
We can therefore use the labels we have seen in the summary to plot only a subset of the variables, | ||||||||||
the one we are interested in. | ||||||||||
Provided we know that the coordinate values shown for theta correspond to the `school` dimension, | ||||||||||
we can plot only ``tau`` to better inspect it's 1.03 :func:`~arviz.rhat` and | ||||||||||
``theta`` for ``Choate`` and ``St. Paul's``, the ones with higher means: | ||||||||||
ArviZ supports label based indexing powered by xarray. | ||||||||||
Through label based indexing you can use labels to plot a subset of selected variables. | ||||||||||
|
||||||||||
Example: Label based indexing | ||||||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||||||||||
|
||||||||||
For a case where the coordinate values shown for the ``theta`` variable coordinate to the ``school`` dimension | ||||||||||
you can indicate ArviZ to plot ``tau`` by including it in the ``var_names`` argument to inspect its 1.03 :func:`~arviz.rhat` value. | ||||||||||
To inspect the ``theta`` values for the ``Choate`` and ``St. Paul's`` coordinates, you can include ``theta`` in ``var_names`` and use the ``coords`` argument to select only these two coordinate values. | ||||||||||
You can generate this plot with the following command: | ||||||||||
|
||||||||||
.. ipython:: python | ||||||||||
|
||||||||||
@savefig label_guide_plot_trace.png | ||||||||||
az.plot_trace(schools, var_names=["tau", "theta"], coords={"school": ["Choate", "St. Paul's"]}, compact=False); | ||||||||||
|
||||||||||
So far so good, we can identify some issues for low ``tau`` values which is great start. | ||||||||||
But say we want to make a report on Deerfield, Hotchkiss and Lawrenceville schools to | ||||||||||
see the probability of ``theta > 5`` and we have to present it somewhere with math notation. | ||||||||||
Our default labels show ``theta``, not $\theta$ (generated from ``$\theta$`` using $\LaTeX$). | ||||||||||
With this you can now identify issues for low ``tau`` values. | ||||||||||
|
||||||||||
Example: Using the labeller argument | ||||||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||||||||||
|
||||||||||
Fear not, we can use the labeller argument to customize the labels. | ||||||||||
The ``arviz.labels`` module contains some classes that cover some common customization classes. | ||||||||||
To create a report on Deerfield, Hotchkiss and Lawrenceville schools for the probability of ``theta > 5`` and use the labeller argument to customize labels. | ||||||||||
Unlike the default labels that show ``theta``, not $\theta$ (generated from ``$\theta$`` using $\LaTeX$), the labeller argument presents the label with proper math notation. | ||||||||||
|
||||||||||
In this case, we can use :class:`~arviz.labels.MapLabeller` and | ||||||||||
tell it to rename the variable name ``theta`` to ``$\theta$``, like so: | ||||||||||
You can use :class:`~arviz.labels.MapLabeller` to rename the variable ``theta`` to ``$\theta$``, as shown in the following example: | ||||||||||
|
||||||||||
.. ipython:: | ||||||||||
|
||||||||||
|
@@ -50,24 +55,26 @@ tell it to rename the variable name ``theta`` to ``$\theta$``, like so: | |||||||||
@savefig label_guide_plot_posterior.png | ||||||||||
In [1]: az.plot_posterior(schools, var_names="theta", coords=coords, labeller=labeller, ref_val=5); | ||||||||||
|
||||||||||
You can see the labellers available in ArviZ at :ref:`their API reference page <labeller_api>`. | ||||||||||
Their names aim to be descriptive and they all have examples in their docstring. | ||||||||||
For further customization continue reading this guide. | ||||||||||
.. seealso:: | ||||||||||
|
||||||||||
- For a list of labellers available in ArviZ, see the :ref:`the API reference page <labeller_api>`. | ||||||||||
|
||||||||||
Sorting labels | ||||||||||
-------------- | ||||||||||
|
||||||||||
Labels in ArviZ can generally be sorted in two ways, | ||||||||||
using the arguments passed to ArviZ plotting functions or | ||||||||||
sorting the underlying xarray Dataset. | ||||||||||
The first one is more convenient for single time ordering | ||||||||||
whereas the second is better if you want plots consistently sorted that way and | ||||||||||
is also more flexible, using ArviZ args is more limited. | ||||||||||
ArviZ allows labels to be sorted in two ways: | ||||||||||
|
||||||||||
- Using the arguments passed to ArviZ plotting functions | ||||||||||
- Sorting the underlying :class:`xarray.Dataset` | ||||||||||
|
||||||||||
The first option is more suitable for single time ordering whereas the second option is more suitable for sorting plots consistently. | ||||||||||
|
||||||||||
.. note:: | ||||||||||
|
||||||||||
Both ways are limited. | ||||||||||
Multidimensional variables can not be separated. | ||||||||||
For example, it is possible to sort ``theta, mu,`` or ``tau`` in any order, and within ``theta`` to sort the schools in any order, but it is not possible to sort half of the schools, then ``mu`` and ``tau`` and then the rest of the schools. | ||||||||||
|
||||||||||
Both alternatives have an important limitation though. | ||||||||||
Multidimensional variables are always together. | ||||||||||
We can sort ``theta, mu, tau`` in any order, and within ``theta`` we can sort the schools in any order, | ||||||||||
but it's not possible to show half the schools, then ``mu`` and ``tau`` and then the rest of the schools. | ||||||||||
|
||||||||||
Sorting variable names | ||||||||||
...................... | ||||||||||
|
@@ -78,16 +85,15 @@ Sorting variable names | |||||||||
|
||||||||||
.. tabbed:: ArviZ args | ||||||||||
|
||||||||||
We can pass a list with the variable names sorted to modify the order in which they appear | ||||||||||
when calling ArviZ functions | ||||||||||
For variable names to appear sorted when calling ArviZ functions, pass a list of the variable names with the variable names sorted. | ||||||||||
|
||||||||||
.. ipython:: | ||||||||||
|
||||||||||
In [1]: az.summary(schools, var_names=var_order) | ||||||||||
|
||||||||||
.. tabbed:: xarray | ||||||||||
|
||||||||||
In xarray, subsetting the Datset with a sorted list of variable names will order the Dataset. | ||||||||||
In xarray, subsetting the Dataset with a sorted list of variable names will order the Dataset. | ||||||||||
|
||||||||||
.. ipython:: | ||||||||||
|
||||||||||
|
@@ -97,22 +103,25 @@ Sorting variable names | |||||||||
Sorting coordinate values | ||||||||||
......................... | ||||||||||
|
||||||||||
We may also want to sort the schools by their mean. | ||||||||||
To do so we first have to get the means of each school: | ||||||||||
To sort coordinate values you have to define the order, store it, and use the result to sort the coordinate values. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
|
||||||||||
Example: Sorting the schools by mean | ||||||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||||||||||
|
||||||||||
1) Locate the means of each school by using the following command: | ||||||||||
|
||||||||||
.. ipython:: | ||||||||||
|
||||||||||
In [1]: school_means = schools.posterior["theta"].mean(("chain", "draw")) | ||||||||||
...: school_means | ||||||||||
|
||||||||||
We can then use this DataArray result to sort the coordinate values for ``theta``. | ||||||||||
Again we have two alternatives: | ||||||||||
2) You can use the DataArray result to sort the coordinate values for ``theta``. | ||||||||||
There are two ways of sorting: | ||||||||||
|
||||||||||
.. tabbed:: ArviZ args | ||||||||||
|
||||||||||
Here the first step is to sort the coordinate values so we can pass them as `coords` argument and | ||||||||||
choose the order of the rows. | ||||||||||
If we want to manually sort the schools, `sorted_schools` can be defined straight away as a list | ||||||||||
Sort the coordinate values to pass them as a `coords` argument and choose the order of the rows. | ||||||||||
To manually sort the schools, `sorted_schools`, define sorted_schools as a list. | ||||||||||
Eva-Lotte marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||
|
||||||||||
.. ipython:: | ||||||||||
|
||||||||||
|
@@ -121,7 +130,7 @@ Again we have two alternatives: | |||||||||
|
||||||||||
.. tabbed:: xarray | ||||||||||
|
||||||||||
We can use the :meth:`~xarray.Dataset.sortby` method to order our coordinate values straight at the source | ||||||||||
You can use the :meth:`~xarray.Dataset.sortby` method to order our coordinate values directly at the source. | ||||||||||
|
||||||||||
.. ipython:: | ||||||||||
|
||||||||||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence now feels incomplete. It starts with "to create" and a description on what to create but nothing on how to create it.
A bit more context in case it helps. Here there are two goals: creating a report of theta > 5 and using the labeller to customize the labels.
One can be done without the other, but instead of showing how to use the labeller (which I think should be done in the example in he docstring that is still wip), I wanted to show how to use the labeller to solve a more specific and real task. When exploring the mode ourselves, we won't generally care much about the labels, having the same thing as the code is fine. However, if generating a report to be published or presented, we'll probably want to take better care of the presentation, and match the labels to the labels in the equations of the paper instead of variables in the code. Hence the MapLabeller
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for providing a bit of context to this matter. The way I see it, since this piece of text is inside of the example, we don't even need the first part of the sentence.