Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

filtered counts excluding $this filter #205

Conversation

gogonzo
Copy link
Contributor

@gogonzo gogonzo commented Feb 22, 2023

Closes #166

todo: tests

Problem

image

Let me describe the diagram above to fully explain what is the problem and what solution I propose. Below I'm attaching sequence diagram adapted to the structure we have. As you already know:

  • FilteredData$init initializes and stores FilteredDataset (1 to many)
  • FilteredDataset$init initializes and stores FilterStates. In case of MultiAssayExperiment a relationship is 1..n in but in case of data.frame a relationship is 1..1 (I'm going to focus on df here).
  • FilterStates initializes and stores FilterState through set_filter_state (unlike other classes which are created immediately in the init of parent).

The first and second sequence in the diagram. have been just explained above. In the sequence upper arrow denotes a function call and lower arrow denotes returned value/object. As you can see init and set_filter_state doesn't return anything (no reversed arrow). In the init sequence there are data_filtered() highlighted with blue color which indicate a reactive. This is the key to understand how reactive counts work in the FilterState:

  • in FilteredDataset$init there is a line where private$dataset_filtered is composed and then passed in init_filter_states(data_filtered = self$get_dataset()). On the diagram this situation is represented by data_filtered[i, j] passed from FilteredDataset to FilterStates. In both classes these objects are kept as private$data_filtered which can be represented by the "blue" box.
  • a new FilterState object is created in FilterStates$set_filter_state and reactive filtered dataset mentioned in previous point, is used to create a reactive vector (column) here. This x_reactive object is also saved in the class, specifically in private$x_reactive.

Taking above into account, each FilterState object has access to x_reactive which can be written as this (I'll skip MAE which has more reactive layers):

x_reactive = reactive({
  data_filtered()[varname]
})

So, update counts in FilterState$server_inputs unwraps reactive layers, which are:

  • [5] execute x_reactive() which is data_filtered()[varname] [4]
  • execute data_filtered() [3] evaluates filter-call taken from FilteredDataset [2]
  • FilteredDataset$get_call calls FilterState$get_call [1]

In the return:

  • FilterState$get_call [1] returns filter call
  • FilteredDataset$get_call [2] returns filter call + merge call to the parent
  • FilteredDataset$data_filtered() [3] executes filter call and returning filtered data
  • FilterStates$data_filtered() [4] simply just proxy this dataset further (MAE does more in this step - gets experiment)
  • FilterState$x_reactive [5] extracts column vector from filtered dataset

Side note: situation described above is interesting case which messes up the perspective. Since data_filtered and x_reactive are passed from parent to child class, this creates a reverse dependency. Normally, each function is evaluated from top to bottom (for example init), and because we are de facto passing a function (reactive is a specific case of the function) this means that we are executing from bottom to top (amazing xD).

Above describes what is going on on the filter_panel_refactor@main branch. So what is the problem?
Problem is that x_reactive() calls data_filtered() and at the end get_call() is called which doesn't know anything about the rest of FilterState objects which together compose filter-call. This means that each FilterState within the same dataset receive vector of values based on the same filtered object. This is why deselecting SEX: M updates it's counts to zero (as it gets included in the calculation of the filtered dataset). What we need is delivering data_filtered()[varname] to the FilterState which would exclude condition created in $this FilterState. As you can see on the diagram - it's a hell of the way to way to get there.


Solution

image

Solution seems to be simple:

  • instead of data_filtered() being reactive I've turned this to a function data_filtered_fun(sid = integer(0)) so the FilterState receives a reactive which calls a function with sid = <attribute of the FilterState in the state_list>, which travels all the way around to FilterStates$get_call which picks up filter conditions from all FilterState objects except one which matched sid == $get_id()
  • When FilterStates create FilterState it attaches "sid" attribute to element in the state_list and control that state added later has higher id.
  • sid is controlled entirely in the FilterStates
  • data_reactive delivered to the FilterState is a function now, so it means that this function can't be used to return data with get_data. This is why in the FilteredDataset there are private$data_filtered and private$data_filtered_fun created.

You can use cdisc_data example from #165 and track trace logs from here and [here](https://github.com/insightsengineering/teal.slice/blob/b779d339228513d9949fe52b0c68cc7c90ae3211/R/FilteredDatasetDefault.R#L65]

@gogonzo gogonzo changed the title 166 filtered counts@filter panel refactor@main filtered counts excluding $this filter Feb 22, 2023
@gogonzo gogonzo added the core label Feb 22, 2023
Copy link
Contributor

@nikolas-burkoff nikolas-burkoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few initial comments from me inline

I understand what's going on here, but it is really complicated - it would conceptually be simpler if the FilterState was given the column of data filtered by all other filters and then the application of the call was directly on the FilterState as then the data flow is much clearer - however, I suspect in practice that will be much much harder to do.

R/FilterStates.R Outdated
@@ -123,7 +123,7 @@ FilterStates <- R6::R6Class( # nolint
#'
#' @return `call` or `NULL`
#'
get_call = function() {
get_call = function(sid = integer(0)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be explicit with a name with get_call: sid -> excluded_filterstate_ids or something like that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've chosen this name as excluded_id or similar could be quickly changed. We have a straight way now to implement hierarchical filter. Let's assume we have 4 filters and we'd like to update counts of the filter-3:

  • 1
  • 2
  • 3**
  • 4
# 1. my branch here - all filters except this one
states <- private$state_list[where sid != sid]

# 2. future - hierarchical
states <- private$state_list[where sid < sid]

So the phrase excluded doesn't really have anything in common with (2). Other proposition?

R/FilterStates.R Outdated
Comment on lines 136 to 144
filtered_items <- Filter(
f = function(x) x$is_any_filtered() && !identical(attr(x, "sid"), sid),
x = items
)
# hierarchical
#filtered_items <- Filter(
# f = function(x) x$is_any_filtered() && attr(x, "sid") < sid,
# x = items
#)
Copy link
Contributor Author

@gogonzo gogonzo Feb 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uncomment to get hierarchical filtering!

  • @lcd2yyz @donyunardi , we can make counts hierarchical! Initially consulted with @nikolas-burkoff already. If we get further with the API issues and filter-groups -> then it will be easier to decide what and how can have hierarchical counts. But I wanted to inform that backend is ready to do this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@npaszty 🥳 (it'll be a long time until it's in main but thought you'd be happy)

@nikolas-burkoff
Copy link
Contributor

  • The numeric slider from the cdisc example app isn't working:
    image
    (the AVAL one is so I don't know if this is a keep inf/na issue or due to the very small range) - the left end can't be moved and the right side snaps to 0

@gogonzo gogonzo marked this pull request as ready for review March 2, 2023 06:06
@nikolas-burkoff nikolas-burkoff self-assigned this Mar 3, 2023
Copy link
Contributor

@nikolas-burkoff nikolas-burkoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good - a few minor comment in line + making the labels work for keep NA and Inf in filterstate. I'll test apps again and then if I don't find anything I'll approve

I wonder if it makes sense to add some explanation in the PR description into the roxygen as what is going on here is complicated and I expect for someone new coming into this code it's going to be tough

@@ -335,6 +334,7 @@ LogicalFilterState <- R6::R6Class( # nolint
private$keep_na_srv("keep_na")

logger::log_trace("LogicalFilterState$server initialized, dataname: { private$dataname }")
non_missing_values <- reactive(Filter(is.finite, private$x_reactive()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already on line 278?

#' This object is needed for the `FilterState` counts being updated
#' on a change in filters. If `reactive(NULL)` then filtered counts are not shown.
#' on a change in filters. If function returns `NULL` then filtered counts are not shown.
#' Function has to have `sid` argument being a character.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may add why it needs to have this

#' @param data_reactive (`reactive`)\cr
#' should a `data.frame` object or `NULL`.
#' @param data_reactive (`function(sid)`)\cr
#' should return a `SummarizedExperiment` object or `NULL`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think this is wrong?

R/FilterStates.R Outdated
#' @param data_reactive (`reactive`)\cr
#' should return an object constistent with the `FilterState` class or `NULL`.
#' @param data_reactive (`function(sid)`)\cr
#' should return a `SummarizedExperiment` object or `NULL`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a summarizedexperiment

} else {
data[, varname]
},
x_reactive = if (!is.array(data)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would add a code comment here

@gogonzo
Copy link
Contributor Author

gogonzo commented Mar 3, 2023

@nikolas-burkoff I'll include this in the vignette with UMLs I made if we get to some stable point.

I wonder if it makes sense to add some explanation in the PR description into the roxygen as what is going on here is complicated and I expect for someone new coming into this code it's going to be tough

Copy link
Contributor

@nikolas-burkoff nikolas-burkoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes this is nice 👍

@gogonzo gogonzo merged commit 03e8b25 into filter_panel_refactor@main Mar 3, 2023
@gogonzo gogonzo deleted the 166_filtered_counts@filter_panel_refactor@main branch March 3, 2023 16:52
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants