Skip to content

Internals bug? will_make_matrix() returns False when given a matrix #786

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
joeycouse opened this issue Aug 14, 2022 · 6 comments
Closed

Comments

@joeycouse
Copy link

Hello, I'm working on a PR to address #765. While doing so I ran into an issue with will_make_matrix(). Shouldn't the return value be True if y is a matrix or a vector?

When y is a numeric matrix it fails this check and converts it to a vector. This is stripping the colname of y whenever it is passed to function later down the line such as xgb_train()

parsnip/R/convert_data.R

Lines 326 to 336 in f081091

will_make_matrix <- function(y) {
if (is.matrix(y) | is.vector(y))
return(FALSE)
cls <- unique(unlist(lapply(y, class)))
if (length(cls) > 1)
return(FALSE)
can_convert <-
vapply(y, function(x)
is.atomic(x) & !is.factor(x), logical(1))
all(can_convert)
}

@simonpcouch
Copy link
Contributor

simonpcouch commented Aug 15, 2022

I believe will_make_matrix may mean "is this a thing that's not a matrix that will be converted to one," and that first if (is.matrix) return(FALSE) line makes me think that may be the case.

Is that interpretation correct, @topepo?

@joeycouse
Copy link
Author

joeycouse commented Aug 15, 2022

Hmm, based on this comment I would assume the intention is to return y as a matrix if it is possible. Right now there doesn't seem to be a case when y is return as a matrix, from the call to .convert_form_to_xy_fit()

parsnip/R/convert_data.R

Lines 120 to 124 in f081091

# Since a matrix is requested, try to convert y but check
# to see if it is possible
if (will_make_matrix(y)) {
y <- as.matrix(y)
}

@topepo
Copy link
Member

topepo commented Aug 30, 2022

@simonpcouch is right about the intent.

It was also designed for data frames so that's why there's no logic to handle vectors. We could do this but would have to know how to name that column (so it feels out of scope for this function).

@joeycouse can you give some context on where this has come up?

@joeycouse
Copy link
Author

joeycouse commented Aug 30, 2022

@topepo this is comes up when there is a call to .covert_form_to_xy_fit() i.e. where the underlying model only has a x_y interface. The return value of y is always a vector/factor and doesn't carry a name. So when passed to the underlying modeling function there is no way to determine what the original column name of y was.

> reg_fit <-
+   boost_tree(trees = 10, mode = "regression") %>%
+   set_engine("xgboost", 
+              eval_metric = "mae", 
+              verbose = 1) %>%
+   fit(mpg ~ ., data = mtcars)

This is an issue because the call to xgb_train() has no name for y which makes it very difficult to extended the validation arg to accept a data frame (#771, #765, #760, #471) because there is no good way to determine what the outcome column is.

parsnip/R/fit_helpers.R

Lines 137 to 155 in e1eb30a

data_obj <- .convert_form_to_xy_fit(
formula = env$formula,
data = env$data,
...,
composition = target,
indicators = indicators,
remove_intercept = remove_intercept
)
env$x <- data_obj$x
env$y <- data_obj$y
check_outcome(env$y, object)
res <- xy_xy(
object = object,
env = env, #weights!
control = control,
target = target
)

@simonpcouch
Copy link
Contributor

Given that we've confirmed that the function, as it's currently scoped, works as intended, and we've closed the PR that prompted the proposed extension in scope, I'm going to go ahead and close this issue.

@github-actions
Copy link

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Sep 14, 2022
# for free to subscribe to this conversation on GitHub. Already have an account? #.
Projects
None yet
Development

No branches or pull requests

3 participants