Skip to content

Add engine specification field for predictor encodings #319

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 27 commits into from
May 29, 2020
Merged

Conversation

juliasilge
Copy link
Member

Closes #290.

This PR adds the field for engine-specific encodings (only dummy/indicator variables for now).

For example, the encoding options for ranger are:

set_encoding(
  model = "rand_forest",
  eng = "ranger",
  mode = "regression",
  options = list(predictor_indicators = FALSE)
)

While the encoding options for vanilla logistic regression are:

set_encoding(
  model = "logistic_reg",
  eng = "glm",
  mode = "classification",
  options = list(predictor_indicators = TRUE)
)

These changes depend on handling the names for data arguments implemented in #315 and #316.

These encodings can be used in workflows so that the user experiences the same behavior around dummy variable creation in both parsnip and workflows.

I am pretty confident that predictor_indicators = TRUE / FALSE is correct for all the model + engine combinations except for liquidSVM. I was having trouble getting output from those models and could use some double-checking.

topepo and others added 26 commits April 29, 2020 21:29
Merge branch 'master' into encoding-options

# Conflicts:
#	R/linear_reg_data.R
#	R/svm_poly_data.R
#	R/svm_rbf_data.R
#	tests/testthat/test_svm_poly.R
#	tests/testthat/test_svm_rbf.R
@juliasilge
Copy link
Member Author

This PR also fixes the function used with Spark decision trees 🌳 for regression.

@topepo topepo merged commit aa29bac into master May 29, 2020
@github-actions
Copy link

github-actions bot commented Mar 7, 2021

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 7, 2021
@juliasilge juliasilge deleted the encoding-options branch June 27, 2021 16:08
# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Adding an engine specification field for predictor encodings
2 participants