-
Notifications
You must be signed in to change notification settings - Fork 0
ordinal regression model type & polr engine #6
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
ordinal regression model type & polr engine #6
Conversation
Here is a complete analysis using the library(tidymodels)
library(ordered)
# disaggregated data & partition
house_data <-
MASS::housing[rep(seq(nrow(MASS::housing)), MASS::housing$Freq), -5]
house_split <- initial_split(house_data, prop = .8)
house_train <- training(house_split)
house_test <- testing(house_split)
# tunable model & analysis specification
house_rec <- recipe(Sat ~ Infl + Type + Cont, data = house_train)
house_spec <- ordinal_reg() |>
set_engine("polr") |>
set_args(method = tune())
house_tune <- extract_parameter_set_dials(house_spec)
(house_grid <- grid_regular(house_tune, levels = Inf))
#> # A tibble: 5 × 1
#> method
#> <chr>
#> 1 logistic
#> 2 probit
#> 3 loglog
#> 4 cloglog
#> 5 cauchit
# hyperparameter (link function) optimization
house_res <- tune_grid(
house_spec,
preprocessor = house_rec,
resamples = vfold_cv(house_train),
grid = house_grid,
metrics = metric_set(accuracy, roc_auc)
)
(house_link <- select_best(house_res, metric = "accuracy"))
#> # A tibble: 1 × 2
#> method .config
#> <chr> <chr>
#> 1 logistic Preprocessor1_Model1
# final fit
house_prep <- prep(house_rec)
house_final <- finalize_model(house_spec, house_link)
(house_fit <- fit(house_final, formula(house_prep), data = house_train))
#> parsnip model object
#>
#> Call:
#> MASS::polr(formula = Sat ~ Infl + Type + Cont, data = data, method = ~"logistic")
#>
#> Coefficients:
#> InflMedium InflHigh TypeApartment TypeAtrium TypeTerrace
#> 0.5103368 1.2315652 -0.4973120 -0.2740917 -0.9533085
#> ContHigh
#> 0.3576051
#>
#> Intercepts:
#> Low|Medium Medium|High
#> -0.4677984 0.7202062
#>
#> Residual Deviance: 2803.47
#> AIC: 2819.47
# evaluation
house_pred_class <- predict(house_fit, new_data = house_test, type = "class")
bind_cols(house_test, house_pred_class) |>
accuracy(truth = Sat, estimate = .pred_class)
#> # A tibble: 1 × 3
#> .metric .estimator .estimate
#> <chr> <chr> <dbl>
#> 1 accuracy multiclass 0.528
house_pred_prob <- predict(house_fit, new_data = house_test, type = "prob")
bind_cols(house_test, house_pred_prob) |>
roc_auc(truth = Sat, starts_with(".pred_"))
#> # A tibble: 1 × 3
#> .metric .estimator .estimate
#> <chr> <chr> <dbl>
#> 1 roc_auc hand_till 0.652 Created on 2024-11-04 with reprex v2.1.1 |
I'll try to review this later today. My first thought is that the bare skeleton of |
@topepo could this be resumed for a minimal CRAN submission in the next several months? I will join a project in June and hope to make use of this package. : ) |
This PR addresses #4 by introducing a single model type for ordinal regression and a single deployable engine. My thinking is that we should complete the implementation of one engine before beginning another.
Model type
The model type is
ordinal_reg()
, per this suggestion. However, as noted in the NEWS, this could be replaced with separateordinal_*()
types for different model structures, per this suggestion.Engine
The model type comes with one engine,
'polr'
, which invokesMASS::polr()
. The engine has one tuning parameter, calledordinal_link
, which mimicssurvival_link
and passed to themethod
parameter ofpolr()
. The engine also providesclass
andprob
prediction formats; confidence intervals for predictions seem not to be implemented in {MASS}. The engine is registered on load.The
ordinal_reg
branch of {ordered} is coordinated with cognominal branches of {parsnip} and of {dials}. In {parsnip}, the model type is registered on load, a basicupdate()
method is provided, and several other brief files or code chunks analogous to those for other model types are included. In {dials}, theordinal_link
parameter tuner is defined.NB: I am not sure i successfully synchronized
ordinal_link
tomethod
; in particular, thepolr_engine_args
tibble is a bit mysterious to me. A unit test with hyperparameter optimization needs to be written. Edit: See the example in a comment below.Documentation
Package documentation was added to 'ordered-package.R' so that illustrative examples, including of {ordinalForest}, could be included there.
NB: I was unable to install the necessary dependencies to knit 'aaa.Rmd', so i manually wrote 'ordinal_reg_polr.md'.