You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So hypothetically we should be able to simplify their code to a common underlying function. Idially, each function would compute the needed shift and scale values, and these will be passed to standardize(..., center = <shift>, scale = <scale>).
This will have 0 effect on usability, but will make maintenance of the code much easier.
Function
Supports weights?
Shift
Scale
center()
✅
mean(x) or median(x)
1
standardize()
✅
mean(x) or median(x)
sd(x) or mad(x)
slide()
❌
min(x) - lowest
1
reverse()
❌
-(max(x)+min(x))
-1
rescale()
❌
min(x)*<scale>-to[1]
(to[2]-to[1])/(max(x)-min(x))
normalize()
❌
same as rescale(x,to=c(0,1)+c(-1,1)*include_bounds) where include_bounds is a number [0,1] or 0.5/length(x)
In any place where a reference= arg is provided, replace x with reference above.
All functions would have to deal with missing values, non-finite values, and labels.
The text was updated successfully, but these errors were encountered:
I looked at the code, and I'm not sure if there's really much we can simplify. Due to the possible transformation inside formulas, we have the dw_transformer class, which requires different attributes (depending on the transformation function), so we need these information anyway, and can't "standardize" this code. Furthermore, there are a few exceptions that we must handle anyway, e.g. reversing for factors (where min()/max() don't work).
You can try to improve things here, but my first impression is that not much simplification is possible beyond what we already did.
All of the following are linear transformers:
center()
standardize()
slide()
reverse()
rescale()
normalize()
So hypothetically we should be able to simplify their code to a common underlying function. Idially, each function would compute the needed shift and scale values, and these will be passed to
standardize(..., center = <shift>, scale = <scale>)
.This will have 0 effect on usability, but will make maintenance of the code much easier.
center()
mean(x)
ormedian(x)
standardize()
mean(x)
ormedian(x)
sd(x)
ormad(x)
slide()
min(x) - lowest
reverse()
-(max(x)+min(x))
rescale()
min(x)*<scale>-to[1]
(to[2]-to[1])/(max(x)-min(x))
normalize()
rescale(x,to=c(0,1)+c(-1,1)*include_bounds)
where
include_bounds
is a number [0,1] or0.5/length(x)
In any place where a
reference=
arg is provided, replacex
withreference
above.All functions would have to deal with missing values, non-finite values, and labels.
The text was updated successfully, but these errors were encountered: