Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Keyword arg for predicted column name #266

Closed
janosh opened this issue Nov 15, 2019 · 4 comments · Fixed by #267
Closed

Keyword arg for predicted column name #266

janosh opened this issue Nov 15, 2019 · 4 comments · Fixed by #267
Labels
enhancement v1.1 Issues and enhancements for upcoming minor release v1.1

Comments

@janosh
Copy link
Member

janosh commented Nov 15, 2019

The default name target + " predicted" for the new column appended to a dataframe by mat_pipe.predict() contains a space and prevents the use of dot notation to access the column. Would you accept a PR to make this suffix configurable or maybe even change the default to target + _predicted" or target + _pred"?

@ardunn
Copy link
Contributor

ardunn commented Nov 15, 2019

Hey @janosh I'd accept a PR to make it configurable!

I think this should take form in two parts:
(1) setting the default in the DFMLAdaptor base class somehow (preferred), or manually adding it to both adaptors args (not preferred). Anyway it should just be an init arg to each adaptor

(2) a powerup kwarg to the get_preset_config for MatPipe.from_preset method.

@ardunn ardunn added enhancement v1.1 Issues and enhancements for upcoming minor release v1.1 labels Nov 15, 2019
@janosh
Copy link
Member Author

janosh commented Nov 18, 2019

Are you sure this should be part of get_preset_config()? Seems too unimportant to make it part of a preset.

@ardunn
Copy link
Contributor

ardunn commented Nov 18, 2019

Yes, in the sense it should be a powerup (kwarg), not a preset itself. The "powerups" are intended to be common options that are applied to the pre-defined presets. The idea is that if you want to enable some common options while using a preset (e.g., cache the features in some file, set the number of jobs, etc.) you don't need to specify an entire custom pipeline for MatPipe.

For example, n_jobs is a powerup which can be applied to any of the presets to define the number of jobs for featurization and learning across the entire pipeline. Similarly, the target_output_col (or whatever name) could be a powerup which sets the output col name.

So without making it a powerup, if you wanted to set the output col name, you'd need to either define and entire custom pipeline (bad, lots of work [relatively]) or get the preset config and then replace the AutoFeaturizer with a custom one (less bad, but still not intuitive).

Making it a powerup in get_preset_config, you could just:

pipe = MatPipe.from_preset("express", target_output_col="custom column name")

@janosh
Copy link
Member Author

janosh commented Nov 18, 2019

I see what you mean. I'll add that.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement v1.1 Issues and enhancements for upcoming minor release v1.1
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants