Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

More pipeline diagnostics #239

Closed
janosh opened this issue Oct 8, 2019 · 4 comments · Fixed by #246
Closed

More pipeline diagnostics #239

janosh opened this issue Oct 8, 2019 · 4 comments · Fixed by #246

Comments

@janosh
Copy link
Member

janosh commented Oct 8, 2019

Besides #238, I think the diagnostics into a fitted pipe could be further improved. In particular, it's too difficult to determine which model actually performed best.

@ardunn
Copy link
Contributor

ardunn commented Oct 8, 2019

I agree, it could definitely be organized better.

If you're just interested in the underlying tpot model, you can get it with:

pipe.learner.best_pipeline

If you're interested in the best "entire" pipeline in terms of going from material object to prediction (including featurization, cleaning, reduction, learning), that is a bit more difficult, because the fitted matpipe is the best pipeline lol.

My thoughts are to either add another method which only returns the most important information. E.g., which featurizers were used, what are the cleaning rules generally, what is the best autoML pipeline, etc.

@janosh
Copy link
Member Author

janosh commented Oct 8, 2019

My thoughts are to either add another method which only returns the most important information. E.g., which featurizers were used, what are the cleaning rules generally, what is the best autoML pipeline, etc.

I think that would be nice!

It took me some time to discover that pipe.learner.best_pipeline and pipe.learner.best_models was what I was looking for. I noticed, however, that these aren't available on saved and loaded pipes.

@ardunn
Copy link
Contributor

ardunn commented Oct 8, 2019

In the case of tpot pipelines saved and loaded, you are correct, because pickling tpot objects doesn't work last time I checked (may have been updated though). Current behavior is to select the best pipeline from the tpot object and save that single sklearn Pipeline as the backend (similar to a SinglePipelineAdaptor learner object). So the entire backend becomes the "best pipeline" and unfortunately, all the other, previously tried models are lost :/

Tl;dr: you can open up the best pipeline from a loaded (toot-backend) pipe using:

pipe.learner.backend

Only the best pipeline is saved. The best_models is not saved.

I've opened an issue addressing this #241

@ardunn
Copy link
Contributor

ardunn commented Oct 12, 2019

related to #221

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants