You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Dec 4, 2019. It is now read-only.
I'm running 15 combinations of a Logistic Regression model with spark-sklearn and I'll see that all tasks have completed but there is a huge amount of time to collect all of the results. I'm guessing it's the number of my coefficients that I'm bringing back to the driver. But I've noticed it several times when I'm working with wide datasets or deep random forests. Is it just expected due to network traffic?
I'm running 15 combinations of a Logistic Regression model with spark-sklearn and I'll see that all tasks have completed but there is a huge amount of time to collect all of the results. I'm guessing it's the number of my coefficients that I'm bringing back to the driver. But I've noticed it several times when I'm working with wide datasets or deep random forests. Is it just expected due to network traffic?
Data set size: 31,358 rows, 10000 columns
Environment:
The text was updated successfully, but these errors were encountered: