You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is there a better way we could do this? Maybe add something upstream if necessary?
As I'm thinking of it, I don't know that this operation is necessarily well defined. Just like with limit when you call it multiple times on a large dataframe you get different results, I would expect different results from multiple calls here.
If we do put this in, I would suggest adding more text to the description to explain why this is an expensive operation - that it performs a collect to determine the size of the dataframe.
As I'm thinking of it, I don't know that this operation is necessarily well defined. Just like with
limit
when you call it multiple times on a large dataframe you get different results, I would expect different results from multiple calls here.If we do put this in, I would suggest adding more text to the description to explain why this is an expensive operation - that it performs a collect to determine the size of the dataframe.
Originally posted by @timsaucer in apache/datafusion-python#915 (comment)
The text was updated successfully, but these errors were encountered: