-
Notifications
You must be signed in to change notification settings - Fork 372
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Allow multicolumn transformations for AbstractDataFrame #2461
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that's impressive. I haven't looked at the tests in detail yet, feel free to point me at interesting cases that I may have missed.
Could you please clarify what |
it is a type. types in Julia are also values, and we use this fact here
The first is instance of a type the second is a type (
What do you mean by "pattern"? In general the transformation mini-language is DataFrames.jl specific. What is important is that
We do not dispatch on type (actually if you look at the implementation there is a problem with this - we have to dynamically check for
We could use In summary: it is not a common pattern, but do you have a better proposal what to use instead? The benefit of this approach is:
(and just to stress - we do not dispatch on This situation is kind-of similar to |
As an aferthought: we could use
and it would be a valid transformation specification (note that there is no need of parens for trailing |
Thank you for the clarification. I'm glad I have understanding of your thought process, here. I think |
@nalimilan - what do you think? I dislike EDIT: sorry, actually |
I prefer |
Let us keep |
Sounds good. Perhaps the best mental model is for it to be a |
Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
Is the following expected behavior?
I would have thought with the |
Additionally, should the following work?
|
No - this would be
What is
This is also expected - and follows your request to disallow In general |
Thanks, this is all very clear.
Yes that is expected. This is a really impressive work! Really appreciate it and the thought you've put into this. |
I think a |
Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
Why allow returning matrices at all if we are deprecating the |
Only for backward compatibility reasons. Note that we will not disallow returning them. The only question is what happens with them and we have two options:
I was thinking which behavior the user would prefer when returning a matrix and I thought that the second is more natural. Would you prefer the first? In general - under current rules the only case when we throw an error is Note that this is a different case from what we discuss with @nalimilan, as he has raised a case when |
The first option reminds me of |
So option 2 is what we currently have 😄. |
Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
I have updated the documentation (so essentially when we accept this this should be good to be merged). @nalimilan - as usual - feel free to rewrite the docstrings 😄 (and sorry for mistakes, as for sure there will be some). |
Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
Thank you for all the comments. If there are no more issues with this proposal I will merge the PR tomorrow and follow up with a small |
Thank you! |
This PR partially addresses #2410 and #2457.
It covers
select
etc. forAbstractDataFrame
.If we are OK with the functionality I will update the documentation.
TODO:
select
etc. forGroupedDataFrame
(this will be a separate PR to keep PRs more atomic)ByRow
with no columns passed tofilter
(also a separate PR)CC @nalimilan @pdeffebach @matthieugomez - this is a rather complex PR so independent testing (especially for corner cases) would be welcome (if you would have suggestions for types of tests to add please comment and I will add them).