Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add
grouped(by:)
andkeyed(by:)
#197New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Add
grouped(by:)
andkeyed(by:)
#197Changes from all commits
33a2553
f3b1dd8
8c7df22
4ca6551
1a70760
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forward
I explored all this really just for my own edification, but it might be helpful to include some summary of this in your notes here, to explain why the
group(by:)
API doesn't and cannot currently support this functionality. Adding all the necessary infrastructure - whether aCollector
protocol or whatever else - seems like a more fundamental addition to the Swift standard library that should be tackled in its own patch & pitch.Rabbit hole
I like the idea of supporting this, but it doesn't seem like it's possible [elegantly] in Swift today…? Swift's standard library & Foundation don't include an equivalent to Java's
Collector
interface, for example. And even though I'm not a big fan of that approach anyway, you fundamentally do need a way to say "any collection that can be initialised and added to", which Swift does not have.Specifically: you can easily genericise
group(by:)
with someValueCollection
type, but what protocol would it have to conform to?Collection
specifies only the APIs for reading from a given collection, not for constructing one or adding to it.MutableCollection
(surprisingly) has the exact same limitations, and feels inappropriate anyway since it forces the resulting Dictionary to have a mutable type for its values.There is
RangeReplaceableCollection
as a half measure - it does include an init method, but only a limited subset of standard collections conform to it (notably not includingSet
). Still, with that you can do:Semantically it's a hack, of course - there's nothing about the range replacement functionality actually needed here, it's just used because it exposes other APIs almost as a side-effect.
And anyway, the ergonomics are poor since Swift doesn't support specifying a default value for a generic parameter, requiring boilerplate in many cases:
One could work around the initialisation problem by having the caller explicitly provide an initialiser, but you still need a protocol that all collections support which specifies some kind of
append
method, and that doesn't currently exist.It is possible to support arbitrary collection types, but only by having the caller provide a reducer explicitly in order to work around the aforementioned limitations, e.g.:
This is essentially now a functional superset of
group(by:)
andkeyed(by:)
.You can't provide a default reducer since there's no way to generically add a value to any type of collection (per the earlier point). So this'd have to be a special parallel version of group(by:), in addition to the 'simple' one that just hard-codes use of
Array
and doesn't require that a reducer be specified.I'm not a fan of this approach, technically capable as it may be, though admittedly that's from a subjective standpoint. I think it'd be more elegant, and cleaner for library authors, to evolve Swift (or the Swift standard library) to better support generalising across all collection types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very interesting findings, thanks for sharing this with me! How do you think it would be best to communicate these in the library? Perhaps an addendum to the new md files I'm adding in this PR?
Oh wow, I never noticed this, but you're right!
This is what I had in mind (or use a new protocol with method requirements, instead of two separate closure params).
This is more general, because the groups might not be collections at all. The histogram example comes to mind, where you'd want something like:
If there were a protocol for this, then there might be some
CountingCollector
that could be used.C# and Java's APIs are particularly interesting in this area, though I think a lot of their designs were constrained by the lack of tuples (at the time of their design).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how best to summarise that - obviously what I wrote, as a kind of deep-dive on this tangent, is way too much for these documents. Maybe it suffices to say something like "This would require either (a) a new protocol, supported by all standard library collection types, which represents an initialisable object that can be added to, or (b) a variant method which accepts a custom 'reducer' closure". If you want to say anything about it at all. Perhaps this thread here on the pull request is sufficient documentation for posterity (which could be hyperlinked to from the markdown file).
Having used Java's version of this a few times in real-world code, my main comment is that these sorts of APIs - groupby etc - should never require explicit specification of the return types (or a custom reducer). They can allow it, through overloads or optional parameters, but it needs to work effortlessly for the vastly most common case of just returning a Dictionary of Arrays. A lot of Java APIs explicitly require you to pass a collector which is almost always
Collectors.toList()
… it gets very repetitive, and verbose.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, as a counter-argument (Devil's advocate style) for the use of this custom-reducer
grouped(by:)
variant, you can achieve the same thing with:(and the explicit typing of the
Result
type can be omitted in some cases where it's implied by surrounding context, making this even terser)Personally I don't mind if a language or library provides multiple ways to achieve the same goals, but I'm not sure a fancier
group(by:)
adds any benefit over the existingreduce(into:_:)
approach.That said, you can implement
group(by:)
usingreduce(into:_:)
too, e.g.:…yet I still feel like a purpose-built
group(by:)
is worth having. Perhaps it's technically just a "convenience", a bit more ergonomic - but it feels like it's worth it for this simple, common case. I guess I'm just not sure where to draw the line between a more powerfulgroup(by:)
and just having people fall back toreduce(into:_:)
for uncommon needs.Counter-counter-argument: even though
reduce(_:_:)
andreduce(into:_:)
can be used for a huge variety of tasks, I usually find them hard to interpret when reading the code. A large number ofSequence
methods can be implemented using just the reducer methods - e.g.allSatisfy
,min
/max
, and evenmap
! - yet we still have all those purpose-built methods. And I think Swift is better for them - they make code much more readable, even if they're technically redundant.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise. I have conviction about the base
group(by:)
and its semantics, but I get less confident the further out I get on the landscape extensibility. Java's a prime example, where the collectors are maximally flexible, but a complete chore in 99% of the cases.Heh I have an entire little blog post about this: https://github.com/amomchilov/Blog/blob/master/Don't%20abuse%20reduce.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this research something you'd like to commit? You deserve to have your name on your findings :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side thought: If
grouped(by:)
was instead implemented to be lazy, thekeyed(by: \.key)
could be replaced withgrouped(by: \.key).transformValues(\.last)
. It's still wordy though, so I thinkkeyed(by:)
is keeping as-is.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd still be notably less efficient, due (mainly) to the construction of the intermediary arrays. I assume the Swift compiler won't be clever enough to eliminate those.
Aside from that, how would lazy execution work? Wouldn't you have to run through the complete input sequence in order to know when you have all the values for any given key?