Skip to content

Commit a5083f6

Browse files
committed
Add Grouped and Keyed
1 parent 288d8b0 commit a5083f6

File tree

8 files changed

+340
-1
lines changed

8 files changed

+340
-1
lines changed

Guides/Combinations.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ call to `CombinationsSequence.Iterator.next()` is an O(_n_) operation.
8686
### Naming
8787

8888
The parameter label in `combination(ofCount:)` is the best match for the
89-
Swift API guidelines. A few other options were considered:
89+
[Swift's API Design Guidelines](https://www.swift.org/documentation/api-design-guidelines/). A few other options were considered:
9090

9191
- When the standard library uses `of` as a label, the parameter is generally
9292
the object of the operation, as in `type(of:)` and `firstIndex(of:)`, and

Guides/Grouped.md

+71
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
# AdjacentPairs
2+
3+
[[Source](https://github.com/apple/swift-algorithms/blob/main/Sources/Algorithms/Grouped.swift) |
4+
[Tests](https://github.com/apple/swift-algorithms/blob/main/Tests/SwiftAlgorithmsTests/GroupedTests.swift)]
5+
6+
Groups up elements of a sequence into a new Dictionary, whose values are Arrays of grouped elements, each keyed by the result of the given closure.
7+
8+
This operation is available for any sequence by calling the `grouped(by:)`
9+
method.
10+
11+
```swift
12+
let fruits = ["Apricot", "Banana", "Apple", "Cherry", "Avocado", "Coconut"]
13+
let fruitsByLetter = fruits.grouped(by: { $0.first! })
14+
// Results in:
15+
[
16+
"B": ["Banana"],
17+
"A": ["Apricot", "Apple", "Avocado"],
18+
"C": ["Cherry", "Coconut"],
19+
]
20+
```
21+
22+
If you wish to achieve a similar effect but for single values (instead of Arrays of grouped values), see [`keyed(by:)`](Keyed.md).
23+
24+
## Detailed Design
25+
26+
The `grouped(by:)` method is declared as a `Sequence` extension returning
27+
`[GroupKey: [Element]]`.
28+
29+
```swift
30+
extension Sequence {
31+
public func grouped<GroupKey>(
32+
by keyForValue: (Element) throws -> GroupKey
33+
) rethrows -> [GroupKey: [Element]]
34+
}
35+
```
36+
37+
### Complexity
38+
39+
Calling `grouped(by:)` is an O(_n_) operation.
40+
41+
### Comparison with other languages
42+
43+
| Language | Grouping API |
44+
|---------------|--------------|
45+
| Java | [`groupingBy`](https://docs.oracle.com/en/java/javase/20/docs/api/java.base/java/util/stream/Collectors.html#groupingBy(java.util.function.Function)) |
46+
| Kotlin | [`groupBy`](https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/group-by.html) |
47+
| C# | [`GroupBy`](https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.groupby?view=net-7.0#system-linq-enumerable-groupby) |
48+
| Rust | [`group_by`](https://doc.rust-lang.org/std/primitive.slice.html#method.group_by) |
49+
| Ruby | [`group_by`](https://ruby-doc.org/3.2.2/Enumerable.html#method-i-group_by) |
50+
| Python | [`groupby`](https://docs.python.org/3/library/itertools.html#itertools.groupby) |
51+
| PHP (Laravel) | [`groupBy`](https://laravel.com/docs/10.x/collections#method-groupby) |
52+
53+
#### Naming
54+
55+
All the surveyed languages name this operation with a variant of "grouped" or "grouping". The past tense `grouped(by:)` best fits [Swift's API Design Guidelines](https://www.swift.org/documentation/api-design-guidelines/).
56+
57+
#### Customization points
58+
59+
Java and C# are interesting in that they provide multiple overloads with several points of customization:
60+
61+
1. Changing the type of the groups.
62+
1. E.g. the groups can be Sets instead of Arrays.
63+
1. Akin to calling `.transformValues { group in Set(group) }` on the resultant dictionary, but avoiding the intermediate allocation of Arrays of each group.
64+
2. Picking which elements end up in the groupings.
65+
1. The default is the elements of the input sequence, but can be changed.
66+
2. Akin to calling `.transformValues { group in group.map(someTransform) }` on the resultant dictionary, but avoiding the intermediate allocation of Arrays of each group.
67+
3. Changing the type of the outermost collection.
68+
1. E.g using an `OrderedDictionary`, `SortedDictionary` or `TreeDictionary` instead of the default (hashed, unordered) `Dictionary`.
69+
2. There's no great way to achieve this with the `grouped(by:)`. One could wrap the resultant dictionary in an initializer to one of the other dictionary types, but that isn't sufficient: Once the `Dictionary` loses the ordering, there's no way to get it back when constructing one of the ordered dictionary variants.
70+
71+
It is not clear which of these points of customization are worth supporting, or what the best way to express them might be.

Guides/Keyed.md

+74
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
# Indexed
2+
3+
[[Source](https://github.com/apple/swift-algorithms/blob/main/Sources/Algorithms/Indexed.swift) |
4+
[Tests](https://github.com/apple/swift-algorithms/blob/main/Tests/SwiftAlgorithmsTests/IndexedTests.swift)]
5+
6+
Stores the elements of a sequence as the values of a Dictionary, keyed by the result of the given closure.
7+
8+
```swift
9+
let fruits = ["Apple", "Banana", "Cherry"]
10+
let fruitByLetter = fruits.keyed(by: { $0.first! })
11+
// Results in:
12+
[
13+
"A": "Apple",
14+
"B": "Banana",
15+
"C": "Cherry",
16+
]
17+
```
18+
19+
Duplicate keys will trigger a runtime error by default. To handle this, you can provide a closure which specifies which value to keep:
20+
21+
```swift
22+
let fruits = ["Apricot", "Banana", "Apple", "Cherry", "Blackberry", "Avocado", "Coconut"]
23+
let fruitsByLetter = fruits.keyed(
24+
by: { $0.first! },
25+
uniquingKeysWith: { old, new in new } // Always pick the latest fruit
26+
)
27+
// Results in:
28+
[
29+
"A": "Avocado",
30+
"B": "Blackberry"],
31+
"C": ["Coconut"],
32+
]
33+
```
34+
35+
## Detailed Design
36+
37+
The `keyed(by:)` method is declared as a `Sequence` extension returning `[Key: Element]`.
38+
39+
```swift
40+
extension Sequence {
41+
public func keyed<Key>(
42+
by keyForValue: (Element) throws -> Key,
43+
uniquingKeysWith combine: ((Element, Element) throws -> Element)? = nil
44+
) rethrows -> [Key: Element]
45+
}
46+
```
47+
48+
### Complexity
49+
50+
Calling `keyed(by:)` is an O(_n_) operation.
51+
52+
### Comparison with other languages
53+
54+
| Language | "Keying" API |
55+
|---------------|-------------|
56+
| Java | [`toMap`](https://docs.oracle.com/en/java/javase/20/docs/api/java.base/java/util/stream/Collectors.html#toMap(java.util.function.Function,java.util.function.Function)) |
57+
| Kotlin | [`associatedBy`](https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/associate-by.html) |
58+
| C# | [`ToDictionary`](https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.todictionary?view=net-7.0#system-linq-enumerable-todictionary) |
59+
| Ruby (ActiveSupport) | [`index_by`](https://rubydoc.info/gems/activesupport/7.0.5/Enumerable#index_by-instance_method) |
60+
| PHP (Laravel) | [`keyBy`](https://laravel.com/docs/10.x/collections#method-keyby) |
61+
62+
#### Rejected alternative names
63+
64+
1. Java's `toMap` is referring to `Map`/`HashMap`, their naming for Dictionaries and other associative collections. It's easy to confuse with the transformation function, `Sequence.map(_:)`.
65+
2. C#'s `tooXXX()` naming doesn't suite Swift well, which tends to prefer `Foo.init` over `toFoo()` methods.
66+
3. Ruby's `index_by` naming doesn't fit Swift well, where "index" is a specific term (e.g. the `associatedtype Index` on `Collection`). There is also a [`index(by:)`](Index.md) method in swift-algorithms, is specifically to do with matching elements up with their indices, and not any arbitrary derived value.
67+
68+
#### Alternative names
69+
70+
Kotlin's `associatedBy` naming is a good alterative, and matches the past tense of [Swift's API Design Guidelines](https://www.swift.org/documentation/api-design-guidelines/), though perhaps we'd spell it `associated(by:)`.
71+
72+
#### Customization points
73+
74+
Java and C# are interesting in that they provide overloads that let you customize the type of the outermost collection. E.g. using an `OrderedDictionary` instead of the default (hashed, unordered) `Dictionary`.

README.md

+2
Original file line numberDiff line numberDiff line change
@@ -45,8 +45,10 @@ Read more about the package, and the intent behind it, in the [announcement on s
4545
- [`adjacentPairs()`](https://github.com/apple/swift-algorithms/blob/main/Guides/AdjacentPairs.md): Lazily iterates over tuples of adjacent elements.
4646
- [`chunked(by:)`, `chunked(on:)`, `chunks(ofCount:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Chunked.md): Eager and lazy operations that break a collection into chunks based on either a binary predicate or when the result of a projection changes or chunks of a given count.
4747
- [`firstNonNil(_:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/FirstNonNil.md): Returns the first non-`nil` result from transforming a sequence's elements.
48+
- [`grouped(by:)](https://github.com/apple/swift-algorithms/blob/main/Guides/Grouped.md): Group up elements using the given closure, returning a Dictionary of those groups, keyed by the results of the closure.
4849
- [`indexed()`](https://github.com/apple/swift-algorithms/blob/main/Guides/Indexed.md): Iterate over tuples of a collection's indices and elements.
4950
- [`interspersed(with:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Intersperse.md): Place a value between every two elements of a sequence.
51+
- [`keyed(by:)`, `keyed(by:uniquingKeysBy:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Keyed.md): Returns a Dictionary that associates elements of a sequence with the keys returned by the given closure.
5052
- [`partitioningIndex(where:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Partition.md): Returns the starting index of the partition of a collection that matches a predicate.
5153
- [`reductions(_:)`, `reductions(_:_:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Reductions.md): Returns all the intermediate states of reducing the elements of a sequence or collection.
5254
- [`split(maxSplits:omittingEmptySubsequences:whereSeparator)`, `split(separator:maxSplits:omittingEmptySubsequences)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Split.md): Lazy versions of the Standard Library's eager operations that split sequences and collections into subsequences separated by the specified separator element.

Sources/Algorithms/Grouped.swift

+25
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
//===----------------------------------------------------------------------===//
2+
//
3+
// This source file is part of the Swift Algorithms open source project
4+
//
5+
// Copyright (c) 2021 Apple Inc. and the Swift project authors
6+
// Licensed under Apache License v2.0 with Runtime Library Exception
7+
//
8+
// See https://swift.org/LICENSE.txt for license information
9+
//
10+
//===----------------------------------------------------------------------===//
11+
12+
extension Sequence {
13+
/// Groups up elements of `self` into a new Dictionary,
14+
/// whose values are Arrays of grouped elements,
15+
/// each keyed by the group key returned by the given closure.
16+
/// - Parameters:
17+
/// - keyForValue: A closure that returns a key for each element in
18+
/// `self`.
19+
/// - Returns: A dictionary containing grouped elements of self, keyed by
20+
/// the keys derived by the `keyForValue` closure.
21+
@inlinable
22+
public func grouped<GroupKey>(by keyForValue: (Element) throws -> GroupKey) rethrows -> [GroupKey: [Element]] {
23+
try Dictionary(grouping: self, by: keyForValue)
24+
}
25+
}

Sources/Algorithms/Keyed.swift

+47
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
//===----------------------------------------------------------------------===//
2+
//
3+
// This source file is part of the Swift Algorithms open source project
4+
//
5+
// Copyright (c) 2020 Apple Inc. and the Swift project authors
6+
// Licensed under Apache License v2.0 with Runtime Library Exception
7+
//
8+
// See https://swift.org/LICENSE.txt for license information
9+
//
10+
//===----------------------------------------------------------------------===//
11+
12+
extension Sequence {
13+
/// Creates a new Dictionary from the elements of `self`, keyed by the
14+
/// results returned by the given `keyForValue` closure. As the dictionary is
15+
/// built, the initializer calls the `combine` closure with the current and
16+
/// new values for any duplicate keys. Pass a closure as `combine` that
17+
/// returns the value to use in the resulting dictionary: The closure can
18+
/// choose between the two values, combine them to produce a new value, or
19+
/// even throw an error.
20+
///
21+
/// If no `combine` closure is provided, deriving the same duplicate key for
22+
/// more than one element of self results in a runtime error.
23+
///
24+
/// - Parameters:
25+
/// - keyForValue: A closure that returns a key for each element in
26+
/// `self`.
27+
/// - combine: A closure that is called with the values for any duplicate
28+
/// keys that are encountered. The closure returns the desired value for
29+
/// the final dictionary.
30+
@inlinable
31+
public func keyed<Key>(
32+
by keyForValue: (Element) throws -> Key,
33+
// TODO: pass `Key` into `combine`: (Key, Element, Element) throws -> Element
34+
uniquingKeysWith combine: ((Element, Element) throws -> Element)? = nil
35+
) rethrows -> [Key: Element] {
36+
// Note: This implementation is a bit convoluted, but it's just aiming to reuse the existing stdlib logic,
37+
// to ensure consistent behaviour, error messages, etc.
38+
// If this API ends up in the stdlib itself, it could just call the underlying `_NativeDictionary` methods.
39+
try withoutActuallyEscaping(keyForValue) { keyForValue in
40+
if let combine {
41+
return try Dictionary(self.lazy.map { (try keyForValue($0), $0) }, uniquingKeysWith: combine)
42+
} else {
43+
return try Dictionary(uniqueKeysWithValues: self.lazy.map { (try keyForValue($0), $0) } )
44+
}
45+
}
46+
}
47+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
//===----------------------------------------------------------------------===//
2+
//
3+
// This source file is part of the Swift Algorithms open source project
4+
//
5+
// Copyright (c) 2020 Apple Inc. and the Swift project authors
6+
// Licensed under Apache License v2.0 with Runtime Library Exception
7+
//
8+
// See https://swift.org/LICENSE.txt for license information
9+
//
10+
//===----------------------------------------------------------------------===//
11+
12+
import XCTest
13+
import Algorithms
14+
15+
final class GroupedTests: XCTestCase {
16+
private class SampleError: Error {}
17+
18+
// Based on https://github.com/apple/swift/blob/4d1d8a9de5ebc132a17aee9fc267461facf89bf8/validation-test/stdlib/Dictionary.swift#L1974-L1988
19+
20+
func testGroupedBy() {
21+
let r = 0..<10
22+
23+
let d1 = r.grouped(by: { $0 % 3 })
24+
XCTAssertEqual(3, d1.count)
25+
XCTAssertEqual(d1[0]!, [0, 3, 6, 9])
26+
XCTAssertEqual(d1[1]!, [1, 4, 7])
27+
XCTAssertEqual(d1[2]!, [2, 5, 8])
28+
29+
let d2 = r.grouped(by: { $0 })
30+
XCTAssertEqual(10, d2.count)
31+
32+
let d3 = (0..<0).grouped(by: { $0 })
33+
XCTAssertEqual(0, d3.count)
34+
}
35+
36+
func testThrowingFromKeyFunction() {
37+
let input = ["Apple", "Banana", "Cherry"]
38+
let error = SampleError()
39+
40+
XCTAssertThrowsError(
41+
try input.grouped(by: { (_: String) -> Character in throw error })
42+
) { thrownError in
43+
XCTAssertIdentical(error, thrownError as? SampleError)
44+
}
45+
}
46+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
//===----------------------------------------------------------------------===//
2+
//
3+
// This source file is part of the Swift Algorithms open source project
4+
//
5+
// Copyright (c) 2020 Apple Inc. and the Swift project authors
6+
// Licensed under Apache License v2.0 with Runtime Library Exception
7+
//
8+
// See https://swift.org/LICENSE.txt for license information
9+
//
10+
//===----------------------------------------------------------------------===//
11+
12+
import XCTest
13+
import Algorithms
14+
15+
final class KeyedTests: XCTestCase {
16+
private class SampleError: Error {}
17+
18+
func testUniqueKeys() {
19+
let d = ["Apple", "Banana", "Cherry"].keyed(by: { $0.first! })
20+
XCTAssertEqual(d.count, 3)
21+
XCTAssertEqual(d["A"]!, "Apple")
22+
XCTAssertEqual(d["B"]!, "Banana")
23+
XCTAssertEqual(d["C"]!, "Cherry")
24+
XCTAssertNil(d["D"])
25+
}
26+
27+
func testEmpty() {
28+
let d = EmptyCollection<String>().keyed(by: { $0.first! })
29+
XCTAssertEqual(d.count, 0)
30+
}
31+
32+
func testNonUniqueKeys() throws {
33+
throw XCTSkip("""
34+
TODO: What's the XCTest equivalent to `expectCrashLater()`?
35+
36+
https://github.com/apple/swift/blob/4d1d8a9de5ebc132a17aee9fc267461facf89bf8/validation-test/stdlib/Dictionary.swift#L1914
37+
""")
38+
}
39+
40+
func testNonUniqueKeysWithMergeFunction() {
41+
let d = ["Apple", "Avocado", "Banana", "Cherry", "Coconut"].keyed(
42+
by: { $0.first! },
43+
uniquingKeysWith: { older, newer in "\(older)-\(newer)"}
44+
)
45+
46+
XCTAssertEqual(d.count, 3)
47+
XCTAssertEqual(d["A"]!, "Apple-Avocado")
48+
XCTAssertEqual(d["B"]!, "Banana")
49+
XCTAssertEqual(d["C"]!, "Cherry-Coconut")
50+
XCTAssertNil(d["D"])
51+
}
52+
53+
func testThrowingFromKeyFunction() {
54+
let input = ["Apple", "Banana", "Cherry"]
55+
let error = SampleError()
56+
57+
XCTAssertThrowsError(
58+
try input.keyed(by: { (_: String) -> Character in throw error })
59+
) { thrownError in
60+
XCTAssertIdentical(error, thrownError as? SampleError)
61+
}
62+
}
63+
64+
func testThrowingFromCombineFunction() {
65+
let input = ["Apple", "Avocado", "Banana", "Cherry"]
66+
let error = SampleError()
67+
68+
XCTAssertThrowsError(
69+
try input.keyed(by: { $0.first! }, uniquingKeysWith: { _, _ in throw error })
70+
) { thrownError in
71+
XCTAssertIdentical(error, thrownError as? SampleError)
72+
}
73+
}
74+
}

0 commit comments

Comments
 (0)