Clarify docs for std::collections

root-goblin · workingjubilee · commit 95f9af09fc2f · 2024-09-10T14:25:38.000-07:00
Page affected: https://doc.rust-lang.org/std/collections/index.html#performance Changes: - bulleted conventions - expanded definitions on terms used - more accessible language - merged Sequence and Map performance cost tables
diff --git a/std/src/collections/mod.rs b/std/src/collections/mod.rs
@@ -79,41 +79,49 @@
 //! see each type's documentation, and note that the names of actual methods may
 //! differ from the tables below on certain collections.
 //!
-//! Throughout the documentation, we will follow a few conventions. For all
-//! operations, the collection's size is denoted by n. If another collection is
-//! involved in the operation, it contains m elements. Operations which have an
-//! *amortized* cost are suffixed with a `*`. Operations with an *expected*
-//! cost are suffixed with a `~`.
+//! Throughout the documentation, we will adhere to the following conventions
+//! for operation notation:
 //!
-//! All amortized costs are for the potential need to resize when capacity is
-//! exhausted. If a resize occurs it will take *O*(*n*) time. Our collections never
-//! automatically shrink, so removal operations aren't amortized. Over a
-//! sufficiently large series of operations, the average cost per operation will
-//! deterministically equal the given cost.
+//! * The collection's size is denoted by `n`.
+//! * If a second collection is involved, its size is denoted by `m`.
+//! * Item indices are denoted by `i`.
+//! * Operations which have an *amortized* cost are suffixed with a `*`.
+//! * Operations with an *expected* cost are suffixed with a `~`.
 //!
-//! Only [`HashMap`] has expected costs, due to the probabilistic nature of hashing.
-//! It is theoretically possible, though very unlikely, for [`HashMap`] to
-//! experience worse performance.
+//! Calling operations that add to a collection will occasionally require a
+//! collection to be resized - an extra operation that takes *O*(*n*) time.
 //!
-//! ## Sequences
+//! *Amortized* costs are calculated to account for the time cost of such resize
+//! operations *over a sufficiently large series of operations*. An individual
+//! operation may be slower or faster due to the sporadic nature of collection
+//! resizing, however the average cost per operation will approach the amortized
+//! cost.
 //!
-//! |                | get(i)                 | insert(i)               | remove(i)              | append    | split_off(i)           |
-//! |----------------|------------------------|-------------------------|------------------------|-----------|------------------------|
-//! | [`Vec`]        | *O*(1)                 | *O*(*n*-*i*)*           | *O*(*n*-*i*)           | *O*(*m*)* | *O*(*n*-*i*)           |
-//! | [`VecDeque`]   | *O*(1)                 | *O*(min(*i*, *n*-*i*))* | *O*(min(*i*, *n*-*i*)) | *O*(*m*)* | *O*(min(*i*, *n*-*i*)) |
-//! | [`LinkedList`] | *O*(min(*i*, *n*-*i*)) | *O*(min(*i*, *n*-*i*))  | *O*(min(*i*, *n*-*i*)) | *O*(1)    | *O*(min(*i*, *n*-*i*)) |
+//! Rust's collections never automatically shrink, so removal operations aren't
+//! amortized.
 //!
-//! Note that where ties occur, [`Vec`] is generally going to be faster than [`VecDeque`], and
-//! [`VecDeque`] is generally going to be faster than [`LinkedList`].
+//! [`HashMap`] uses *expected* costs. It is theoretically possible, though very
+//! unlikely, for [`HashMap`] to experience significantly worse performance than
+//! the expected cost. This is due to the probabilistic nature of hashing - i.e.
+//! it is possible to generate a duplicate hash given some input key that will
+//! requires extra computation to correct.
 //!
-//! ## Maps
+//! ## Cost of Collection Operations
 //!
-//! For Sets, all operations have the cost of the equivalent Map operation.
 //!
-//! |              | get           | insert        | remove        | range         | append       |
-//! |--------------|---------------|---------------|---------------|---------------|--------------|
-//! | [`HashMap`]  | *O*(1)~       | *O*(1)~*      | *O*(1)~       | N/A           | N/A          |
-//! | [`BTreeMap`] | *O*(log(*n*)) | *O*(log(*n*)) | *O*(log(*n*)) | *O*(log(*n*)) | *O*(*n*+*m*) |
+//! |                | get(i)                 | insert(i)               | remove(i)              | append(Vec(m))    | split_off(i)           | range           | append       |
+//! |----------------|------------------------|-------------------------|------------------------|-------------------|------------------------|-----------------|--------------|
+//! | [`Vec`]        | *O*(1)                 | *O*(*n*-*i*)*           | *O*(*n*-*i*)           | *O*(*m*)*         | *O*(*n*-*i*)           | N/A             | N/A          |
+//! | [`VecDeque`]   | *O*(1)                 | *O*(min(*i*, *n*-*i*))* | *O*(min(*i*, *n*-*i*)) | *O*(*m*)*         | *O*(min(*i*, *n*-*i*)) | N/A             | N/A          |
+//! | [`LinkedList`] | *O*(min(*i*, *n*-*i*)) | *O*(min(*i*, *n*-*i*))  | *O*(min(*i*, *n*-*i*)) | *O*(1)            | *O*(min(*i*, *n*-*i*)) | N/A             | N/A          |
+//! | [`HashMap`]    | *O*(1)~                | *O*(1)~*                | *O*(1)~                | N/A               | N/A                    | N/A             | N/A          |
+//! | [`BTreeMap`]   | *O*(log(*n*))          | *O*(log(*n*))           | *O*(log(*n*))          | N/A               | N/A                    | *O*(log(*n*))   | *O*(*n*+*m*) |
+//!
+//! Note that where ties occur, [`Vec`] is generally going to be faster than
+//! [`VecDeque`], and [`VecDeque`] is generally going to be faster than
+//! [`LinkedList`].
+//!
+//! For Sets, all operations have the cost of the equivalent Map operation.
 //!
 //! # Correct and Efficient Usage of Collections
 //!