Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

StrExt::splitn should not require a DoubleEndedSearcher #23269

Merged
merged 2 commits into from
Mar 24, 2015

Conversation

shepmaster
Copy link
Member

Closes #23262

@rust-highfive
Copy link
Collaborator

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

@shepmaster
Copy link
Member Author

This has a couple of unanswered questions, so consider it a draft commit. I don't like having to add the duplication involved by creating RCharSplitsN. The only other solution I can think of would be to have CharSplitsN take a closure that could abstract out the call to next / next_back. Also, it's possible that rsplitn should only require ReverseSearcher, and not DoubleEndedSearcher.

@@ -1416,11 +1455,12 @@ impl StrExt for str {
}

#[inline]
fn splitn<'a, P: Pattern<'a>>(&'a self, count: usize, pat: P) -> SplitN<'a, P> {
fn splitn<'a, P: Pattern<'a>>(&'a self, count: usize, pat: P) -> SplitN<'a, P>
where P::Searcher: Searcher<'a>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the associated type on the Pattern trait already has this bound, so shouldn't this extra clause be unnecessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly, it is unnecessary, but having it here provide much better error messages. If you have this here, the error message lets you know that this call needs to be changed. Otherwise, the return type simply won't implement Iterator and you end up with "foo doesn't implement a method called {collect,map,etc}, try implementing Iterator".

@alexcrichton
Copy link
Member

cc @Kimundi, I vaguely remember there being a reason there may be a more strict requirement in some places? Not sure if this was intentional or accidental though.

@Kimundi
Copy link
Member

Kimundi commented Mar 13, 2015

I'm sorry, but I'm unhappy with most of this PR. :(

  • rsplitn should only require ReverseSearcher, not DoubleEndedSearcher.
  • It was intentional that the where P::Searcher: Searcher<'a> clauses can and are being omitted because of the inherent P::Searcher bound. The idea is to make the "simple" signatures of the basic string operations not too complex by having all these bounds everywhere, by adding the reasonable constrain that any pattern should least provide a regular forward searcher. Of course, that only works for methods returning iterators and non-iterator returning methods that don't also require ReverseSearcher or DoubleEndedSearcher.
  • If we would add those "early" where clauses, then the inherent bound on Pattern::Searcher should be removed to remove redundancy, and to make the API consistent. As the patch is right now, its weird that rsplitn() has a needed bound, splitn() has a redundant bound, and split() has no bounds.
  • However, those early where bounds on splitn() and rsplitn() only really work for those two methods because there the generic bounds only have one valid option each: P::Search: Searcher -> SplitN: Iterator and P::Searcher: ReverseSearcher -> RSplitN: Iterator. It doesn't work for the general cases like split, because there its both P::Search: Searcher -> Split: Iterator and P::Search: DoubleEndedSearcher -> Split: DoubleEndedIterator.
  • So even if we add the early where clauses to improve error messages, they should not be added for methods returning iterators because it would make their API inconsistent. You could make the case for having the bounds on all impls and non-iterator returning methods though.

So, in summary I'd be fine with a fix to make RSplitN<'a, P> work with P::Searcher: ReverseSearcher and SplitN<'a, P> without P::Searcher: DoubleEndedSearcher, but I think the other changes at least warrant a separate discussion and/or PR.

I'd much rather see the compiler solve the error message problem in a general way, eg by deeply searching for the cause of why a trait might not be implemented:

<anon>:14         "foo".split("o").rev().collect::<Vec<_>>()
                                         ^~~~~~~~~~~~~~~~~~~
error: type `core::iter::Rev<core::str::Split<'_, &str>>` does not implement any method in scope named `collect`
note: Possible trait that provides `collect`: `core::iter::Iterator`.
note: `core::iter::Rev<core::str::Split<'_, &str>>` would implement `core::iter::Iterator` if
      `core::str::Split<'_, &str>` would implement `core::iter::DoubleEndedIterator`
note: `core::str::Split<'_, &str>` would implement `core::iter::DoubleEndedIterator` if
      `<&str as Pattern<'_>>::Searcher` would implement `core::str::DoubleEndedSearcher`
note: `<&str as Pattern<'_>>::Searcher` does not implement `core::str::DoubleEndedSearcher`

@shepmaster
Copy link
Member Author

I'm sorry, but I'm unhappy with most of this PR. :(

No need to be sorry - that's the point of reviews!

rsplitn should only require ReverseSearcher, not DoubleEndedSearcher.

This was a question in my original comment. I actually agree with this decision, but didn't want to change the semantics until someone knowledgable agreed.

fix to make RSplitN<'a, P> work with P::Searcher: ReverseSearcher and SplitN<'a, P> without P::Searcher: DoubleEndedSearcher, but I think the other changes at least warrant a separate discussion and/or PR.

Easily done. I'll amend this commit; should I open a topic on internals for the rest of it? Maybe as proto RFC? Not so sure on the current best process.

@Kimundi
Copy link
Member

Kimundi commented Mar 14, 2015

Hm, I think the motivation is applicable broadly enough that a internals discuss thread would be the best starting point.
Something like "API conventions for improved error messages", with the proposed changes in this PR as an example, and a general discussion of possible alternatives like improved error messages (like in my example).

@sfackler
Copy link
Member

I doesn't sense to me to leave the correct bound off of the splitn function. The documentation as-is is basically lying to people by implying that the only requirement for splitn is P: Pattern. We should strive to have simple signatures for functions, but not to the point that they're incorrect.

@shepmaster
Copy link
Member Author

the only requirement for splitn is P: Pattern

For pedantry's sake, after this PR that will be true. rsplitn (and rsplit, which springs into existence here), will require P::Searcher: ReverseSearcher, just like ends_with or rfind. At least, that's what my code looks like now, it's entirely possible I've misunderstood!

@shepmaster shepmaster force-pushed the split-not-double-ended branch from a7c7f11 to da68868 Compare March 15, 2015 02:47
@shepmaster
Copy link
Member Author

@Kimundi I've got a new version up now, if you'd be so kind as to give it another look-see. The big thing that's new is that I've introduced rsplit. This is required because of the different semantics of ReverseSearcher - the whole splitting foo:::bar on :: issue.

I should add dedicated tests to both commits, however.

@Kimundi
Copy link
Member

Kimundi commented Mar 15, 2015

@sfackler: The signature would not be incorrect - it returns a type that conditionally implements iterator based on whether the pattern has some properties like supporting a reverse searcher.

I agree that if we would just be talking about rsplitn in isolation, it would be less confusing to just add the bound up front. But you can not do that for something like the DoubleEndedIterator impl of split, so it makes sense to me to treat them consistently in that regard.

That said, while writing this I changed mind: I just realized that we can preserve consistency by treating DoubleEndedIterator impls and reverse iterators differently, but uniformly so.

  • Reverse iterators can only work with a ReverseSearcher bound, so having the bound on the method can be done for all of them and does not restrict their usage.
  • Independent from that, both forward and reverse iterators would conditionally implement DoubleEndedIterator without it being obvious from the method signature, which is consistent to other APIs.

... and this, apparently is also how @shepmaster interpreted the situation, so yeah, patch looks good!

Tests should definitely be added, but I wouldn't fuss too much about code layout and the code duplication introduced here: I've started working on a patch that adds the missing methods from the string pattern RFC, which also includes the changes here, and in which I try to reduce the general complexity of forward/backward iterator wrappers with a more specialized macro that generates them.

Not sure when that will be done though, so I'd definitely try to get this PR merged first.

@shepmaster shepmaster force-pushed the split-not-double-ended branch from da68868 to a39209e Compare March 15, 2015 19:09
@shepmaster
Copy link
Member Author

@Kimundi Updated with tests and small doc changes. I think it's good-to-go now. Thanks!

@bors
Copy link
Contributor

bors commented Mar 17, 2015

☔ The latest upstream changes (presumably #23104) made this pull request unmergeable. Please resolve the merge conflicts.

Verified

This commit was signed with the committer’s verified signature.
raggi James Tucker

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.
@shepmaster shepmaster force-pushed the split-not-double-ended branch from a39209e to c6ca220 Compare March 20, 2015 00:32
@shepmaster
Copy link
Member Author

@Kimundi @alexcrichton updated to address merge conflicts.

@Kimundi
Copy link
Member

Kimundi commented Mar 21, 2015

Looks good to me

/// assert_eq!(v, ["ghi", "def", "abc"]);
/// ```
#[stable(feature = "rust1", since = "1.0.0")]
pub fn rsplit<'a, P: Pattern<'a>>(&'a self, pat: P) -> RSplit<'a, P>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if I missed the context here, but the original motivation for this removal was that it duplicates the functionality of split(foo).rev() (whereas rsplitn has different semantics than splitn(foo).rev()). Was this added back just to get nicer error messages on the .rev() case? If so that may be covered by #23587 instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was that it duplicates the functionality of split(foo).rev()

@alexcrichton With the conceptual difference of Searcher and ReverseSearcher, this is no longer true. For example, these are not the same:

"baaab".split("aa").rev(); 
"baaab".rsplit("aa");

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right! I always forget about that...

@alexcrichton
Copy link
Member

@bors: r+ c6ca220

Thanks!

alexcrichton added a commit to alexcrichton/rust that referenced this pull request Mar 23, 2015

Verified

This commit was signed with the committer’s verified signature.
raggi James Tucker
@bors bors merged commit c6ca220 into rust-lang:master Mar 24, 2015
@shepmaster shepmaster deleted the split-not-double-ended branch March 25, 2015 02:17
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

StrExt::splitn requires a DoubleEndedSearcher whereas StrExt::split does not
6 participants