Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

''.split() != ''.split(' ') #80

Closed
koddo opened this issue May 16, 2018 · 7 comments
Closed

''.split() != ''.split(' ') #80

koddo opened this issue May 16, 2018 · 7 comments
Milestone

Comments

@koddo
Copy link

koddo commented May 16, 2018

Hello. Does the following qualify?
When you split an empty string without arguments, you get an empty list.
When you specify a separator, you get a non-empty list, see below.
More of an inconsistency than a wtf, but nevertheless I'd like to share this.
I wonder why it works this way.

>>> ''.split() != ''.split(' ')
>>> True

>>> ''.split() 
>>> []

>>> ''.split(' ')
>>> ['']
@snahor
Copy link

snahor commented May 16, 2018

I don't think so, the docs are pretty clear about what happens when you don't supply a delimiter.

@satwikkansal
Copy link
Owner

umm, I think this can be added to our "Minor ones" collection (I wasn't expecting it to return [''])

But let's wait for more opinions on this...

@lavishsaluja
Copy link

lavishsaluja commented Jun 4, 2018

I don't think there is any inconsistency, the split docs clearly explains what happens when we pass an empty string and what happens when you pass an empty string with a delimiter.
Also, the very first slack answer Split return on the empty argument pretty much explains the working of the split.

@koddo
Copy link
Author

koddo commented Jun 6, 2018

I've just read the stackoverflow answer, and I'm not convinced. I don't understand why with no sep an empty string is considered a run of consecutive whitespaces and gets eaten.

Note, the number of result fields is one greater than the number of delimiters.

To me this should apply to both algorithms.

@koddo
Copy link
Author

koddo commented Jun 6, 2018

And there's another edge case when a string only consists of whitespace. Should ' '.split() return [''] or['', ''] for consistency? Like ','.split(',') == ['', ''] does? Now it returns [] like ''.split(). I don't know. To me this behavior is not obvious.

@koddo
Copy link
Author

koddo commented Jun 6, 2018

I'm sorry for spamming, I think I get it now.

The non-obvious thing about this function is when there's no sep the number of result fields is one less than the number of actual delimiters: both ' a ' and 'a' have two delimiters and splitting them will result in one field. And with sep given the number of result fields is one more than the number of actual delimiters: ',' has one delimiter and splitting will result in two fields.

So the edge case when the string is empty works like this: ''.split() will have zero fields because the number of delimiters is one, while ''.split(' ') will have one field because the number of delimiters is zero.

I'd have two functions with carefully chosen names than having one function with behavior like this.

@satwikkansal satwikkansal added this to the 3.0 milestone Dec 4, 2018
satwikkansal added a commit that referenced this issue Jun 8, 2019
@satwikkansal
Copy link
Owner

@koddo Added, thanks for the suggestion :) You can check the above commit and let me know if something's incorrect or missing.

satwikkansal added a commit that referenced this issue Oct 28, 2019
muscliary pushed a commit to muscliary/wtfpython that referenced this issue Sep 12, 2023
muscliary pushed a commit to muscliary/wtfpython that referenced this issue Sep 12, 2023
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

4 participants