Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Adjust string length for fullwidth characters when end is undefined #27

Merged
merged 1 commit into from
Feb 16, 2020
Merged

Adjust string length for fullwidth characters when end is undefined #27

merged 1 commit into from
Feb 16, 2020

Conversation

kevva
Copy link
Contributor

@kevva kevva commented Sep 18, 2019

Fixes #26.
Closes #29.


IssueHunt Summary

Referenced issues

This pull request has been submitted to:


IssueHunt has been backed by the following sponsors. Become a sponsor

@sindresorhus
Copy link
Member

@TiagoDanin Can you help review?

@Qix-
Copy link
Member

Qix- commented Sep 19, 2019

Can we add a few cases where string.normalize() would change the input string, too? I want to make sure that won't break anything.

@kevva
Copy link
Contributor Author

kevva commented Sep 19, 2019

Good catch, that actually breaks stuff. I wrote the code that included the normalization, but now that I think of it, I don't see why we need it. It'll only modify the string to use a different character than the one inputted. Also, all the tests passes without normalization as well.

'\u006E\u0303test' === '\u006E\u0303test'.normalize();
//=> false

It doesn't break anything, except that if you're going to compare the output with your input some characters might differ.

image

@Qix-
Copy link
Member

Qix- commented Sep 19, 2019

I agree, we shouldn't care about it. It's another linear sweep of the string, anyway, so it's a win-win. Let's remove it ^^

@kevva
Copy link
Contributor Author

kevva commented Sep 19, 2019

I'll do it in a separate PR though since it's unrelated to this.

Copy link
Contributor

@TiagoDanin TiagoDanin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@TiagoDanin
Copy link
Contributor

TiagoDanin commented Sep 21, 2019

I found a big problem.
The support fullwidth characters ( #4, #9) does not work with ANSI combination.

DeepinScreenshot_select-area_20190920220205

\u001B[31m古古test\u001B[39m does not return the \u001B[31m


UPDATE:
Sugestion

if (!astralRegex({exact: true}).test(character) && isFullwidthCodePoint(character.codePointAt())) {
			++visible;
			++begin;
			++stringEnd

			if (typeof end !== 'number') {
				++stringEnd;
			}
		}

This should fix.

Bus this break the test 'supports fullwidth characters', I believe that here "안녕하세" (

t.is(sliceAnsi('안녕하세', 0, 4), '안녕');
) has only 4 (안(c548) 녕(b155) 하(d558) 세(c138)) characters and not 8

@kevva
Copy link
Contributor Author

kevva commented Sep 21, 2019

@TiagoDanin, could you provide a test example?

@TiagoDanin
Copy link
Contributor

test.failing('fullwidth characters + ANSI codes', t => {
	t.is(sliceAnsi('\u001B[31m古古test\u001B[39m', 3), '\u001B[31m古test\u001B[39m');
	t.is(sliceAnsi('\u001B[31m古古test\u001B[39m', 3, 4), '\u001B[31m古\u001B[39m');
});

@kevva
Copy link
Contributor Author

kevva commented Sep 30, 2019

@TiagoDanin, that's no bug, you're starting to slice in the middle of a full width character. This works:

t.is(sliceAnsi('\u001B[31m古古test\u001B[39m', 2), '\u001B[31m古test\u001B[39m');
t.is(sliceAnsi('\u001B[31m古古test\u001B[39m', 2, 4), '\u001B[31m古\u001B[39m');

This is also how I'd expect this library to work. If we were to treat full width characters as "single" characters, I'd add an option for it but it's out of scope of this issue.

@sindresorhus sindresorhus changed the title Adjust string length for full width characters when end is undefined Adjust string length for fullwidth characters when end is undefined Feb 16, 2020
@sindresorhus sindresorhus merged commit 013d918 into chalk:master Feb 16, 2020
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Lose characters when fullwidth characters exist and end is not specified
4 participants