Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Misaligned editor cursor movement with Hangul Jamo characters #207794

Closed
ruby3141 opened this issue Mar 15, 2024 · 10 comments
Closed

Misaligned editor cursor movement with Hangul Jamo characters #207794

ruby3141 opened this issue Mar 15, 2024 · 10 comments
Labels
editor-core Editor basic functionality feature-request Request for new features or functionality

Comments

@ruby3141
Copy link

ruby3141 commented Mar 15, 2024

Does this issue occur when all extensions are disabled?: Yes

  • VS Code Version: 1.87.2
  • OS Version: Windows 11 23H2 22631.3296

Note

(updated 2024-03-18)
It turns out the majority of described problem below was not from VSCode's bug.
It was VSCodeVim to blame. Sorry for misinformation.
Korean with Hangul Jamo still has problem tho.

Steps to Reproduce:

  • Example text

    The quick brown fox jumps over a lazy dog. the quick brown fox
    The quick brown fox jumps over a lazy dog. The quick brown fox jumps over a lazy dog. The quick brown fox jumps over a lazy dog.
    とりなくこゑす / ゆめさませ / みよあけわたる / ひんがしを / そらいろはえて / おきつへに / ほふねむれゐぬ / もやのうち
    Victor jagt zwölf Boxkämpfer quer über den großen Sylter Deich Victor jagt zwölf Boxkämpfer quer über den großen Sylter Deich
    정 참판 양반댁 규수 큰 교자 타고 혼례 치른 날. 정 참판 양반댁 규수 큰 교자 타고 혼례 치른 날. 정 참판 양반댁 규수 큰 교자 타고 혼례 치른 날.
    정 참판 양반댁 규수 큰 교자 타고 혼례 치른 날. 정 참판 양반댁 규수 큰 교자 타고 혼례 치른 날. 정 참판 양반댁 규수 큰 교자 타고 혼례 치른 날.
    เป็นมนุษย์สุดประเสริฐเลิศคุณค่า กว่าบรรดาฝูงสัตว์เดรัจฉาน จงฝ่าฟันพัฒนาวิชาการ อย่าล้างผลาญฤๅเข่นฆ่าบีฑาใคร ไม่ถือโทษโกรธแช่งซัดฮึดฮัดด่า
    鉴于对人类家庭所有成员的固有尊严及其平等的和不移的权利的承认,乃是世界自由、正义与和平的基础。鉴于对人类家庭所有成员的固有尊严及其平等的
    鑑於人類社會個成員儕有個固有尊嚴脫仔平等個脫仔勿移個權利承認,是世界自由、正義脫仔和平個基礎。鑑於人類社會個成員儕有個固有尊嚴脫仔平等個
    

    Each line of string is consist of characters used in

    • English, with full-width latin characters
    • English, with half-width latin characters
    • Japanese
    • German
    • Korean, with Hangul Syllables Unicode block characters
    • Korean, with Hangul Jamo Unicode block characters
    • Thai
    • Mandarin, simplified
    • Mandarin, traditional

    respectively.
    Note that for Thai and Hangul Jamo characters, some groups of Unicode characters can be combined to compose single displayed character.

  • Fonts used to create sample images
    NeoDunggeunmo for latin and korean glyph
    GNU Unifont for fallback.
    Both are 8x16 grid bitmap based monospace font, so glyphs are perfectly aligned.
    (I think this will be reproducible just with Unifont.)

  • Steps

    1. Click left side of the first occuring of full-width character to put cursor.
    2. Press up and down arrow key to move cursor up and down respectively.
  • Expected behavior
    image
    Cursor moves vertically, without horizontal movement, follwing marked line on image.

  • Current behavior
    image
    Cursor bumps horizontally while moves vertically, ignoring character width and combination.

@ruby3141
Copy link
Author

Sorry for huge misinformation.
VSCodeVim was not properly disabled during testing. Major fault for this problem was on VSCodeVim.

Everything works as intended EXCEPT for "Korean with Hangul Jamo", as of my reproduction.

@ruby3141 ruby3141 changed the title Misaligned editor cursor movement with full-width character and combined character Misaligned editor cursor movement with Hangul Jamo characters Mar 18, 2024
@ruby3141
Copy link
Author

Record of current behavior, with extensions properly disabled

You can see that the cursor clearly bumps on Hangul Jamo line.

@alexdima
Copy link
Member

The width of the glyphs is 100% controlled by the font. Have you tried a font like Inconsolata which is known for making all glyphs equallly wide?

@alexdima alexdima added the info-needed Issue requires more information from poster label Jun 12, 2024
@ruby3141
Copy link
Author

The glyphs of fonts I used in that reproduction are "uniformly" wide.
More specifically, every "full-width" glyphs like CJK has exactly double the width of "half-width" glyph.
(It's monospaced by design, but not technically. CJK programming fonts are usually behave like that.)

And VSCode simply consider full-width characters as "double sized" while calculating vertical movement.
You can test that behavior with fonts like Last Resort, which is special purpose fallback font to indicate what Unicode block the "tofu"ed character is in, and has "equally" sized square box glyphs across every possible Unicode code point.

image
With the font, the example text looks like this. highlighted characters are in the same "VSCode vertical line".
And as you can see, unlike Hangul Syllables(5th line), Hangul Jamo(6th line) seems to be considered as "half-width", by the fact that it's in the same "Displayed vertical line" with "half-width" characters like English, German and Thai.

@alexdima
Copy link
Member

And VSCode simply consider full-width characters as "double sized" while calculating vertical movement

That is correct, that is the current implementation. With the current stack we have, we delegate font rendering in the editor to the browser, so we don't implement text layouting ourselves. To make true width based vertical movement we would need to basically render the line the cursor would land on and do an offset search to find out what would be the closest column to the current horizontal offset. The current implementation hard-codes a heuristic that wide characters are twice as wide as narrow ones. This is done at

/**
* Returns a visible column from a column.
* @see {@link CursorColumns}
*/
public static visibleColumnFromColumn(lineContent: string, column: number, tabSize: number): number {
const textLen = Math.min(column - 1, lineContent.length);
const text = lineContent.substring(0, textLen);
const iterator = new strings.GraphemeIterator(text);
let result = 0;
while (!iterator.eol()) {
const codePoint = strings.getNextCodePoint(text, textLen, iterator.offset);
iterator.nextGraphemeLength();
result = this._nextVisibleColumn(codePoint, result, tabSize);
}
return result;
}
and at
private static _nextVisibleColumn(codePoint: number, visibleColumn: number, tabSize: number): number {
if (codePoint === CharCode.Tab) {
return CursorColumns.nextRenderTabStop(visibleColumn, tabSize);
}
if (strings.isFullWidthCharacter(codePoint) || strings.isEmojiImprecise(codePoint)) {
return visibleColumn + 2;
}
return visibleColumn + 1;
}

Because of this, the current implementation also does not work correctly for proportional fonts. I acknowledge this behavior but will mark this as a feature request because the code is working as it was built, allowing for quick vertical movement while sacrificing correctness for non-monospace fonts or mostly non-programming text content.

@alexdima alexdima added editor-core Editor basic functionality feature-request Request for new features or functionality and removed info-needed Issue requires more information from poster labels Jun 13, 2024
@alexdima alexdima removed their assignment Jun 13, 2024
@vscodenpa vscodenpa added this to the Backlog Candidates milestone Jun 13, 2024
@vscodenpa
Copy link

This feature request is now a candidate for our backlog. The community has 60 days to upvote the issue. If it receives 20 upvotes we will move it to our backlog. If not, we will close it. To learn more about how we handle feature requests, please see our documentation.

Happy Coding!

@ruby3141
Copy link
Author

ruby3141 commented Jun 13, 2024

@alexdima
I don't meant about "move cursor vertically according to actual glyph size of current font" thing.
It's that the Hangul Jamo block characters are not considered full width despite it is.

@alexdima
Copy link
Member

@ruby3141 This is defined here --

export function isFullWidthCharacter(charCode: number): boolean {
// Do a cheap trick to better support wrapping of wide characters, treat them as 2 columns
// http://jrgraphix.net/research/unicode_blocks.php
// 2E80 - 2EFF CJK Radicals Supplement
// 2F00 - 2FDF Kangxi Radicals
// 2FF0 - 2FFF Ideographic Description Characters
// 3000 - 303F CJK Symbols and Punctuation
// 3040 - 309F Hiragana
// 30A0 - 30FF Katakana
// 3100 - 312F Bopomofo
// 3130 - 318F Hangul Compatibility Jamo
// 3190 - 319F Kanbun
// 31A0 - 31BF Bopomofo Extended
// 31F0 - 31FF Katakana Phonetic Extensions
// 3200 - 32FF Enclosed CJK Letters and Months
// 3300 - 33FF CJK Compatibility
// 3400 - 4DBF CJK Unified Ideographs Extension A
// 4DC0 - 4DFF Yijing Hexagram Symbols
// 4E00 - 9FFF CJK Unified Ideographs
// A000 - A48F Yi Syllables
// A490 - A4CF Yi Radicals
// AC00 - D7AF Hangul Syllables
// [IGNORE] D800 - DB7F High Surrogates
// [IGNORE] DB80 - DBFF High Private Use Surrogates
// [IGNORE] DC00 - DFFF Low Surrogates
// [IGNORE] E000 - F8FF Private Use Area
// F900 - FAFF CJK Compatibility Ideographs
// [IGNORE] FB00 - FB4F Alphabetic Presentation Forms
// [IGNORE] FB50 - FDFF Arabic Presentation Forms-A
// [IGNORE] FE00 - FE0F Variation Selectors
// [IGNORE] FE20 - FE2F Combining Half Marks
// [IGNORE] FE30 - FE4F CJK Compatibility Forms
// [IGNORE] FE50 - FE6F Small Form Variants
// [IGNORE] FE70 - FEFF Arabic Presentation Forms-B
// FF00 - FFEF Halfwidth and Fullwidth Forms
// [https://en.wikipedia.org/wiki/Halfwidth_and_fullwidth_forms]
// of which FF01 - FF5E fullwidth ASCII of 21 to 7E
// [IGNORE] and FF65 - FFDC halfwidth of Katakana and Hangul
// [IGNORE] FFF0 - FFFF Specials
return (
(charCode >= 0x2E80 && charCode <= 0xD7AF)
|| (charCode >= 0xF900 && charCode <= 0xFAFF)
|| (charCode >= 0xFF01 && charCode <= 0xFF5E)
);
}
. Would you be interested in improving that with a PR?

Copy link

This feature request has not yet received the 20 community upvotes it takes to make to our backlog. 10 days to go. To learn more about how we handle feature requests, please see our documentation.

Happy Coding!

Copy link

🙁 In the last 60 days, this feature request has received less than 20 community upvotes and we closed it. Still a big Thank You to you for taking the time to create this issue! To learn more about how we handle feature requests, please see our documentation.

Happy Coding!

@vs-code-engineering vs-code-engineering bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 30, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
editor-core Editor basic functionality feature-request Request for new features or functionality
Projects
None yet
Development

No branches or pull requests

3 participants