You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
🤽🏼♀️ Woman Playing Water Polo: Medium-Light Skin Tone
into an nccell. We're only loading 11 bytes, but we ought be loading 17:
U+1F93D WATER POLO
UTF-8: f0 9f a4 bd UTF-16BE: d83edd3d Decimal: 🤽 Octal: \0374475
🤽
Category: So (Symbol, Other); East Asian width: W (wide)
Unicode block: 1F900..1F9FF; Supplemental Symbols and Pictographs
Bidi: ON (Other Neutrals)
Category: Sk (Symbol, Modifier); East Asian width: W (wide)
Unicode block: 1F300..1F5FF; Miscellaneous Symbols and Pictographs
Bidi: ON (Other Neutrals)
if we check for \u200d and force a join following it, we get the expected 17 bytes. i'm not sure we want this...not all sequences are valid, and what if there are two such characters. if nothing else, this does seem to improve the appearance of mojibake...
I've added a new unit test loading
🤽🏼♀️ Woman Playing Water Polo: Medium-Light Skin Tone
into an
nccell
. We're only loading 11 bytes, but we ought be loading 17:U+1F93D WATER POLO
UTF-8: f0 9f a4 bd UTF-16BE: d83edd3d Decimal: 🤽 Octal: \0374475
🤽
Category: So (Symbol, Other); East Asian width: W (wide)
Unicode block: 1F900..1F9FF; Supplemental Symbols and Pictographs
Bidi: ON (Other Neutrals)
U+1F3FC EMOJI MODIFIER FITZPATRICK TYPE-3
UTF-8: f0 9f 8f bc UTF-16BE: d83cdffc Decimal: 🏼 Octal: \0371774
Category: Sk (Symbol, Modifier); East Asian width: W (wide)
Unicode block: 1F300..1F5FF; Miscellaneous Symbols and Pictographs
Bidi: ON (Other Neutrals)
U+200D ZERO WIDTH JOINER
UTF-8: e2 80 8d UTF-16BE: 200d Decimal: Octal: \020015
Category: Cf (Other, Format); East Asian width: N (neutral)
Unicode block: 2000..206F; General Punctuation
Bidi: BN (Boundary Neutral)
U+2640 FEMALE SIGN
UTF-8: e2 99 80 UTF-16BE: 2640 Decimal: ♀ Octal: \023100
♀
Category: So (Symbol, Other); East Asian width: A (ambiguous)
Unicode block: 2600..26FF; Miscellaneous Symbols
Bidi: ON (Other Neutrals)
U+FE0F VARIATION SELECTOR-16
UTF-8: ef b8 8f UTF-16BE: fe0f Decimal: ️ Octal: \0177017
Category: Mn (Mark, Non-Spacing); East Asian width: A (ambiguous)
Unicode block: FE00..FE0F; Variation Selectors
Bidi: NSM (Non-Spacing Mark)
The text was updated successfully, but these errors were encountered: