This base is a benchmark / test / torture for implementations that want to support Unicode.
Since both buffers and base256 items have 256 permutations per item the encoding is trivial, there is a one to one correspondence between one UTF-32 character and one byte value and you don't need to deal with any overflow or padding.
First, allocate a UTF-32 output string with a codepoint length of your input buffer.
Then, for each index lookup in the correspondence table using the current byte value as an index and write the codepoint you found to your output buffer at the same index.
You can find out the correspondence using this table:
Emoji | Unicode codepoint | Byte Value |
---|---|---|
🚀 | U+1F680 | 0 |
🪐 | U+1FA90 | 1 |
☄ | U+2604 | 2 |
🛰 | U+1F6F0 | 3 |
🌌 | U+1F30C | 4 |
🌑 | U+1F311 | 5 |
🌒 | U+1F312 | 6 |
🌓 | U+1F313 | 7 |
🌔 | U+1F314 | 8 |
🌕 | U+1F315 | 9 |
🌖 | U+1F316 | 10 |
🌗 | U+1F317 | 11 |
🌘 | U+1F318 | 12 |
🌍 | U+1F30D | 13 |
🌏 | U+1F30F | 14 |
🌎 | U+1F30E | 15 |
🐉 | U+1F409 | 16 |
☀ | U+2600 | 17 |
💻 | U+1F4BB | 18 |
🖥 | U+1F5A5 | 19 |
💾 | U+1F4BE | 20 |
💿 | U+1F4BF | 21 |
😂 | U+1F602 | 22 |
❤ | U+2764 | 23 |
😍 | U+1F60D | 24 |
🤣 | U+1F923 | 25 |
😊 | U+1F60A | 26 |
🙏 | U+1F64F | 27 |
💕 | U+1F495 | 28 |
😭 | U+1F62D | 29 |
😘 | U+1F618 | 30 |
👍 | U+1F44D | 31 |
😅 | U+1F605 | 32 |
👏 | U+1F44F | 33 |
😁 | U+1F601 | 34 |
🔥 | U+1F525 | 35 |
🥰 | U+1F970 | 36 |
💔 | U+1F494 | 37 |
💖 | U+1F496 | 38 |
💙 | U+1F499 | 39 |
😢 | U+1F622 | 40 |
🤔 | U+1F914 | 41 |
😆 | U+1F606 | 42 |
🙄 | U+1F644 | 43 |
💪 | U+1F4AA | 44 |
😉 | U+1F609 | 45 |
☺ | U+263A | 46 |
👌 | U+1F44C | 47 |
🤗 | U+1F917 | 48 |
💜 | U+1F49C | 49 |
😔 | U+1F614 | 50 |
😎 | U+1F60E | 51 |
😇 | U+1F607 | 52 |
🌹 | U+1F339 | 53 |
🤦 | U+1F926 | 54 |
🎉 | U+1F389 | 55 |
💞 | U+1F49E | 56 |
✌ | U+270C | 57 |
✨ | U+2728 | 58 |
🤷 | U+1F937 | 59 |
😱 | U+1F631 | 60 |
😌 | U+1F60C | 61 |
🌸 | U+1F338 | 62 |
🙌 | U+1F64C | 63 |
😋 | U+1F60B | 64 |
💗 | U+1F497 | 65 |
💚 | U+1F49A | 66 |
😏 | U+1F60F | 67 |
💛 | U+1F49B | 68 |
🙂 | U+1F642 | 69 |
💓 | U+1F493 | 70 |
🤩 | U+1F929 | 71 |
😄 | U+1F604 | 72 |
😀 | U+1F600 | 73 |
🖤 | U+1F5A4 | 74 |
😃 | U+1F603 | 75 |
💯 | U+1F4AF | 76 |
🙈 | U+1F648 | 77 |
👇 | U+1F447 | 78 |
🎶 | U+1F3B6 | 79 |
😒 | U+1F612 | 80 |
🤭 | U+1F92D | 81 |
❣ | U+2763 | 82 |
😜 | U+1F61C | 83 |
💋 | U+1F48B | 84 |
👀 | U+1F440 | 85 |
😪 | U+1F62A | 86 |
😑 | U+1F611 | 87 |
💥 | U+1F4A5 | 88 |
🙋 | U+1F64B | 89 |
😞 | U+1F61E | 90 |
😩 | U+1F629 | 91 |
😡 | U+1F621 | 92 |
🤪 | U+1F92A | 93 |
👊 | U+1F44A | 94 |
🥳 | U+1F973 | 95 |
😥 | U+1F625 | 96 |
🤤 | U+1F924 | 97 |
👉 | U+1F449 | 98 |
💃 | U+1F483 | 99 |
😳 | U+1F633 | 100 |
✋ | U+270B | 101 |
😚 | U+1F61A | 102 |
😝 | U+1F61D | 103 |
😴 | U+1F634 | 104 |
🌟 | U+1F31F | 105 |
😬 | U+1F62C | 106 |
🙃 | U+1F643 | 107 |
🍀 | U+1F340 | 108 |
🌷 | U+1F337 | 109 |
😻 | U+1F63B | 110 |
😓 | U+1F613 | 111 |
⭐ | U+2B50 | 112 |
✅ | U+2705 | 113 |
🥺 | U+1F97A | 114 |
🌈 | U+1F308 | 115 |
😈 | U+1F608 | 116 |
🤘 | U+1F918 | 117 |
💦 | U+1F4A6 | 118 |
✔ | U+2714 | 119 |
😣 | U+1F623 | 120 |
🏃 | U+1F3C3 | 121 |
💐 | U+1F490 | 122 |
☹ | U+2639 | 123 |
🎊 | U+1F38A | 124 |
💘 | U+1F498 | 125 |
😠 | U+1F620 | 126 |
☝ | U+261D | 127 |
😕 | U+1F615 | 128 |
🌺 | U+1F33A | 129 |
🎂 | U+1F382 | 130 |
🌻 | U+1F33B | 131 |
😐 | U+1F610 | 132 |
🖕 | U+1F595 | 133 |
💝 | U+1F49D | 134 |
🙊 | U+1F64A | 135 |
😹 | U+1F639 | 136 |
🗣 | U+1F5E3 | 137 |
💫 | U+1F4AB | 138 |
💀 | U+1F480 | 139 |
👑 | U+1F451 | 140 |
🎵 | U+1F3B5 | 141 |
🤞 | U+1F91E | 142 |
😛 | U+1F61B | 143 |
🔴 | U+1F534 | 144 |
😤 | U+1F624 | 145 |
🌼 | U+1F33C | 146 |
😫 | U+1F62B | 147 |
⚽ | U+26BD | 148 |
🤙 | U+1F919 | 149 |
☕ | U+2615 | 150 |
🏆 | U+1F3C6 | 151 |
🤫 | U+1F92B | 152 |
👈 | U+1F448 | 153 |
😮 | U+1F62E | 154 |
🙆 | U+1F646 | 155 |
🍻 | U+1F37B | 156 |
🍃 | U+1F343 | 157 |
🐶 | U+1F436 | 158 |
💁 | U+1F481 | 159 |
😲 | U+1F632 | 160 |
🌿 | U+1F33F | 161 |
🧡 | U+1F9E1 | 162 |
🎁 | U+1F381 | 163 |
⚡ | U+26A1 | 164 |
🌞 | U+1F31E | 165 |
🎈 | U+1F388 | 166 |
❌ | U+274C | 167 |
✊ | U+270A | 168 |
👋 | U+1F44B | 169 |
😰 | U+1F630 | 170 |
🤨 | U+1F928 | 171 |
😶 | U+1F636 | 172 |
🤝 | U+1F91D | 173 |
🚶 | U+1F6B6 | 174 |
💰 | U+1F4B0 | 175 |
🍓 | U+1F353 | 176 |
💢 | U+1F4A2 | 177 |
🤟 | U+1F91F | 178 |
🙁 | U+1F641 | 179 |
🚨 | U+1F6A8 | 180 |
💨 | U+1F4A8 | 181 |
🤬 | U+1F92C | 182 |
✈ | U+2708 | 183 |
🎀 | U+1F380 | 184 |
🍺 | U+1F37A | 185 |
🤓 | U+1F913 | 186 |
😙 | U+1F619 | 187 |
💟 | U+1F49F | 188 |
🌱 | U+1F331 | 189 |
😖 | U+1F616 | 190 |
👶 | U+1F476 | 191 |
🥴 | U+1F974 | 192 |
▶ | U+25B6 | 193 |
➡ | U+27A1 | 194 |
❓ | U+2753 | 195 |
💎 | U+1F48E | 196 |
💸 | U+1F4B8 | 197 |
⬇ | U+2B07 | 198 |
😨 | U+1F628 | 199 |
🌚 | U+1F31A | 200 |
🦋 | U+1F98B | 201 |
😷 | U+1F637 | 202 |
🕺 | U+1F57A | 203 |
⚠ | U+26A0 | 204 |
🙅 | U+1F645 | 205 |
😟 | U+1F61F | 206 |
😵 | U+1F635 | 207 |
👎 | U+1F44E | 208 |
🤲 | U+1F932 | 209 |
🤠 | U+1F920 | 210 |
🤧 | U+1F927 | 211 |
📌 | U+1F4CC | 212 |
🔵 | U+1F535 | 213 |
💅 | U+1F485 | 214 |
🧐 | U+1F9D0 | 215 |
🐾 | U+1F43E | 216 |
🍒 | U+1F352 | 217 |
😗 | U+1F617 | 218 |
🤑 | U+1F911 | 219 |
🌊 | U+1F30A | 220 |
🤯 | U+1F92F | 221 |
🐷 | U+1F437 | 222 |
☎ | U+260E | 223 |
💧 | U+1F4A7 | 224 |
😯 | U+1F62F | 225 |
💆 | U+1F486 | 226 |
👆 | U+1F446 | 227 |
🎤 | U+1F3A4 | 228 |
🙇 | U+1F647 | 229 |
🍑 | U+1F351 | 230 |
❄ | U+2744 | 231 |
🌴 | U+1F334 | 232 |
💣 | U+1F4A3 | 233 |
🐸 | U+1F438 | 234 |
💌 | U+1F48C | 235 |
📍 | U+1F4CD | 236 |
🥀 | U+1F940 | 237 |
🤢 | U+1F922 | 238 |
👅 | U+1F445 | 239 |
💡 | U+1F4A1 | 240 |
💩 | U+1F4A9 | 241 |
👐 | U+1F450 | 242 |
📸 | U+1F4F8 | 243 |
👻 | U+1F47B | 244 |
🤐 | U+1F910 | 245 |
🤮 | U+1F92E | 246 |
🎼 | U+1F3BC | 247 |
🥵 | U+1F975 | 248 |
🚩 | U+1F6A9 | 249 |
🍎 | U+1F34E | 250 |
🍊 | U+1F34A | 251 |
👼 | U+1F47C | 252 |
💍 | U+1F48D | 253 |
📣 | U+1F4E3 | 254 |
🥂 | U+1F942 | 255 |
It is the same as encoding but the other way around.
Note it is not recommended to use a 8 gigabytes UTF-32 codepoint
->
struct {bool, byte}
, it might be wise to a hash map instead.