home..
Converter (UTF-8 to Unicode)
Mia Vrhovnik / April 2023
C
bit manipulation
FULL SOURCE CODE:
Description
- The program converts UTF-8 encoded text to Unicode, it does this by reading the input file byte by byte and converting it to Unicode according to the UTF-8 standard. Table for reference:
| UTF-8 | Unicode |
|---|---|
0xxxxxxx |
0xxxxxxx |
110xxxxx 10xxxxxx |
00000xxx xxxxxxxx |
1110xxxx 10xxxxxx 10xxxxxx |
xxxxxxyy yyyyyyyy |
11110xxx 10xxxxxx 10xxxxxx 10xxxxxx |
110110www_wxxxxxx_110111yy_yyyyyyyy |