Skip to main content

Write characters by Unicode code point in Python

Unicode code points are often written in the form of U+XXXX, where xxxx is an integer in hexadecimal. For instance character of '?' is 'U+003f' in unicode. To convert character to Unicode code point by ord() is as follow:
i = ord('?')
h = hex(i)
print(i, h)
# 62 0x3f
The ord() gives digit in decimal, so to convert decimal I used the hex(). In Python an integer with the prefix 0x is a number in hexadecimal.
To write '?' in unicode is as follow:
print('\u003f')
# ?
When you write a hexadecimal Unicode code point with the prefix of \x, \u, or \U in a string literal, it is treated as that character. It should be 2, 4, or 8 digits like \xXX, \uXXXX, and \UXXXXXX, respectively.

Comments