python – What does the u symbol mean in front of string values?
python – What does the u symbol mean in front of string values?
The u in front of the string values means the string is a Unicode string. Unicode is a way to represent more characters than normal ASCII can manage. The fact that youre seeing the u
means youre on Python 2 – strings are Unicode by default on Python 3, but on Python 2, the u
in front distinguishes Unicode strings. The rest of this answer will focus on Python 2.
You can create a Unicode string multiple ways:
>>> ufoo
ufoo
>>> unicode(foo) # Python 2 only
ufoo
But the real reason is to represent something like this (translation here):
>>> val = uОзнакомьтесь с документацией
>>> val
uu041eu0437u043du0430u043au043eu043cu044cu0442u0435u0441u044c u0441 u0434u043eu043au0443u043cu0435u043du0442u0430u0446u0438u0435u0439
>>> print val
Ознакомьтесь с документацией
For the most part, Unicode and non-Unicode strings are interoperable on Python 2.
There are other symbols you will see, such as the raw symbol r
for telling a string not to interpret backslashes. This is extremely useful for writing regular expressions.
>>> foo
foo
>>> rfoo
foo\
Unicode and non-Unicode strings can be equal on Python 2:
>>> bird1 = unicode(unladen swallow)
>>> bird2 = unladen swallow
>>> bird1 == bird2
True
but not on Python 3:
>>> x = uasdf # Python 3
>>> y = basdf # b indicates bytestring
>>> x == y
False
This is a feature, not a bug.
See http://docs.python.org/howto/unicode.html, specifically the unicode type section.