python – What does the u symbol mean in front of string values?

python – What does the u symbol mean in front of string values?

The u in front of the string values means the string is a Unicode string. Unicode is a way to represent more characters than normal ASCII can manage. The fact that youre seeing the u means youre on Python 2 – strings are Unicode by default on Python 3, but on Python 2, the u in front distinguishes Unicode strings. The rest of this answer will focus on Python 2.

You can create a Unicode string multiple ways:

>>> ufoo
ufoo
>>> unicode(foo) # Python 2 only
ufoo

But the real reason is to represent something like this (translation here):

>>> val = uОзнакомьтесь с документацией
>>> val
uu041eu0437u043du0430u043au043eu043cu044cu0442u0435u0441u044c u0441 u0434u043eu043au0443u043cu0435u043du0442u0430u0446u0438u0435u0439
>>> print val
Ознакомьтесь с документацией

For the most part, Unicode and non-Unicode strings are interoperable on Python 2.

There are other symbols you will see, such as the raw symbol r for telling a string not to interpret backslashes. This is extremely useful for writing regular expressions.

>>> foo
foo
>>> rfoo
foo\

Unicode and non-Unicode strings can be equal on Python 2:

>>> bird1 = unicode(unladen swallow)
>>> bird2 = unladen swallow
>>> bird1 == bird2
True

but not on Python 3:

>>> x = uasdf # Python 3
>>> y = basdf # b indicates bytestring
>>> x == y
False

This is a feature, not a bug.

See http://docs.python.org/howto/unicode.html, specifically the unicode type section.

python – What does the u symbol mean in front of string values?

Leave a Reply

Your email address will not be published. Required fields are marked *