Code Answer: What does 'u' mean in a list?

This is the first time I've came across this. Just printed a list and each element seems to have a u in front of it i.e.

[u'hello', u'hi', u'hey']

What does it mean and why would a list have this in front of each element?

As I don't know how common this is, if you'd like to see how I came across it, I'll happily edit the post.

From stackoverflow

The u just means that the following string is a unicode string (as opposed to a plain ascii string). It has nothing to do with the list that happens to contain the (unicode) strings.
I believe the u' prefix creates a unicode string instead of regular ascii
it's an indication of unicode string. similar to r'' for raw string.
```
>>> type(u'abc')
<type 'unicode'>
>>> r'ab\c'
'ab\\c'
```
day_trader : Ah, I thought r'' meant something to do with a regular expression?

Samir Talwar : It's generally used for regular expressions so we can write things like `r'/[ \t]+/'` instead of `'/[ \\t]+/'` (note the double backslash - you don't have to escape things in raw strings unless you're escaping the closing quote).

SilentGhost : it's often used in regex to avoid all the escaping backslashes

day_trader : I see. If I iterate through a unicode listing and check if some string is 'in' the list, will that recognise the string? I'm currently checking each element to see if it matches a certain string and it keeps escaping everytime. Is this because it's Unicode?

Mike Graham : r and u are a bit different. u indicates the type of the string, whereas r (or ru, if you want to use raw unicode literals) makes a normal str (or unicode, if u and r are both used) but that is parsed differently at compile time. `>>> repr(r'foo') "'foo'" >>> repr(u'foo') "u'foo'"` Notice how the r goes away (that's just a matter of what backslashes do) and the u does not (because it makes an object of different type.)

SilentGhost : if your string is a unicode string that uses only ascii characters (as in your example) `in` operation would cast the strings implicitly and you'll get `True`: 'abc' in [u'abc'] results in `True`. If your unicode string uses characters outside of ascii charset, you naturally would get `False` in such test.
Unicode.

Code Answer

Tuesday, March 1, 2011

What does 'u' mean in a list?

0 comments:

Post a Comment

Blog Archive