python - unicode().decode('utf-8', 'ignore') raising UnicodeEncodeError -


here code:

>>> z = u'\u2022'.decode('utf-8', 'ignore') traceback (most recent call last):   file "<stdin>", line 1, in <module>   file "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode     return codecs.utf_8_decode(input, errors, true) unicodeencodeerror: 'latin-1' codec can't encode character u'\u2022' in position 0: ordinal not in range(256) 

why unicodeencodeerror raised when using .decode?

why error raised when using 'ignore'?

when first started messing around python strings , unicode, took me awhile understand jargon of decode , encode too, here's post here may help:


think of decoding go regular bytestring to unicode , encoding from unicode. in other words:

you de - code str produce unicode string

and en - code unicode string produce str.

so:

unicode_char = u'\xb0'  encodedchar = unicode_char.encode('utf-8') 

encodedchar contain unicode character, displayed in selected encoding (in case, utf-8).


Comments

Popular posts from this blog

Javascript line number mapping -

c# - Is it possible to remove an existing registration from Autofac container builder? -

php - Mysql PK and FK char(36) vs int(10) -