python – Writing Unicode text to a text file?

python – Writing Unicode text to a text file?

Deal exclusively with unicode objects as much as possible by decoding things to unicode objects when you first get them and encoding them as necessary on the way out.

If your string is actually a unicode object, youll need to convert it to a unicode-encoded string object before writing it to a file:

foo = uΔ, Й, ק, ‎ م, ๗, あ, 叶, 葉, and 말.
f = open(test, w)
f.write(foo.encode(utf8))
f.close()

When you read that file again, youll get a unicode-encoded string that you can decode to a unicode object:

f = file(test, r)
print f.read().decode(utf8)

In Python 2.6+, you could use io.open() that is default (builtin open()) on Python 3:

import io

with io.open(filename, w, encoding=character_encoding) as file:
    file.write(unicode_text)

It might be more convenient if you need to write the text incrementally (you dont need to call unicode_text.encode(character_encoding) multiple times). Unlike codecs module, io module has a proper universal newlines support.

python – Writing Unicode text to a text file?

Unicode string handling is already standardized in Python 3.

  1. chars are already stored in Unicode (32-bit) in memory
  2. You only need to open file in utf-8
    (32-bit Unicode to variable-byte-length utf-8 conversion is automatically performed from memory to file.)

    out1 = (嘉南大圳 ㄐㄧㄚ ㄋㄢˊ ㄉㄚˋ ㄗㄨㄣˋ )
    fobj = open(t1.txt, w, encoding=utf-8)
    fobj.write(out1)
    fobj.close()
    

Leave a Reply

Your email address will not be published. Required fields are marked *