utf 8 – Correctly reading text from Windows-1252(cp1252) file in python
utf 8 – Correctly reading text from Windows-1252(cp1252) file in python
CP1252 cannot represent ā; your input contains the similar character â. repr
just displays an ASCII representation of a unicode string in Python 2.x:
>>> print(repr(bJxe2nis.decode(cp1252)))
uJxe2nis
>>> print(bJxe2nis.decode(cp1252))
Jânis
I think uJxe2nis
is correct, see:
>>> print uJxe2nis.encode(utf-8)
Jânis
Are you getting actual errors from SQLAlchemy or in your applications output?
utf 8 – Correctly reading text from Windows-1252(cp1252) file in python
I had the same problem with some XML files, I solved reading the file with ANSI encoding (Windows-1252) and writing a file with UTF-8 encoding:
import os
import sys
path = os.path.dirname(__file__)
file_name = my_input_file.xml
if __name__ == __main__:
with open(os.path.join(path, ./ + file_name), r, encoding=cp1252) as f1:
lines = f1.read()
f2 = open(os.path.join(path, ./ + my_output_file.xml), w, encoding=utf-8)
f2.write(lines)
f2.close()