Simon Willison’s Weblog

Subscribe
Atom feed for text

1 item tagged “text”

2010

Reexamining Python 3 Text I/O. Python 3.1’s IO performance is a huge improvement over 3.0, but still considerably slower than 2.6. It turns out it’s all to do with Python 3’s unicode support: When you read a file in to a string, you’re asking Python to decode the bytes in to UTF-8 (the new default encoding) at the same time. If you open the file in binary mode Python 3 will read raw bytes in to a bytestring instead, avoiding the conversion overhead and performing only 4% slower than the equivalent code in Python 2.6.4.

# 28th January 2010, 1:28 pm / david-beazley, io, performance, python, python3, text, unicode