UnicodeEncodeError
Python Programming Language
Severity: ModerateWhat Does This Error Mean?
UnicodeEncodeError happens when Python tries to convert text (a string) into bytes, but the encoding you chose cannot handle one of the characters in that text. For example, trying to save a string containing a Japanese character using ASCII encoding — ASCII only knows about basic English letters, so it cannot represent Japanese characters. The solution is to use an encoding that supports those characters, like UTF-8.
Affected Models
- Python 2.x
- Python 3.x
- All Python versions
Common Causes
- Writing a string containing non-English characters to a file opened without specifying encoding='utf-8'
- Printing Unicode text in a terminal or console that does not support Unicode (common on Windows with older settings)
- Sending text over a network connection or to a system that expects ASCII-only data
- Using str.encode() with 'ascii' encoding on a string that contains accented letters, emoji, or non-Latin characters
- The default system encoding is not UTF-8 — this happens on some Windows systems
How to Fix It
-
When opening a file for writing, always specify the encoding explicitly: open('file.txt', 'w', encoding='utf-8')
If you do not specify encoding, Python uses the system default — which on Windows is often 'cp1252' and cannot handle many Unicode characters.
-
When encoding a string manually with .encode(), use UTF-8: my_string.encode('utf-8')
UTF-8 can represent every Unicode character, including emoji, Chinese, Arabic, and every other script.
-
If you need to encode with a limited encoding (like ASCII) but just want to skip or replace characters it cannot handle, use the 'errors' parameter: my_string.encode('ascii', errors='ignore') or errors='replace'
'ignore' silently drops characters that cannot be encoded. 'replace' substitutes them with a question mark. Neither is ideal — use UTF-8 when possible.
-
If the error happens when printing to the terminal on Windows, set the environment variable PYTHONIOENCODING=utf-8 before running your script, or run: chcp 65001 in the Command Prompt to switch to UTF-8 mode.
On modern Windows 10/11 you can also go to System Settings > Region > Administrative > Change system locale and enable 'Beta: Use Unicode UTF-8 for worldwide language support'.
-
At the top of your script, you can also set the default encoding for stdout: import sys; sys.stdout.reconfigure(encoding='utf-8') — this is useful when you cannot control how the terminal is set up.
This only works in Python 3.7 and later.
When to Call a Professional
UnicodeEncodeError is always something you can fix yourself. The fix is almost always to specify UTF-8 encoding wherever you write text. Modern software should always use UTF-8 — it supports every character in every language.
Frequently Asked Questions
What is the difference between UnicodeEncodeError and UnicodeDecodeError?
UnicodeEncodeError happens when converting text (a string) INTO bytes — for example, when saving to a file. UnicodeDecodeError happens when converting bytes BACK into text — for example, when reading a file. Both are caused by a mismatch between the characters in the data and the encoding being used.
Why does this work on my Mac but fail on Windows?
Because the default system encoding is different. Mac and Linux systems default to UTF-8, which handles all Unicode characters. Older Windows systems default to a regional encoding like cp1252 (Western Europe) which cannot handle characters from other languages. The fix is to always specify encoding='utf-8' explicitly instead of relying on the system default.
Should I always use UTF-8?
Yes, for almost all new code. UTF-8 is the universal standard — it can represent every character in every human language. The only reason to use a different encoding is when you need to work with old files or systems that were created with a specific encoding.