Estimated reading time: 2 minutes
The tr
command in Linux is used to translate or delete characters from a text stream. However, when it encounters an invalid or unexpected byte sequence, it throws an “Illegal byte sequence” error. This usually happens when the input data contains characters that do not match the expected encoding.
Table of contents
Example of tr command “Illegal Byte Sequence” Error
Let’s take, for example, a scenario where you attempt to translate lowercase letters to uppercase using the tr
command:
echo "Hello, World!" | tr 'a-z' 'A-Z'
For most inputs, this works perfectly. However, if your input contains non-ASCII characters, you might encounter the “Illegal byte sequence” error:
echo "Héllo, Wørld!" | tr 'a-z' 'A-Z'
Output:
tr: Illegal byte sequence
Now, why did this error occur? Some of the most common causes are:
- Encoding Mismatch: The input data might be in a different encoding than what the
tr
command would be expecting. - Non-ASCII Characters: Special characters or symbols not supported by the
tr
command’s default settings may exist in the input.
SOLVED: tr “Illegal byte sequence” Error
Step 1: Set the Locale
As noted above, one of the common reasons for the tr command throwing an “Illegal byte sequence” error is a mismatch of encoding settings. Setting the locale can often resolve the encoding mismatch. Use the following command to set the locale to UTF-8:
export LC_ALL=C.UTF-8
Step 2: Re-run the tr
command after setting the locale:
echo "Héllo, Wørld!" | tr 'a-z' 'A-Z'
Alternatively, you can also run both commands in one go:
export LC_ALL=C.UTF-8; echo "Héllo, Wørld!" | tr 'a-z' 'A-Z'