qapi: force a UTF-8 locale for running Python

Python2 did not validate locale correctness when reading input data, so
would happily read UTF-8 data in non-UTF-8 locales. Python3 is strict so
if you try to read UTF-8 data in the C locale, it will raise an error
for any UTF-8 bytes that aren't representable in 7-bit ascii encoding.
e.g.

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 54: ordinal not in range(128)
Traceback (most recent call last):
File "/tmp/qemu-test/src/scripts/qapi-commands.py", line 317, in <module>
schema = QAPISchema(input_file)
File "/tmp/qemu-test/src/scripts/qapi.py", line 1468, in __init__
parser = QAPISchemaParser(open(fname, 'r'))
File "/tmp/qemu-test/src/scripts/qapi.py", line 301, in __init__
previously_included)
File "/tmp/qemu-test/src/scripts/qapi.py", line 348, in _include
exprs_include = QAPISchemaParser(fobj, previously_included, info)
File "/tmp/qemu-test/src/scripts/qapi.py", line 271, in __init__
self.src = fp.read()
File "/usr/lib64/python3.5/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]

More background on this can be seen in

https://www.python.org/dev/peps/pep-0538/

Many distros support a new C.UTF-8 locale that is like the C locale,
but with UTF-8 instead of 7-bit ASCII. That is not entirely portable
though. This patch thus sets the LANG to "C", but overrides LC_CTYPE
to be en_US.UTF-8 locale. This gets us pretty close to C.UTF-8, but
in a way that should be portable to everywhere QEMU builds.

This patch only forces UTF-8 for QAPI scripts, since that is the one
showing the immediate error under Python3 with C locale, but potentially
we ought to force this for all python scripts used in the build process.

Backports commit d4e5ec877ca698a87dabe68814c6f93668f50c60 from qemu
This commit is contained in:
Daniel P. Berrange 2018-03-06 11:32:46 -05:00 committed by Lioncash
parent 4fb711df46
commit 42b238b45b
No known key found for this signature in database
GPG key ID: 4E3C3CC1031BA9C7

View file

@ -9,6 +9,8 @@ ifneq ($(wildcard config-host.mak),)
all:
include config-host.mak
PYTHON_UTF8 = LC_ALL= LANG=C LC_CTYPE=en_US.UTF-8 $(PYTHON)
# Check that we're not trying to do an out-of-tree build from
# a tree that's been used for an in-tree build.
ifneq ($(realpath $(SRC_PATH)),$(realpath .))
@ -147,12 +149,12 @@ qapi-modules = $(SRC_PATH)/qapi-schema.json $(SRC_PATH)/qapi/common.json
qapi-types.c qapi-types.h :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py \
$(call quiet-command,$(PYTHON_UTF8) $(SRC_PATH)/scripts/qapi-types.py \
$(gen-out-type) -o "." -b $<, \
" GEN $@")
qapi-visit.c qapi-visit.h :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-visit.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-visit.py \
$(call quiet-command,$(PYTHON_UTF8) $(SRC_PATH)/scripts/qapi-visit.py \
$(gen-out-type) -o "." -b $<, \
" GEN $@")