This is identical for each target. So, move the initialization to
common code. Move the variable itself out of tcg_ctx and name it
cpu_env to minimize changes within targets.
This also means we can remove tcg_global_reg_new_{ptr,i32,i64},
since there are no longer global-register temps created by targets.
Backports commit 1c2adb958fc07e5b3e81ed21b801c04a15f41f4f from qemu
This is a prerequisite for supporting multiple TCG contexts, since
we will have threads generating code in separate regions of
code_gen_buffer.
For this we need a new field (.size) in struct tb_tc to keep
track of the size of the translated code. This field uses a size_t
to avoid adding a hole to the struct, although really an unsigned
int would have been enough.
The comparison function we use is optimized for the common case:
insertions. Profiling shows that upon booting debian-arm, 98%
of comparisons are between existing tb's (i.e. a->size and b->size
are both !0), which happens during insertions (and removals, but
those are rare). The remaining cases are lookups. From reading the glib
sources we see that the first key is always the lookup key. However,
the code does not assume this to always be the case because this
behaviour is not guaranteed in the glib docs. However, we embed
this knowledge in the code as a branch hint for the compiler.
Note that tb_free does not free space in the code_gen_buffer anymore,
since we cannot easily know whether the tb is the last one inserted
in code_gen_buffer. The next patch in this series renames tb_free
to tb_remove to reflect this.
Performance-wise, lookups in tb_find_pc are the same as before:
O(log n). However, insertions are O(log n) instead of O(1), which
results in a small slowdown when booting debian-arm:
Performance counter stats for 'build/arm-softmmu/qemu-system-arm \
-machine type=virt -nographic -smp 1 -m 4096 \
-netdev user,id=unet,hostfwd=tcp::2222-:22 \
-device virtio-net-device,netdev=unet \
-drive file=img/arm/jessie-arm32.qcow2,id=myblock,index=0,if=none \
-device virtio-blk-device,drive=myblock \
-kernel img/arm/aarch32-current-linux-kernel-only.img \
-append console=ttyAMA0 root=/dev/vda1 \
-name arm,debug-threads=on -smp 1' (10 runs):
- Before:
8048.598422 task-clock (msec) # 0.931 CPUs utilized ( +- 0.28% )
16,974 context-switches # 0.002 M/sec ( +- 0.12% )
0 cpu-migrations # 0.000 K/sec
10,125 page-faults # 0.001 M/sec ( +- 1.23% )
35,144,901,879 cycles # 4.367 GHz ( +- 0.14% )
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
65,758,252,643 instructions # 1.87 insns per cycle ( +- 0.33% )
10,871,298,668 branches # 1350.707 M/sec ( +- 0.41% )
192,322,212 branch-misses # 1.77% of all branches ( +- 0.32% )
8.640869419 seconds time elapsed ( +- 0.57% )
- After:
8146.242027 task-clock (msec) # 0.923 CPUs utilized ( +- 1.23% )
17,016 context-switches # 0.002 M/sec ( +- 0.40% )
0 cpu-migrations # 0.000 K/sec
18,769 page-faults # 0.002 M/sec ( +- 0.45% )
35,660,956,120 cycles # 4.378 GHz ( +- 1.22% )
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
65,095,366,607 instructions # 1.83 insns per cycle ( +- 1.73% )
10,803,480,261 branches # 1326.192 M/sec ( +- 1.95% )
195,601,289 branch-misses # 1.81% of all branches ( +- 0.39% )
8.828660235 seconds time elapsed ( +- 0.38% )
Backports commit 2ac01d6dafabd4a726254eea98824c798d416ee4 from qemu
Thereby decoupling the resulting translated code from the current state
of the system.
Backports commit 87d757d60d66d5ee1608460b0f1e07e2b758db9c from qemu
Thereby decoupling the resulting translated code from the current state
of the system.
Backports commit f0ddf11b23260f0af84fb529486a8f9ba2d19401 from qemu
Thereby decoupling the resulting translated code from the current state
of the system.
Backports commit b5e3b4c2aca8eb5a9cfeedfb273af623f17c3731 from qemu
Thereby decoupling the resulting translated code from the current state
of the system.
Backports commit 2399d4e7cec22ecf1c51062d2ebfd45220dbaace from qemu
Convert all existing readers of tb->cflags to tb_cflags, so that we
use atomic_read and therefore avoid undefined behaviour in C11.
Note that the remaining setters/getters of the field are protected
by tb_lock, and therefore do not need conversion.
Luckily all readers access the field via 'tb->cflags' (so no foo.cflags,
bar->cflags in the code base), which makes the conversion easily
scriptable:
FILES=$(git grep 'tb->cflags' target include/exec/gen-icount.h \
accel/tcg/translator.c | cut -f1 -d':' | sort | uniq)
perl -pi -e 's/([^.>])tb->cflags/$1tb_cflags(tb)/g' $FILES
perl -pi -e 's/([a-z->.]*)(->|\.)tb->cflags/tb_cflags($1$2tb)/g' $FILES
Then manually fixed the few errors that checkpatch reported.
Compile-tested for all targets.
Backports commit c5a49c63fa26e8825ad101dfe86339ae4c216539 from qemu
Now we have a working '-cpu max', the linux-user-only
'any' CPU is pretty much the same thing, so implement it
that way.
For the moment we don't add any of the extra feature bits
to the system-emulation "max", because we don't set the
ID register bits we would need to to advertise those
features as present.
Backports commit a0032cc5427d0d396aa0a9383ad9980533448ea4 from qemu
Add support for "-cpu max" for ARM guests. This CPU type behaves
like "-cpu host" when KVM is enabled, and like a system CPU with
the maximum possible feature set otherwise. (Note that this means
it won't be migratable across versions, as we will likely add
features to it in future.)
Backports commit bab52d4bba3f22921a690a887b4bd0342f2754cd from qemu
The cortex A53 TRM specifies that bits 24 and 25 of the L2CTLR register
specify the number of cores in the processor, not the total number of
cores in the system. To report this correctly on machines with multiple
CPU clusters (ARM's big.LITTLE or Xilinx's ZynqMP) we need to allow
the machine to overwrite this value. To do this let's add an optional
property.
Backports commit f9a697112ee64180354f98309a5d6b691cc8699d from qemu
Using a local m68k floatx80_tentox()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]
Backports commit 6c25be6e30bda0e470f8f0b6b93d53a6efe469e8 from qemu
Using a local m68k floatx80_twotox()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]
Backports commit 068f161536d9a28a5bc482f3de9c387b2fe5908d from qemu
Using a local m68k floatx80_etox()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]
Backports commit 40ad087330bee5394c9e78c97f909f580be69b58 from qemu
Using a local m68k floatx80_log2()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]
Backports commit 67b453ed73fe65949c24e6ca2b43f6816a89a301 from qemu
Using a local m68k floatx80_log10()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]
Backports commit 248efb66fb88bc17c04a0d0f09a3539a43c80769 from qemu
Using a local m68k floatx80_logn()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]
Backports commit 50067bd16fead5d78a283130efbf3e3b026de450 from qemu
Using a local m68k floatx80_lognp1()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]
Backports commit 4b5c65b8f02a057bc1b77839b5012544f96fec80 from qemu
This functions is needed by upcoming m68k softfloat functions.
Source code copied for WinUAE (tag 3500)
(The WinUAE file has been copied from QEMU and has
the QEMU licensing notice)
Backports commit 9a069775a8087cbd6fa8c479b69be8d37bd90351 from qemu
The script used for converting from QEMUMachine had used one
DEFINE_MACHINE() per machine registered. In cases where multiple
machines are registered from one source file, avoid the excessive
generation of module init functions by reverting this unrolling.
Backports commit 8a661aea0e7f6e776c6ebc9abe339a85b34fea1d from qemu
Convert all machines to use DEFINE_MACHINE() instead of QEMUMachine
automatically using a script.
Backports commit e264d29de28c5b0be3d063307ce9fb613b427cc3 from qemu