unicorn/qemu
Emilio G. Cota 210d13ec49
tcg: consolidate TB lookups in tb_lookup__cpu_state
This avoids duplicating code. cpu_exec_step will also use the
new common function once we integrate parallel_cpus into tb->cflags.

Note that in this commit we also fix a race, described by Richard Henderson
during review. Think of this scenario with threads A and B:

(A) Lookup succeeds for TB in hash without tb_lock
(B) Sets the TB's tb->invalid flag
(B) Removes the TB from tb_htable
(B) Clears all CPU's tb_jmp_cache
(A) Store TB into local tb_jmp_cache

Given that order of events, (A) will keep executing that invalid TB until
another flush of its tb_jmp_cache happens, which in theory might never happen.
We can fix this by checking the tb->invalid flag every time we look up a TB
from tb_jmp_cache, so that in the above scenario, next time we try to find
that TB in tb_jmp_cache, we won't, and will therefore be forced to look it
up in tb_htable.

Performance-wise, I measured a small improvement when booting debian-arm.
Note that inlining pays off:

Performance counter stats for 'taskset -c 0 qemu-system-arm \
-machine type=virt -nographic -smp 1 -m 4096 \
-netdev user,id=unet,hostfwd=tcp::2222-:22 \
-device virtio-net-device,netdev=unet \
-drive file=jessie.qcow2,id=myblock,index=0,if=none \
-device virtio-blk-device,drive=myblock \
-kernel kernel.img -append console=ttyAMA0 root=/dev/vda1 \
-name arm,debug-threads=on -smp 1' (10 runs):

Before:
18714.917392 task-clock # 0.952 CPUs utilized ( +- 0.95% )
23,142 context-switches # 0.001 M/sec ( +- 0.50% )
1 CPU-migrations # 0.000 M/sec
10,558 page-faults # 0.001 M/sec ( +- 0.95% )
53,957,727,252 cycles # 2.883 GHz ( +- 0.91% ) [83.33%]
24,440,599,852 stalled-cycles-frontend # 45.30% frontend cycles idle ( +- 1.20% ) [83.33%]
16,495,714,424 stalled-cycles-backend # 30.57% backend cycles idle ( +- 0.95% ) [66.66%]
76,267,572,582 instructions # 1.41 insns per cycle
12,692,186,323 branches # 678.186 M/sec ( +- 0.92% ) [83.35%]
263,486,879 branch-misses # 2.08% of all branches ( +- 0.73% ) [83.34%]

19.648474449 seconds time elapsed ( +- 0.82% )

After, w/ inline (this patch):
18471.376627 task-clock # 0.955 CPUs utilized ( +- 0.96% )
23,048 context-switches # 0.001 M/sec ( +- 0.48% )
1 CPU-migrations # 0.000 M/sec
10,708 page-faults # 0.001 M/sec ( +- 0.81% )
53,208,990,796 cycles # 2.881 GHz ( +- 0.98% ) [83.34%]
23,941,071,673 stalled-cycles-frontend # 44.99% frontend cycles idle ( +- 0.95% ) [83.34%]
16,161,773,848 stalled-cycles-backend # 30.37% backend cycles idle ( +- 0.76% ) [66.67%]
75,786,269,766 instructions # 1.42 insns per cycle
12,573,617,143 branches # 680.708 M/sec ( +- 1.34% ) [83.33%]
260,235,550 branch-misses # 2.07% of all branches ( +- 0.66% ) [83.33%]

19.340502161 seconds time elapsed ( +- 0.56% )

After, w/o inline:
18791.253967 task-clock # 0.954 CPUs utilized ( +- 0.78% )
23,230 context-switches # 0.001 M/sec ( +- 0.42% )
1 CPU-migrations # 0.000 M/sec
10,563 page-faults # 0.001 M/sec ( +- 1.27% )
54,168,674,622 cycles # 2.883 GHz ( +- 0.80% ) [83.34%]
24,244,712,629 stalled-cycles-frontend # 44.76% frontend cycles idle ( +- 1.37% ) [83.33%]
16,288,648,572 stalled-cycles-backend # 30.07% backend cycles idle ( +- 0.95% ) [66.66%]
77,659,755,503 instructions # 1.43 insns per cycle
12,922,780,045 branches # 687.702 M/sec ( +- 1.06% ) [83.34%]
261,962,386 branch-misses # 2.03% of all branches ( +- 0.71% ) [83.35%]

19.700174670 seconds time elapsed ( +- 0.56% )

Backports commit f6bb84d53110398f4899c19dab4e0fe9908ec060 from qemu
2018-03-05 02:42:46 -05:00
..
accel target/arm: [tcg] Port to generic translation framework 2018-03-04 20:28:06 -05:00
crypto crypto: Clean up includes 2018-02-19 00:47:40 -05:00
default-configs arm64eb: add support for ARM64 big endian. 2017-04-24 23:30:01 +08:00
docs docs: clarify memory region lifecycle 2018-02-12 15:11:21 -05:00
fpu softfloat: define floatx80_round() 2018-03-03 20:57:27 -05:00
hw mips: replace cpu_mips_init() with cpu_generic_init() 2018-03-05 00:49:10 -05:00
include tcg: consolidate TB lookups in tb_lookup__cpu_state 2018-03-05 02:42:46 -05:00
qapi qapi: add explicit null to string input and output visitors 2018-03-03 20:32:50 -05:00
qobject qnum: add uint type 2018-03-03 18:37:56 -05:00
qom qom/cpu: move cpu_model null check to cpu_class_by_name() 2018-03-05 02:02:29 -05:00
scripts scripts: use build_ prefix for string not piped through cgen() 2018-03-03 22:11:28 -05:00
target tcg: remove addr argument from lookup_tb_ptr 2018-03-05 02:16:34 -05:00
tcg tcg: remove addr argument from lookup_tb_ptr 2018-03-05 02:16:34 -05:00
util bitmap: provide to_le/from_le helpers 2018-03-05 01:11:13 -05:00
aarch64.h target/arm: Prepare for CONTROL.SPSEL being nonzero in Handler mode 2018-03-05 01:29:54 -05:00
aarch64eb.h target/arm: Prepare for CONTROL.SPSEL being nonzero in Handler mode 2018-03-05 01:29:54 -05:00
accel.c clean-up: removed duplicate #includes 2018-02-28 08:51:56 -05:00
arm.h target/arm: Prepare for CONTROL.SPSEL being nonzero in Handler mode 2018-03-05 01:29:54 -05:00
armeb.h target/arm: Prepare for CONTROL.SPSEL being nonzero in Handler mode 2018-03-05 01:29:54 -05:00
atomic_template.h tcg: Add atomic128 helpers 2018-02-27 21:43:48 -05:00
CODING_STYLE import 2015-08-21 15:04:50 +08:00
configure configure: Drop AIX host support 2018-03-04 21:32:40 -05:00
COPYING import 2015-08-21 15:04:50 +08:00
COPYING.LIB import 2015-08-21 15:04:50 +08:00
cpu-exec-common.c tcg: Add EXCP_ATOMIC 2018-02-27 11:57:58 -05:00
cpu-exec.c tcg: consolidate TB lookups in tb_lookup__cpu_state 2018-03-05 02:42:46 -05:00
cpus.c tcg: handle EXCP_ATOMIC exception for system emulation 2018-03-02 09:56:43 -05:00
cputlb.c cputlb: Support generating CPU exceptions on memory transaction failures 2018-03-04 13:14:50 -05:00
exec.c memory: Open code FlatView rendering 2018-03-04 02:06:48 -05:00
gen_all_header.sh arm64eb: add support for ARM64 big endian. 2017-04-24 23:30:01 +08:00
glib_compat.c qapi: Improve qobject input visitor error reporting 2018-03-02 12:05:53 -05:00
HACKING import 2015-08-21 15:04:50 +08:00
header_gen.py target/arm: Prepare for CONTROL.SPSEL being nonzero in Handler mode 2018-03-05 01:29:54 -05:00
ioport.c hw: remove pio_addr_t 2018-02-24 02:43:16 -05:00
LICENSE import 2015-08-21 15:04:50 +08:00
m68k.h target/arm: Prepare for CONTROL.SPSEL being nonzero in Handler mode 2018-03-05 01:29:54 -05:00
Makefile Makefile: Add a FORCE target 2018-02-24 17:03:51 -05:00
Makefile.objs tcg: Add atomic helpers 2018-02-27 15:57:47 -05:00
Makefile.target tcg: Add generic translation framework 2018-03-04 14:31:16 -05:00
memory.c memory: avoid a name clash with access macro 2018-03-05 01:13:01 -05:00
memory_ldst.inc.c exec: introduce memory_ldst.inc.c 2018-03-01 09:59:34 -05:00
memory_mapping.c include/qemu/osdep.h: Don't include qapi/error.h 2018-02-21 23:08:18 -05:00
mips.h target/arm: Prepare for CONTROL.SPSEL being nonzero in Handler mode 2018-03-05 01:29:54 -05:00
mips64.h target/arm: Prepare for CONTROL.SPSEL being nonzero in Handler mode 2018-03-05 01:29:54 -05:00
mips64el.h target/arm: Prepare for CONTROL.SPSEL being nonzero in Handler mode 2018-03-05 01:29:54 -05:00
mipsel.h target/arm: Prepare for CONTROL.SPSEL being nonzero in Handler mode 2018-03-05 01:29:54 -05:00
powerpc.h target/arm: Prepare for CONTROL.SPSEL being nonzero in Handler mode 2018-03-05 01:29:54 -05:00
qapi-schema.json qapi: Update scripts to commit 01b2ffcedd94ad7b42bc870e4c6936c87ad03429 2018-03-03 18:32:12 -05:00
qemu-timer.c timer/cpus: fix some typos and update some comments 2018-02-25 23:21:57 -05:00
rules.mak rules.mak: Don't extract libs from .mo-libs in link command 2018-02-26 02:08:03 -05:00
softmmu_template.h cputlb: Support generating CPU exceptions on memory transaction failures 2018-03-04 13:14:50 -05:00
sparc.h target/arm: Prepare for CONTROL.SPSEL being nonzero in Handler mode 2018-03-05 01:29:54 -05:00
sparc64.h target/arm: Prepare for CONTROL.SPSEL being nonzero in Handler mode 2018-03-05 01:29:54 -05:00
tcg-runtime.c tcg: consolidate TB lookups in tb_lookup__cpu_state 2018-03-05 02:42:46 -05:00
translate-all.c tcg: Infrastructure for managing constant pools 2018-03-04 22:17:33 -05:00
translate-all.h translate-all.c: Compute L1 page table properties at runtime 2018-02-26 11:46:58 -05:00
translate-common.c exec: Clean up includes 2018-02-19 00:49:55 -05:00
unicorn_common.h qom/cpu: Add MemoryRegion property 2018-02-18 21:54:50 -05:00
VERSION import 2015-08-21 15:04:50 +08:00
vl.c util: add cacheinfo 2018-03-03 16:58:28 -05:00
vl.h import 2015-08-21 15:04:50 +08:00
x86_64.h target/arm: Prepare for CONTROL.SPSEL being nonzero in Handler mode 2018-03-05 01:29:54 -05:00