unicorn

mirror of https://github.com/yuzu-emu/unicorn.git synced 2025-12-18 00:01:43 +00:00

History

Emilio G. Cota ec14a00925 target-arm: emulate LL/SC using cmpxchg helpers Emulating LL/SC with cmpxchg is not correct, since it can suffer from the ABA problem. Portable parallel code, however, is written assuming only cmpxchg--and not LL/SC--is available. This means that in practice emulating LL/SC with cmpxchg is a viable alternative. The appended emulates LL/SC pairs in ARM with cmpxchg helpers. This works in both user and system mode. In usermode, it avoids pausing all other CPUs to perform the LL/SC pair. The subsequent performance and scalability improvement is significant, as the plots below show. They plot the throughput of atomic_add-bench compiled for ARM and executed on a 64-core x86 machine. Hi-res plots: http://imgur.com/a/aNQpB atomic_add-bench: 1000000 ops/thread, [0,1] range 9 ++---------+----------+----------+----------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| 8 +Emaster +-H--+ ++ \| \| \| 7 ++E ++ \| \| \| 6 ++++ ++ \| \| \| 5 ++ \| ++ 4 ++ \| ++ \| \| \| 3 ++ \| ++ \| \| \| 2 ++ \| ++ \|H++E+--- +++ ---+E+------+E+------+E\| 1 +++ +E+-----+E+------+E+------+E+------+E+-- +++ +++ ++ ++H+ + +++ + +++ ++++ + + + \| 0 ++--H----H-+-----H----+----------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 1000000 ops/thread, [0,2] range 16 ++---------+----------+---------+----------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| 14 ++master +-H--+ ++ \| \| \| 12 ++\| ++ \| E \| 10 ++\| ++ \| \| \| 8 ++++ ++ \|E+\| \| \| \| \| 6 ++ \| ++ \| \| \| 4 ++ \| ++ \| +E+--- +++ +++ +++ ---+E+------+E\| 2 +H+ +E+------E-------+E+-----+E+------+E+------+E+-- +++ + \| + +++ + ++++ + + + \| 0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 1000000 ops/thread, [0,128] range 70 ++---------+----------+---------+----------+----------+----------+---++ +cmpxchg +-E--+ + + + ++++ + \| 60 ++master +-H--+ ----E------+E+-------++ \| -+E+--- +++ +++ +E\| \| +++ ---- +++ ++\| 50 ++ +++ ---+E+- ++ \| -E--- \| 40 ++ ---+++ ++ \| +++--- \| \| -+E+ \| 30 ++ +++---- ++ \| +E+ \| 20 ++ +++-- ++ \| +E+ \| \|+E+ \| 10 +E+ ++ + + + + + + + \| 0 +HH-H----H-+-----H----+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 1000000 ops/thread, [0,1024] range 120 ++---------+---------+----------+---------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| \| master +-H--+ ++\| 100 ++ ----E+ \| +++ ---+E+--- ++\| \| --E--- +++ \| 80 ++ ---- +++ ++ \| ---+E+- \| 60 ++ -+E+-- ++ \| +++ ---- +++ \| \| -+E+- \| 40 ++ +++---- ++ \| +++ ---+E+ \| \| -+E+--- \| 20 ++ +E+ ++ \|+E+++ \| +E+ + + + + + + \| 0 +HH-H---H--+-----H---+----------+---------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads Backports commit 354161b37c6465a32073eac5f16fa35939af2bb4 from qemu		2018-02-28 00:07:44 -05:00
..
arm_ldst.h	cpu: move exec-all.h inclusion out of cpu.h	2018-02-24 02:39:08 -05:00
cpu-qom.h	target-arm: make cpu-qom.h not target specific	2018-02-24 00:48:59 -05:00
cpu.c	arm: add Cortex A7 CPU parameters	2018-02-26 03:44:24 -05:00
cpu.h	target-arm: Implement new HLT trap for semihosting	2018-02-26 15:28:45 -05:00
cpu64.c	target-arm: Get rid of unused variable warnings	2018-02-23 12:43:09 -05:00
crypto_helper.c	target-arm: Clean up includes	2018-02-17 21:09:32 -05:00
helper-a64.c	softfloat: Implement run-time-configurable meaning of signaling NaN bit	2018-02-24 20:27:12 -05:00
helper-a64.h	import	2015-08-21 15:04:50 +08:00
helper.c	target-arm: Implement new HLT trap for semihosting	2018-02-26 15:28:45 -05:00
helper.h	target-arm: Implement MRS (banked) and MSR (banked) instructions	2018-02-21 21:50:42 -05:00
internals.h	Fix confusing argument names in some common functions	2018-02-25 03:58:27 -05:00
iwmmxt_helper.c	target-arm: Clean up includes	2018-02-17 21:09:32 -05:00
kvm-consts.h	import	2015-08-21 15:04:50 +08:00
Makefile.objs	delete sparc32_dma.h & arm-semi.c	2017-01-19 15:10:41 +08:00
neon_helper.c	target-arm: Fix warn about implicit conversion	2018-02-25 22:44:43 -05:00
op_addsub.h	import	2015-08-21 15:04:50 +08:00
op_helper.c	Fix masking of PC lower bits when doing exception returns	2018-02-26 08:09:28 -05:00
psci.c	Use #include "..." for our own headers, <...> for others	2018-02-25 04:10:33 -05:00
translate-a64.c	target-arm: Comments added to identify cases in a switch	2018-02-26 08:05:49 -05:00
translate.c	target-arm: emulate LL/SC using cmpxchg helpers	2018-02-28 00:07:44 -05:00
translate.h	target-arm: Infrastucture changes to enable handling of tagged address loading into PC	2018-02-26 07:58:17 -05:00
unicorn.h	arm64eb: add support for ARM64 big endian.	2017-04-24 23:30:01 +08:00
unicorn_aarch64.c	qemu-common: push cpu.h inclusion out of qemu-common.h	2018-02-24 01:50:56 -05:00
unicorn_arm.c	qemu-common: push cpu.h inclusion out of qemu-common.h	2018-02-24 01:50:56 -05:00