unicorn

mirror of https://github.com/yuzu-emu/unicorn.git synced 2025-12-15 15:41:25 +00:00

History

Emilio G. Cota 3dc16ebca3 target-i386: remove helper_lock() It's been superseded by the atomic helpers. The use of the atomic helpers provides a significant performance and scalability improvement. Below is the result of running the atomic_add-test microbenchmark with: $ x86_64-linux-user/qemu-x86_64 tests/atomic_add-bench -o 5000000 -r $r -n $n , where $n is the number of threads and $r is the allowed range for the additions. The scenarios measured are: - atomic: implements x86' ADDL with the atomic_add helper (i.e. this patchset) - cmpxchg: implement x86' ADDL with a TCG loop using the cmpxchg helper - master: before this patchset Results sorted in ascending range, i.e. descending degree of contention. Y axis is Throughput in Mops/s. Tests are run on an AMD machine with 64 Opteron 6376 cores. atomic_add-bench: 5000000 ops/thread, [0,1] range 25 ++---------+----------+---------+----------+----------+----------+---++ + atomic +-E--+ + + + + + \| \|cmpxchg +-H--+ \| 20 +Emaster +-N--+ ++ \|\| \| \|++ \| \|\| \| 15 +++ ++ \|N\| \| \|+\| \| 10 ++\| ++ \|+\|+ \| \| \| -+E+------ +++ ---+E+------+E+------+E+-----+E+------+E\| \|+E+E+- +++ +E+------+E+-- \| 5 ++\|+ ++ \|+N+H+--- +++ \| ++++N+--+H++----+++ + +++ --++H+------+H+------+H++----+H+---+--- \| 0 ++---------+-----H----+---H-----+----------+----------+----------+---H+ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 5000000 ops/thread, [0,2] range 25 ++---------+----------+---------+----------+----------+----------+---++ ++atomic +-E--+ + + + + + \| \|cmpxchg +-H--+ \| 20 ++master +-N--+ ++ \|E\| \| \|++ \| \|\|E \| 15 ++\| ++ \|N\|\| \| \|+\|\| ---+E+------+E+-----+E+------+E\| 10 ++\| \| ---+E+------+E+-----+E+--- +++ +++ \|\|H+E+--+E+-- \| \|+++++ \| \| \|\| \| 5 ++\|+H+-- +++ ++ \|+N+ - ---+H+------+H+------ \| + +N+--+H++----+H+---+--+H+----++H+--- + + +H+---+--+H\| 0 ++---------+----------+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 5000000 ops/thread, [0,8] range 40 ++---------+----------+---------+----------+----------+----------+---++ ++atomic +-E--+ + + + + + \| 35 +cmpxchg +-H--+ ++ \| master +-N--+ ---+E+------+E+------+E+-----+E+------+E\| 30 ++\| ---+E+-- +++ ++ \| \| -+E+--- \| 25 ++E ---- +++ ++ \|+++++ -+E+ \| 20 +E+ E-- +++ ++ \|H\|+++ \| \|+\| +H+------- \| 15 ++H+ ---+++ +H+------ ++ \|N++H+-- +++--- +H+------++\| 10 ++ +++ - +++ ---+H+ +++ +H+ \| \| +H+-----+H+------+H+-- \| 5 ++\| +++ ++ ++N+N+--+N++ + + + + + \| 0 ++---------+----------+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 5000000 ops/thread, [0,128] range 160 ++---------+---------+----------+---------+----------+----------+---++ + atomic +-E--+ + + + + + \| 140 +cmpxchg +-H--+ +++ +++ ++ \| master +-N--+ E--------E------+E+------++\| 120 ++ --\| \| +++ E+ \| -- +++ +++ ++\| 100 ++ - ++ \| +++- +++ ++\| 80 ++ -+E+ -+H+------+H+------H--------++ \| ---- ---- +++ H\| \| ---+E+-----+E+- ---+H+ ++\| 60 ++ +E+--- +++ ---+H+--- ++ \| --+++ ---+H+-- \| 40 ++ +E+-+H+--- ++ \| +H+ \| 20 +EE+ ++ +N+ + + + + + + \| 0 ++N-N---N--+---------+----------+---------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 5000000 ops/thread, [0,1024] range 350 ++---------+---------+----------+---------+----------+----------+---++ + atomic +-E--+ + + + + + \| 300 +cmpxchg +-H--+ +++ \| master +-N--+ +++ \|\| \| +++ \| ----E\| 250 ++ \| ----E---- ++ \| ----E--- \| ---+H\| 200 ++ -+E+--- +++ ---+H+--- ++ \| ---- -+H+-- \| \| +E+ +++ ---- +++ \| 150 ++ ---+++ ---+H+- ++ \| --- -+H+-- \| 100 ++ ---+E+ ---- +++ ++ \| +++ ---+E+-----+H+- \| \| -+E+------+H+-- \| 50 ++ +E+ ++ +EE+ + + + + + + \| 0 ++N-N---N--+---------+----------+---------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads hi-res: http://imgur.com/a/fMRmq For master I stopped measuring master after 8 threads, because there is little point in measuring the well-known performance collapse of a contended lock. Backports commit 37b995f6e7a1cb6fa378c5cd4217b9dd9e1fc98b from qemu		2018-02-27 23:43:22 -05:00
..
crypto	crypto: Clean up includes	2018-02-19 00:47:40 -05:00
default-configs	arm64eb: add support for ARM64 big endian.	2017-04-24 23:30:01 +08:00
docs	docs: clarify memory region lifecycle	2018-02-12 15:11:21 -05:00
fpu	fpu: add mechanism to check for invalid long double formats	2018-02-26 02:27:40 -05:00
hw	qdev: Fix object reference leak in case device.realize() fails	2018-02-25 21:00:26 -05:00
include	tcg: Add atomic128 helpers	2018-02-27 21:43:48 -05:00
qapi	qapi: rename QmpOutputVisitor to QObjectOutputVisitor	2018-02-27 08:05:33 -05:00
qobject	qapi: rename QmpOutputVisitor to QObjectOutputVisitor	2018-02-27 08:05:33 -05:00
qom	qapi: rename QmpOutputVisitor to QObjectOutputVisitor	2018-02-27 08:05:33 -05:00
scripts	qapi: rename QmpOutputVisitor to QObjectOutputVisitor	2018-02-27 08:05:33 -05:00
target-arm	target-arm: Implement new HLT trap for semihosting	2018-02-26 15:28:45 -05:00
target-i386	target-i386: remove helper_lock()	2018-02-27 23:43:22 -05:00
target-m68k	target-m68k: Optimize gen_flush_flags	2018-02-27 10:19:54 -05:00
target-mips	softmmu: Add probe_write()	2018-02-27 12:20:50 -05:00
target-sparc	sparc: Use g_memdup() instead of g_new0() + memcpy()	2018-02-25 23:19:44 -05:00
tcg	tcg: Emit barriers with parallel_cpus	2018-02-27 22:28:33 -05:00
util	qapi: rename QmpOutputVisitor to QObjectOutputVisitor	2018-02-27 08:05:33 -05:00
aarch64.h	tcg: Add CONFIG_ATOMIC64	2018-02-27 22:25:36 -05:00
aarch64eb.h	tcg: Add CONFIG_ATOMIC64	2018-02-27 22:25:36 -05:00
accel.c	accel: make configure_accelerator return void	2018-02-24 00:31:28 -05:00
arm.h	tcg: Add CONFIG_ATOMIC64	2018-02-27 22:25:36 -05:00
armeb.h	tcg: Add CONFIG_ATOMIC64	2018-02-27 22:25:36 -05:00
atomic_template.h	tcg: Add atomic128 helpers	2018-02-27 21:43:48 -05:00
CODING_STYLE	import	2015-08-21 15:04:50 +08:00
configure	tcg: Add CONFIG_ATOMIC64	2018-02-27 22:25:36 -05:00
COPYING	import	2015-08-21 15:04:50 +08:00
COPYING.LIB	import	2015-08-21 15:04:50 +08:00
cpu-exec-common.c	tcg: Add EXCP_ATOMIC	2018-02-27 11:57:58 -05:00
cpu-exec.c	tcg: Add EXCP_ATOMIC	2018-02-27 11:57:58 -05:00
cpus.c	tcg: Add EXCP_ATOMIC	2018-02-27 11:57:58 -05:00
cputlb.c	tcg: Add CONFIG_ATOMIC64	2018-02-27 22:25:36 -05:00
exec.c	exec: Avoid direct references to Int128 parts	2018-02-27 11:01:43 -05:00
gen_all_header.sh	arm64eb: add support for ARM64 big endian.	2017-04-24 23:30:01 +08:00
glib_compat.c	qapi: Fix memleak in string visitors on int lists	2018-02-25 00:20:34 -05:00
HACKING	import	2015-08-21 15:04:50 +08:00
header_gen.py	tcg: Add CONFIG_ATOMIC64	2018-02-27 22:25:36 -05:00
ioport.c	hw: remove pio_addr_t	2018-02-24 02:43:16 -05:00
LICENSE	import	2015-08-21 15:04:50 +08:00
m68k.h	tcg: Add CONFIG_ATOMIC64	2018-02-27 22:25:36 -05:00
Makefile	Makefile: Add a FORCE target	2018-02-24 17:03:51 -05:00
Makefile.objs	tcg: Add atomic helpers	2018-02-27 15:57:47 -05:00
Makefile.target	tcg: Add atomic helpers	2018-02-27 15:57:47 -05:00
memory.c	exec.c: Remove static allocation of sub_section of sub_page	2018-02-26 10:50:04 -05:00
memory_mapping.c	include/qemu/osdep.h: Don't include qapi/error.h	2018-02-21 23:08:18 -05:00
mips.h	tcg: Add CONFIG_ATOMIC64	2018-02-27 22:25:36 -05:00
mips64.h	tcg: Add CONFIG_ATOMIC64	2018-02-27 22:25:36 -05:00
mips64el.h	tcg: Add CONFIG_ATOMIC64	2018-02-27 22:25:36 -05:00
mipsel.h	tcg: Add CONFIG_ATOMIC64	2018-02-27 22:25:36 -05:00
powerpc.h	tcg: Add CONFIG_ATOMIC64	2018-02-27 22:25:36 -05:00
qapi-schema.json	qapi: Lazy creation of array types	2018-02-19 18:55:35 -05:00
qemu-timer.c	timer/cpus: fix some typos and update some comments	2018-02-25 23:21:57 -05:00
rules.mak	rules.mak: Don't extract libs from .mo-libs in link command	2018-02-26 02:08:03 -05:00
softmmu_template.h	cputlb: Remove includes from softmmu_template.h	2018-02-27 12:40:43 -05:00
sparc.h	tcg: Add CONFIG_ATOMIC64	2018-02-27 22:25:36 -05:00
sparc64.h	tcg: Add CONFIG_ATOMIC64	2018-02-27 22:25:36 -05:00
tcg-runtime.c	tcg: Add CONFIG_ATOMIC64	2018-02-27 22:25:36 -05:00
translate-all.c	tcg: Add EXCP_ATOMIC	2018-02-27 11:57:58 -05:00
translate-all.h	translate-all.c: Compute L1 page table properties at runtime	2018-02-26 11:46:58 -05:00
translate-common.c	exec: Clean up includes	2018-02-19 00:49:55 -05:00
unicorn_common.h	qom/cpu: Add MemoryRegion property	2018-02-18 21:54:50 -05:00
VERSION	import	2015-08-21 15:04:50 +08:00
vl.c	cpu: Support a target CPU having a variable page size	2018-02-26 12:29:08 -05:00
vl.h	import	2015-08-21 15:04:50 +08:00
x86_64.h	tcg: Add CONFIG_ATOMIC64	2018-02-27 22:25:36 -05:00