unicorn/qemu
Emilio G. Cota 8276a4dc66
hardfloat: implement float32/64 comparison
Performance results for fp-bench:

Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
- before:
cmp-single: 110.98 MFlops
cmp-double: 107.12 MFlops
- after:
cmp-single: 506.28 MFlops
cmp-double: 524.77 MFlops

Note that flattening both eq and eq_signaling versions
would give us extra performance (695v506, 615v524 Mflops
for single/double, respectively) but this would emit two
essentially identical functions for each eq/signaling pair,
which is a waste.

Aggregate performance improvement for the last few patches:
[ all charts in png: https://imgur.com/a/4yV8p ]

1. Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz

qemu-aarch64 NBench score; higher is better
Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz

16 +-+-----------+-------------+----===-------+---===-------+-----------+-+
14 +-+..........................@@@&&.=.......@@@&&.=...................+-+
12 +-+..........................@.@.&.=.......@.@.&.=.....+befor=== +-+
10 +-+..........................@.@.&.=.......@.@.&.=.....+ad@@&& = +-+
8 +-+.......................$$$%.@.&.=.......@.@.&.=.....+ @@u& = +-+
6 +-+............@@@&&=+***##.$%.@.&.=***##$$%+@.&.=..###$$%%@i& = +-+
4 +-+.......###$%%.@.&=.*.*.#.$%.@.&.=*.*.#.$%.@.&.=+**.#+$ +@m& = +-+
2 +-+.....***.#$.%.@.&=.*.*.#.$%.@.&.=*.*.#.$%.@.&.=.**.#+$+sqr& = +-+
0 +-+-----***##$%%@@&&=-***##$$%@@&&==***##$$%@@&&==-**##$$%+cmp==-----+-+
FOURIER NEURAL NELU DECOMPOSITION gmean

qemu-aarch64 SPEC06fp (test set) speedup over QEMU 4c2c1015905
Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
error bars: 95% confidence interval

4.5 +-+---+-----+----+-----+-----+-&---+-----+----+-----+-----+-----+----+-----+-----+-----+-----+----+-----+---+-+
4 +-+..........................+@@+...........................................................................+-+
3.5 +-+..............%%@&.........@@..............%%@&............................................+++dsub +-+
2.5 +-+....&&+.......%%@&.......+%%@..+%%&+..@@&+.%%@&....................................+%%&+.+%@&++%%@& +-+
2 +-+..+%%&..+%@&+.%%@&...+++..%%@...%%&.+$$@&..%%@&..%%@&.......+%%&+.%%@&+......+%%@&.+%%&++$$@&++d%@& %%@&+-+
1.5 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**#%@&**$%@&*#$%@**#$%&**#$@&*+f%@&**$%@&+-+
0.5 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**#%@&**$%@&*#$%@**#$%&**#$@&+sqr@&**$%@&+-+
0 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**#%@&**$%@&*#$%@**#$%&**#$@&*+cmp&**$%@&+-+
410.bw416.gam433.434.z435.436.cac437.lesli444.447.de450.so453454.ca459.GemsF465.tont470.lb4482.sphinxgeomean

2. Host: ARM Aarch64 A57 @ 2.4GHz

qemu-aarch64 NBench score; higher is better
Host: Applied Micro X-Gene, Aarch64 A57 @ 2.4 GHz

5 +-+-----------+-------------+-------------+-------------+-----------+-+
4.5 +-+........................................@@@&==...................+-+
3 4 +-+..........................@@@&==........@.@&.=.....+before +-+
3 +-+..........................@.@&.=........@.@&.=.....+ad@@@&== +-+
2.5 +-+.....................##$$%%.@&.=........@.@&.=.....+ @m@& = +-+
2 +-+............@@@&==.***#.$.%.@&.=.***#$$%%.@&.=.***#$$%%d@& = +-+
1.5 +-+.....***#$$%%.@&.=.*.*#.$.%.@&.=.*.*#.$.%.@&.=.*.*#+$ +f@& = +-+
0.5 +-+.....*.*#.$.%.@&.=.*.*#.$.%.@&.=.*.*#.$.%.@&.=.*.*#+$+sqr& = +-+
0 +-+-----***#$$%%@@&==-***#$$%%@@&==-***#$$%%@@&==-***#$$%+cmp==-----+-+
FOURIER NEURAL NLU DECOMPOSITION gmean
2018-12-19 10:45:22 -05:00
..
accel tcg: Support MMU protection regions smaller than TARGET_PAGE_SIZE 2018-11-16 21:35:54 -05:00
crypto crypto: Clean up includes 2018-02-19 00:47:40 -05:00
default-configs arm64eb: add support for ARM64 big endian. 2017-04-24 23:30:01 +08:00
docs docs/devel/memory.txt: Document _with_attrs accessors 2018-10-04 04:46:26 -04:00
fpu hardfloat: implement float32/64 comparison 2018-12-19 10:45:22 -05:00
hw hw/mips/mips_r4k: Fix initialization of MIPS target CPUs 2018-09-03 17:40:08 -04:00
include softfloat: add float{32,64}_is_zero_or_normal 2018-12-19 10:31:10 -05:00
qapi qapi: Rewrite string-input-visitor's integer and list parsing 2018-12-18 04:57:25 -05:00
qobject qstring: Move qstring_from_substr()'s @end one to the right 2018-08-02 21:24:19 -04:00
qom tcg: access cpu->icount_decr.u16.high with atomics 2018-10-23 14:36:46 -04:00
scripts qapi: Do not define enumeration value explicitly 2018-12-18 05:03:22 -05:00
target target/arm: Free name string in ARMCPRegInfo hashtable entries' 2018-12-18 05:09:59 -05:00
tcg tcg: Drop nargs from tcg_op_insert_{before,after} 2018-12-18 06:00:13 -05:00
util cutils: Fix qemu_strtosz() & friends to reject non-finite sizes 2018-12-18 04:48:12 -05:00
aarch64.h target/arm: Introduce arm_hcr_el2_eff 2018-12-18 04:27:34 -05:00
aarch64eb.h target/arm: Introduce arm_hcr_el2_eff 2018-12-18 04:27:34 -05:00
accel.c clean-up: removed duplicate #includes 2018-02-28 08:51:56 -05:00
arm.h target/arm: Introduce arm_hcr_el2_eff 2018-12-18 04:27:34 -05:00
armeb.h target/arm: Introduce arm_hcr_el2_eff 2018-12-18 04:27:34 -05:00
CODING_STYLE
configure configure: Remove old -fno-gcse workaround for GCC 4.6.x and 4.7.[012] 2018-12-18 03:52:36 -05:00
COPYING
COPYING.LIB
cpus.c Include qapi/error.h exactly where needed 2018-03-07 12:26:38 -05:00
exec.c Partial backport of: exec.c: Handle IOMMUs in address_space_translate_for_iotlb() 2018-11-16 21:24:55 -05:00
gen_all_header.sh arm64eb: add support for ARM64 big endian. 2017-04-24 23:30:01 +08:00
glib_compat.c Use cpu_create(type) instead of cpu_init(cpu_model) 2018-03-20 14:20:30 -04:00
HACKING HACKING: document preference for g_new instead of g_malloc 2018-05-22 00:30:50 -04:00
header_gen.py target/arm: Introduce arm_hcr_el2_eff 2018-12-18 04:27:34 -05:00
ioport.c hw: remove pio_addr_t 2018-02-24 02:43:16 -05:00
LICENSE
m68k.h target/arm: Correctly implement handling of HCR_EL2.{VI, VF} 2018-11-16 21:53:53 -05:00
Makefile Revert "Makefile: Rename TARGET_DIRS to TARGET_LIST" 2018-07-05 17:40:24 -04:00
Makefile.objs qapi: Move qapi-schema.json to qapi/, rename generated files 2018-03-09 11:35:11 -05:00
Makefile.target configure: Remove old -fno-gcse workaround for GCC 4.6.x and 4.7.[012] 2018-12-18 03:52:36 -05:00
memory.c memory: learn about non-volatile memory region 2018-11-11 08:50:39 -05:00
memory_ldst.inc.c exec: Fix MAP_RAM for cached access 2018-07-03 01:11:12 -04:00
memory_mapping.c include/qemu/osdep.h: Don't include qapi/error.h 2018-02-21 23:08:18 -05:00
mips.h target/arm: Correctly implement handling of HCR_EL2.{VI, VF} 2018-11-16 21:53:53 -05:00
mips64.h target/arm: Correctly implement handling of HCR_EL2.{VI, VF} 2018-11-16 21:53:53 -05:00
mips64el.h target/arm: Correctly implement handling of HCR_EL2.{VI, VF} 2018-11-16 21:53:53 -05:00
mipsel.h target/arm: Correctly implement handling of HCR_EL2.{VI, VF} 2018-11-16 21:53:53 -05:00
powerpc.h target/arm: Correctly implement handling of HCR_EL2.{VI, VF} 2018-11-16 21:53:53 -05:00
qemu-timer.c timer/cpus: fix some typos and update some comments 2018-02-25 23:21:57 -05:00
riscv32.h target/arm: Add v8M stack checks on ADD/SUB/MOV of SP 2018-10-08 14:15:15 -04:00
riscv64.h target/arm: Add v8M stack checks on ADD/SUB/MOV of SP 2018-10-08 14:15:15 -04:00
rules.mak build-sys: silence make by default or V=0 2018-03-06 08:58:03 -05:00
sparc.h target/arm: Correctly implement handling of HCR_EL2.{VI, VF} 2018-11-16 21:53:53 -05:00
sparc64.h target/arm: Correctly implement handling of HCR_EL2.{VI, VF} 2018-11-16 21:53:53 -05:00
unicorn_common.h unicorn_common: Fix unicorn memory functions failing 2018-09-03 10:40:14 -04:00
VERSION Open 4.0 development tree 2018-12-11 20:33:45 -05:00
vl.c Use cpu_create(type) instead of cpu_init(cpu_model) 2018-03-20 14:20:30 -04:00
vl.h
x86_64.h target/arm: Correctly implement handling of HCR_EL2.{VI, VF} 2018-11-16 21:53:53 -05:00