unicorn

mirror of https://github.com/yuzu-emu/unicorn.git synced 2025-09-08 19:17:08 +00:00

Author	SHA1	Message	Date
Richard Henderson	254f882efc	target/arm: SVE brk[ab] merging does not have s bit While brk[ab] zeroing has a flags setting option, the merging variant does not. Retain the same argument structure, to share expansion but force the flag zero and do not decode bit 22. Backports commit 407e6ce7f1f428cb242d424cd35381a77b5b2071 from qemu	2019-01-13 19:39:34 -05:00
Richard Henderson	4d8b7a9967	target/arm: Convert ARM_TBFLAG_* to FIELDs Use "register" TBFLAG_ANY to indicate shared state between A32 and A64, and "registers" TBFLAG_A32 & TBFLAG_A64 for fields that are specific to the given cpu state. Move ARM_TBFLAG_BE_DATA to shared state, instead of its current placement within "Bit usage when in AArch32 state". Backports commit aad821ac4faad369fad8941d25e59edf2514246b from qemu	2019-01-13 19:21:18 -05:00
Fredrik Noring	ee4b59e981	target/mips: Support R5900 three-operand MADD1 and MADDU1 instructions The three-operand MADD and MADDU are specific to R5900 cores. Backports commit a95c4c26f1dc233987350e7cb1cf62d46ade5ce5 from qemu	2019-01-05 08:07:56 -05:00
Philippe Mathieu-Daudé	76bc93690f	target/mips: Support R5900 three-operand MADD and MADDU instructions The three-operand MADD and MADDU are specific to Sony R5900 core, and Toshiba TX19/TX39/TX79 cores as well. The "32-Bit TX System RISC TX39 Family Architecture manual" is available at https://wiki.qemu.org/File:DSAE0022432.pdf Backports commit 3b948f053fc588154d95228da8a6561c61c66104 from qemu	2019-01-05 08:03:43 -05:00
Aleksandar Markovic	5729c803a7	target/mips: MXU: Add handler for an align instruction Add translation handler for S32ALNI MXU instruction. Backports commit 79f5fee7a3c53494c7ca4bc18c72944f5e2d5c2f from qemu	2019-01-05 08:00:09 -05:00
Aleksandar Markovic	94956d81f6	target/mips: MXU: Add handlers for max/min instructions Add translation handlers for six max/min MXU instructions. Backports commit bb84cbf38505bd1d800fdddcd81407a99e5c2142 from qemu	2019-01-05 07:55:39 -05:00
Aleksandar Markovic	bf7da7bf57	target/mips: MXU: Add handlers for logic instructions Add translation handlers for four logic MXU instructions. It should be noted that there is an error in MXU documentation (dated June 2017) regarding opcodes for this group of instructions. This was confirmed by running tests on hardware, and also by looking up other related public source trees (binutils, Android NDK). In initial MXU patches to QEMU, opcodes for MXU logic instructions were created to be in accordance with the MXU documentation, therefore the error from was propagated. This patch corrects that, changing the involved code. Besides that, as MXU was designed and implemented only for 32-bit CPUs, corresponding preprosessor conditions were added around MXU code, which allows more flexible implementation of MXU handlers. Backports commit b621f0187ef789aeef733cf79e5ac83984752394 from qemu	2019-01-05 07:48:08 -05:00
Aleksandar Markovic	ba253dd0d3	target/mips: MXU: Improve the comment containing MXU overview Improve textual description of MXU extension. These are mostly comment formatting changes. Backports commit 84e2c895b12fb7056daeb7e5094656eae7b50d3d from qemu	2019-01-05 07:39:47 -05:00
Aleksandar Markovic	57bb979ce8	target/mips: MXU: Add generic naming for optn2 constants Add generic naming involving generig suffixes OPTN0, OPTN1, OPTN2, OPTN3 for four optn2 constants. Existing suffixes WW, LW, HW, XW are not quite appropriate for some instructions using optn2.	2019-01-05 07:35:49 -05:00
Aleksandar Markovic	b5e1ea2e08	target/mips: MXU: Add missing opcodes/decoding for LX* instructions Add missing opcodes and decoding engine for LXB, LXH, LXW, LXBU, and LXHU instructions. They were for some reason forgotten in previous commits. The MXU opcode list and decoding engine should be now complete. Backports commit c233bf07af7cf2358b69c38150dbd2e3e4a399b6 from qemu	2019-01-05 07:34:07 -05:00
Paul Burton	1c6732b053	atomics: Set ATOMIC_REG_SIZE=8 for MIPS n32 ATOMIC_REG_SIZE is currently defined as the default sizeof(void ) for all MIPS host builds, including those using the n32 ABI. n32 is the MIPS64 ILP32 ABI and as such tcg/mips/tcg-target.h defines TCG_TARGET_REG_BITS as 64 for n32 builds. If we attempt to build QEMU for an n32 host with support for a 64b target architecture then TCG_OVERSIZED_GUEST is 0 and accel/tcg/cputlb.c attempts to use atomic_ functions. This fails because ATOMIC_REG_SIZE is 4, causing the calls to QEMU_BUILD_BUG_ON(sizeof(ptr) > ATOMIC_REG_SIZE) in the various atomic_ functions to generate errors. Fix this by defining ATOMIC_REG_SIZE as 8 for all MIPS64 builds, which will cover both n32 (ILP32) & n64 (LP64) ABIs in much the same was as we already do for x86_64/x32. Backports commit c5b00c1684f3317e887c7401b58dde54c2b05354 from qemu	2019-01-05 07:26:14 -05:00
Richard Henderson	6ed82f77b4	tcg: Improve call argument loading Free the argument register only after we have verified that the temporary is not already in that register. This case is likely now that we are back propagating the preferred register. Backports commit 4250da10923347c9ee907f8d72bd93dfa5ee8742 from qemu	2019-01-05 07:24:08 -05:00
Richard Henderson	64843e8c09	tcg: Record register preferences during liveness With these preferences, we can arrange for function call arguments to be computed into the proper registers instead of requiring extra moves. Backports commit 25f49c5f1508ddf081ce89fa6bbfd87a51eea37b from qemu	2019-01-05 07:22:57 -05:00
Richard Henderson	c2be1cee79	tcg: Add TCG_OPF_BB_EXIT Use this to notice the opcodes that exit the TB, which implies that local temps are really dead and need not be synced. Previously we so marked the true end of the TB, but that was immediately overwritten by the la_bb_end invoked by any TCG_OPF_BB_END opcode, like exit_tb. Backports commit ae36a246ed1a0e96c6c4f478f03d047dfa3a8898 from qemu	2019-01-05 07:09:38 -05:00
Richard Henderson	63cf164724	tcg: Split out more subroutines from liveness_pass_1 Backports commit f65a061c39cc4f9d088201031050e42eb23d5b2a from qemu	2019-01-05 07:07:49 -05:00
Richard Henderson	c348ceba56	tcg: Rename and adjust liveness_pass_1 helpers No need for a "tcg_" prefix for a static function; we already have another "la_" prefix for indicating liveness analysis. Pass in nb_globals and nb_temps, as we will already have them in registers for other loops within the parent function. Backports commit 2616c8082143373e794b62444bf81754f50dbf6b from qemu	2019-01-05 07:05:58 -05:00
Richard Henderson	b356212b33	tcg: Dump register preference info with liveness Backports commit 1894f69a612b35c2a39b44a824da04d74bfe324a from qemu	2019-01-05 07:00:21 -05:00
Richard Henderson	494d802781	tcg: Improve register allocation for matching constraints Try harder to honor the output_pref. When we're forced to allocate a second register for the input, it does not need to use the input constraint; that will be honored by the register we allocate for the output and a move is already required. Backports commit d62816f2db439b2dd761c674f0256f21d9dd2ed0 from qemu	2019-01-05 06:57:56 -05:00
Richard Henderson	83a7de2566	tcg: Add output_pref to TCGOp Allocate storage for, but do not yet fill in, per-opcode preferences for the output operands. Pass it in to the register allocation routines for output operands. Backports commit 69e3706d2b473815e382552e729d12590339e0ac from qemu	2019-01-05 06:54:40 -05:00
Richard Henderson	19bde1a9cf	tcg: Add preferred_reg argument to tcg_reg_alloc_do_movi Pass this through to temp_sync. Backports commit ba87719cd267e6f07b17f6cda08246bf483146d4 from qemu	2019-01-05 06:51:55 -05:00
Richard Henderson	c3aa567b03	tcg: Add preferred_reg argument to temp_sync Pass this through to tcg_reg_alloc. Backports commit 98b4e186c1ccb8f1868c61a33a3be8c2b82654f3 from qemu	2019-01-05 06:50:22 -05:00
Richard Henderson	96b6640f3b	tcg: Add preferred_reg argument to temp_load Pass this through to tcg_reg_alloc. Backports commit b722452aefb089e003b16946a4d73bad1fd3b79b from qemu	2019-01-05 06:48:19 -05:00
Richard Henderson	5e73b27607	tcg: Add preferred_reg argument to tcg_reg_alloc This new argument will aid register allocation by indicating how the temporary will be used in future. If the preference cannot be satisfied, fall back to the constraints of the current insn. Short circuit the preference when it cannot be satisfied or if it does not further constrain the operation. With an eye toward optimizing function call sequences, optimize for the preferred_reg set containing a single register. For the moment, all users pass 0 for preference. Backports commit b016486e7baddb43cfc1e51909b05cde9cf82e0c from qemu	2019-01-05 06:45:15 -05:00
Richard Henderson	6aea2880d2	tcg: Add reachable_code_pass Delete trivially dead code that follows unconditional branches and noreturn helpers. These can occur either via optimization or via the structure of a target's translator following an exception. Backports commit b4fc67c7afd2c338d6e7c73a7f428dfe05ae0603 from qemu	2019-01-05 06:41:16 -05:00
Richard Henderson	26ab4d6560	tcg: Reference count labels Increment when adding branches, and decrement when removing them. Backports commit d88a117eaa39b1d0eb1a79fe84c81840a39eb233 from qemu	2019-01-05 06:39:20 -05:00
Richard Henderson	80b4bef1cc	tcg: Add TCG_CALL_NO_RETURN Remember which helpers have been marked noreturn. Backports commit 15d7409260498505e991e7b9d87118627165e613 from qemu	2019-01-05 06:35:21 -05:00
Richard Henderson	7dbbf58653	tcg: Renumber TCG_CALL_* flags Previously, the low 4 bits were used for TCG_CALL_TYPE_MASK, which was removed in 6a18ae2d2947532d5c26439548afa0481c4529f9. Backports commit 3b50352b05eeafeb95cccd770f7aaba00bbdf6fe from qemu	2019-01-05 06:32:52 -05:00
Marc-André Lureau	ba1f54804a	qapi: fix flat union on uncovered branches conditionals Default branches variant should use the member conditional. This fixes compilation with --disable-replication. Fixes: 335d10cd8e2c3bb6067804b095aaf6371fc1983e Backports commit ce1a1aec47877a281d69dbc2e65f86bfe8fea231 from qemu	2018-12-19 10:53:29 -05:00
Lioncash	f8435ca3a6	Temporarily disable tcg_debug_assert() Backporting 6fa2cef205a60b5c5c3b058f53852416b885c455 by Thomas Huth started invoking assertions on clang. This means Unicorn is doing something silly. This should be tracked down, but in the meantime, restore behavior to allow tests to still be run.	2018-12-19 10:50:48 -05:00
Emilio G. Cota	8276a4dc66	hardfloat: implement float32/64 comparison Performance results for fp-bench: Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: cmp-single: 110.98 MFlops cmp-double: 107.12 MFlops - after: cmp-single: 506.28 MFlops cmp-double: 524.77 MFlops Note that flattening both eq and eq_signaling versions would give us extra performance (695v506, 615v524 Mflops for single/double, respectively) but this would emit two essentially identical functions for each eq/signaling pair, which is a waste. Aggregate performance improvement for the last few patches: [ all charts in png: https://imgur.com/a/4yV8p ] 1. Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz qemu-aarch64 NBench score; higher is better Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz 16 +-+-----------+-------------+----===-------+---===-------+-----------+-+ 14 +-+..........................@@@&&.=.......@@@&&.=...................+-+ 12 +-+..........................@.@.&.=.......@.@.&.=.....+befor=== +-+ 10 +-+..........................@.@.&.=.......@.@.&.=.....+ad@@&& = +-+ 8 +-+.......................$$$%.@.&.=.......@.@.&.=.....+ @@u& = +-+ 6 +-+............@@@&&=+*##.$%.@.&.=##$$%+@.&.=..###$$%%@i& = +-+ 4 +-+.......###$%%.@.&=...#.$%.@.&.=..#.$%.@.&.=+.#+$ +@m& = +-+ 2 +-+.....*.#$.%.@.&=...#.$%.@.&.=..#.$%.@.&.=..#+$+sqr& = +-+ 0 +-+-----##$%%@@&&=-##$$%@@&&==##$$%@@&&==-##$$%+cmp==-----+-+ FOURIER NEURAL NELU DECOMPOSITION gmean qemu-aarch64 SPEC06fp (test set) speedup over QEMU 4c2c1015905 Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz error bars: 95% confidence interval 4.5 +-+---+-----+----+-----+-----+-&---+-----+----+-----+-----+-----+----+-----+-----+-----+-----+----+-----+---+-+ 4 +-+..........................+@@+...........................................................................+-+ 3.5 +-+..............%%@&.........@@..............%%@&............................................+++dsub +-+ 2.5 +-+....&&+.......%%@&.......+%%@..+%%&+..@@&+.%%@&....................................+%%&+.+%@&++%%@& +-+ 2 +-+..+%%&..+%@&+.%%@&...+++..%%@...%%&.+$$@&..%%@&..%%@&.......+%%&+.%%@&+......+%%@&.+%%&++$$@&++d%@& %%@&+-+ 1.5 +-+#$%&#$@&#%@&$%@#$%@#$%&#$@&$%@&#$%@#$%@#$%&#%@&$%@&#$%@#$%&#$@&+f%@&$%@&+-+ 0.5 +-+#$%&#$@&#%@&$%@#$%@#$%&#$@&$%@&#$%@#$%@#$%&#%@&$%@&#$%@#$%&#$@&+sqr@&$%@&+-+ 0 +-+#$%&#$@&#%@&$%@#$%@#$%&#$@&$%@&#$%@#$%@#$%&#%@&$%@&#$%@#$%&#$@&+cmp&$%@&+-+ 410.bw416.gam433.434.z435.436.cac437.lesli444.447.de450.so453454.ca459.GemsF465.tont470.lb4482.sphinxgeomean 2. Host: ARM Aarch64 A57 @ 2.4GHz qemu-aarch64 NBench score; higher is better Host: Applied Micro X-Gene, Aarch64 A57 @ 2.4 GHz 5 +-+-----------+-------------+-------------+-------------+-----------+-+ 4.5 +-+........................................@@@&==...................+-+ 3 4 +-+..........................@@@&==........@.@&.=.....+before +-+ 3 +-+..........................@.@&.=........@.@&.=.....+ad@@@&== +-+ 2.5 +-+.....................##$$%%.@&.=........@.@&.=.....+ @m@& = +-+ 2 +-+............@@@&==.#.$.%.@&.=.#$$%%.@&.=.#$$%%d@& = +-+ 1.5 +-+.....*#$$%%.@&.=..#.$.%.@&.=..#.$.%.@&.=..#+$ +f@& = +-+ 0.5 +-+......#.$.%.@&.=..#.$.%.@&.=..#.$.%.@&.=..#+$+sqr& = +-+ 0 +-+-----#$$%%@@&==-#$$%%@@&==-#$$%%@@&==-*#$$%+cmp==-----+-+ FOURIER NEURAL NLU DECOMPOSITION gmean	2018-12-19 10:45:22 -05:00
Emilio G. Cota	f7549fc13e	hardfloat: implement float32/64 square root Performance results for fp-bench: Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: sqrt-single: 42.30 MFlops sqrt-double: 22.97 MFlops - after: sqrt-single: 311.42 MFlops sqrt-double: 311.08 MFlops Here USE_FP makes a huge difference for f64's, with throughput going from ~200 MFlops to ~300 MFlops. Backports commit f131bae8a7b7ed1928cc94c69df291db609c316a from qemu	2018-12-19 10:43:23 -05:00
Emilio G. Cota	3cf836ca83	hardfloat: implement float32/64 fused multiply-add Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: fma-single: 74.73 MFlops fma-double: 74.54 MFlops - after: fma-single: 203.37 MFlops fma-double: 169.37 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: fma-single: 23.24 MFlops fma-double: 23.70 MFlops - after: fma-single: 66.14 MFlops fma-double: 63.10 MFlops 3. IBM POWER8E @ 2.1 GHz - before: fma-single: 37.26 MFlops fma-double: 37.29 MFlops - after: fma-single: 48.90 MFlops fma-double: 59.51 MFlops Here having 3FP64 set to 1 pays off for x86_64: [1] 170.15 vs [0] 153.12 MFlops Backports commit ccf770ba7396c240ca8a1564740083742dd04c08 from qemu	2018-12-19 10:42:00 -05:00
Emilio G. Cota	95781d2bb5	hardfloat: implement float32/64 division Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: div-single: 34.84 MFlops div-double: 34.04 MFlops - after: div-single: 275.23 MFlops div-double: 216.38 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: div-single: 9.33 MFlops div-double: 9.30 MFlops - after: div-single: 51.55 MFlops div-double: 15.09 MFlops 3. IBM POWER8E @ 2.1 GHz - before: div-single: 25.65 MFlops div-double: 24.91 MFlops - after: div-single: 96.83 MFlops div-double: 31.01 MFlops Here setting 2FP64_USE_FP to 1 pays off for x86_64: [1] 215.97 vs [0] 62.15 MFlops Backports commit 4a6295613f533a6841de5968c50e1ca36748807e from qemu	2018-12-19 10:40:00 -05:00
Emilio G. Cota	93991714fb	hardfloat: implement float32/64 multiplication Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: mul-single: 126.91 MFlops mul-double: 118.28 MFlops - after: mul-single: 258.02 MFlops mul-double: 197.96 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: mul-single: 37.42 MFlops mul-double: 38.77 MFlops - after: mul-single: 73.41 MFlops mul-double: 76.93 MFlops 3. IBM POWER8E @ 2.1 GHz - before: mul-single: 58.40 MFlops mul-double: 59.33 MFlops - after: mul-single: 60.25 MFlops mul-double: 94.79 MFlops Backports commit 2dfabc86e656e835c67954c60e143ecd33e15817 from qemu	2018-12-19 10:38:33 -05:00
Emilio G. Cota	0862d9c462	hardfloat: implement float32/64 addition and subtraction Performance results (single and double precision) for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: add-single: 135.07 MFlops add-double: 131.60 MFlops sub-single: 130.04 MFlops sub-double: 133.01 MFlops - after: add-single: 443.04 MFlops add-double: 301.95 MFlops sub-single: 411.36 MFlops sub-double: 293.15 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: add-single: 44.79 MFlops add-double: 49.20 MFlops sub-single: 44.55 MFlops sub-double: 49.06 MFlops - after: add-single: 93.28 MFlops add-double: 88.27 MFlops sub-single: 91.47 MFlops sub-double: 88.27 MFlops 3. IBM POWER8E @ 2.1 GHz - before: add-single: 72.59 MFlops add-double: 72.27 MFlops sub-single: 75.33 MFlops sub-double: 70.54 MFlops - after: add-single: 112.95 MFlops add-double: 201.11 MFlops sub-single: 116.80 MFlops sub-double: 188.72 MFlops Note that the IBM and ARM machines benefit from having HARDFLOAT_2F{32,64}_USE_FP set to 0. Otherwise their performance can suffer significantly: - IBM Power8: add-single: [1] 54.94 vs [0] 116.37 MFlops add-double: [1] 58.92 vs [0] 201.44 MFlops - Aarch64 A57: add-single: [1] 80.72 vs [0] 93.24 MFlops add-double: [1] 82.10 vs [0] 88.18 MFlops On the Intel machine, having 2F64 set to 1 pays off, but it doesn't for 2F32: - Intel i7-6700K: add-single: [1] 285.79 vs [0] 426.70 MFlops add-double: [1] 302.15 vs [0] 278.82 MFlops Backports commit 1b615d482094e0123d187f0ad3c676ba8eb9d0a3 from qemu	2018-12-19 10:36:55 -05:00
Emilio G. Cota	bca8e39e3c	fpu: introduce hardfloat The appended paves the way for leveraging the host FPU for a subset of guest FP operations. For most guest workloads (e.g. FP flags aren't ever cleared, inexact occurs often and rounding is set to the default [to nearest]) this will yield sizable performance speedups. The approach followed here avoids checking the FP exception flags register. See the added comment for details. This assumes that QEMU is running on an IEEE754-compliant FPU and that the rounding is set to the default (to nearest). The implementation-dependent specifics of the FPU should not matter; things like tininess detection and snan representation are still dealt with in soft-fp. However, this approach will break on most hosts if we compile QEMU with flags that break IEEE compatibility. There is no way to detect all of these flags at compilation time, but at least we check for -ffast-math (which defines __FAST_MATH__) and disable hardfloat (plus emit a #warning) when it is set. This patch just adds common code. Some operations will be migrated to hardfloat in subsequent patches to ease bisection. Note: some architectures (at least PPC, there might be others) clear the status flags passed to softfloat before most FP operations. This precludes the use of hardfloat, so to avoid introducing a performance regression for those targets, we add a flag to disable hardfloat. In the long run though it would be good to fix the targets so that at least the inexact flag passed to softfloat is indeed sticky. Backports commit a94b783952cc493cb241aabb1da8c7a830385baa from qemu	2018-12-19 10:32:32 -05:00
Emilio G. Cota	5d3ccde625	softfloat: add float{32,64}_is_zero_or_normal These will gain some users very soon. Backports commit 315df0d193929b167b9d7be4665d5f2c0e2427e0 from qemu	2018-12-19 10:31:10 -05:00
Emilio G. Cota	a9d9005399	softfloat: rename canonicalize to sf_canonicalize glibc >= 2.25 defines canonicalize in commit eaf5ad0 (Add canonicalize, canonicalizef, canonicalizel., 2016-10-26). Given that we'll be including <math.h> soon, prepare for this by prefixing our canonicalize() with sf_ to avoid clashing with the libc's canonicalize(). Backports commit f9943c7f766678af36d31076b78e466256f4871b from qemu	2018-12-19 10:30:38 -05:00
Emilio G. Cota	3a8f7d6d84	softfloat: add float{32,64}_is_{de,}normal This paves the way for upcoming work. Backports commit 588e6dfd8774e6da56b6995611655fbe59ff564a from qemu	2018-12-19 10:30:33 -05:00
Emilio G. Cota	3d0359c0f5	xxhash: match output against the original xxhash32 Change the order in which we extract a/b and c/d to match the output of the upstream xxhash32. Tested with: https://github.com/cota/xxhash/tree/qemu Backports commit b7c2cd08a6f68010ad27c9c0bf2fde02fb743a0e from qemu	2018-12-18 06:09:01 -05:00
Emilio G. Cota	308f4c1e0c	include: move exec/tb-hash-xx.h to qemu/xxhash.h Backports commit fe656e3185fa10973d43492c867643e80fa433cd from qemu	2018-12-18 06:07:55 -05:00
Emilio G. Cota	63082a4d20	exec: introduce qemu_xxhash{2,4,5,6,7} Before moving them all to include/qemu/xxhash.h. Backports commit c971d8fa73ff92996d751fa87d90f220cf3c8194 from qemu	2018-12-18 06:04:57 -05:00
Emilio G. Cota	0567c69235	tcg: Drop nargs from tcg_op_insert_{before,after} It's unused since 75e8b9b7aa0b95a761b9add7e2f09248b101a392. Backports commit ac1043f6d607aaac206c8aac42bc32f634f59395 from qemu	2018-12-18 06:00:13 -05:00
Alistair Francis	7219548fbd	tcg/mips: Improve the add2/sub2 command to use TCG_TARGET_REG_BITS Instead of hard coding 31 for the shift right use TCG_TARGET_REG_BITS - 1. Backports commit 161dec9d1b03552e78e5728186eae9cf1dfbe035 from qemu	2018-12-18 05:58:09 -05:00
Richard Henderson	5c4e852c6e	tcg: Add TCG_TARGET_HAS_MEMORY_BSWAP For now, defined universally as true, since we previously required backends to implement swapped memory operations. Future patches may now remove that support where it is onerous. Backports commit e1dcf3529d0797b25bb49a20e94b62eb93e7276a from qemu	2018-12-18 05:56:58 -05:00
Richard Henderson	fdb3d6488e	tcg/optimize: Optimize bswap Somehow we forgot these operations, once upon a time. This will allow immediate stores to have their bswap optimized away. Backports commit 6498594c8eda83c5f5915afc34bd03396f8de6df from qemu	2018-12-18 05:49:29 -05:00
Richard Henderson	1bcbdc2f1b	tcg: Clean up generic bswap64 Based on the only current user, Sparc: New code uses 2 constants that take 2 insns to load from constant pool, plus 13. Old code used 6 constants that took 1 or 2 insns to create, plus 21. The result is a new total of 17 vs an old total of 29. Backports commit 9e821eab0ab708add35fa0446d880086e845ee3e from qemu	2018-12-18 05:48:05 -05:00
Richard Henderson	f68b4aa896	tcg: Clean up generic bswap32 Based on the only current user, Sparc: New code uses 1 constant that takes 2 insns to create, plus 8. Old code used 2 constants that took 2 insns to create, plus 9. The result is a new total of 10 vs an old total of 13. Backports commit a686dc71d89b1d7934becd95c843aa1375cdb7e7 from qemu	2018-12-18 05:46:27 -05:00
Richard Henderson	3b85c29bb9	tcg/i386: Assume 32-bit values are zero-extended We now have an invariant that all TCG_TYPE_I32 values are zero-extended, which means that we do not need to extend them again during qemu_ld/st, either explicitly via a separate tcg_out_ext32u or implicitly via P_ADDR32. Backports commit 4810d96f03be4d3820563e3c6bf13dfc0627f205 from qemu	2018-12-18 05:42:52 -05:00
Richard Henderson	b7b142ed79	tcg/i386: Implement INDEX_op_extr{lh}_i64_i32 for 32-bit guests This preserves the invariant that all TCG_TYPE_I32 values are zero-extended in the 64-bit host register. Backports commit 75478279a0c1eafc7b69d5382356da138f58f1bd from qemu	2018-12-18 05:38:55 -05:00

... 2 3 4 5 6 ...

5540 commits