unicorn

mirror of https://github.com/yuzu-emu/unicorn.git synced 2024-12-26 02:05:40 +00:00

Author	SHA1	Message	Date
Richard Henderson	67f0af4282	tcg/aarch64: Allow immediates for vector ORR and BIC The allows immediates to be used for ORR and BIC, as well as the trivial inversions, ORC and AND. Backports commit 9e27f58b9902834dffc0d66d9eb62f78d9c2a632 from qemu	2019-05-24 18:47:07 -04:00
Richard Henderson	5ecfba4fe6	tcg/aarch64: Build vector immediates with two insns Use MOVI+ORR or MVNI+BIC in order to build some vector constants, as opposed to dropping them to the constant pool. This includes all 16-bit constants and a similar set of 32-bit constants. Backports commit 02f3a5b4744885258758d07ebe09cf965de78bcf from qemu	2019-05-24 18:43:54 -04:00
Richard Henderson	06058ef648	tcg/aarch64: Use MVNI in tcg_out_dupi_vec The compliment of a subset of immediates can be computed with a single instruction. Backports commit 7e308e003e5b6ddd3130e09711e1d33693230696 from qemu	2019-05-24 18:42:40 -04:00
Richard Henderson	c18ec586dc	tcg/aarch64: Split up is_fimm There are several sub-classes of vector immediate, and only MOVI can use them all. This will enable usage of MVNI and ORRI, which use progressively fewer sub-classes. This patch adds no new functionality, merely splits the function and moves part of the logic into tcg_out_dupi_vec. Backports commit 984fdcee342473dfe797897758929dad654693c8 from qemu	2019-05-24 18:41:37 -04:00
Richard Henderson	0ea4c05dc3	tcg/aarch64: Support vector bitwise select value The instruction set has 3 insns that perform the same operation, only varying in which operand must overlap the destination. We can represent the operation without overlap and choose based on the operands seen. Backports commit a9e434a5dc16f71ee156428619fc3c3765b68f26 from qemu	2019-05-24 18:38:37 -04:00
Richard Henderson	de260cfbd6	tcg/aarch64: Do not advertise minmax for MO_64 The min/max instructions are not available for 64-bit elements. Backports commit a7b6d286cfb5205b9f5330aefc5727269b3d810f from qemu	2019-05-16 16:44:34 -04:00
Richard Henderson	7c9b3a9021	tcg/aarch64: Support vector absolute value Backports commit a456394ae540f852cd0d10fd693fe9f33598dc01 from qemu	2019-05-16 16:39:14 -04:00
Richard Henderson	0217ee7b24	tcg/aarch64: Support vector variable shift opcodes Backports commit 79525dfd08262d8de10d271f17e5a4096ef96d16 from qemu	2019-05-16 15:58:54 -04:00
Richard Henderson	66e6bea084	tcg: Add INDEX_op_dupm_vec Allow the backend to expand dup from memory directly, instead of forcing the value into a temp first. This is especially important if integer/vector register moves do not exist. Note that officially tcg_out_dupm_vec is allowed to fail. If it did, we could fix this up relatively easily: VECE == 32/64: Load the value into a vector register, then dup. Both of these must work. VECE == 8/16: If the value happens to be at an offset such that an aligned load would place the desired value in the least significant end of the register, go ahead and load w/garbage in high bits. Load the value w/INDEX_op_ld{8,16}_i32. Attempt a move directly to vector reg, which may fail. Store the value into the backing store for OTS. Load the value into the vector reg w/TCG_TYPE_I32, which must work. Duplicate from the vector reg into itself, which must work. All of which is well and good, except that all supported hosts can support dupm for all vece, so all of the failure paths would be dead code and untestable. Backports commit 37ee55a081b7863ffab2151068dd1b2f11376914 from qemu	2019-05-16 15:38:02 -04:00
Richard Henderson	fd7a67e4a7	tcg/aarch64: Implement tcg_out_dupm_vec The LD1R instruction does all the work. Note that the only useful addressing mode is a base register with no offset. Backports commit f23e5e15edfd49d5dd72cab2ed2d85ac354b2eeb from qemu	2019-05-16 15:29:04 -04:00
Richard Henderson	d4e7c6a8c5	tcg: Add tcg_out_dupm_vec to the backend interface Currently stubbed out in all backends that support vectors. Backports commit d6ecb4a978b718dbe108a9fa9ecccc8b7f7cb579 from qemu	2019-05-16 15:24:48 -04:00
Richard Henderson	cf238d3544	tcg: Manually expand INDEX_op_dup_vec This case is similar to INDEX_op_mov_* in that we need to do different things depending on the current location of the source. Backports commit bab1671f0fa928fd678a22f934739f06fd5fd035 from qemu	2019-05-16 15:22:29 -04:00
Richard Henderson	3d20e1678c	tcg: Promote tcg_out_{dup,dupi}_vec to backend interface The i386 backend already has these functions, and the aarch64 backend could easily split out one. Nothing is done with these functions yet, but this will aid register allocation of INDEX_op_dup_vec in a later patch. Adjust the aarch64 tcg_out_dupi_vec signature to match the new interface. Backports commit e7632cfa8b76cdbbc1c76e8737338ef5844e7d60 from qemu	2019-05-16 15:18:48 -04:00
Richard Henderson	f86bd1c5d6	tcg: Return bool success from tcg_out_mov This patch merely changes the interface, aborting on all failures, of which there are currently none. Backports commit 78113e83e0007e869c9f0cb4c0497a77538988e3 from qemu	2019-05-16 15:14:42 -04:00
Richard Henderson	6145e3fdd7	tcg: Restart TB generation after out-of-line ldst overflow This is part c of relocation overflow handling. Backports commit aeee05f53a5d67304a521d2644dc0a607e3c8b28 from qemu	2019-04-30 10:06:53 -04:00
Richard Henderson	2479bbd3b2	tcg/aarch64: Support INDEX_op_extract2_{i32,i64} Backports commit 464c2969d5d7a0a5d38d2aa5d930986df876d3fb from qemu	2019-04-30 09:40:40 -04:00
Richard Henderson	3f0781e39b	tcg/aarch64: Implement vector minmax arithmetic Backports commit 93f332a50371936ea02392bdb748c8140ef3f06a from qemu	2019-01-29 16:44:09 -05:00
Richard Henderson	8a012c3929	tcg/aarch64: Implement vector saturating arithmetic Backports commit d32648d445c534cea7e2ad7ed8608208aa8831c1 from qemu	2019-01-29 16:42:50 -05:00
Richard Henderson	a22387f919	tcg/aarch64: Return false on failure from patch_reloc This does require an extra two checks within the slow paths to replace the assert that we're moving. Backports commit 214bfe83d5a5af70bac2b8d0bd649b018c33c03b from qemu	2018-12-18 05:28:45 -05:00
Richard Henderson	46189d87b3	tcg: Return success from patch_reloc This will move the assert for success from within (subroutines of) patch_reloc into the callers. It will also let new code do something different when a relocation is out of range. For the moment, all backends are trivially converted to return true. Backports commit 6ac1778676f4259c10b0629ccd9df319a5d1baeb from qemu	2018-12-18 05:25:45 -05:00
Richard Henderson	0a8bc142d3	tcg/aarch64: Fold away noaddr branch routines There are one use apiece for these. There is no longer a need for preserving branch offset operands, as we no longer re-translate. Backports commit 733589b3382afcb0ae9f43e72e083a5ddd38abd5 from qemu	2018-12-18 05:15:41 -05:00
Richard Henderson	cbe1065e83	tcg/aarch64: Remove reloc_pc26_atomic It is unused since b68686bd4bfeb70040b4099df993dfa0b4f37b03. Backports commit 90d6cb781130891f96eb54f8315e29fbd4e99a71 from qemu	2018-12-18 05:14:22 -05:00
Alex Bennée	11948dd1cc	tcg/aarch64: limit mul_vec size In AdvSIMD we can only do 32x32 integer multiples although SVE is capable of larger 64 bit multiples. As a result we can end up generating invalid opcodes. Fix this by only reprting we can emit mul vector ops if the size is small enough. Fixes a crash on: sve-all-short-v8.3+sve@vq3/insn_mul_z_zi___INC.risu.bin When running on AArch64 hardware. Backports commit e65a5f227d77a5dbae7a7123c3ee915ee4bd80cf from qemu	2018-07-21 14:15:59 -04:00
Richard Henderson	d1da0b8f6d	tcg/aarch64: Add vector operations Backports commit 14e4c1e2355473ccb2939afc69ac8f25de103b92 from qemu	2018-03-07 08:07:58 -05:00
Richard Henderson	47ed20fdd4	tcg/aarch64: Fully convert tcg_target_op_def Backports commit 1897cc2eb8be2d8be23380b45a2d3c1a2808723f from qemu	2018-03-04 23:46:38 -05:00
Richard Henderson	fc8b4316a9	tcg: Remove tcg_regset_set32 It's not even clear what the interface REG and VAL32 were supposed to mean. All uses had REG = 0 and VAL32 was the bitset assigned to the destination. Backports commit f46934df662182097dce07d57ec00f37e4d2abf1 from qemu	2018-03-04 23:42:59 -05:00
Richard Henderson	49d09d6888	tcg: Remove tcg_regset_clear Backports commit ccb1bb66ea2a42e773bfa04178d8b383ff86d4d8 from qemu	2018-03-04 23:24:45 -05:00
Richard Henderson	0c3781e7eb	tcg/aarch64: Use constant pool for movi Backports commit 55129955e92ec164ee2d778f20070dc214109bc6 from qemu	2018-03-04 22:46:50 -05:00
Richard Henderson	f96514a99c	tcg: Rearrange ldst label tracking Dispense with TCGBackendData, as it has never been used for more than holding a single pointer. Use a define in the cpu/tcg-target.h to signal requirement for TCGLabelQemuLdst, so that we can drop the no-op tcg-be-null.h stubs. Rename tcg-be-ldst.h to tcg-ldst.inc.c. Backports commit 659ef5cbb893872d25e9d95191cc23b16546c8a1 from qemu	2018-03-04 22:13:13 -05:00
Richard Henderson	31b8b67cd3	tcg: Move USE_DIRECT_JUMP discriminator to tcg/cpu/tcg-target.h Replace the USE_DIRECT_JUMP ifdef with a TCG_TARGET_HAS_direct_jump boolean test. Replace the tb_set_jmp_target1 ifdef with an unconditional function tb_target_set_jmp_target. While we're touching all backends, add a parameter for tb->tc_ptr; we're going to need it shortly for some backends. Move tb_set_jmp_target and tb_add_jump from exec-all.h to cpu-exec.c. Backports commit a85833933628384d74ec412024d55cf012640287 from qemu	2018-03-04 21:52:35 -05:00
Pranith Kumar	57f8eec080	tcg/aarch64: Enable indirect jump path using LDR (literal) This patch enables the indirect jump path using an LDR (literal) instruction. It will be interesting to test and see which performs better among the two paths. Backports commit 2acee8b2b5e6bba2935bb6ce5be92d0f0f9799cb from qemu	2018-03-03 22:03:39 -05:00
Pranith Kumar	5e9e39cafd	tcg/aarch64: Use ADRP+ADD to compute target address We use ADRP+ADD to compute the target address for goto_tb. This patch introduces the NOP instruction which is used to align the above instruction pair so that we can use one atomic instruction to patch the destination offsets. Backports commit b68686bd4bfeb70040b4099df993dfa0b4f37b03 from qemu	2018-03-03 22:01:38 -05:00
Pranith Kumar	0998ba8259	tcg/aarch64: Introduce and use long branch to register We can use a branch to register instruction for exit_tb for offsets greater than 128MB. Backports commit 23b7aa1d2af04ba57cc94f74d9f0ab25dce72fa0 from qemu	2018-03-03 21:59:58 -05:00
Richard Henderson	9a85cb0a26	tcg/aarch64: Use ADR in tcg_out_movi The new placement of the TB means that we can use one insn to load the return value for exit_tb returning the TB pointer. Backports commit cc74d332ff9a78684374847375ef63fc4bd10436 from qemu	2018-03-03 17:09:42 -05:00
Richard Henderson	81f1aae572	tcg/aarch64: Implement goto_ptr Measurements: SPECint06 (test set), x86_64-linux-user. Host: APM 64-bit ARMv8 (Atlas/A57) @ 2.4 GHz 1.45x +-+-------------------------------------------------------------------------------------------------------------+-+ \| ***** \| \| +++ * * +goto-ptr \| 1.4x +-+...****...................................................................................................+-+ \| +++* * * +++ \| 1.35x +-+................................................................****....................................+-+ \| * * * +++ \| \| * * * * * * \| 1.3x +-+.......................................................................................................+-+ \| * * * * * * \| \| * * * * * * ***** \| 1.25x +-+.................****.........................................................***.................+-+ \| * * * * * * * +++ * * \| 1.2x +-+.................................................................................................+-+ \| * * * * * * * * * * * * \| \| * * * * * * * * * * * * ***** \| 1.15x +-+...............................................................................................+-+ \| * * * * * * * * +++ * * * * * * \| \| * * * * * * * * ***** * * * * * * \| 1.1x +-+........................****.........***..................................................+-+ \| * * * * * * * * * * * * * * * * * * * \| 1.05x +-+.........................................................................................+-+ \| * * ***** * * * * * * * * * * * * * * * * * * \| \| * * * * * * * * * * * * *** *** * * * * * * * * * * \| 1x +-+---***---*---*----*---*---*---*---*---*---*----*---*---***---+-+ astar bzip2 gcc gobmk h264ref hmmlibquantum mcf omnetpperlbench sjenxalancbmk hmean png: http://imgur.com/en9HE8L Backports commit b19f0c2e7d344d4d62daf554951acdb6c94a34b0 from qemu	2018-03-03 14:13:09 -05:00
Pranith Kumar	ee609fa59f	aarch64: Change ext type to TCGType to fix warnings To fix the following warnings: In file included from /users/pranith/qemu/tcg/tcg.c:255: /users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:879:24: warning: implicit conversion from enumeration type 'TCGMemOp' (aka 'enum TCGMemOp') to different enumeration type 'TCGType' (aka 'enum TCGType') [-Wenum-conversion] tcg_out_cmp(s, ext, a, b, b_const); ~~~~~~~~~~~ ^~~ /users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:893:36: warning: implicit conversion from enumeration type 'TCGMemOp' (aka 'enum TCGMemOp') to different enumeration type 'TCGType' (aka 'enum TCGType') [-Wenum-conversion] tcg_out_insn(s, 3201, CBZ, ext, a, offset); ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~ /users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:389:65: note: expanded from macro 'tcg_out_insn' glue(tcg_out_insn_,FMT)(S, glue(glue(glue(I,FMT),_),OP), ## __VA_ARGS__) ^ /users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:895:37: warning: implicit conversion from enumeration type 'TCGMemOp' (aka 'enum TCGMemOp') to different enumeration type 'TCGType' (aka 'enum TCGType') [-Wenum-conversion] tcg_out_insn(s, 3201, CBNZ, ext, a, offset); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~ /users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:389:65: note: expanded from macro 'tcg_out_insn' glue(tcg_out_insn_,FMT)(S, glue(glue(glue(I,FMT),_),OP), ## __VA_ARGS__) ^ /users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:1610:27: warning: implicit conversion from enumeration type 'TCGType' (aka 'enum TCGType') to different enumeration type 'TCGMemOp' (aka 'enum TCGMemOp') [-Wenum-conversion] tcg_out_brcond(s, ext, a2, a0, a1, const_args[1], arg_label(args[3])); ~~~~~~~~~~~~~~ ^~~ backports commit dc1eccd661ada3b746ca4438e444993c36a0f04f from qemu	2018-03-02 10:48:56 -05:00
Richard Henderson	2b87ddda35	tcg/aarch64: Handle ctz and clz opcodes Backports commit 53c76c19904983d2c81e4f5e77027c241918a479 from qemu	2018-03-01 16:19:34 -05:00
Richard Henderson	3f38611159	tcg: Pass the opcode width to target_parse_constraint This will let us choose how to interpret a given constraint depending on whether the opcode is 32- or 64-bit. Which will let us share more constraint combinations between opcodes. At the same time, change the interface to return the advanced pointer instead of passing it in/out by reference. Backports commit 069ea736b50b75fdec99c9b8cc603b97bd98419e from qemu	2018-03-01 15:45:40 -05:00
Richard Henderson	b8c93597b4	tcg: Transition flat op_defs array to a target callback This will allow the target to tailor the constraints to the auto-detected ISA extensions. Backports commit f69d277ece43c42c7ab0144c2ff05ba740f6706b from qemu	2018-03-01 15:40:11 -05:00
Richard Henderson	fbea4130fc	tcg/aarch64: Implement field extraction opcodes Backports commit e2179f94a17bf0933df29ce1b4f6bc93cbe7dbd3 from qemu	2018-03-01 13:30:55 -05:00
Richard Henderson	6820964e2f	tcg/aarch64: Fix tcg_out_movi There were some patterns, like 0x0000_ffff_ffff_00ff, for which we would select to begin a multi-insn sequence with MOVN, but would fail to set the 0x0000 lane back from 0xffff. Backports commit 50b468d42107a2c646b1c566ed17d9ec362c51c4 from qemu	2018-03-01 09:15:34 -05:00
Richard Henderson	a03666f2f2	tcg/aarch64: Fix addsub2 for 0+C When al == xzr, we cannot use addi/subi because that encodes xsp. Force a zero into the temp register for that (rare) case. Backports commit 028fbea47713f909d6ea761a457779a82b276247 from qemu	2018-03-01 09:13:54 -05:00
Pranith Kumar	907060b865	tcg/aarch64: Add support for fence Backports commit c7a59c2a92592e556b9361437c9c4229917bd1e3 from qemu	2018-02-26 03:11:03 -05:00
Richard Henderson	91f5cf0417	tcg: Support arbitrary size + alignment Previously we allowed fully unaligned operations, but not operations that are aligned but with less alignment than the operation size. In addition, arm32, ia64, mips, and sparc had been omitted from the previous overalignment patch, which would have led to that alignment being enforced. Backports commit 85aa80813dd9f5c1f581c743e45678a3bee220f8 from qemu	2018-02-26 02:47:26 -05:00
Sergey Sorokin	e4d123caa9	tcg: Improve the alignment check infrastructure Some architectures (e.g. ARMv8) need the address which is aligned to a size more than the size of the memory access. To support such check it's enough the current costless alignment check implementation in QEMU, but we need to support an alignment size specifying. Backports commit 1f00b27f17518a1bcb4cedca49eaec96a4d560bd from qemu	2018-02-25 02:23:28 -05:00
Richard Henderson	23586e2674	tcg: Optimize spills of constants While we can store constants via constrants on INDEX_op_st_i32 et al, we weren't able to spill constants to backing store. Add a new backend interface, tcg_out_sti, which may store the constant (and is allowed to fail). Rearrange the temp_* helpers so that we only attempt to directly store a constant when the temp is becoming dead/free. Backports commit 59d7c14eeff8d2ad7f61aed86ce5a176113bc153 from qemu	2018-02-25 01:45:29 -05:00
Sergey Fedorov	e60c24cecf	tcg: Clean up direct block chaining data fields Briefly describe in a comment how direct block chaining is done. It should help in understanding of the following data fields. Rename some fields in TranslationBlock and TCGContext structures to better reflect their purpose (dropping excessive 'tb_' prefix in TranslationBlock but keeping it in TCGContext): tb_next_offset => jmp_reset_offset tb_jmp_offset => jmp_insn_offset tb_next => jmp_target_addr jmp_next => jmp_list_next jmp_first => jmp_list_first Avoid using a magic constant as an invalid offset which is used to indicate that there's no n-th jump generated. Backports commit f309101c26b59641fc1aa8fb2a98a5441cdaea03 from qemu	2018-02-23 21:28:19 -05:00
Sergey Fedorov	a45f8cb49d	tcg/aarch64: Make direct jump patching thread-safe Ensure direct jump patching in AArch64 is atomic by using atomic_read()/atomic_set() for code patching. Backports commit 9e269112953be4d670cb0d25042bd6546fcf3e45 from qemu	2018-02-23 21:28:18 -05:00
Aurelien Jarno	6060ab6596	tcg: check for CONFIG_DEBUG_TCG instead of NDEBUG Check for CONFIG_DEBUG_TCG instead of NDEBUG, drop now useless code. Backports commit 8d8fdbae010aa75a23f0307172e81034125aba6e from qemu	2018-02-23 13:55:21 -05:00
Aurelien Jarno	355ed7cd08	tcg: use tcg_debug_assert instead of assert (fix performance regression) The TCG code is quite performance sensitive, but at the same time can also be quite tricky. That is why asserts that can be enabled with the --enable-debug-tcg configure option. This used to work the following way: \| #include "config.h" \| \| ... \| \| #if !defined(CONFIG_DEBUG_TCG) && !defined(NDEBUG) \| /* define it to suppress various consistency checks (faster) */ \| #define NDEBUG \| #endif \| \| ... \| \| #include <assert.h> Since commit 757e725b (tcg: Clean up includes) "config.h" as been replaced by "qemu/osdep.h" which itself includes <assert.h>. As a consequence the assertions are always enabled, even when using --disable-debug-tcg, causing a performance regression, especially on targets with many registers. For instance on qemu-system-ppc the speed difference is about 15%. tcg_debug_assert is controlled directly by CONFIG_DEBUG_TCG and already uses in some places. This patch replaces all the calls to assert into calss to tcg_debug_assert. Backports commit eabb7b91b36b202b4dac2df2d59d698e3aff197a from qemu	2018-02-23 13:52:13 -05:00

1 2

52 commits