unicorn

mirror of https://github.com/yuzu-emu/unicorn.git synced 2024-12-23 18:55:49 +00:00

Author	SHA1	Message	Date
Richard Henderson	e3356f9bad	tcg: Drop union from TCGArgConstraint The union is unused; let "regs" appear in the main structure without the "u.regs" wrapping. Backports 9be0d08019465b38e2f1a605960961a491430c21	2021-03-01 19:29:19 -05:00
Richard Henderson	6b91e9bae1	tcg/i386: Implement INDEX_op_rotl{i,s,v}_vec For immediates, we must continue the special casing of 8-bit elements. The other element sizes and shift types are trivially implemented with shifts. Backports commit 885b1706df6f0211a22e120fac910fb3abf3e733 from qemu	2020-06-14 22:09:24 -04:00
Richard Henderson	cc3187b1e4	tcg: Implement gvec support for rotate by scalar No host backend support yet, but the interfaces for rotls are in place. Only implement left-rotate for now, as the only known use of vector rotate by scalar is s390x, so any right-rotate would be unused and untestable. Backports commit 23850a74afb641102325b4b7f74071d929fc4594 from qemu	2020-06-14 22:00:50 -04:00
Richard Henderson	be78062fd8	tcg: Implement gvec support for rotate by vector No host backend support yet, but the interfaces for rotlv and rotrv are in place. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- v3: Drop the generic expansion from rot to shift; we can do better for each backend, and then this code becomes unused. Backports commit 5d0ceda902915e3f0e21c39d142c92c4e97c3ebb from qemu	2020-06-14 21:43:46 -04:00
Richard Henderson	5cce52a04b	tcg: Implement gvec support for rotate by immediate No host backend support yet, but the interfaces for rotli are in place. Canonicalize immediate rotate to the left, based on a survey of architectures, but provide both left and right shift interfaces to the translators. Backports commit b0f7e7444c03da17e41bf327c8aea590104a28ab from qemu	2020-06-14 21:26:58 -04:00
Richard Henderson	299ba4e867	tcg/i386: Fix INDEX_op_dup2_vec We were only constructing the 64-bit element, and not replicating the 64-bit element across the rest of the vector. Backports commit e20cb81d9c5a3d0f9c08f3642728a210a1c162c9 from qemu	2020-04-30 07:15:08 -04:00
Richard Henderson	b358f771f6	tcg/i386: Bound shift count expanding sari_vec A given RISU testcase for SVE can produce tcg-op-vec.c:511: do_shifti: Assertion `i >= 0 && i < (8 << vece)' failed. because expand_vec_sari gave a shift count of 32 to a MO_32 vector shift. In 44f1441dbe1, we changed from direct expansion of vector opcodes to re-use of the tcg expanders. So while the comment correctly notes that the hw will handle such a shift count, we now have to take our own sanity checks into account. Which is easy in this particular case. Fixes: 44f1441dbe1 Backports commit 312b426fea4d6dd322d7472c80010a8ba7a166d2 from qemu	2020-04-30 06:26:42 -04:00
Tony Nguyen	f75368cd0f	tcg: TCGMemOp is now accelerator independent MemOp Preparation for collapsing the two byte swaps, adjust_endianness and handle_bswap, along the I/O path. Target dependant attributes are conditionalized upon NEED_CPU_H. Backports commit 14776ab5a12972ea439c7fb2203a4c15a09094b4 from qemu	2019-11-28 03:01:12 -05:00
Richard Henderson	c79510378f	tcg/i386: Use umin/umax in expanding unsigned compare Using umin(a, b) == a as an expansion for TCG_COND_LEU is a better alternative to (a - INT_MIN) <= (b - INT_MIN). Backports commit ebcfb91abed8c0fb180a968b9004419c208dcc02 from qemu	2019-05-24 18:36:32 -04:00
Richard Henderson	ffdbc1a233	tcg/i386: Remove expansion for missing minmax This is now handled by code within tcg-op-vec.c. Backports commit 3ec3538a45f2fead475b0cca6945092c87927b4f from qemu	2019-05-24 18:34:44 -04:00
Richard Henderson	68cb096196	tcg/i386: Support vector comparison select value We already had backend support for this feature. Expand the new cmpsel opcode using vpblendb. The combination allows us to avoid an extra NOT for some comparison codes. Backports commit 904c5e19672778cc3349f4975437cfdf3371abb6 from qemu	2019-05-24 18:33:16 -04:00
Richard Henderson	2ea6dfbd63	tcg: Add support for vector compare select Perform a per-element conditional move. This combination operation is easier to implement on some host vector units than plain cmp+bitsel. Omit the usual gvec interface, as this is intended to be used by target-specific gvec expansion call-backs. Backports commit f75da2988eb2457fa23d006d573220c5c680ec4e from qemu	2019-05-24 18:21:13 -04:00
Richard Henderson	ca58be9cb4	tcg: Add support for vector bitwise select This operation performs d = (b & a) \| (c & ~a), and is present on a majority of host vector units. Include gvec expanders. Backports commit 38dc12947ec9106237f9cdbd428792c985cd86ae from qemu	2019-05-24 18:15:10 -04:00
Richard Henderson	60cfe541b2	tcg/i386: Fix dupi/dupm for avx1 and 32-bit hosts The VBROADCASTSD instruction only allows %ymm registers as destination. Rather than forcing VEX.L and writing to the entire 256-bit register, revert to using MOVDDUP with an %xmm register. This is sufficient for an avx1 host since we do not support TCG_TYPE_V256 for that case. Also fix the 32-bit avx2, which should have used VPBROADCASTW. Fixes: 1e262b49b533 Backports commit 7b60ef3264e9627ac6efb34e9a6130647e9b55c0 from qemu	2019-05-24 18:04:08 -04:00
Lioncash	fcaa52c1fe	tcg: Synchronize with qemu Resolves any formatting discrepancies and bad merges that slipped through.	2019-05-16 18:11:08 -04:00
Richard Henderson	fd35490991	tcg/i386: Support vector absolute value Backports commit 18f9b65f1a4225dd314cb9b0a8dea968c5bc2ef3 from qemu	2019-05-16 16:37:33 -04:00
Richard Henderson	6d5e7856ff	tcg: Add support for vector absolute value Backports commit bcefc90208f8a1d6f619d61c2647281d92277015 from qemu	2019-05-16 16:33:43 -04:00
Richard Henderson	18b3df6e4e	tcg/i386: Support vector scalar shift opcodes Backports commit 0a8d7a3bf5a149a82450eef555fd61728703dd84 from qemu	2019-05-16 16:19:44 -04:00
Richard Henderson	f793ec847d	tcg/i386: Support vector variable shift opcodes Backports commit a2ce146a06807fe1d1a81e878b8f249ff1e14038 from qemu	2019-05-16 15:53:33 -04:00
Richard Henderson	66e6bea084	tcg: Add INDEX_op_dupm_vec Allow the backend to expand dup from memory directly, instead of forcing the value into a temp first. This is especially important if integer/vector register moves do not exist. Note that officially tcg_out_dupm_vec is allowed to fail. If it did, we could fix this up relatively easily: VECE == 32/64: Load the value into a vector register, then dup. Both of these must work. VECE == 8/16: If the value happens to be at an offset such that an aligned load would place the desired value in the least significant end of the register, go ahead and load w/garbage in high bits. Load the value w/INDEX_op_ld{8,16}_i32. Attempt a move directly to vector reg, which may fail. Store the value into the backing store for OTS. Load the value into the vector reg w/TCG_TYPE_I32, which must work. Duplicate from the vector reg into itself, which must work. All of which is well and good, except that all supported hosts can support dupm for all vece, so all of the failure paths would be dead code and untestable. Backports commit 37ee55a081b7863ffab2151068dd1b2f11376914 from qemu	2019-05-16 15:38:02 -04:00
Richard Henderson	a6fd4e2345	tcg/i386: Implement tcg_out_dupm_vec At the same time, improve tcg_out_dupi_vec wrt broadcast from the constant pool. Backports commit 1e262b49b5331441f697461e4305fe06719758a7 from qemu	2019-05-16 15:27:15 -04:00
Richard Henderson	d4e7c6a8c5	tcg: Add tcg_out_dupm_vec to the backend interface Currently stubbed out in all backends that support vectors. Backports commit d6ecb4a978b718dbe108a9fa9ecccc8b7f7cb579 from qemu	2019-05-16 15:24:48 -04:00
Richard Henderson	cf238d3544	tcg: Manually expand INDEX_op_dup_vec This case is similar to INDEX_op_mov_* in that we need to do different things depending on the current location of the source. Backports commit bab1671f0fa928fd678a22f934739f06fd5fd035 from qemu	2019-05-16 15:22:29 -04:00
Richard Henderson	3d20e1678c	tcg: Promote tcg_out_{dup,dupi}_vec to backend interface The i386 backend already has these functions, and the aarch64 backend could easily split out one. Nothing is done with these functions yet, but this will aid register allocation of INDEX_op_dup_vec in a later patch. Adjust the aarch64 tcg_out_dupi_vec signature to match the new interface. Backports commit e7632cfa8b76cdbbc1c76e8737338ef5844e7d60 from qemu	2019-05-16 15:18:48 -04:00
Richard Henderson	f86bd1c5d6	tcg: Return bool success from tcg_out_mov This patch merely changes the interface, aborting on all failures, of which there are currently none. Backports commit 78113e83e0007e869c9f0cb4c0497a77538988e3 from qemu	2019-05-16 15:14:42 -04:00
Richard Henderson	6145e3fdd7	tcg: Restart TB generation after out-of-line ldst overflow This is part c of relocation overflow handling. Backports commit aeee05f53a5d67304a521d2644dc0a607e3c8b28 from qemu	2019-04-30 10:06:53 -04:00
Richard Henderson	0f20a26b36	tcg/i386: Support INDEX_op_extract2_{i32,i64} Backports commit c6fb8c0cf704c4a1a48c3e99e995ad4c58150dab from qemu	2019-04-30 09:37:39 -04:00
Richard Henderson	269fa0daba	tcg: Add INDEX_op_extract2_{i32,i64} This will let backends implement the double-word shift operation. Backports commit fce1296f135669eca85dc42154a2a352c818ad76 from qemu	2019-04-30 09:29:05 -04:00
Lioncash	96c52ea053	tcg: Synchronize with qemu	2019-04-22 02:03:01 -04:00
Mark Cave-Ayland	576df55076	tcg/i386: fix unsigned vector saturating arithmetic Due to a cut/paste error in the original implementation, the unsigned vector saturating arithmetic was erroneously being calculated as signed vector saturating arithmetic. Fixes: 8ffafbcec2 ("tcg/i386: Implement vector saturating arithmetic") Backports commit 3115584d39afe8cf2a84a40549029f53792abca5 from qemu	2019-02-12 11:37:12 -05:00
Richard Henderson	63d1aae6b2	tcg/i386: Implement vector minmax arithmetic The avx instruction set does not directly provide MO_64. We can still implement 64-bit with comparison and vpblendvb. Backports commit bc37faf4cb2baa77c44298c01558970b88d32808 from qemu	2019-01-29 16:41:11 -05:00
Richard Henderson	5518b543ed	tcg/i386: Implement vector saturating arithmetic Only MO_8 and MO_16 are implemented, since that's all the instruction set provides. Backports commit 8ffafbcec275e61f6a1a17ac1d0bd918d5b23db3 from qemu	2019-01-29 16:37:55 -05:00
Richard Henderson	24e65f60ed	tcg/i386: Split subroutines out of tcg_expand_vec_op This routine was becoming too large. Backports commit 44f1441dbe14e7174a707d7e7ecbc2c8e080bfda from qemu	2019-01-29 16:33:59 -05:00
Richard Henderson	fb684825c8	tcg: Add opcodes for vector minmax arithmetic Backports commit dd0a0fcdd8848c2a18970c44a62bd8f394c2b495 from qemu	2019-01-29 16:24:52 -05:00
Richard Henderson	e0266239ea	tcg: Add opcodes for vector saturated arithmetic Backports commit 8afaf0506606f8003ef696df849c5a98637a7a83 from qemu	2019-01-29 16:14:34 -05:00
Richard Henderson	5c4e852c6e	tcg: Add TCG_TARGET_HAS_MEMORY_BSWAP For now, defined universally as true, since we previously required backends to implement swapped memory operations. Future patches may now remove that support where it is onerous. Backports commit e1dcf3529d0797b25bb49a20e94b62eb93e7276a from qemu	2018-12-18 05:56:58 -05:00
Richard Henderson	3b85c29bb9	tcg/i386: Assume 32-bit values are zero-extended We now have an invariant that all TCG_TYPE_I32 values are zero-extended, which means that we do not need to extend them again during qemu_ld/st, either explicitly via a separate tcg_out_ext32u or implicitly via P_ADDR32. Backports commit 4810d96f03be4d3820563e3c6bf13dfc0627f205 from qemu	2018-12-18 05:42:52 -05:00
Richard Henderson	b7b142ed79	tcg/i386: Implement INDEX_op_extr{lh}_i64_i32 for 32-bit guests This preserves the invariant that all TCG_TYPE_I32 values are zero-extended in the 64-bit host register. Backports commit 75478279a0c1eafc7b69d5382356da138f58f1bd from qemu	2018-12-18 05:38:55 -05:00
Richard Henderson	4e882a95f3	tcg/i386: Propagate is64 to tcg_out_qemu_ld_slow_path This helps preserve the invariant that all TCG_TYPE_I32 values are stored zero-extended in the 64-bit host registers. Backports commit 3dbc8c61de4e0d0a2afe0897cda7ab28cd37a164 from qemu	2018-12-18 05:36:58 -05:00
Richard Henderson	bdd6118105	tcg/i386: Propagate is64 to tcg_out_qemu_ld_direct This helps preserve the invariant that all TCG_TYPE_I32 values are stored zero-extended in the 64-bit host registers. Backports commit 1d21d95b6101786d44d3b4a12400eb80a1ecc647 from qemu	2018-12-18 05:35:34 -05:00
Richard Henderson	fc86fd34ff	tcg/i386: Return false on failure from patch_reloc Backports commit bec3afd5fc6ab0b6e9d8a01575d58db8d1ad82ce from qemu	2018-12-18 05:27:14 -05:00
Richard Henderson	46189d87b3	tcg: Return success from patch_reloc This will move the assert for success from within (subroutines of) patch_reloc into the callers. It will also let new code do something different when a relocation is out of range. For the moment, all backends are trivially converted to return true. Backports commit 6ac1778676f4259c10b0629ccd9df319a5d1baeb from qemu	2018-12-18 05:25:45 -05:00
Richard Henderson	091b4fa1ff	tcg/i386: Move TCG_REG_CALL_STACK from define to enum Backports commit 66c0285df4270d184afce5ac8b97ac175c89562f from qemu	2018-12-18 05:13:47 -05:00
Richard Henderson	f3a8a4a306	tcg/i386: Always use %ebp for TCG_AREG0 For x86_64, this can remove a REX prefix resulting in smaller code when manipulating globals of type i32, as we move them between backing store via cpu_env, aka TCG_AREG0. Backports commit 5740d9f714835964873325d1210b26811252843f from qemu	2018-12-18 05:13:05 -05:00
Roman Kapl	33e69342e3	tcg/i386: fix vector operations on 32-bit hosts The TCG backend uses LOWREGMASK to get the low 3 bits of register numbers. This was defined as no-op for 32-bit x86, with the assumption that we have eight registers anyway. This assumption is not true once we have xmm regs. Since LOWREGMASK was a no-op, xmm register indidices were wrong in opcodes and have overflown into other opcode fields, wreaking havoc. To trigger these problems, you can try running the "movi d8, #0x0" AArch64 instruction on 32-bit x86. "vpxor %xmm0, %xmm0, %xmm0" should be generated, but instead TCG generated "vpxor %xmm0, %xmm0, %xmm2". Fixes: 770c2fc7bb ("Add vector operations") Backports commit 93bf9a42733321fb632bcb9eafd049ef0e3d9417 from qemu	2018-10-02 04:22:35 -04:00
Richard Henderson	a4c2dbef3e	tcg/i386: Mark xmm registers call-clobbered When host vector registers and operations were introduced, I failed to mark the registers call clobbered as required by the ABI. Fixes: 770c2fc7bb7 Backports commit 672189cd586ea38a2c1d8ab91eb1f9dcff5ceb05 from qemu	2018-07-23 20:00:26 -04:00
John Arbuckle	22c3206738	tcg/i386: Use byte form of xgetbv instruction The assembler in most versions of Mac OS X is pretty old and does not support the xgetbv instruction. To go around this problem, the raw encoding of the instruction is used instead. Backports commit 1019242af11400252f6735ca71a35f81ac23a66d from qemu	2018-06-28 13:23:32 -05:00
Richard Henderson	33f7f6f09a	tcg/i386: Fix dup_vec in non-AVX2 codepath The VPUNPCKLD* instructions are all "non-destructive source", indicated by "NDS" in the encoding string in the x86 ISA manual. This means that they take two source operands, one of which is encoded in the VEX.vvvv field. We were incorrectly treating them as if they were destructive-source and passing 0 as the 'v' argument of tcg_out_vex_modrm(). This meant we were always using %xmm0 as one of the source operands, causing incorrect results if the register allocator happened to want to use something else. For instance the input AArch64 insn: DUP v26.16b, w21 which becomes TCG IR ops: dup_vec v128,e8,tmp2,x21 st_vec v128,e8,tmp2,env,$0xa40 was assembled to: 0x607c568c: c4 c1 7a 7e 86 e8 00 00 vmovq 0xe8(%r14), %xmm0 0x607c5694: 00 0x607c5695: c5 f9 60 c8 vpunpcklbw %xmm0, %xmm0, %xmm1 0x607c5699: c5 f9 61 c9 vpunpcklwd %xmm1, %xmm0, %xmm1 0x607c569d: c5 f9 70 c9 00 vpshufd $0, %xmm1, %xmm1 0x607c56a2: c4 c1 7a 7f 8e 40 0a 00 vmovdqu %xmm1, 0xa40(%r14) 0x607c56aa: 00 when the vpunpcklwd insn should be "%xmm1, %xmm1, %xmm1". This resulted in our incorrectly setting the output vector to q26=0000320000003200:0000320000003200 when given an input of x21 == 0000000002803200 rather than the expected all-zeroes. Pass the correct source register number to tcg_out_vex_modrm() for these insns. Backports commit 7eb30ef0ba2eb59e7430d4848ae8d4bf4e50f768 from qemu	2018-05-11 11:22:38 -04:00
Lioncash	6bdfeb35ec	tcg/i386: Perform comparison pass against qemu Ensures formatting and code are consistent.	2018-03-20 06:29:06 -04:00
Richard Henderson	2310bd4887	tcg/i386: Support INDEX_op_dup2_vec for -m32 Unknown why -m32 was passing with gcc but not clang; it should have failed for both. This would be used for tcg_gen_dup_i64_vec, and visible with the right TB and an aarch64 guest. Backports commit 7f34ed4bcdfda55f978f51aadca64aa970c9f4b6 from qemu	2018-03-17 20:22:24 -04:00

1 2 3

106 commits