unicorn

mirror of https://github.com/yuzu-emu/unicorn.git synced 2025-03-28 11:26:53 +00:00

History

Peter Maydell edf81eb214 target/arm: Convert VFP VMLA to decodetree Convert the VFP VMLA instruction to decodetree. This is the first of the VFP 3-operand data processing instructions, so we include in this patch the code which loops over the elements for an old-style VFP vector operation. The existing code to do this looping uses the deprecated cpu_F0s/F0d/F1s/F1d TCG globals; since we are going to be converting instructions one at a time anyway we can take the opportunity to make the new loop use TCG temporaries, which means we can do that conversion one operation at a time rather than needing to do it all in one go. We include an UNDEF check which was missing in the old code: short-vector operations (with stride or length non-zero) were deprecated in v7A and must UNDEF in v8A, so if the MVFR0 FPShVec field does not indicate that support for short vectors is present we UNDEF the operations that would use them. (This is a change of behaviour for Cortex-A7, Cortex-A15 and the v8 CPUs, which previously were all incorrectly allowing short-vector operations.) Note that the conversion fixes a bug in the old code for the case of VFP short-vector "mixed scalar/vector operations". These happen where the destination register is in a vector bank but but the second operand is in a scalar bank. For example vmla.f64 d10, d1, d16 with length 2 stride 2 is equivalent to the pair of scalar operations vmla.f64 d10, d1, d16 vmla.f64 d8, d3, d16 where the destination and first input register cycle through their vector but the second input is scalar (d16). In the old decoder the gen_vfp_F1_mul() operation uses cpu_F1{s,d} as a temporary output for the multiply, which trashes the second input operand. For the fully-scalar case (where we never do a second iteration) and the fully-vector case (where the loop loads the new second input operand) this doesn't matter, but for the mixed scalar/vector case we will end up using the wrong value for later loop iterations. In the new code we use TCG temporaries and so avoid the bug. This bug is present for all the multiply-accumulate insns that operate on short vectors: VMLA, VMLS, VNMLA, VNMLS. Note 2: the expression used to calculate the next register number in the vector bank is not in fact correct; we leave this behaviour unchanged from the old decoder and will fix this bug later in the series. Backports commit 266bd25c485597c94209bfdb3891c1d0c573c164 from qemu		2019-06-13 17:59:16 -04:00
..
arm-powerctl.c	arm: Clarify the logic of set_pc()	2019-02-03 17:55:30 -05:00
arm-powerctl.h	ARM: Factor out ARM on/off PSCI control functions	2018-03-01 23:31:47 -05:00
arm_ldst.h	Fix Thumb-1 BE32 execution and disassembly.	2018-03-02 00:20:11 -05:00
cpu-param.h	tcg: Split out target/arch/cpu-param.h	2019-06-10 19:35:46 -04:00
cpu-qom.h	target/arm: Add "-cpu max" support	2018-03-12 10:11:49 -04:00
cpu.c	target/arm: Explicitly enable VFP short-vectors for aarch32 -cpu max	2019-06-13 16:38:01 -04:00
cpu.h	target/arm: Convert VFP VMLA to decodetree	2019-06-13 17:59:16 -04:00
cpu64.c	target/arm: Use env_cpu, env_archcpu	2019-06-12 11:34:08 -04:00
crypto_helper.c	target/arm/cpu and crypto_helper: Correct bad merge and adjust to qemu code style	2018-03-12 11:57:24 -04:00
helper-a64.c	target/arm: Use env_cpu, env_archcpu	2019-06-12 11:34:08 -04:00
helper-a64.h	target/arm: check CF_PARALLEL instead of parallel_cpus	2019-05-04 22:44:32 -04:00
helper-sve.h	target/arm: Rewrite vector gather first-fault loads	2018-10-08 14:15:15 -04:00
helper.c	target/arm: Implement NSACR gating of floating point	2019-06-13 16:15:28 -04:00
helper.h	target/arm: Use tcg_gen_abs_i64 and tcg_gen_gvec_abs	2019-05-16 16:43:02 -04:00
internals.h	target/arm: Convert to CPUClass::tlb_fill	2019-05-16 16:55:12 -04:00
iwmmxt_helper.c	target/arm: Untabify iwmmxt_helper.c	2018-08-25 04:33:44 -04:00
kvm-consts.h	arm: better stub version for MISMATCH_CHECK	2018-03-02 00:13:45 -05:00
Makefile.objs	target/arm: Add stubs for AArch32 VFP decodetree	2019-06-13 16:24:37 -04:00
neon_helper.c	target/arm: Use tcg_gen_abs_i64 and tcg_gen_gvec_abs	2019-05-16 16:43:02 -04:00
op_addsub.h	Move target-* CPU file into a target/ folder	2018-03-01 22:50:58 -05:00
op_helper.c	target/arm: Use env_cpu, env_archcpu	2019-06-12 11:34:08 -04:00
pauth_helper.c	target/arm: Fix output of PAuth Auth	2019-06-13 16:17:00 -04:00
psci.c	fix WFI/WFE length in syndrome register	2018-03-05 11:21:51 -05:00
sve.decode	target/arm: Sychronize with qemu	2019-04-18 04:49:11 -04:00
sve_helper.c	tcg: Use tlb_fill probe from tlb_vaddr_to_host	2019-05-16 18:27:03 -04:00
translate-a64.c	target/arm: Use tcg_gen_gvec_bitsel	2019-06-13 16:12:56 -04:00
translate-a64.h	target/arm: Use tcg_gen_gvec_bitsel	2019-06-13 16:12:56 -04:00
translate-sve.c	tcg: Specify optional vector requirements with a list	2019-05-16 15:05:02 -04:00
translate-vfp.inc.c	target/arm: Convert VFP VMLA to decodetree	2019-06-13 17:59:16 -04:00
translate.c	target/arm: Convert VFP VMLA to decodetree	2019-06-13 17:59:16 -04:00
translate.h	target/arm: Use tcg_gen_gvec_bitsel	2019-06-13 16:12:56 -04:00
unicorn.h	Move target-* CPU file into a target/ folder	2018-03-01 22:50:58 -05:00
unicorn_aarch64.c	unicorn_aarch64: Use aa64_vfp_qreg instead of aa32_vfp_dreg	2018-09-03 07:47:40 +01:00
unicorn_arm.c	unicorn_arm: Treat registers as unsigned values in casts	2019-04-26 08:48:31 -04:00
vec_helper.c	target/arm: Add helpers for FMLAL	2019-02-28 15:31:48 -05:00
vfp-uncond.decode	target/arm: Convert VCVTA/VCVTN/VCVTP/VCVTM to decodetree	2019-06-13 16:54:42 -04:00
vfp.decode	target/arm: Convert VFP VMLA to decodetree	2019-06-13 17:59:16 -04:00
vfp_helper.c	target/arm: Use env_cpu, env_archcpu	2019-06-12 11:34:08 -04:00