unicorn

mirror of https://github.com/yuzu-emu/unicorn.git synced 2024-12-26 11:55:33 +00:00

Author	SHA1	Message	Date
Peter Maydell	7a16bc6876	target/arm: Convert VMOV (imm) to decodetree Convert the VFP VMOV (immediate) instruction to decodetree. Backports commit b518c753f0b94e14e01e97b4ec42c100dafc0cc2 from qemu	2019-06-13 18:37:58 -04:00
Peter Maydell	0ebb6b8b90	target/arm: Convert VFP fused multiply-add insns to decodetree Convert the VFP fused multiply-add instructions (VFNMA, VFNMS, VFMA, VFMS) to decodetree. Note that in the old decode structure we were implementing these to honour the VFP vector stride/length. These instructions were introduced in VFPv4, and in the v7A architecture they are UNPREDICTABLE if the vector stride or length are non-zero. In v8A they must UNDEF if stride or length are non-zero, like all VFP instructions; we choose to UNDEF always. Backports commit d4893b01d23060845ee3855bc96626e16aad9ab5 from qemu	2019-06-13 18:24:36 -04:00
Peter Maydell	321bcc822b	target/arm: Convert VDIV to decodetree Convert the VDIV instruction to decodetree. Backports commit 519ee7ae31e050eb0ff9ad35c213f0bd7ab1c03e from qemu	2019-06-13 18:19:47 -04:00
Peter Maydell	76c74bc657	target/arm: Convert VSUB to decodetree Convert the VSUB instruction to decodetree. Backports commit 8fec9a119264b7936503abce3c106fad7e3ccb76 from qemu.	2019-06-13 18:18:00 -04:00
Peter Maydell	f56f0342ad	target/arm: Convert VADD to decodetree Convert the VADD instruction to decodetree. Backports commit ce28b303716e7eca3f3765bf6776d722ebbe1122 from qemu	2019-06-13 18:15:52 -04:00
Peter Maydell	06584edf61	target/arm: Convert VNMUL to decodetree Convert the VNMUL instruction to decodetree. Backports commit 43c4be1236c105090d134540da1036073d157cd4 from qemu	2019-06-13 18:14:16 -04:00
Peter Maydell	2c5e102017	target/arm: Convert VMUL to decodetree Convert the VMUL instruction to decodetree. Backports commit 88c5188ced60e9f2b8cc3af3b9bc4a8031c8c996 from qemu	2019-06-13 18:12:03 -04:00
Peter Maydell	b26b6a12a2	target/arm: Convert VFP VNMLA to decodetree Convert the VFP VNMLA instruction to decodetree. Backports commit 8a483533adc1bdc2decb8f456dbe930a2d245a8b from qemu	2019-06-13 18:09:57 -04:00
Peter Maydell	638b90de31	target/arm: Convert VFP VNMLS to decodetree Convert the VFP VNMLS instruction to decodetree. Backports commit c54a416cc6d60efbc79dd37aaf0c8918c05b5815 from qemu	2019-06-13 18:06:59 -04:00
Peter Maydell	67ad40ffa4	target/arm: Convert VFP VMLS to decodetree Convert the VFP VMLS instruction to decodetree. Backports commit e7258280d46af4ab6a0cc93ccfe8f6614defb4b7 from qemu	2019-06-13 18:02:37 -04:00
Peter Maydell	edf81eb214	target/arm: Convert VFP VMLA to decodetree Convert the VFP VMLA instruction to decodetree. This is the first of the VFP 3-operand data processing instructions, so we include in this patch the code which loops over the elements for an old-style VFP vector operation. The existing code to do this looping uses the deprecated cpu_F0s/F0d/F1s/F1d TCG globals; since we are going to be converting instructions one at a time anyway we can take the opportunity to make the new loop use TCG temporaries, which means we can do that conversion one operation at a time rather than needing to do it all in one go. We include an UNDEF check which was missing in the old code: short-vector operations (with stride or length non-zero) were deprecated in v7A and must UNDEF in v8A, so if the MVFR0 FPShVec field does not indicate that support for short vectors is present we UNDEF the operations that would use them. (This is a change of behaviour for Cortex-A7, Cortex-A15 and the v8 CPUs, which previously were all incorrectly allowing short-vector operations.) Note that the conversion fixes a bug in the old code for the case of VFP short-vector "mixed scalar/vector operations". These happen where the destination register is in a vector bank but but the second operand is in a scalar bank. For example vmla.f64 d10, d1, d16 with length 2 stride 2 is equivalent to the pair of scalar operations vmla.f64 d10, d1, d16 vmla.f64 d8, d3, d16 where the destination and first input register cycle through their vector but the second input is scalar (d16). In the old decoder the gen_vfp_F1_mul() operation uses cpu_F1{s,d} as a temporary output for the multiply, which trashes the second input operand. For the fully-scalar case (where we never do a second iteration) and the fully-vector case (where the loop loads the new second input operand) this doesn't matter, but for the mixed scalar/vector case we will end up using the wrong value for later loop iterations. In the new code we use TCG temporaries and so avoid the bug. This bug is present for all the multiply-accumulate insns that operate on short vectors: VMLA, VMLS, VNMLA, VNMLS. Note 2: the expression used to calculate the next register number in the vector bank is not in fact correct; we leave this behaviour unchanged from the old decoder and will fix this bug later in the series. Backports commit 266bd25c485597c94209bfdb3891c1d0c573c164 from qemu	2019-06-13 17:59:16 -04:00
Peter Maydell	93fe4cbe9e	target/arm: Remove VLDR/VSTR/VLDM/VSTM use of cpu_F0s and cpu_F0d Expand out the sequences in the new decoder VLDR/VSTR/VLDM/VSTM trans functions which perform the memory accesses by going via the TCG globals cpu_F0s and cpu_F0d, to use local TCG temps instead. Backports commit 3993d0407dff7233e42f2251db971e126a0497e9 from qemu	2019-06-13 17:31:28 -04:00
Peter Maydell	ff7042567e	target/arm: Convert the VFP load/store multiple insns to decodetree Convert the VFP load/store multiple insns to decodetree. This includes tightening up the UNDEF checking for pre-VFPv3 CPUs which only have D0-D15 : they now UNDEF for any access to D16-D31, not merely when the smallest register in the transfer list is in D16-D31. This conversion does not try to share code between the single precision and the double precision versions; this looks a bit duplicative of code, but it leaves the door open for a future refactoring which gets rid of the use of the "F0" registers by inlining the various functions like gen_vfp_ld() and gen_mov_F0_reg() which are hiding "if (dp) { ... } else { ... }" conditionalisation. Backports commit fa288de272c5c8a66d5eb683b123706a52bc7ad6 from qemu	2019-06-13 17:26:52 -04:00
Peter Maydell	6f0633ce80	target/arm: Convert VFP VLDR and VSTR to decodetree Convert the VFP single load/store insns VLDR and VSTR to decodetree. Backports commit 79b02a3b5231c5b8cd31e50cd549968dd0a05c49 from qemu	2019-06-13 17:22:48 -04:00
Peter Maydell	fe98885ff2	target/arm: Convert VFP two-register transfer insns to decodetree Convert the VFP two-register transfer instructions to decodetree (in the v8 Arm ARM these are the "Advanced SIMD and floating-point 64-bit move" encoding group). Again, we expand out the sequences involving gen_vfp_msr() and gen_msr_vfp(). Backports commit 81f681106eabe21c55118a5a41999fb7387fb714 from qemu	2019-06-13 17:20:00 -04:00
Peter Maydell	3fb3403b82	target/arm: Convert single-precision register moves to decodetree Convert the "single-precision" register moves to decodetree: * VMSR * VMRS * VMOV between general purpose register and single precision Note that the VMSR/VMRS conversions make our handling of the "should this UNDEF?" checks consistent between the two instructions: * VMSR to MVFR0, MVFR1, MVFR2 now UNDEF from EL0 (previously was a nop) * VMSR to FPSID now UNDEFs from EL0 or if VFPv3 or better (previously was a nop) * VMSR to FPINST and FPINST2 now UNDEF if VFPv3 or better (previously would write to the register, which had no guest-visible effect because we always UNDEF reads) We also tighten up the decode: we were previously underdecoding some SBZ or SBO bits. The conversion of VMOV_single includes the expansion out of the gen_mov_F0_vreg()/gen_vfp_mrs() and gen_mov_vreg_F0()/gen_vfp_msr() sequences into the simpler direct load/store of the TCG temp via neon_{load,store}_reg32(): we know in the new function that we're always single-precision, we don't need to use the old-and-deprecated cpu_F0* TCG globals, and we don't happen to have the declaration of gen_vfp_msr() and gen_vfp_mrs() at the point in the file where the new function is. Backports commit a9ab50011aeda2dd012da99069e078379315ea18 from qemu	2019-06-13 17:16:38 -04:00
Peter Maydell	694058da94	target/arm: Convert double-precision register moves to decodetree Convert the "double-precision" register moves to decodetree: this covers VMOV scalar-to-gpreg, VMOV gpreg-to-scalar and VDUP. Note that the conversion process has tightened up a few of the UNDEF encoding checks: we now correctly forbid: * VMOV-to-gpr with U:opc1:opc2 == 10x00 or x0x10 * VMOV-from-gpr with opc1:opc2 == 0x10 * VDUP with B:E == 11 * VDUP with Q == 1 and Vn<0> == 1 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> --- The accesses of elements < 32 bits could be improved by doing direct ld/st of the right size rather than 32-bit read-and-shift or read-modify-write, but we leave this for later cleanup, since this series is generally trying to stick to fixing the decode. Backports commit 9851ed9269d214c0c6feba960dd14ff09e6c34b4 from qemu	2019-06-13 17:11:56 -04:00
Peter Maydell	7265161108	target/arm: Add helpers for VFP register loads and stores The current VFP code has two different idioms for loading and storing from the VFP register file: 1 using the gen_mov_F0_vreg() and similar functions, which load and store to a fixed set of TCG globals cpu_F0s, CPU_F0d, etc 2 by direct calls to tcg_gen_ld_f64() and friends We want to phase out idiom 1 (because the use of the fixed globals is a relic of a much older version of TCG), but idiom 2 is quite longwinded: tcg_gen_ld_f64(tmp, cpu_env, vfp_reg_offset(true, reg)) requires us to specify the 64-bitness twice, once in the function name and once by passing 'true' to vfp_reg_offset(). There's no guard against accidentally passing the wrong flag. Instead, let's move to a convention of accessing 64-bit registers via the existing neon_load_reg64() and neon_store_reg64(), and provide new neon_load_reg32() and neon_store_reg32() for the 32-bit equivalents. Implement the new functions and use them in the code in translate-vfp.inc.c. We will convert the rest of the VFP code as we do the decodetree conversion in subsequent commits. Backports commit 160f3b64c5cc4c8a09a1859edc764882ce6ad6bf from qemu	2019-06-13 17:01:59 -04:00
Peter Maydell	033a386ffb	target/arm: Move the VFP trans_* functions to translate-vfp.inc.c Move the trans_*() functions we've just created from translate.c to translate-vfp.inc.c. This is pure code motion with no textual changes (this can be checked with 'git show --color-moved'). Backports commit f7bbb8f31f0761edbf0c64b7ab3c3f49c13612ea from qemu	2019-06-13 16:56:24 -04:00
Peter Maydell	e55d31a5ac	target/arm: Convert VCVTA/VCVTN/VCVTP/VCVTM to decodetree Convert the VCVTA/VCVTN/VCVTP/VCVTM instructions to decodetree. trans_VCVT() is temporarily left in translate.c. Backports commit c2a46a914cd5c38fd0ee57ff0befc1c5bde27bcf from qemu	2019-06-13 16:54:42 -04:00
Peter Maydell	9fb01cb526	target/arm: Convert VRINTA/VRINTN/VRINTP/VRINTM to decodetree Convert the VRINTA/VRINTN/VRINTP/VRINTM instructions to decodetree. Again, trans_VRINT() is temporarily left in translate.c. Backports commit e3bb599d16e4678b228d80194cee328f894b1ceb from qemu	2019-06-13 16:50:36 -04:00
Peter Maydell	4501daf010	target/arm: Convert VMINNM, VMAXNM to decodetree Convert the VMINNM and VMAXNM instructions to decodetree. As with VSEL, we leave the trans_VMINMAXNM() function in translate.c for the moment. Backports commit f65988a1efdb42f9058db44297591491842e697c from qemu	2019-06-13 16:43:50 -04:00
Peter Maydell	3994dfd079	target/arm: Convert the VSEL instructions to decodetree Convert the VSEL instructions to decodetree. We leave trans_VSEL() in translate.c for now as this allows the patch to show just the changes from the old handle_vsel(). In the old code the check for "do D16-D31 exist" was hidden in the VFP_DREG macro, and assumed that VFPv3 always implied that D16-D31 exist. In the new code we do the correct ID register test. This gives identical behaviour for most of our CPUs, and fixes previously incorrect handling for Cortex-R5F, Cortex-M4 and Cortex-M33, which all implement VFPv3 or better with only 16 double-precision registers. Backports commit b3ff4b87b4ae08120a51fe12592725e1dca8a085 from qemu	2019-06-13 16:41:22 -04:00
Lioncash	b3cfede44f	target/arm: Make load_cpu_offset() take a DisasContext* instead of uc_struct* Keeps it consistent with store_cpu_offset	2019-06-13 16:35:31 -04:00
Peter Maydell	78997058e4	target/arm: Factor out VFP access checking code Factor out the VFP access checking code so that we can use it in the leaf functions of the decodetree decoder. We call the function full_vfp_access_check() so we can keep the more natural vfp_access_check() for a version which doesn't have the 'ignore_vfp_enabled' flag -- that way almost all VFP insns will be able to use vfp_access_check(s) and only the special-register access function will have to use full_vfp_access_check(s, ignore_vfp_enabled). Backports commit 06db8196bba34776829020192ed623a0b22e6557 from qemu	2019-06-13 16:33:38 -04:00
Peter Maydell	9732ebba5c	target/arm: Add stubs for AArch32 VFP decodetree Add the infrastructure for building and invoking a decodetree decoder for the AArch32 VFP encodings. At the moment the new decoder covers nothing, so we always fall back to the existing hand-written decode. We need to have one decoder for the unconditional insns and one for the conditional insns, as otherwise the patterns for conditional insns would incorrectly match against the unconditional ones too. Since translate.c is over 14,000 lines long and we're going to be touching pretty much every line of the VFP code as part of the decodetree conversion, we create a new translate-vfp.inc.c to hold the code which deals with VFP in the new scheme. It should be possible to convert this into a standalone translation unit eventually, but the conversion process will be much simpler if we simply #include it midway through translate.c to start with. Backports commit 78e138bc1f672c145ef6ace74617db00eebaa2ba from qemu	2019-06-13 16:24:37 -04:00
Richard Henderson	7c32498b7f	target/arm: Use tcg_gen_gvec_bitsel This replaces 3 target-specific implementations for BIT, BIF, and BSL. Backports commit 3a7a2b4e5cf0d49cd8b14e8225af0310068b7d20 from qemu	2019-06-13 16:12:56 -04:00
Richard Henderson	b8bd543390	target/arm: Use env_cpu, env_archcpu Cleanup in the boilerplate that each target must define. Replace arm_env_get_cpu with env_archcpu. The combination CPU(arm_env_get_cpu) should have used ENV_GET_CPU to begin; use env_cpu now. Backports commit 2fc0cc0e1e034582f4718b1a2d57691474ccb6aa from qemu	2019-06-12 11:34:08 -04:00
Alistair Francis	f8f3e50372	target/arm: Fix vector operation segfault Commit 89e68b575 "target/arm: Use vector operations for saturation" causes this abort() when booting QEMU ARM with a Cortex-A15: 0 0x00007ffff4c2382f in raise () at /usr/lib/libc.so.6 1 0x00007ffff4c0e672 in abort () at /usr/lib/libc.so.6 2 0x00005555559c1839 in disas_neon_data_insn (insn=<optimized out>, s=<optimized out>) at ./target/arm/translate.c:6673 3 0x00005555559c1839 in disas_neon_data_insn (s=<optimized out>, insn=<optimized out>) at ./target/arm/translate.c:6386 4 0x00005555559cd8a4 in disas_arm_insn (insn=4081107068, s=0x7fffe59a9510) at ./target/arm/translate.c:9289 5 0x00005555559cd8a4 in arm_tr_translate_insn (dcbase=0x7fffe59a9510, cpu=<optimized out>) at ./target/arm/translate.c:13612 6 0x00005555558d1d39 in translator_loop (ops=0x5555561cc580 <arm_translator_ops>, db=0x7fffe59a9510, cpu=0x55555686a2f0, tb=<optimized out>, max_insns=<optimized out>) at ./accel/tcg/translator.c:96 7 0x00005555559d10d4 in gen_intermediate_code (cpu=cpu@entry=0x55555686a2f0, tb=tb@entry=0x7fffd7840080 <code_gen_buffer+126091347>, max_insns=max_insns@entry=512) at ./target/arm/translate.c:13901 8 0x00005555558d06b9 in tb_gen_code (cpu=cpu@entry=0x55555686a2f0, pc=3067096216, cs_base=0, flags=192, cflags=-16252928, cflags@entry=524288) at ./accel/tcg/translate-all.c:1736 9 0x00005555558ce467 in tb_find (cf_mask=524288, tb_exit=1, last_tb=0x7fffd783e640 <code_gen_buffer+126084627>, cpu=0x1) at ./accel/tcg/cpu-exec.c:407 10 0x00005555558ce467 in cpu_exec (cpu=cpu@entry=0x55555686a2f0) at ./accel/tcg/cpu-exec.c:728 11 0x000055555588b0cf in tcg_cpu_exec (cpu=0x55555686a2f0) at ./cpus.c:1431 12 0x000055555588d223 in qemu_tcg_cpu_thread_fn (arg=0x55555686a2f0) at ./cpus.c:1735 13 0x000055555588d223 in qemu_tcg_cpu_thread_fn (arg=arg@entry=0x55555686a2f0) at ./cpus.c:1709 14 0x0000555555d2629a in qemu_thread_start (args=<optimized out>) at ./util/qemu-thread-posix.c:502 15 0x00007ffff4db8a92 in start_thread () at /usr/lib/libpthread. This patch ensures that we don't hit the abort() in the second switch case in disas_neon_data_insn() as we will return from the first case. Backports commit 2f143d3ad1c05e91cf2cdf5de06d59a80a95e6c8 from qemu	2019-05-24 18:02:32 -04:00
Richard Henderson	552e48f14e	target/arm: Use tcg_gen_abs_i64 and tcg_gen_gvec_abs Backports commit 4e027a710673f5d4dc6cff88728bcfd32e4c47b0 from qemu	2019-05-16 16:43:02 -04:00
Richard Henderson	6d1730048d	tcg: Add support for integer absolute value Remove a function of the same name from target/arm/. Use a branchless implementation of abs gleaned from gcc. Backports commit ff1f11f7f8710a768f9313f24bd7f509d3db27e5 from qemu	2019-05-16 16:25:15 -04:00
Richard Henderson	c54b2776f6	tcg: Specify optional vector requirements with a list Replace the single opcode in .opc with a null-terminated array in .opt_opc. We still require that all opcodes be used with the same .vece. Validate the contents of this list with CONFIG_DEBUG_TCG. All tcg_gen_*_vec functions will check any list active during .fniv expansion. Swap the active list in and out as we expand other opcodes, or take control away from the front-end function. Convert all existing vector aware front ends. Backports commit 53229a7703eeb2bbe101a19a33ef22aaf960c65b from qemu	2019-05-16 15:05:02 -04:00
Emilio G. Cota	1715f382b4	target/arm: check CF_PARALLEL instead of parallel_cpus Thereby decoupling the resulting translated code from the current state of the system. Backports commit 2399d4e7cec22ecf1c51062d2ebfd45220dbaace from qemu	2019-05-04 22:44:32 -04:00
Peter Maydell	77ae3982b4	target/arm: Implement VLLDM for v7M CPUs with an FPU Implement the VLLDM instruction for v7M for the FPU present cas. Backports commit 956fe143b4f254356496a0a1c479fa632376dfec from qemu	2019-04-30 11:27:54 -04:00
Peter Maydell	b483951046	target/arm: Implement VLSTM for v7M CPUs with an FPU Implement the VLSTM instruction for v7M for the FPU present case. Backports commit 019076b036da4444494de38388218040d9d3a26c from qemu	2019-04-30 11:25:44 -04:00
Peter Maydell	a976d7642a	target/arm: Implement M-profile lazy FP state preservation The M-profile architecture floating point system supports lazy FP state preservation, where FP registers are not pushed to the stack when an exception occurs but are instead only saved if and when the first FP instruction in the exception handler is executed. Implement this in QEMU, corresponding to the check of LSPACT in the pseudocode ExecuteFPCheck(). Backports commit e33cf0f8d8c9998a7616684f9d6aa0d181b88803 from qemu	2019-04-30 11:21:50 -04:00
Peter Maydell	719231b4c0	target/arm: Activate M-profile floating point context when FPCCR.ASPEN is set The M-profile FPCCR.ASPEN bit indicates that automatic floating-point context preservation is enabled. Before executing any floating-point instruction, if FPCCR.ASPEN is set and the CONTROL FPCA/SFPA bits indicate that there is no active floating point context then we must create a new context (by initializing FPSCR and setting FPCA/SFPA to indicate that the context is now active). In the pseudocode this is handled by ExecuteFPCheck(). Implement this with a new TB flag which tracks whether we need to create a new FP context. Backports commit 6000531e19964756673a5f4b694a649ef883605a from qemu	2019-04-30 10:51:31 -04:00
Peter Maydell	87c8c0fde7	target/arm: Set FPCCR.S when executing M-profile floating point insns The M-profile FPCCR.S bit indicates the security status of the floating point context. In the pseudocode ExecuteFPCheck() function it is unconditionally set to match the current security state whenever a floating point instruction is executed. Implement this by adding a new TB flag which tracks whether FPCCR.S is different from the current security state, so that we only need to emit the code to update it in the less-common case when it is not already set correctly. Note that we will add the handling for the other work done by ExecuteFPCheck() in later commits. Backports commit 6d60c67a1a03be32c3342aff6604cdc5095088d1 from qemu	2019-04-30 10:50:17 -04:00
Peter Maydell	8d726490ff	target/arm: Overlap VECSTRIDE and XSCALE_CPAR TB flags We are close to running out of TB flags for AArch32; we could start using the cs_base word, but before we do that we can economise on our usage by sharing the same bits for the VFP VECSTRIDE field and the XScale XSCALE_CPAR field. This works because no XScale CPU ever had VFP. Backports commit ea7ac69d124c94c6e5579145e727adec9ccbefef from qemu	2019-04-30 10:45:14 -04:00
Peter Maydell	89baa5cffa	target/arm: Decode FP instructions for M profile Correct the decode of the M-profile "coprocessor and floating-point instructions" space: * op0 == 0b11 is always unallocated * if the CPU has an FPU then all insns with op1 == 0b101 are floating point and go to disas_vfp_insn() For the moment we leave VLLDM and VLSTM as NOPs; in a later commit we will fill in the proper implementation for the case where an FPU is present. Backports commit 8859ba3c9625e7ceb5599f457a344bcd7c5e112b from qemu	2019-04-30 10:19:45 -04:00
Peter Maydell	18bb21c035	target/arm: Honour M-profile FP enable bits Like AArch64, M-profile floating point has no FPEXC enable bit to gate floating point; so always set the VFPEN TB flag. M-profile also has CPACR and NSACR similar to A-profile; they behave slightly differently: * the CPACR is banked between Secure and Non-Secure * if the NSACR forces a trap then this is taken to the Secure state, not the Non-Secure state Honour the CPACR and NSACR settings. The NSACR handling requires us to borrow the exception.target_el field (usually meaningless for M profile) to distinguish the NOCP UsageFault taken to Secure state from the more usual fault taken to the current security state. Backports commit d87513c0abcbcd856f8e1dee2f2d18903b2c3ea2 from qemu	2019-04-30 10:18:21 -04:00
Peter Maydell	c6bb8d483d	target/arm: Disable most VFP sysregs for M-profile The only "system register" that M-profile floating point exposes via the VMRS/VMRS instructions is FPSCR, and it does not have the odd special case for rd==15. Add a check to ensure we only expose FPSCR. Backports commit ef9aae2522c22c05df17dd898099dd5c3f20d688 from qemu	2019-04-30 10:15:25 -04:00
Richard Henderson	bca82cde84	tcg: Hoist max_insns computation to tb_gen_code In order to handle TB's that translate to too much code, we need to place the control of the length of the translation in the hands of the code gen master loop. Backports commit 8b86d6d25807e13a63ab6ea879f976b9f18cc45a from qemu	2019-04-30 09:49:57 -04:00
Lioncash	c3df12e534	target/arm/translate: Synchronize with Qemu	2019-04-27 10:13:01 -04:00
Lioncash	d844d7cc9d	exec: Backport tb_cflags accessor	2019-04-22 06:12:59 -04:00
Lioncash	5968b3d96f	target/arm: Synchronize with qemu	2019-04-19 15:31:18 -04:00
Lioncash	bf6dfeb175	target/arm/translate: Synchronize with qemu Backports a few other missing pieces from mainline qemu.	2019-04-18 06:22:36 -04:00
Lioncash	5b062dacf2	target/arm: Simplify and correct thumb instruction tracing This wasn't subtracting the size of the instruction off the PC like how the ARM mode tracing was performing the tracing. This simplifies it and makes the behavior identical.	2019-04-18 06:00:15 -04:00
Lioncash	5d6ddec7fb	target/arm/translate: Subtract PC value properly for thumb tracecode calls	2019-04-18 05:44:48 -04:00
Lioncash	3521e72580	target/arm: Sychronize with qemu Synchronizes with bits and pieces that were missed due to merging incorrectly (sorry :<)	2019-04-18 04:49:11 -04:00
Lioncash	ddcf400955	arm: Always enable access to coprocessors initially Allows non-AArch64 environments to always access coprocessors initially. Removes the need to do avoidable register management when testing floating-point code.	2019-04-13 19:49:43 -04:00
Richard Henderson	45c297c99b	target/arm: Add set/clear_pstate_bits, share gen_ss_advance We do not need an out-of-line helper for manipulating bits in pstate. While changing things, share the implementation of gen_ss_advance. Backports commit 22ac3c49641f6eed93dca5b852030b4d3eacf6c4 from qemu	2019-03-05 22:55:22 -05:00
Richard Henderson	1721e429c2	target/arm: Implement ARMv8.0-SB Backports commit 9888bd1e20425dfe4dcca5dcd1ca2fac8e90ad19 from qemu	2019-03-05 22:35:16 -05:00
Richard Henderson	fa70a2bc69	target/arm: Fix PC test for LDM (exception return) Found by inspection: Rn is the base register against which the load began; I is the register within the mask being processed. The exception return should of course be processed from the loaded PC. Backports commit 9d090d17234058f55c3c439d285db78c94d7d4de from qemu	2019-03-05 22:27:38 -05:00
Lioncash	0868015992	target/arm: Move TCGContext variable within arm_post_translate_insn into a narrower scope This is only used within the scope of the if statement, so we can just move it there.	2019-02-28 18:53:33 -05:00
Lioncash	15440a83c5	target/arm: Fix execution of ARM instructions Previously we'd be checking prior to the actual decoding if we were at the ending address. This worked fine using the old model of the translation process in qemu. However, this causes the wrong behavior to occur in both ARM and Thumb/Thumb-2 modes using the newer translator model. Given the translator itself checks for the end address already, this needs to be placed within arm_post_translate_insn(). This prevents the emulation process being off-by-one as well when it comes to actually executing the instructions.	2019-02-28 18:49:22 -05:00
Richard Henderson	4ae3ff8e61	target/arm: Implement VFMAL and VFMSL for aarch32 Backports commit 87732318c5d68a366fc2d6fc394d9c20412099fa from qemu	2019-02-28 15:44:59 -05:00
Peter Maydell	82b8e97f76	target/arm: Gate "miscellaneous FP" insns by ID register field There is a set of VFP instructions which we implement in disas_vfp_v8_insn() and gate on the ARM_FEATURE_V8 bit. These were all first introduced in v8 for A-profile, but in M-profile they appeared in v7M. Gate them on the MVFR2 FPMisc field instead, and rename the function appropriately. Backports commit c0c760afe800b60b48c80ddf3509fec413594778 from qemu	2019-02-28 15:26:27 -05:00
Peter Maydell	118a2bde5c	target/arm: Use MVFR1 feature bits to gate A32/T32 FP16 instructions Instead of gating the A32/T32 FP16 conversion instructions on the ARM_FEATURE_VFP_FP16 flag, switch to our new approach of looking at ID register bits. In this case MVFR1 fields FPHP and SIMDHP indicate the presence of these insns. This change doesn't alter behaviour for any of our CPUs. Backports commit 602f6e42cfbfe9278be34e9b91d2ceb695837e02 from qemu	2019-02-28 15:23:51 -05:00
Richard Henderson	c9ad233678	target/arm: Implement ARMv8.3-JSConv Backports commit 6c1f6f2733a7692793135ea5ce72b829add99a50 from qemu	2019-02-22 19:08:57 -05:00
Richard Henderson	f16dcbe226	target/arm: Rearrange Floating-point data-processing (2 regs) There are lots of special cases within these insns. Split the major argument decode/loading/saving into no_output (compares), rd_is_dp, and rm_is_dp. We still need to special case argument load for compare (rd as input, rm as zero) and vcvt fixed (rd as input+output), but lots of special cases do disappear. Now that we have a full switch at the beginning, hoist the ISA checks from the code generation. Backports commit e80941bd64cc388554770fd72334e9e7d459a1ef from qemu	2019-02-22 18:57:25 -05:00
Richard Henderson	f3cb92c86c	target/arm: Use vector operations for saturation For same-sign saturation, we have tcg vector operations. We can compute the QC bit by comparing the saturated value against the unsaturated value. Backports commit 89e68b575e138d0af1435f11a8ffcd8779c237bd from qemu	2019-02-15 18:14:09 -05:00
Richard Henderson	4e44043956	target/arm: Fix arm_cpu_dump_state vs FPSCR Backports commit ec527e4eeccc31e3beadf3b61b66c61bbd873811 from qemu	2019-02-15 17:58:25 -05:00
Richard Henderson	198befc50e	target/arm: Use tcg integer min/max primitives for neon The 32-bit PMIN/PMAX has been decomposed to scalars, and so can be trivially expanded inline. Backports commit 9ecd3c5c1651fa7f9adbedff4806a2da0b50490c from qemu	2019-02-15 17:55:11 -05:00
Richard Henderson	eee33bd692	target/arm: Use vector minmax expanders for aarch32 Backports commit 6f2782218230bbb33fa22f9a2f73f8a570046007 from qemu	2019-02-15 17:54:05 -05:00
Richard Henderson	d147946edc	target/arm: Rely on optimization within tcg_gen_gvec_or Since we're now handling a == b generically, we no longer need to do it by hand within target/arm/. Backports commit 2900847ff4c862887af750935a875059615f509a from qemu	2019-02-15 17:50:28 -05:00
Peter Maydell	55bc017af4	target/arm: Emit barriers for A32/T32 load-acquire/store-release insns Now that MTTCG is here, the comment in the 32-bit Arm decoder that "Since the emulation does not have barriers, the acquire/release semantics need no special handling" is no longer true. Emit the correct barriers for the load-acquire/store-release insns, as we already do in the A64 decoder. Backports commit 96c552958dbb63453b5f02bea6e704006d50e39a from qemu	2019-01-13 19:48:27 -05:00
Richard Henderson	4d8b7a9967	target/arm: Convert ARM_TBFLAG_* to FIELDs Use "register" TBFLAG_ANY to indicate shared state between A32 and A64, and "registers" TBFLAG_A32 & TBFLAG_A64 for fields that are specific to the given cpu state. Move ARM_TBFLAG_BE_DATA to shared state, instead of its current placement within "Bit usage when in AArch32 state". Backports commit aad821ac4faad369fad8941d25e59edf2514246b from qemu	2019-01-13 19:21:18 -05:00
Richard Henderson	1bcba0737e	target/arm: Reorg NEON VLD/VST single element to one lane Instead of shifts and masks, use direct loads and stores from the neon register file. Backports commit 2d6ac920837f558be214ad2ddd28cad7f3b15e5c from qemu	2018-11-10 11:24:37 -05:00
Richard Henderson	37103f1bc4	target/arm: Promote consecutive memory ops for aa32 For a sequence of loads or stores from a single register, little-endian operations can be promoted to an 8-byte op. This can reduce the number of operations by a factor of 8. Backports commit e23f12b3a252352b575908ca7b94587acd004641 from qemu	2018-11-10 11:19:15 -05:00
Richard Henderson	1cab7a41ac	target/arm: Reorg NEON VLD/VST all elements Instead of shifts and masks, use direct loads and stores from the neon register file. Mirror the iteration structure of the ARM pseudocode more closely. Correct the parameters of the VLD2 A2 insn. Note that this includes a bugfix for handling of the insn "VLD2 (multiple 2-element structures)" -- we were using an incorrect stride value. Backports commit ac55d00709e78cd39dfa298dcaac7aecb58762e8 from qemu	2018-11-10 11:18:45 -05:00
Richard Henderson	a2239b9f5b	target/arm: Use gvec for NEON VLD all lanes Backports commit 7377c2c97e20e64ed9b481eb2d9b9084bfd5b7e9 from qemu	2018-11-10 11:08:29 -05:00
Richard Henderson	985acb9cde	target/arm: Use gvec for NEON_3R_VTST_VCEQ, NEON_3R_VCGT, NEON_3R_VCGE Move cmtst_op expanders from translate-a64.c. Backports commit ea580fa312674c1ba82a8b137caf42b0609ce3e3 from qemu	2018-11-10 11:03:42 -05:00
Richard Henderson	5d9c0e52bf	target/arm: Use gvec for NEON_3R_VML Move mla_op and mls_op expanders from translate-a64.c. Backports commit 4a7832b095b9ce97a815749a13516f5cfb3c5dd4 from qemu	2018-11-10 10:58:44 -05:00
Richard Henderson	79bbb7c730	target/arm: Use gvec for VSRI, VSLI Move shi_op and sli_op expanders from translate-a64.c. Backports commit f3cd8218d1d3e534877ce3f3cb61c6757d10f9df from qemu	2018-11-10 10:53:28 -05:00
Lioncash	edb36c7505	target/arm: Use gvec for VSRA	2018-11-10 10:32:29 -05:00
Richard Henderson	b5877f1dfb	target/arm: Use gvec for VSHR, VSHL Backports commit 1dc8425e551be1371d657e94367f37130cd7aede from qemu	2018-11-10 10:14:31 -05:00
Lioncash	7790ca1ccb	target/arm: Use gvec for NEON_3R_VMUL	2018-11-10 10:11:10 -05:00
Richard Henderson	dfdc6bc05c	target/arm: Use gvec for NEON_2RM_VMN, NEON_2RM_VNEG Backports commit 4bf940bebad273e4b3534ae3f83f2c9d1191d3a2 from qemu	2018-11-10 10:09:38 -05:00
Richard Henderson	7b4b5ac249	target/arm: Use gvec for NEON_3R_VADD_VSUB insns Backports commit e4717ae02dd0c2e544a07302c1ed473775209aba from qemu	2018-11-10 10:08:23 -05:00
Richard Henderson	0965b9513a	target/arm: Use gvec for NEON_3R_LOGIC insns Move expanders for VBSL, VBIT, and VBIF from translate-a64.c. Backports commit eabcd6faa90461e0b7463f4ebe75b8d050487c9c from qemu	2018-11-10 10:06:13 -05:00
Richard Henderson	9f767248a2	target/arm: Use gvec for NEON VMOV, VMVN, VBIC & VORR (immediate) Backports commit 246fa4aca95e213fba10c8222dbc6bd0a9a2a8d4 from qemu	2018-11-10 09:56:30 -05:00
Richard Henderson	c1251a19e1	target/arm: Use gvec for NEON VDUP Also introduces neon_element_offset to find the env offset of a specific element within a neon register. Backports commit 32f91fb71f4c32113ec8c2af5f74f14abe6c7162 from qemu	2018-11-10 09:51:40 -05:00
Richard Henderson	3d5f040608	target/arm: Mark some arrays const Backports commit 308e5636152594daa4c5597b1188d44d7266db04 from qemu	2018-11-10 09:49:25 -05:00
Richard Henderson	74aba4ba51	target/arm: Don't call tcg_clear_temp_count This is done generically in translator_loop. Backports commit 7108e255c2d95b44c9dfee8075d0d6fb391281a8 from qemu	2018-11-10 09:40:06 -05:00
Peter Maydell	d60fe610bb	target/arm: Report correct syndrome for FP/SIMD traps to Hyp mode For traps of FP/SIMD instructions to AArch32 Hyp mode, the syndrome provided in HSR has more information than is reported to AArch64. Specifically, there are extra fields TA and coproc which indicate whether the trapped instruction was FP or SIMD. Add this extra information to the syndromes we construct, and mask it out when taking the exception to AArch64. Backports commit 4be42f4013fa1a9df47b48aae5148767bed8e80c from qemu	2018-11-10 09:36:41 -05:00
Lioncash	a0358202a7	target/arm: Improve debug logging of AArch32 exception return For AArch32, exception return happens through certain kinds of CPSR write. We don't currently have any CPU_LOG_INT logging of these events (unlike AArch64, where we log in the ERET instruction). Add some suitable logging. This will log exception returns like this: Exception return from AArch32 hyp to usr PC 0x80100374 paralleling the existing logging in the exception_return helper for AArch64 exception returns: Exception return from AArch64 EL2 to AArch64 EL0 PC 0x8003045c Exception return from AArch64 EL2 to AArch32 EL0 PC 0x8003045c (Note that an AArch32 exception return can only be AArch32->AArch32, never to AArch64.) Backports commit 81e3728407bf4a12f83e14fd410d5f0a7d29b5b4 from qemu	2018-11-10 09:09:52 -05:00
Richard Henderson	03ec90f39b	target/arm: Convert v8.2-fp16 from feature bit to aa64pfr0 test Backports commit 5763190fa8705863b4b725aa1657661a97113eb4 from qemu	2018-11-10 08:34:32 -05:00
Richard Henderson	03e2d64aed	target/arm: Convert jazelle from feature bit to isar1 test Having V6 alone imply jazelle was wrong for cortex-m0. Change to an assertion for V6 & !M. This was harmless, because the only place we tested ARM_FEATURE_JAZELLE was for 'bxj' in disas_arm(), which is unreachable for M-profile cores. Backports commit 09cbd50198d5dcac8bea2e47fa5dd641ec505fae from qemu	2018-11-10 08:24:11 -05:00
Richard Henderson	4a58a81e31	target/arm: Convert division from feature bits to isar0 tests Both arm and thumb2 division are controlled by the same ISAR field, which takes care of the arm implies thumb case. Having M imply thumb2 division was wrong for cortex-m0, which is v6m and does not have thumb2 at all, much less thumb2 division. Backports commit 7e0cf8b47f0e67cebbc3dfa73f304e56ad1a090f from qemu	2018-11-10 08:21:02 -05:00
Richard Henderson	4221703f18	target/arm: Convert v8 extensions from feature bits to isar tests Most of the v8 extensions are self-contained within the ISAR registers and are not implied by other feature bits, which makes them the easiest to convert. Backports commit 962fcbf2efe57231a9f5df0ae0f40c05e35628ba from qemu	2018-11-10 08:17:57 -05:00
Peter Maydell	76f521e6c3	target/arm: Add v8M stack checks for VLDM/VSTM Add the v8M stack checks for the VLDM/VSTM (aka VPUSH/VPOP) instructions. This code is currently unreachable because we haven't yet implemented M profile floating point support, but since the change is simple, we add it now because otherwise we're likely to forget to do it later. Backports commit 8a954faf5412d5073d585d85a1da63a09bb5d84e from qemu	2018-10-08 14:23:02 -04:00
Peter Maydell	37d0c7fcf1	target/arm: Add v8M stack checks for Thumb push/pop Add v8M stack checks for the 16-bit Thumb push/pop encodings: STMDB, STMFD, LDM, LDMIA, LDMFD. Backports commit aa369e5c08bbe2748d2be96f13f4ef469a4d3080 from qemu	2018-10-08 14:22:08 -04:00
Peter Maydell	ef9afb1855	target/arm: Add v8M stack checks for T32 load/store single Add v8M stack checks for the instructions in the T32 "load/store single" encoding class: these are the "immediate pre-indexed" and "immediate, post-indexed" LDR and STR instructions. Backports commit 0bc003bad9752afc61624cb680226c922f34f82c from qemu	2018-10-08 14:20:58 -04:00
Peter Maydell	de30651f5e	target/arm: Add v8M stack checks for Thumb2 LDM/STM Add the v8M stack checks for: * LDM (T2 encoding) * STM (T2 encoding) This includes the 32-bit encodings of the instructions listed in v8M ARM ARM rule R_YVWT as * LDM, LDMIA, LDMFD * LDMDB, LDMEA * POP (multiple registers) * PUSH (muliple registers) * STM, STMIA, STMEA * STMDB, STMFD We perform the stack limit before doing any other part of the load or store. Backports commit 7c0ed88e7d6bee3e55c3d8935c46226cb544191a from qemu	2018-10-08 14:19:14 -04:00
Peter Maydell	bb97240df6	target/arm: Add v8M stack checks for LDRD/STRD (imm) Add the v8M stack checks for: * LDRD (immediate) * STRD (immediate) Loads and stores are more complicated than ADD/SUB/MOV, because we must ensure that memory accesses below the stack limit are not performed, so we can't simply do the check when we actually update SP. For these instructions, if the stack limit check triggers we must not: * perform any memory access below the SP limit * update PC, SP or the load/store base register but it is IMPDEF whether we: * perform any accesses above or equal to the SP limit * update destination registers for loads For QEMU we choose to always check the limit before doing any other part of the load or store, so we won't update any registers or perform any memory accesses. It is UNKNOWN whether the limit check triggers for a load or store where the initial SP value is below the limit and one of the stores would be below the limit, but the writeback moves SP to above the limit. For QEMU we choose to trigger the check in this situation. Note that limit checks happen only for loads and stores which update SP via writeback; they do not happen for loads and stores which simply use SP as a base register. Backports commit 910d7692e5b60f2c2d08cc3d6d36076e85b6a69d from qemu	2018-10-08 14:17:27 -04:00
Peter Maydell	0fc6e2c183	target/arm: Add some comments in Thumb decode Add some comments to the Thumb decoder indicating what bits of the instruction have been decoded at various points in the code. This is not an exhaustive set of comments; we're gradually adding comments as we work with particular bits of the code. Backports commit a2d12f0f34e9c5ef8a193556fde983aa186fa73a from qemu	2018-10-08 14:15:15 -04:00
Peter Maydell	ca5d7b8fd2	target/arm: Add v8M stack checks on ADD/SUB/MOV of SP Add code to insert calls to a helper function to do the stack limit checking when we handle these forms of instruction that write to SP: * ADD (SP plus immediate) * ADD (SP plus register) * SUB (SP minus immediate) * SUB (SP minus register) * MOV (register) Backports commit 5520318939fea5d659bf808157cd726cb967b761 from qemu	2018-10-08 14:15:15 -04:00
Peter Maydell	8b3b548961	target/arm: Define new TBFLAG for v8M stack checking The Arm v8M architecture includes hardware stack limit checking. When certain instructions update the stack pointer, if the new value of SP is below the limit set in the associated limit register then an exception is taken. Add a TB flag that tracks whether the limit-checking code needs to be emitted. Backports commit 4730fb85035e99c909db7d14ef76cd17f28f4423 from qemu	2018-10-08 14:15:15 -04:00
Lioncash	47b45f1bc2	arm: Take DisasContext as a parameter instead of TCGContext where applicable This is more future-friendly with qemu, as it's more generic.	2018-10-06 04:17:12 -04:00

1 2 3 4 5 ...

252 commits