unicorn

mirror of https://github.com/yuzu-emu/unicorn.git synced 2024-12-25 15:05:32 +00:00

Author	SHA1	Message	Date
Richard Henderson	f19b4df20d	target/arm: Replace offset with pc in gen_exception_internal_insn The offset is variable depending on the instruction set. Passing in the actual value is clearer in intent. Backpors commit aee828e7541a5895669ade3a4b6978382b6b094a from qemu	2019-11-18 20:05:23 -05:00
Richard Henderson	00fbadf637	target/arm: Replace s->pc with s->base.pc_next We must update s->base.pc_next when we return from the translate_insn hook to the main translator loop. By incrementing s->base.pc_next immediately after reading the insn word, "pc_next" contains the address of the next instruction throughout translation. All remaining uses of s->pc are referencing the address of the next insn, so this is now a simple global replacement. Remove the "s->pc" field. Backports commit a04159166b880b505ccadc16f2fe84169806883d from qemu	2019-11-18 17:32:53 -05:00
Richard Henderson	7d1fcef722	target/arm: Remove redundant s->pc & ~1 The thumb bit has already been removed from s->pc, and is always even. Backports commit 4818c3743b0e0095fdcecd24457da9b3443730ab from qemu	2019-11-18 17:32:53 -05:00
Richard Henderson	a2e60445de	target/arm: Introduce add_reg_for_lit Provide a common routine for the places that require ALIGN(PC, 4) as the base address as opposed to plain PC. The two are always the same for A32, but the difference is meaningful for thumb mode. Backports commit 16e0d8234ef9291747332d2c431e46808a060472 from qemu	2019-11-18 17:32:49 -05:00
Richard Henderson	1c0914e58c	target/arm: Introduce read_pc We currently have 3 different ways of computing the architectural value of "PC" as seen in the ARM ARM. The value of s->pc has been incremented past the current insn, but that is all. Thus for a32, PC = s->pc + 4; for t32, PC = s->pc; for t16, PC = s->pc + 2. These differing computations make it impossible at present to unify the various code paths. With the newly introduced s->pc_curr, we can compute the correct value for all cases, using the formula given in the ARM ARM. This changes the behaviour for load_reg() and load_reg_var() when called with reg==15 from a 32-bit Thumb instruction: previously they would have returned the incorrect value of pc_curr + 6, and now they will return the architecturally correct value of PC, which is pc_curr + 4. This will not affect well-behaved guest software, because all of the places we call these functions from T32 code are instructions where using r15 is UNPREDICTABLE. Using the architectural PC value here is more consistent with the T16 and A32 behaviour. Backports commit fdbcf6329d0c2984c55d7019419a72bf8e583c36 from qemu	2019-11-18 17:04:50 -05:00
Richard Henderson	0048f3e887	target/arm: Introduce pc_curr Add a new field to retain the address of the instruction currently being translated. The 32-bit uses are all within subroutines used by a32 and t32. This will become less obvious when t16 support is merged with a32+t32, and having a clear definition will help. Convert aarch64 as well for consistency. Note that there is one instance of a pre-assert fprintf that used the wrong value for the address of the current instruction. Backports commit 43722a6d4f0c92f7e7e1e291580039b0f9789df1 from qemu	2019-11-18 16:58:40 -05:00
Richard Henderson	1aa3c685a8	target/arm: Pass in pc to thumb_insn_is_16bit This function is used in two different contexts, and it will be clearer if the function is given the address to which it applies. Backports commit 331b1ca616cb708db30dab68e3262d286e687f24 from qemu	2019-11-18 16:52:35 -05:00
Peter Maydell	c61e22627d	target/arm: Fix routing of singlestep exceptions When generating an architectural single-step exception we were routing it to the "default exception level", which is to say the same exception level we execute at except that EL0 exceptions go to EL1. This is incorrect because the debug exception level can be configured by the guest for situations such as single stepping of EL0 and EL1 code by EL2. We have to track the target debug exception level in the TB flags, because it is dependent on CPU state like HCR_EL2.TGE and MDCR_EL2.TDE. (That we were previously calling the arm_debug_target_el() function to determine dc->ss_same_el is itself a bug, though one that would only have manifested as incorrect syndrome information.) Since we are out of TB flag bits unless we want to expand into the cs_base field, we share some bits with the M-profile only HANDLER and STACKCHECK bits, since only A-profile has this singlestep. Fixes: https://bugs.launchpad.net/qemu/+bug/1838913 Backports commit 8bd587c1066f4456ddfe611b571d9439a947d74c from qemu	2019-11-18 16:50:15 -05:00
Peter Maydell	3f531fac61	target/arm: Factor out 'generate singlestep exception' function Factor out code to 'generate a singlestep exception', which is currently repeated in four places. To do this we need to also pull the identical copies of the gen-exception() function out of translate-a64.c and translate.c into translate.h. (There is a bug in the code: we're taking the exception to the wrong target EL. This will be simpler to fix if there's only one place to do it.) Backports commit c1d5f50f094ab204accfacc2ee6aafc9601dd5c4 from qemu	2019-11-18 16:47:08 -05:00
Peter Maydell	3fc86e1901	target/arm: Don't abort on M-profile exception return in linux-user mode An attempt to do an exception-return (branch to one of the magic addresses) in linux-user mode for M-profile should behave like a normal branch, because linux-user mode is always going to be in 'handler' mode. This used to work, but we broke it when we added support for the M-profile security extension in commit d02a8698d7ae2bfed. In that commit we allowed even handler-mode calls to magic return values to be checked for and dealt with by causing an EXCP_EXCEPTION_EXIT exception to be taken, because this is needed for the FNC_RETURN return-from-non-secure-function-call handling. For system mode we added a check in do_v7m_exception_exit() to make any spurious calls from Handler mode behave correctly, but forgot that linux-user mode would also be affected. How an attempted return-from-non-secure-function-call in linux-user mode should be handled is not clear -- on real hardware it would result in return to secure code (not to the Linux kernel) which could then handle the error in any way it chose. For QEMU we take the simple approach of treating this erroneous return the same way it would be handled on a CPU without the security extensions -- treat it as a normal branch. The upshot of all this is that for linux-user mode we should never do any of the bx_excret magic, so the code change is simple. This ought to be a weird corner case that only affects broken guest code (because Linux user processes should never be attempting to do exception returns or NS function returns), except that the code that assigns addresses in RAM for the process and stack in our linux-user code does not attempt to avoid this magic address range, so legitimate code attempting to return to a trampoline routine on the stack can fall into this case. This change fixes those programs, but we should also look at restricting the range of memory we use for M-profile linux-user guests to the area that would be real RAM in hardware. Backports commit 9027d3fba605d8f6093342ebe4a1da450d374630 from qemu	2019-11-18 16:30:43 -05:00
Peter Maydell	0d89bce217	target/arm: Execute Thumb instructions when their condbits are 0xf Thumb instructions in an IT block are set up to be conditionally executed depending on a set of condition bits encoded into the IT bits of the CPSR/XPSR. The architecture specifies that if the condition bits are 0b1111 this means "always execute" (like 0b1110), not "never execute"; we were treating it as "never execute". (See the ConditionHolds() pseudocode in both the A-profile and M-profile Arm ARM.) This is a bit of an obscure corner case, because the only legal way to get to an 0b1111 set of condbits is to do an exception return which sets the XPSR/CPSR up that way. An IT instruction which encodes a condition sequence that would include an 0b1111 is UNPREDICTABLE, and for v8A the CONSTRAINED UNPREDICTABLE choices for such an IT insn are to NOP, UNDEF, or treat 0b1111 like 0b1110. Add a comment noting that we take the latter option. Backports commit 5529de1e5512c05276825fa8b922147663fd6eac from qemu	2019-08-08 18:07:57 -04:00
Philippe Mathieu-Daudé	f77b60d7e9	target/arm: Fix coding style issues Since we'll move this code around, fix its style first. Backports commit 9798ac7162c8a720c5d28f4d1fc9e03c7ab4f015 from qemu	2019-08-08 15:05:57 -04:00
Lioncash	76d33b34e1	target/arm: Fix bad patch merge in arm_tr_init_disas_context	2019-08-08 14:37:38 -04:00
Peter Maydell	318a1ddf39	target/arm: Remove unused cpu_F0s, cpu_F0d, cpu_F1s, cpu_F1d Remove the now unused TCG globals cpu_F0s, cpu_F0d, cpu_F1s, cpu_F1d. cpu_M0 is still used by the iwmmxt code, and cpu_V0 and cpu_V1 are used by both iwmmxt and Neon. Backports commit d9eea52c67c04c58ecceba6ffe5a93d1d02051fa from qemu	2019-06-25 18:45:53 -05:00
Peter Maydell	74168c20f2	target/arm: Stop using deprecated functions in NEON_2RM_VCVT_F32_F16 Remove some old constructns from NEON_2RM_VCVT_F16_F32 code: * don't use CPU_F0s * don't use tcg_gen_st_f32 Backports commit b66f6b9981004bbf120b8d17c20f92785179bdf2 from qemu	2019-06-25 18:43:40 -05:00
Peter Maydell	8ae25f6e4c	target/arm: stop using deprecated functions in NEON_2RM_VCVT_F16_F32 Remove some old constructs from NEON_2RM_VCVT_F16_F32 code: * don't use cpu_F0s * don't use tcg_gen_ld_f32 Backports commit 58f2682eee738e8890f9cfe858e0f4f68b00d45d from qemu	2019-06-25 18:39:43 -05:00
Peter Maydell	d419fbc270	target/arm: Stop using cpu_F0s in Neon VCVT fixed-point ops Stop using cpu_F0s in the Neon VCVT fixed-point operations. Backports commit c253dd7832bc6b4e140a0da56410a9336cce05bc from qemu	2019-06-25 18:35:33 -05:00
Peter Maydell	46216ae382	target/arm: Stop using cpu_F0s for Neon f32/s32 VCVT Stop using cpu_F0s for the Neon f32/s32 VCVT operations. Since this is the last user of cpu_F0s in the Neon 2rm-op loop, we can remove the handling code for it too. Backports commit 60737ed5785b9c1c6f1c85575dfdd1e9eec91878 from qemu	2019-06-25 18:32:32 -05:00
Peter Maydell	2fbe9c1d1d	target/arm: Stop using cpu_F0s for NEON_2RM_VRECPE_F and NEON_2RM_VRSQRTE_F Stop using cpu_F0s for NEON_2RM_VRECPE_F and NEON_2RM_VRSQRTE_F. Backports commit 9a011fece7201f8e268c982df8c7836f3335bbe6 from qemu	2019-06-25 18:29:22 -05:00
Peter Maydell	f82ea34369	target/arm: Stop using cpu_F0s for NEON_2RM_VCVT[ANPM][US] Stop using cpu_F0s for the NEON_2RM_VCVT[ANPM][US] ops. Backports commit 30bf0a018f6c706913c8c0ea57b386907f4229be from qemu	2019-06-25 18:28:03 -05:00
Peter Maydell	0d4535bf16	target/arm: Stop using cpu_F0s for NEON_2RM_VRINT* Switch NEON_2RM_VRINT* away from using cpu_F0s. Backports commit 3b52ad1fae804acdc2fdc41b418a65249beae430 from qemu	2019-06-25 18:26:24 -05:00
Peter Maydell	a62cbc7ac5	target/arm: Stop using cpu_F0s for NEON_2RM_VNEG_F Switch NEON_2RM_VABS_F away from using cpu_F0s. Backports commit cedcc96fc7c8e520a190a010ac97dbb53e57d7d2 from qemu	2019-06-25 18:24:01 -05:00
Peter Maydell	63d7f92eba	target/arm: Stop using cpu_F0s for NEON_2RM_VABS_F Where Neon instructions are floating point operations, we mostly use the old VFP utility functions like gen_vfp_abs() which work on the TCG globals cpu_F0s and cpu_F1s. The Neon for-each-element loop conditionally loads the inputs into either a plain old TCG temporary for most operations or into cpu_F0s for float operations, and similarly stores back either cpu_F0s or the temporary. Switch NEON_2RM_VABS_F away from using cpu_F0s, and update neon_2rm_is_float_op() accordingly. Backports commit fd8a68cdcf81d70eebf866a132e9780d4108da9c from qemu	2019-06-25 18:22:05 -05:00
Peter Maydell	1a0d31c05e	target/arm: Convert float-to-integer VCVT insns to decodetree Convert the float-to-integer VCVT instructions to decodetree. Since these are the last unconverted instructions, we can delete the old decoder structure entirely now. Backports commit 3111bfc2da6ba0c8396dc97ca479942d711c6146 from qemu	2019-06-13 19:40:02 -04:00
Peter Maydell	f6c67559d4	target/arm: Convert VCVT fp/fixed-point conversion insns to decodetree Convert the VCVT (between floating-point and fixed-point) instructions to decodetree. Backports commit e3d6f4290c788e850c64815f0b3e331600a4bcc0 from qemu	2019-06-13 19:35:51 -04:00
Peter Maydell	c66d477359	target/arm: Convert VJCVT to decodetree Convert the VJCVT instruction to decodetree. Backports commit 92073e947487e2109f3dfebfeaa48d6323cbd981 from qemu	2019-06-13 19:31:35 -04:00
Peter Maydell	7be9e6f9b4	target/arm: Convert integer-to-float insns to decodetree Convert the VCVT integer-to-float instructions to decodetree. Backports commit 8fc9d8918cde342c71923e361b9f2193e36ed18b from qemu	2019-06-13 19:20:41 -04:00
Peter Maydell	e0e4f99103	target/arm: Convert double-single precision conversion insns to decodetree Convert the VCVT double/single precision conversion insns to decodetree. Backports commit 6ed7e49c3693ed8411773c4880f42b2932beb12d from qemu	2019-06-13 19:18:01 -04:00
Peter Maydell	ab9d0235ed	target/arm: Convert VFP round insns to decodetree Convert the VFP round-to-integer instructions VRINTR, VRINTZ and VRINTX to decodetree. These instructions were only introduced as part of the "VFP misc" additions in v8A, so we check this. The old decoder's implementation was incorrectly providing them even for v7A CPUs. Backports commit e25155f55dc4abb427a88dfe58bbbc550fe7d643 from qemu	2019-06-13 19:15:05 -04:00
Peter Maydell	9e842a0f2a	target/arm: Convert the VCVT-to-f16 insns to decodetree Convert the VCVTT and VCVTB instructions which convert from f32 and f64 to f16 to decodetree. Since we're no longer constrained to the old decoder's style using cpu_F0s and cpu_F0d we can perform a direct 16 bit store of the right half of the input single-precision register rather than doing a load/modify/store sequence on the full 32 bits. Backports commit cdfd14e86ab0b1ca29a702d13a8e4af2e902a9bf from qemu	2019-06-13 19:03:59 -04:00
Peter Maydell	7d927b2d0e	target/arm: Convert the VCVT-from-f16 insns to decodetree Convert the VCVTT, VCVTB instructions that deal with conversion from half-precision floats to f32 or 64 to decodetree. Since we're no longer constrained to the old decoder's style using cpu_F0s and cpu_F0d we can perform a direct 16 bit load of the right half of the input single-precision register rather than loading the full 32 bits and then doing a separate shift or sign-extension. Backports commit b623d803dda805f07aadcbf098961fde27315c19 from qemu	2019-06-13 19:00:23 -04:00
Peter Maydell	e6cc2616d2	target/arm: Convert VFP comparison insns to decodetree Convert the VFP comparison instructions to decodetree. Note that comparison instructions should not honour the VFP short-vector length and stride information: they are scalar-only operations. This applies to all the 2-operand instructions except for VMOV, VABS, VNEG and VSQRT. (In the old decoder this is implemented via the "if (op == 15 && rn > 3) { veclen = 0; }" check.) Backports commit 386bba2368842fc74388a3c1651c6c0c0c70adbd from qemu	2019-06-13 18:55:53 -04:00
Peter Maydell	a75a3e321f	target/arm: Convert VMOV (register) to decodetree Backports commit 17552b979ebb9848a534c25ebed18a1072710058 from qemu	2019-06-13 18:49:49 -04:00
Peter Maydell	ee30962891	target/arm: Convert VSQRT to decodetree Convert the VSQRT instruction to decodetree. Backports commit b8474540cbce4e2fa45010416375d1bcbe86dc15 from qemu	2019-06-13 18:47:32 -04:00
Peter Maydell	7aea3da6b7	target/arm: Convert VNEG to decodetree Convert the VNEG instruction to decodetree. Backports commit 1882651afdb0ca44f0631192fbe65a71c660d809 from qemu	2019-06-13 18:43:50 -04:00
Peter Maydell	1032d86ad3	target/arm: Convert VABS to decodetree Convert the VFP VABS instruction to decodetree. Unlike the 3-op versions, we don't pass fpst to the VFPGen2OpSPFn or VFPGen2OpDPFn because none of the operations which use this format and support short vectors will need it. Backports commit 90287e22c987e9840704345ed33d237cbe759dd9 from qemu	2019-06-13 18:41:43 -04:00
Peter Maydell	7a16bc6876	target/arm: Convert VMOV (imm) to decodetree Convert the VFP VMOV (immediate) instruction to decodetree. Backports commit b518c753f0b94e14e01e97b4ec42c100dafc0cc2 from qemu	2019-06-13 18:37:58 -04:00
Peter Maydell	0ebb6b8b90	target/arm: Convert VFP fused multiply-add insns to decodetree Convert the VFP fused multiply-add instructions (VFNMA, VFNMS, VFMA, VFMS) to decodetree. Note that in the old decode structure we were implementing these to honour the VFP vector stride/length. These instructions were introduced in VFPv4, and in the v7A architecture they are UNPREDICTABLE if the vector stride or length are non-zero. In v8A they must UNDEF if stride or length are non-zero, like all VFP instructions; we choose to UNDEF always. Backports commit d4893b01d23060845ee3855bc96626e16aad9ab5 from qemu	2019-06-13 18:24:36 -04:00
Peter Maydell	321bcc822b	target/arm: Convert VDIV to decodetree Convert the VDIV instruction to decodetree. Backports commit 519ee7ae31e050eb0ff9ad35c213f0bd7ab1c03e from qemu	2019-06-13 18:19:47 -04:00
Peter Maydell	76c74bc657	target/arm: Convert VSUB to decodetree Convert the VSUB instruction to decodetree. Backports commit 8fec9a119264b7936503abce3c106fad7e3ccb76 from qemu.	2019-06-13 18:18:00 -04:00
Peter Maydell	f56f0342ad	target/arm: Convert VADD to decodetree Convert the VADD instruction to decodetree. Backports commit ce28b303716e7eca3f3765bf6776d722ebbe1122 from qemu	2019-06-13 18:15:52 -04:00
Peter Maydell	06584edf61	target/arm: Convert VNMUL to decodetree Convert the VNMUL instruction to decodetree. Backports commit 43c4be1236c105090d134540da1036073d157cd4 from qemu	2019-06-13 18:14:16 -04:00
Peter Maydell	2c5e102017	target/arm: Convert VMUL to decodetree Convert the VMUL instruction to decodetree. Backports commit 88c5188ced60e9f2b8cc3af3b9bc4a8031c8c996 from qemu	2019-06-13 18:12:03 -04:00
Peter Maydell	b26b6a12a2	target/arm: Convert VFP VNMLA to decodetree Convert the VFP VNMLA instruction to decodetree. Backports commit 8a483533adc1bdc2decb8f456dbe930a2d245a8b from qemu	2019-06-13 18:09:57 -04:00
Peter Maydell	638b90de31	target/arm: Convert VFP VNMLS to decodetree Convert the VFP VNMLS instruction to decodetree. Backports commit c54a416cc6d60efbc79dd37aaf0c8918c05b5815 from qemu	2019-06-13 18:06:59 -04:00
Peter Maydell	67ad40ffa4	target/arm: Convert VFP VMLS to decodetree Convert the VFP VMLS instruction to decodetree. Backports commit e7258280d46af4ab6a0cc93ccfe8f6614defb4b7 from qemu	2019-06-13 18:02:37 -04:00
Peter Maydell	edf81eb214	target/arm: Convert VFP VMLA to decodetree Convert the VFP VMLA instruction to decodetree. This is the first of the VFP 3-operand data processing instructions, so we include in this patch the code which loops over the elements for an old-style VFP vector operation. The existing code to do this looping uses the deprecated cpu_F0s/F0d/F1s/F1d TCG globals; since we are going to be converting instructions one at a time anyway we can take the opportunity to make the new loop use TCG temporaries, which means we can do that conversion one operation at a time rather than needing to do it all in one go. We include an UNDEF check which was missing in the old code: short-vector operations (with stride or length non-zero) were deprecated in v7A and must UNDEF in v8A, so if the MVFR0 FPShVec field does not indicate that support for short vectors is present we UNDEF the operations that would use them. (This is a change of behaviour for Cortex-A7, Cortex-A15 and the v8 CPUs, which previously were all incorrectly allowing short-vector operations.) Note that the conversion fixes a bug in the old code for the case of VFP short-vector "mixed scalar/vector operations". These happen where the destination register is in a vector bank but but the second operand is in a scalar bank. For example vmla.f64 d10, d1, d16 with length 2 stride 2 is equivalent to the pair of scalar operations vmla.f64 d10, d1, d16 vmla.f64 d8, d3, d16 where the destination and first input register cycle through their vector but the second input is scalar (d16). In the old decoder the gen_vfp_F1_mul() operation uses cpu_F1{s,d} as a temporary output for the multiply, which trashes the second input operand. For the fully-scalar case (where we never do a second iteration) and the fully-vector case (where the loop loads the new second input operand) this doesn't matter, but for the mixed scalar/vector case we will end up using the wrong value for later loop iterations. In the new code we use TCG temporaries and so avoid the bug. This bug is present for all the multiply-accumulate insns that operate on short vectors: VMLA, VMLS, VNMLA, VNMLS. Note 2: the expression used to calculate the next register number in the vector bank is not in fact correct; we leave this behaviour unchanged from the old decoder and will fix this bug later in the series. Backports commit 266bd25c485597c94209bfdb3891c1d0c573c164 from qemu	2019-06-13 17:59:16 -04:00
Peter Maydell	93fe4cbe9e	target/arm: Remove VLDR/VSTR/VLDM/VSTM use of cpu_F0s and cpu_F0d Expand out the sequences in the new decoder VLDR/VSTR/VLDM/VSTM trans functions which perform the memory accesses by going via the TCG globals cpu_F0s and cpu_F0d, to use local TCG temps instead. Backports commit 3993d0407dff7233e42f2251db971e126a0497e9 from qemu	2019-06-13 17:31:28 -04:00
Peter Maydell	ff7042567e	target/arm: Convert the VFP load/store multiple insns to decodetree Convert the VFP load/store multiple insns to decodetree. This includes tightening up the UNDEF checking for pre-VFPv3 CPUs which only have D0-D15 : they now UNDEF for any access to D16-D31, not merely when the smallest register in the transfer list is in D16-D31. This conversion does not try to share code between the single precision and the double precision versions; this looks a bit duplicative of code, but it leaves the door open for a future refactoring which gets rid of the use of the "F0" registers by inlining the various functions like gen_vfp_ld() and gen_mov_F0_reg() which are hiding "if (dp) { ... } else { ... }" conditionalisation. Backports commit fa288de272c5c8a66d5eb683b123706a52bc7ad6 from qemu	2019-06-13 17:26:52 -04:00
Peter Maydell	6f0633ce80	target/arm: Convert VFP VLDR and VSTR to decodetree Convert the VFP single load/store insns VLDR and VSTR to decodetree. Backports commit 79b02a3b5231c5b8cd31e50cd549968dd0a05c49 from qemu	2019-06-13 17:22:48 -04:00

1 2 3 4 5

238 commits