unicorn

mirror of https://github.com/yuzu-emu/unicorn.git synced 2024-12-26 01:05:32 +00:00

Author	SHA1	Message	Date
Peter Maydell	56b54f361e	target/arm: Allow ARMCPRegInfo read/write functions to throw exceptions Currently the only part of an ARMCPRegInfo which is allowed to cause a CPU exception is the access function, which returns a value indicating that some flavour of UNDEF should be generated. For the ATS system instructions, we would like to conditionally generate exceptions as part of the writefn, because some faults during the page table walk (like external aborts) should cause an exception to be raised rather than returning a value. There are several ways we could do this: * plumb the GETPC() value from the top level set_cp_reg/get_cp_reg helper functions through into the readfn and writefn hooks * add extra readfn_with_ra/writefn_with_ra hooks that take the GETPC() value * require the ATS instructions to provide a dummy accessfn, which serves no purpose except to cause the code generation to emit TCG ops to sync the CPU state * add an ARM_CP_ flag to mark the ARMCPRegInfo as possibly throwing an exception in its read/write hooks, and make the codegen sync the CPU state before calling the hooks if the flag is set This patch opts for the last of these, as it is fairly simple to implement and doesn't require invasive changes like updating the readfn/writefn hook function prototype signature. Backports commit 37ff584c15bc3e1dd2c26b1998f00ff87189538c from qemu	2019-11-20 17:24:37 -05:00
Richard Henderson	87c06b7fae	target/arm: Factor out unallocated_encoding for aarch32 Make this a static function private to translate.c. Thus we can use the same idiom between aarch64 and aarch32 without actually sharing function implementations. Backports commit 1ce21ba1eaf08b22da5925f3e37fc0b4322da858 from qemu	2019-11-18 23:51:45 -05:00
Richard Henderson	1f59a43544	Revert "target/arm: Use unallocated_encoding for aarch32" Despite the fact that the text for the call to gen_exception_insn is identical for aarch64 and aarch32, the implementation inside gen_exception_insn is totally different. This fixes exceptions raised from aarch64. This reverts commit `fb2d3c9a9a`.	2019-11-18 23:49:47 -05:00
Richard Henderson	9d2a3064af	target/arm: Use tcg_gen_extrh_i64_i32 to extract the high word Separate shift + extract low will result in one extra insn for hosts like RISC-V, MIPS, and Sparc. Backports commit 664b7e3b97d6376f3329986c465b3782458b0f8b from qemu	2019-11-18 20:36:19 -05:00
Richard Henderson	93c016a3e7	target/arm: Simplify SMMLA, SMMLAR, SMMLS, SMMLSR All of the inputs to these instructions are 32-bits. Rather than extend each input to 64-bits and then extract the high 32-bits of the output, use tcg_gen_muls2_i32 and other 32-bit generator functions. Backports commit 5f8cd06ebcf57420be8fea4574de2e074de46709 from qemu	2019-11-18 20:31:12 -05:00
Richard Henderson	4a1cc16eef	target/arm: Use tcg_gen_rotri_i32 for gen_swap_half Rotate is the more compact and obvious way to swap 16-bit elements of a 32-bit word. Backports commit adefba76e8bf10dfb342094d2f5debfeedb1a74d from qemu	2019-11-18 20:27:12 -05:00
Richard Henderson	751ab7b24b	target/arm: Use ror32 instead of open-coding the operation The helper function is more documentary, and also already handles the case of rotate by zero. Backports commit dd861b3f29be97a9e3cdb9769dcbc0c7d7825185 from qemu	2019-11-18 20:25:51 -05:00
Richard Henderson	df4c773ed2	target/arm: Remove redundant shift tests The immediate shift generator functions already test for, and eliminate, the case of a shift by zero. Backports commit 464eaa9571fae5867d9aea7d7209c091c8a50223 from qemu	2019-11-18 20:24:39 -05:00
Richard Henderson	4dd30ebfbd	target/arm: Use tcg_gen_deposit_i32 for PKHBT, PKHTB Use deposit as the composit operation to merge the bits from the two inputs. Backports commit d1f8755fc93911f5b27246b1da794542d222fa1b from qemu	2019-11-18 20:22:00 -05:00
Richard Henderson	25ccd28e78	target/arm: Use tcg_gen_extract_i32 for shifter_out_im Extract is a compact combination of shift + and. Backports commit 191f4bfe8d6cf0c7d5cd7f84cd7076e32e3745dd from qemu	2019-11-18 20:19:40 -05:00
Richard Henderson	3d3d56056b	target/arm: Remove helper_double_saturate Replace x = double_saturate(y) with x = add_saturate(y, y). There is no need for a separate more specialized helper. Backports commit 640581a06d14e2d0d3c3ba79b916de6bc43578b0 from qemu	2019-11-18 20:13:21 -05:00
Richard Henderson	fb2d3c9a9a	target/arm: Use unallocated_encoding for aarch32 Promote this function from aarch64 to fully general use. Use it to unify the code sequences for generating illegal opcode exceptions. Backports commit 3cb36637157088892e9e33ddb1034bffd1251d3b from qemu	2019-11-18 20:10:50 -05:00
Richard Henderson	d562bea784	target/arm: Remove offset argument to gen_exception_bkpt_insn Unlike the other more generic gen_exception{,_internal}_insn interfaces, breakpoints always refer to the current instruction. Backports commit 06bcbda3f64d464b6ecac789bce4bd69f199cd68 from qemu	2019-11-18 20:05:45 -05:00
Richard Henderson	f19b4df20d	target/arm: Replace offset with pc in gen_exception_internal_insn The offset is variable depending on the instruction set. Passing in the actual value is clearer in intent. Backpors commit aee828e7541a5895669ade3a4b6978382b6b094a from qemu	2019-11-18 20:05:23 -05:00
Richard Henderson	00fbadf637	target/arm: Replace s->pc with s->base.pc_next We must update s->base.pc_next when we return from the translate_insn hook to the main translator loop. By incrementing s->base.pc_next immediately after reading the insn word, "pc_next" contains the address of the next instruction throughout translation. All remaining uses of s->pc are referencing the address of the next insn, so this is now a simple global replacement. Remove the "s->pc" field. Backports commit a04159166b880b505ccadc16f2fe84169806883d from qemu	2019-11-18 17:32:53 -05:00
Richard Henderson	7d1fcef722	target/arm: Remove redundant s->pc & ~1 The thumb bit has already been removed from s->pc, and is always even. Backports commit 4818c3743b0e0095fdcecd24457da9b3443730ab from qemu	2019-11-18 17:32:53 -05:00
Richard Henderson	a2e60445de	target/arm: Introduce add_reg_for_lit Provide a common routine for the places that require ALIGN(PC, 4) as the base address as opposed to plain PC. The two are always the same for A32, but the difference is meaningful for thumb mode. Backports commit 16e0d8234ef9291747332d2c431e46808a060472 from qemu	2019-11-18 17:32:49 -05:00
Richard Henderson	1c0914e58c	target/arm: Introduce read_pc We currently have 3 different ways of computing the architectural value of "PC" as seen in the ARM ARM. The value of s->pc has been incremented past the current insn, but that is all. Thus for a32, PC = s->pc + 4; for t32, PC = s->pc; for t16, PC = s->pc + 2. These differing computations make it impossible at present to unify the various code paths. With the newly introduced s->pc_curr, we can compute the correct value for all cases, using the formula given in the ARM ARM. This changes the behaviour for load_reg() and load_reg_var() when called with reg==15 from a 32-bit Thumb instruction: previously they would have returned the incorrect value of pc_curr + 6, and now they will return the architecturally correct value of PC, which is pc_curr + 4. This will not affect well-behaved guest software, because all of the places we call these functions from T32 code are instructions where using r15 is UNPREDICTABLE. Using the architectural PC value here is more consistent with the T16 and A32 behaviour. Backports commit fdbcf6329d0c2984c55d7019419a72bf8e583c36 from qemu	2019-11-18 17:04:50 -05:00
Richard Henderson	0048f3e887	target/arm: Introduce pc_curr Add a new field to retain the address of the instruction currently being translated. The 32-bit uses are all within subroutines used by a32 and t32. This will become less obvious when t16 support is merged with a32+t32, and having a clear definition will help. Convert aarch64 as well for consistency. Note that there is one instance of a pre-assert fprintf that used the wrong value for the address of the current instruction. Backports commit 43722a6d4f0c92f7e7e1e291580039b0f9789df1 from qemu	2019-11-18 16:58:40 -05:00
Richard Henderson	1aa3c685a8	target/arm: Pass in pc to thumb_insn_is_16bit This function is used in two different contexts, and it will be clearer if the function is given the address to which it applies. Backports commit 331b1ca616cb708db30dab68e3262d286e687f24 from qemu	2019-11-18 16:52:35 -05:00
Peter Maydell	c61e22627d	target/arm: Fix routing of singlestep exceptions When generating an architectural single-step exception we were routing it to the "default exception level", which is to say the same exception level we execute at except that EL0 exceptions go to EL1. This is incorrect because the debug exception level can be configured by the guest for situations such as single stepping of EL0 and EL1 code by EL2. We have to track the target debug exception level in the TB flags, because it is dependent on CPU state like HCR_EL2.TGE and MDCR_EL2.TDE. (That we were previously calling the arm_debug_target_el() function to determine dc->ss_same_el is itself a bug, though one that would only have manifested as incorrect syndrome information.) Since we are out of TB flag bits unless we want to expand into the cs_base field, we share some bits with the M-profile only HANDLER and STACKCHECK bits, since only A-profile has this singlestep. Fixes: https://bugs.launchpad.net/qemu/+bug/1838913 Backports commit 8bd587c1066f4456ddfe611b571d9439a947d74c from qemu	2019-11-18 16:50:15 -05:00
Peter Maydell	3f531fac61	target/arm: Factor out 'generate singlestep exception' function Factor out code to 'generate a singlestep exception', which is currently repeated in four places. To do this we need to also pull the identical copies of the gen-exception() function out of translate-a64.c and translate.c into translate.h. (There is a bug in the code: we're taking the exception to the wrong target EL. This will be simpler to fix if there's only one place to do it.) Backports commit c1d5f50f094ab204accfacc2ee6aafc9601dd5c4 from qemu	2019-11-18 16:47:08 -05:00
Peter Maydell	3fc86e1901	target/arm: Don't abort on M-profile exception return in linux-user mode An attempt to do an exception-return (branch to one of the magic addresses) in linux-user mode for M-profile should behave like a normal branch, because linux-user mode is always going to be in 'handler' mode. This used to work, but we broke it when we added support for the M-profile security extension in commit d02a8698d7ae2bfed. In that commit we allowed even handler-mode calls to magic return values to be checked for and dealt with by causing an EXCP_EXCEPTION_EXIT exception to be taken, because this is needed for the FNC_RETURN return-from-non-secure-function-call handling. For system mode we added a check in do_v7m_exception_exit() to make any spurious calls from Handler mode behave correctly, but forgot that linux-user mode would also be affected. How an attempted return-from-non-secure-function-call in linux-user mode should be handled is not clear -- on real hardware it would result in return to secure code (not to the Linux kernel) which could then handle the error in any way it chose. For QEMU we take the simple approach of treating this erroneous return the same way it would be handled on a CPU without the security extensions -- treat it as a normal branch. The upshot of all this is that for linux-user mode we should never do any of the bx_excret magic, so the code change is simple. This ought to be a weird corner case that only affects broken guest code (because Linux user processes should never be attempting to do exception returns or NS function returns), except that the code that assigns addresses in RAM for the process and stack in our linux-user code does not attempt to avoid this magic address range, so legitimate code attempting to return to a trampoline routine on the stack can fall into this case. This change fixes those programs, but we should also look at restricting the range of memory we use for M-profile linux-user guests to the area that would be real RAM in hardware. Backports commit 9027d3fba605d8f6093342ebe4a1da450d374630 from qemu	2019-11-18 16:30:43 -05:00
Peter Maydell	0d89bce217	target/arm: Execute Thumb instructions when their condbits are 0xf Thumb instructions in an IT block are set up to be conditionally executed depending on a set of condition bits encoded into the IT bits of the CPSR/XPSR. The architecture specifies that if the condition bits are 0b1111 this means "always execute" (like 0b1110), not "never execute"; we were treating it as "never execute". (See the ConditionHolds() pseudocode in both the A-profile and M-profile Arm ARM.) This is a bit of an obscure corner case, because the only legal way to get to an 0b1111 set of condbits is to do an exception return which sets the XPSR/CPSR up that way. An IT instruction which encodes a condition sequence that would include an 0b1111 is UNPREDICTABLE, and for v8A the CONSTRAINED UNPREDICTABLE choices for such an IT insn are to NOP, UNDEF, or treat 0b1111 like 0b1110. Add a comment noting that we take the latter option. Backports commit 5529de1e5512c05276825fa8b922147663fd6eac from qemu	2019-08-08 18:07:57 -04:00
Philippe Mathieu-Daudé	f77b60d7e9	target/arm: Fix coding style issues Since we'll move this code around, fix its style first. Backports commit 9798ac7162c8a720c5d28f4d1fc9e03c7ab4f015 from qemu	2019-08-08 15:05:57 -04:00
Lioncash	76d33b34e1	target/arm: Fix bad patch merge in arm_tr_init_disas_context	2019-08-08 14:37:38 -04:00
Peter Maydell	318a1ddf39	target/arm: Remove unused cpu_F0s, cpu_F0d, cpu_F1s, cpu_F1d Remove the now unused TCG globals cpu_F0s, cpu_F0d, cpu_F1s, cpu_F1d. cpu_M0 is still used by the iwmmxt code, and cpu_V0 and cpu_V1 are used by both iwmmxt and Neon. Backports commit d9eea52c67c04c58ecceba6ffe5a93d1d02051fa from qemu	2019-06-25 18:45:53 -05:00
Peter Maydell	74168c20f2	target/arm: Stop using deprecated functions in NEON_2RM_VCVT_F32_F16 Remove some old constructns from NEON_2RM_VCVT_F16_F32 code: * don't use CPU_F0s * don't use tcg_gen_st_f32 Backports commit b66f6b9981004bbf120b8d17c20f92785179bdf2 from qemu	2019-06-25 18:43:40 -05:00
Peter Maydell	8ae25f6e4c	target/arm: stop using deprecated functions in NEON_2RM_VCVT_F16_F32 Remove some old constructs from NEON_2RM_VCVT_F16_F32 code: * don't use cpu_F0s * don't use tcg_gen_ld_f32 Backports commit 58f2682eee738e8890f9cfe858e0f4f68b00d45d from qemu	2019-06-25 18:39:43 -05:00
Peter Maydell	d419fbc270	target/arm: Stop using cpu_F0s in Neon VCVT fixed-point ops Stop using cpu_F0s in the Neon VCVT fixed-point operations. Backports commit c253dd7832bc6b4e140a0da56410a9336cce05bc from qemu	2019-06-25 18:35:33 -05:00
Peter Maydell	46216ae382	target/arm: Stop using cpu_F0s for Neon f32/s32 VCVT Stop using cpu_F0s for the Neon f32/s32 VCVT operations. Since this is the last user of cpu_F0s in the Neon 2rm-op loop, we can remove the handling code for it too. Backports commit 60737ed5785b9c1c6f1c85575dfdd1e9eec91878 from qemu	2019-06-25 18:32:32 -05:00
Peter Maydell	2fbe9c1d1d	target/arm: Stop using cpu_F0s for NEON_2RM_VRECPE_F and NEON_2RM_VRSQRTE_F Stop using cpu_F0s for NEON_2RM_VRECPE_F and NEON_2RM_VRSQRTE_F. Backports commit 9a011fece7201f8e268c982df8c7836f3335bbe6 from qemu	2019-06-25 18:29:22 -05:00
Peter Maydell	f82ea34369	target/arm: Stop using cpu_F0s for NEON_2RM_VCVT[ANPM][US] Stop using cpu_F0s for the NEON_2RM_VCVT[ANPM][US] ops. Backports commit 30bf0a018f6c706913c8c0ea57b386907f4229be from qemu	2019-06-25 18:28:03 -05:00
Peter Maydell	0d4535bf16	target/arm: Stop using cpu_F0s for NEON_2RM_VRINT* Switch NEON_2RM_VRINT* away from using cpu_F0s. Backports commit 3b52ad1fae804acdc2fdc41b418a65249beae430 from qemu	2019-06-25 18:26:24 -05:00
Peter Maydell	a62cbc7ac5	target/arm: Stop using cpu_F0s for NEON_2RM_VNEG_F Switch NEON_2RM_VABS_F away from using cpu_F0s. Backports commit cedcc96fc7c8e520a190a010ac97dbb53e57d7d2 from qemu	2019-06-25 18:24:01 -05:00
Peter Maydell	63d7f92eba	target/arm: Stop using cpu_F0s for NEON_2RM_VABS_F Where Neon instructions are floating point operations, we mostly use the old VFP utility functions like gen_vfp_abs() which work on the TCG globals cpu_F0s and cpu_F1s. The Neon for-each-element loop conditionally loads the inputs into either a plain old TCG temporary for most operations or into cpu_F0s for float operations, and similarly stores back either cpu_F0s or the temporary. Switch NEON_2RM_VABS_F away from using cpu_F0s, and update neon_2rm_is_float_op() accordingly. Backports commit fd8a68cdcf81d70eebf866a132e9780d4108da9c from qemu	2019-06-25 18:22:05 -05:00
Peter Maydell	1a0d31c05e	target/arm: Convert float-to-integer VCVT insns to decodetree Convert the float-to-integer VCVT instructions to decodetree. Since these are the last unconverted instructions, we can delete the old decoder structure entirely now. Backports commit 3111bfc2da6ba0c8396dc97ca479942d711c6146 from qemu	2019-06-13 19:40:02 -04:00
Peter Maydell	f6c67559d4	target/arm: Convert VCVT fp/fixed-point conversion insns to decodetree Convert the VCVT (between floating-point and fixed-point) instructions to decodetree. Backports commit e3d6f4290c788e850c64815f0b3e331600a4bcc0 from qemu	2019-06-13 19:35:51 -04:00
Peter Maydell	c66d477359	target/arm: Convert VJCVT to decodetree Convert the VJCVT instruction to decodetree. Backports commit 92073e947487e2109f3dfebfeaa48d6323cbd981 from qemu	2019-06-13 19:31:35 -04:00
Peter Maydell	7be9e6f9b4	target/arm: Convert integer-to-float insns to decodetree Convert the VCVT integer-to-float instructions to decodetree. Backports commit 8fc9d8918cde342c71923e361b9f2193e36ed18b from qemu	2019-06-13 19:20:41 -04:00
Peter Maydell	e0e4f99103	target/arm: Convert double-single precision conversion insns to decodetree Convert the VCVT double/single precision conversion insns to decodetree. Backports commit 6ed7e49c3693ed8411773c4880f42b2932beb12d from qemu	2019-06-13 19:18:01 -04:00
Peter Maydell	ab9d0235ed	target/arm: Convert VFP round insns to decodetree Convert the VFP round-to-integer instructions VRINTR, VRINTZ and VRINTX to decodetree. These instructions were only introduced as part of the "VFP misc" additions in v8A, so we check this. The old decoder's implementation was incorrectly providing them even for v7A CPUs. Backports commit e25155f55dc4abb427a88dfe58bbbc550fe7d643 from qemu	2019-06-13 19:15:05 -04:00
Peter Maydell	9e842a0f2a	target/arm: Convert the VCVT-to-f16 insns to decodetree Convert the VCVTT and VCVTB instructions which convert from f32 and f64 to f16 to decodetree. Since we're no longer constrained to the old decoder's style using cpu_F0s and cpu_F0d we can perform a direct 16 bit store of the right half of the input single-precision register rather than doing a load/modify/store sequence on the full 32 bits. Backports commit cdfd14e86ab0b1ca29a702d13a8e4af2e902a9bf from qemu	2019-06-13 19:03:59 -04:00
Peter Maydell	7d927b2d0e	target/arm: Convert the VCVT-from-f16 insns to decodetree Convert the VCVTT, VCVTB instructions that deal with conversion from half-precision floats to f32 or 64 to decodetree. Since we're no longer constrained to the old decoder's style using cpu_F0s and cpu_F0d we can perform a direct 16 bit load of the right half of the input single-precision register rather than loading the full 32 bits and then doing a separate shift or sign-extension. Backports commit b623d803dda805f07aadcbf098961fde27315c19 from qemu	2019-06-13 19:00:23 -04:00
Peter Maydell	e6cc2616d2	target/arm: Convert VFP comparison insns to decodetree Convert the VFP comparison instructions to decodetree. Note that comparison instructions should not honour the VFP short-vector length and stride information: they are scalar-only operations. This applies to all the 2-operand instructions except for VMOV, VABS, VNEG and VSQRT. (In the old decoder this is implemented via the "if (op == 15 && rn > 3) { veclen = 0; }" check.) Backports commit 386bba2368842fc74388a3c1651c6c0c0c70adbd from qemu	2019-06-13 18:55:53 -04:00
Peter Maydell	a75a3e321f	target/arm: Convert VMOV (register) to decodetree Backports commit 17552b979ebb9848a534c25ebed18a1072710058 from qemu	2019-06-13 18:49:49 -04:00
Peter Maydell	ee30962891	target/arm: Convert VSQRT to decodetree Convert the VSQRT instruction to decodetree. Backports commit b8474540cbce4e2fa45010416375d1bcbe86dc15 from qemu	2019-06-13 18:47:32 -04:00
Peter Maydell	7aea3da6b7	target/arm: Convert VNEG to decodetree Convert the VNEG instruction to decodetree. Backports commit 1882651afdb0ca44f0631192fbe65a71c660d809 from qemu	2019-06-13 18:43:50 -04:00
Peter Maydell	1032d86ad3	target/arm: Convert VABS to decodetree Convert the VFP VABS instruction to decodetree. Unlike the 3-op versions, we don't pass fpst to the VFPGen2OpSPFn or VFPGen2OpDPFn because none of the operations which use this format and support short vectors will need it. Backports commit 90287e22c987e9840704345ed33d237cbe759dd9 from qemu	2019-06-13 18:41:43 -04:00
Peter Maydell	7a16bc6876	target/arm: Convert VMOV (imm) to decodetree Convert the VFP VMOV (immediate) instruction to decodetree. Backports commit b518c753f0b94e14e01e97b4ec42c100dafc0cc2 from qemu	2019-06-13 18:37:58 -04:00
Peter Maydell	0ebb6b8b90	target/arm: Convert VFP fused multiply-add insns to decodetree Convert the VFP fused multiply-add instructions (VFNMA, VFNMS, VFMA, VFMS) to decodetree. Note that in the old decode structure we were implementing these to honour the VFP vector stride/length. These instructions were introduced in VFPv4, and in the v7A architecture they are UNPREDICTABLE if the vector stride or length are non-zero. In v8A they must UNDEF if stride or length are non-zero, like all VFP instructions; we choose to UNDEF always. Backports commit d4893b01d23060845ee3855bc96626e16aad9ab5 from qemu	2019-06-13 18:24:36 -04:00
Peter Maydell	321bcc822b	target/arm: Convert VDIV to decodetree Convert the VDIV instruction to decodetree. Backports commit 519ee7ae31e050eb0ff9ad35c213f0bd7ab1c03e from qemu	2019-06-13 18:19:47 -04:00
Peter Maydell	76c74bc657	target/arm: Convert VSUB to decodetree Convert the VSUB instruction to decodetree. Backports commit 8fec9a119264b7936503abce3c106fad7e3ccb76 from qemu.	2019-06-13 18:18:00 -04:00
Peter Maydell	f56f0342ad	target/arm: Convert VADD to decodetree Convert the VADD instruction to decodetree. Backports commit ce28b303716e7eca3f3765bf6776d722ebbe1122 from qemu	2019-06-13 18:15:52 -04:00
Peter Maydell	06584edf61	target/arm: Convert VNMUL to decodetree Convert the VNMUL instruction to decodetree. Backports commit 43c4be1236c105090d134540da1036073d157cd4 from qemu	2019-06-13 18:14:16 -04:00
Peter Maydell	2c5e102017	target/arm: Convert VMUL to decodetree Convert the VMUL instruction to decodetree. Backports commit 88c5188ced60e9f2b8cc3af3b9bc4a8031c8c996 from qemu	2019-06-13 18:12:03 -04:00
Peter Maydell	b26b6a12a2	target/arm: Convert VFP VNMLA to decodetree Convert the VFP VNMLA instruction to decodetree. Backports commit 8a483533adc1bdc2decb8f456dbe930a2d245a8b from qemu	2019-06-13 18:09:57 -04:00
Peter Maydell	638b90de31	target/arm: Convert VFP VNMLS to decodetree Convert the VFP VNMLS instruction to decodetree. Backports commit c54a416cc6d60efbc79dd37aaf0c8918c05b5815 from qemu	2019-06-13 18:06:59 -04:00
Peter Maydell	67ad40ffa4	target/arm: Convert VFP VMLS to decodetree Convert the VFP VMLS instruction to decodetree. Backports commit e7258280d46af4ab6a0cc93ccfe8f6614defb4b7 from qemu	2019-06-13 18:02:37 -04:00
Peter Maydell	edf81eb214	target/arm: Convert VFP VMLA to decodetree Convert the VFP VMLA instruction to decodetree. This is the first of the VFP 3-operand data processing instructions, so we include in this patch the code which loops over the elements for an old-style VFP vector operation. The existing code to do this looping uses the deprecated cpu_F0s/F0d/F1s/F1d TCG globals; since we are going to be converting instructions one at a time anyway we can take the opportunity to make the new loop use TCG temporaries, which means we can do that conversion one operation at a time rather than needing to do it all in one go. We include an UNDEF check which was missing in the old code: short-vector operations (with stride or length non-zero) were deprecated in v7A and must UNDEF in v8A, so if the MVFR0 FPShVec field does not indicate that support for short vectors is present we UNDEF the operations that would use them. (This is a change of behaviour for Cortex-A7, Cortex-A15 and the v8 CPUs, which previously were all incorrectly allowing short-vector operations.) Note that the conversion fixes a bug in the old code for the case of VFP short-vector "mixed scalar/vector operations". These happen where the destination register is in a vector bank but but the second operand is in a scalar bank. For example vmla.f64 d10, d1, d16 with length 2 stride 2 is equivalent to the pair of scalar operations vmla.f64 d10, d1, d16 vmla.f64 d8, d3, d16 where the destination and first input register cycle through their vector but the second input is scalar (d16). In the old decoder the gen_vfp_F1_mul() operation uses cpu_F1{s,d} as a temporary output for the multiply, which trashes the second input operand. For the fully-scalar case (where we never do a second iteration) and the fully-vector case (where the loop loads the new second input operand) this doesn't matter, but for the mixed scalar/vector case we will end up using the wrong value for later loop iterations. In the new code we use TCG temporaries and so avoid the bug. This bug is present for all the multiply-accumulate insns that operate on short vectors: VMLA, VMLS, VNMLA, VNMLS. Note 2: the expression used to calculate the next register number in the vector bank is not in fact correct; we leave this behaviour unchanged from the old decoder and will fix this bug later in the series. Backports commit 266bd25c485597c94209bfdb3891c1d0c573c164 from qemu	2019-06-13 17:59:16 -04:00
Peter Maydell	93fe4cbe9e	target/arm: Remove VLDR/VSTR/VLDM/VSTM use of cpu_F0s and cpu_F0d Expand out the sequences in the new decoder VLDR/VSTR/VLDM/VSTM trans functions which perform the memory accesses by going via the TCG globals cpu_F0s and cpu_F0d, to use local TCG temps instead. Backports commit 3993d0407dff7233e42f2251db971e126a0497e9 from qemu	2019-06-13 17:31:28 -04:00
Peter Maydell	ff7042567e	target/arm: Convert the VFP load/store multiple insns to decodetree Convert the VFP load/store multiple insns to decodetree. This includes tightening up the UNDEF checking for pre-VFPv3 CPUs which only have D0-D15 : they now UNDEF for any access to D16-D31, not merely when the smallest register in the transfer list is in D16-D31. This conversion does not try to share code between the single precision and the double precision versions; this looks a bit duplicative of code, but it leaves the door open for a future refactoring which gets rid of the use of the "F0" registers by inlining the various functions like gen_vfp_ld() and gen_mov_F0_reg() which are hiding "if (dp) { ... } else { ... }" conditionalisation. Backports commit fa288de272c5c8a66d5eb683b123706a52bc7ad6 from qemu	2019-06-13 17:26:52 -04:00
Peter Maydell	6f0633ce80	target/arm: Convert VFP VLDR and VSTR to decodetree Convert the VFP single load/store insns VLDR and VSTR to decodetree. Backports commit 79b02a3b5231c5b8cd31e50cd549968dd0a05c49 from qemu	2019-06-13 17:22:48 -04:00
Peter Maydell	fe98885ff2	target/arm: Convert VFP two-register transfer insns to decodetree Convert the VFP two-register transfer instructions to decodetree (in the v8 Arm ARM these are the "Advanced SIMD and floating-point 64-bit move" encoding group). Again, we expand out the sequences involving gen_vfp_msr() and gen_msr_vfp(). Backports commit 81f681106eabe21c55118a5a41999fb7387fb714 from qemu	2019-06-13 17:20:00 -04:00
Peter Maydell	3fb3403b82	target/arm: Convert single-precision register moves to decodetree Convert the "single-precision" register moves to decodetree: * VMSR * VMRS * VMOV between general purpose register and single precision Note that the VMSR/VMRS conversions make our handling of the "should this UNDEF?" checks consistent between the two instructions: * VMSR to MVFR0, MVFR1, MVFR2 now UNDEF from EL0 (previously was a nop) * VMSR to FPSID now UNDEFs from EL0 or if VFPv3 or better (previously was a nop) * VMSR to FPINST and FPINST2 now UNDEF if VFPv3 or better (previously would write to the register, which had no guest-visible effect because we always UNDEF reads) We also tighten up the decode: we were previously underdecoding some SBZ or SBO bits. The conversion of VMOV_single includes the expansion out of the gen_mov_F0_vreg()/gen_vfp_mrs() and gen_mov_vreg_F0()/gen_vfp_msr() sequences into the simpler direct load/store of the TCG temp via neon_{load,store}_reg32(): we know in the new function that we're always single-precision, we don't need to use the old-and-deprecated cpu_F0* TCG globals, and we don't happen to have the declaration of gen_vfp_msr() and gen_vfp_mrs() at the point in the file where the new function is. Backports commit a9ab50011aeda2dd012da99069e078379315ea18 from qemu	2019-06-13 17:16:38 -04:00
Peter Maydell	694058da94	target/arm: Convert double-precision register moves to decodetree Convert the "double-precision" register moves to decodetree: this covers VMOV scalar-to-gpreg, VMOV gpreg-to-scalar and VDUP. Note that the conversion process has tightened up a few of the UNDEF encoding checks: we now correctly forbid: * VMOV-to-gpr with U:opc1:opc2 == 10x00 or x0x10 * VMOV-from-gpr with opc1:opc2 == 0x10 * VDUP with B:E == 11 * VDUP with Q == 1 and Vn<0> == 1 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> --- The accesses of elements < 32 bits could be improved by doing direct ld/st of the right size rather than 32-bit read-and-shift or read-modify-write, but we leave this for later cleanup, since this series is generally trying to stick to fixing the decode. Backports commit 9851ed9269d214c0c6feba960dd14ff09e6c34b4 from qemu	2019-06-13 17:11:56 -04:00
Peter Maydell	7265161108	target/arm: Add helpers for VFP register loads and stores The current VFP code has two different idioms for loading and storing from the VFP register file: 1 using the gen_mov_F0_vreg() and similar functions, which load and store to a fixed set of TCG globals cpu_F0s, CPU_F0d, etc 2 by direct calls to tcg_gen_ld_f64() and friends We want to phase out idiom 1 (because the use of the fixed globals is a relic of a much older version of TCG), but idiom 2 is quite longwinded: tcg_gen_ld_f64(tmp, cpu_env, vfp_reg_offset(true, reg)) requires us to specify the 64-bitness twice, once in the function name and once by passing 'true' to vfp_reg_offset(). There's no guard against accidentally passing the wrong flag. Instead, let's move to a convention of accessing 64-bit registers via the existing neon_load_reg64() and neon_store_reg64(), and provide new neon_load_reg32() and neon_store_reg32() for the 32-bit equivalents. Implement the new functions and use them in the code in translate-vfp.inc.c. We will convert the rest of the VFP code as we do the decodetree conversion in subsequent commits. Backports commit 160f3b64c5cc4c8a09a1859edc764882ce6ad6bf from qemu	2019-06-13 17:01:59 -04:00
Peter Maydell	033a386ffb	target/arm: Move the VFP trans_* functions to translate-vfp.inc.c Move the trans_*() functions we've just created from translate.c to translate-vfp.inc.c. This is pure code motion with no textual changes (this can be checked with 'git show --color-moved'). Backports commit f7bbb8f31f0761edbf0c64b7ab3c3f49c13612ea from qemu	2019-06-13 16:56:24 -04:00
Peter Maydell	e55d31a5ac	target/arm: Convert VCVTA/VCVTN/VCVTP/VCVTM to decodetree Convert the VCVTA/VCVTN/VCVTP/VCVTM instructions to decodetree. trans_VCVT() is temporarily left in translate.c. Backports commit c2a46a914cd5c38fd0ee57ff0befc1c5bde27bcf from qemu	2019-06-13 16:54:42 -04:00
Peter Maydell	9fb01cb526	target/arm: Convert VRINTA/VRINTN/VRINTP/VRINTM to decodetree Convert the VRINTA/VRINTN/VRINTP/VRINTM instructions to decodetree. Again, trans_VRINT() is temporarily left in translate.c. Backports commit e3bb599d16e4678b228d80194cee328f894b1ceb from qemu	2019-06-13 16:50:36 -04:00
Peter Maydell	4501daf010	target/arm: Convert VMINNM, VMAXNM to decodetree Convert the VMINNM and VMAXNM instructions to decodetree. As with VSEL, we leave the trans_VMINMAXNM() function in translate.c for the moment. Backports commit f65988a1efdb42f9058db44297591491842e697c from qemu	2019-06-13 16:43:50 -04:00
Peter Maydell	3994dfd079	target/arm: Convert the VSEL instructions to decodetree Convert the VSEL instructions to decodetree. We leave trans_VSEL() in translate.c for now as this allows the patch to show just the changes from the old handle_vsel(). In the old code the check for "do D16-D31 exist" was hidden in the VFP_DREG macro, and assumed that VFPv3 always implied that D16-D31 exist. In the new code we do the correct ID register test. This gives identical behaviour for most of our CPUs, and fixes previously incorrect handling for Cortex-R5F, Cortex-M4 and Cortex-M33, which all implement VFPv3 or better with only 16 double-precision registers. Backports commit b3ff4b87b4ae08120a51fe12592725e1dca8a085 from qemu	2019-06-13 16:41:22 -04:00
Lioncash	b3cfede44f	target/arm: Make load_cpu_offset() take a DisasContext* instead of uc_struct* Keeps it consistent with store_cpu_offset	2019-06-13 16:35:31 -04:00
Peter Maydell	78997058e4	target/arm: Factor out VFP access checking code Factor out the VFP access checking code so that we can use it in the leaf functions of the decodetree decoder. We call the function full_vfp_access_check() so we can keep the more natural vfp_access_check() for a version which doesn't have the 'ignore_vfp_enabled' flag -- that way almost all VFP insns will be able to use vfp_access_check(s) and only the special-register access function will have to use full_vfp_access_check(s, ignore_vfp_enabled). Backports commit 06db8196bba34776829020192ed623a0b22e6557 from qemu	2019-06-13 16:33:38 -04:00
Peter Maydell	9732ebba5c	target/arm: Add stubs for AArch32 VFP decodetree Add the infrastructure for building and invoking a decodetree decoder for the AArch32 VFP encodings. At the moment the new decoder covers nothing, so we always fall back to the existing hand-written decode. We need to have one decoder for the unconditional insns and one for the conditional insns, as otherwise the patterns for conditional insns would incorrectly match against the unconditional ones too. Since translate.c is over 14,000 lines long and we're going to be touching pretty much every line of the VFP code as part of the decodetree conversion, we create a new translate-vfp.inc.c to hold the code which deals with VFP in the new scheme. It should be possible to convert this into a standalone translation unit eventually, but the conversion process will be much simpler if we simply #include it midway through translate.c to start with. Backports commit 78e138bc1f672c145ef6ace74617db00eebaa2ba from qemu	2019-06-13 16:24:37 -04:00
Richard Henderson	7c32498b7f	target/arm: Use tcg_gen_gvec_bitsel This replaces 3 target-specific implementations for BIT, BIF, and BSL. Backports commit 3a7a2b4e5cf0d49cd8b14e8225af0310068b7d20 from qemu	2019-06-13 16:12:56 -04:00
Richard Henderson	b8bd543390	target/arm: Use env_cpu, env_archcpu Cleanup in the boilerplate that each target must define. Replace arm_env_get_cpu with env_archcpu. The combination CPU(arm_env_get_cpu) should have used ENV_GET_CPU to begin; use env_cpu now. Backports commit 2fc0cc0e1e034582f4718b1a2d57691474ccb6aa from qemu	2019-06-12 11:34:08 -04:00
Alistair Francis	f8f3e50372	target/arm: Fix vector operation segfault Commit 89e68b575 "target/arm: Use vector operations for saturation" causes this abort() when booting QEMU ARM with a Cortex-A15: 0 0x00007ffff4c2382f in raise () at /usr/lib/libc.so.6 1 0x00007ffff4c0e672 in abort () at /usr/lib/libc.so.6 2 0x00005555559c1839 in disas_neon_data_insn (insn=<optimized out>, s=<optimized out>) at ./target/arm/translate.c:6673 3 0x00005555559c1839 in disas_neon_data_insn (s=<optimized out>, insn=<optimized out>) at ./target/arm/translate.c:6386 4 0x00005555559cd8a4 in disas_arm_insn (insn=4081107068, s=0x7fffe59a9510) at ./target/arm/translate.c:9289 5 0x00005555559cd8a4 in arm_tr_translate_insn (dcbase=0x7fffe59a9510, cpu=<optimized out>) at ./target/arm/translate.c:13612 6 0x00005555558d1d39 in translator_loop (ops=0x5555561cc580 <arm_translator_ops>, db=0x7fffe59a9510, cpu=0x55555686a2f0, tb=<optimized out>, max_insns=<optimized out>) at ./accel/tcg/translator.c:96 7 0x00005555559d10d4 in gen_intermediate_code (cpu=cpu@entry=0x55555686a2f0, tb=tb@entry=0x7fffd7840080 <code_gen_buffer+126091347>, max_insns=max_insns@entry=512) at ./target/arm/translate.c:13901 8 0x00005555558d06b9 in tb_gen_code (cpu=cpu@entry=0x55555686a2f0, pc=3067096216, cs_base=0, flags=192, cflags=-16252928, cflags@entry=524288) at ./accel/tcg/translate-all.c:1736 9 0x00005555558ce467 in tb_find (cf_mask=524288, tb_exit=1, last_tb=0x7fffd783e640 <code_gen_buffer+126084627>, cpu=0x1) at ./accel/tcg/cpu-exec.c:407 10 0x00005555558ce467 in cpu_exec (cpu=cpu@entry=0x55555686a2f0) at ./accel/tcg/cpu-exec.c:728 11 0x000055555588b0cf in tcg_cpu_exec (cpu=0x55555686a2f0) at ./cpus.c:1431 12 0x000055555588d223 in qemu_tcg_cpu_thread_fn (arg=0x55555686a2f0) at ./cpus.c:1735 13 0x000055555588d223 in qemu_tcg_cpu_thread_fn (arg=arg@entry=0x55555686a2f0) at ./cpus.c:1709 14 0x0000555555d2629a in qemu_thread_start (args=<optimized out>) at ./util/qemu-thread-posix.c:502 15 0x00007ffff4db8a92 in start_thread () at /usr/lib/libpthread. This patch ensures that we don't hit the abort() in the second switch case in disas_neon_data_insn() as we will return from the first case. Backports commit 2f143d3ad1c05e91cf2cdf5de06d59a80a95e6c8 from qemu	2019-05-24 18:02:32 -04:00
Richard Henderson	552e48f14e	target/arm: Use tcg_gen_abs_i64 and tcg_gen_gvec_abs Backports commit 4e027a710673f5d4dc6cff88728bcfd32e4c47b0 from qemu	2019-05-16 16:43:02 -04:00
Richard Henderson	6d1730048d	tcg: Add support for integer absolute value Remove a function of the same name from target/arm/. Use a branchless implementation of abs gleaned from gcc. Backports commit ff1f11f7f8710a768f9313f24bd7f509d3db27e5 from qemu	2019-05-16 16:25:15 -04:00
Richard Henderson	c54b2776f6	tcg: Specify optional vector requirements with a list Replace the single opcode in .opc with a null-terminated array in .opt_opc. We still require that all opcodes be used with the same .vece. Validate the contents of this list with CONFIG_DEBUG_TCG. All tcg_gen_*_vec functions will check any list active during .fniv expansion. Swap the active list in and out as we expand other opcodes, or take control away from the front-end function. Convert all existing vector aware front ends. Backports commit 53229a7703eeb2bbe101a19a33ef22aaf960c65b from qemu	2019-05-16 15:05:02 -04:00
Emilio G. Cota	1715f382b4	target/arm: check CF_PARALLEL instead of parallel_cpus Thereby decoupling the resulting translated code from the current state of the system. Backports commit 2399d4e7cec22ecf1c51062d2ebfd45220dbaace from qemu	2019-05-04 22:44:32 -04:00
Peter Maydell	77ae3982b4	target/arm: Implement VLLDM for v7M CPUs with an FPU Implement the VLLDM instruction for v7M for the FPU present cas. Backports commit 956fe143b4f254356496a0a1c479fa632376dfec from qemu	2019-04-30 11:27:54 -04:00
Peter Maydell	b483951046	target/arm: Implement VLSTM for v7M CPUs with an FPU Implement the VLSTM instruction for v7M for the FPU present case. Backports commit 019076b036da4444494de38388218040d9d3a26c from qemu	2019-04-30 11:25:44 -04:00
Peter Maydell	a976d7642a	target/arm: Implement M-profile lazy FP state preservation The M-profile architecture floating point system supports lazy FP state preservation, where FP registers are not pushed to the stack when an exception occurs but are instead only saved if and when the first FP instruction in the exception handler is executed. Implement this in QEMU, corresponding to the check of LSPACT in the pseudocode ExecuteFPCheck(). Backports commit e33cf0f8d8c9998a7616684f9d6aa0d181b88803 from qemu	2019-04-30 11:21:50 -04:00
Peter Maydell	719231b4c0	target/arm: Activate M-profile floating point context when FPCCR.ASPEN is set The M-profile FPCCR.ASPEN bit indicates that automatic floating-point context preservation is enabled. Before executing any floating-point instruction, if FPCCR.ASPEN is set and the CONTROL FPCA/SFPA bits indicate that there is no active floating point context then we must create a new context (by initializing FPSCR and setting FPCA/SFPA to indicate that the context is now active). In the pseudocode this is handled by ExecuteFPCheck(). Implement this with a new TB flag which tracks whether we need to create a new FP context. Backports commit 6000531e19964756673a5f4b694a649ef883605a from qemu	2019-04-30 10:51:31 -04:00
Peter Maydell	87c8c0fde7	target/arm: Set FPCCR.S when executing M-profile floating point insns The M-profile FPCCR.S bit indicates the security status of the floating point context. In the pseudocode ExecuteFPCheck() function it is unconditionally set to match the current security state whenever a floating point instruction is executed. Implement this by adding a new TB flag which tracks whether FPCCR.S is different from the current security state, so that we only need to emit the code to update it in the less-common case when it is not already set correctly. Note that we will add the handling for the other work done by ExecuteFPCheck() in later commits. Backports commit 6d60c67a1a03be32c3342aff6604cdc5095088d1 from qemu	2019-04-30 10:50:17 -04:00
Peter Maydell	8d726490ff	target/arm: Overlap VECSTRIDE and XSCALE_CPAR TB flags We are close to running out of TB flags for AArch32; we could start using the cs_base word, but before we do that we can economise on our usage by sharing the same bits for the VFP VECSTRIDE field and the XScale XSCALE_CPAR field. This works because no XScale CPU ever had VFP. Backports commit ea7ac69d124c94c6e5579145e727adec9ccbefef from qemu	2019-04-30 10:45:14 -04:00
Peter Maydell	89baa5cffa	target/arm: Decode FP instructions for M profile Correct the decode of the M-profile "coprocessor and floating-point instructions" space: * op0 == 0b11 is always unallocated * if the CPU has an FPU then all insns with op1 == 0b101 are floating point and go to disas_vfp_insn() For the moment we leave VLLDM and VLSTM as NOPs; in a later commit we will fill in the proper implementation for the case where an FPU is present. Backports commit 8859ba3c9625e7ceb5599f457a344bcd7c5e112b from qemu	2019-04-30 10:19:45 -04:00
Peter Maydell	18bb21c035	target/arm: Honour M-profile FP enable bits Like AArch64, M-profile floating point has no FPEXC enable bit to gate floating point; so always set the VFPEN TB flag. M-profile also has CPACR and NSACR similar to A-profile; they behave slightly differently: * the CPACR is banked between Secure and Non-Secure * if the NSACR forces a trap then this is taken to the Secure state, not the Non-Secure state Honour the CPACR and NSACR settings. The NSACR handling requires us to borrow the exception.target_el field (usually meaningless for M profile) to distinguish the NOCP UsageFault taken to Secure state from the more usual fault taken to the current security state. Backports commit d87513c0abcbcd856f8e1dee2f2d18903b2c3ea2 from qemu	2019-04-30 10:18:21 -04:00
Peter Maydell	c6bb8d483d	target/arm: Disable most VFP sysregs for M-profile The only "system register" that M-profile floating point exposes via the VMRS/VMRS instructions is FPSCR, and it does not have the odd special case for rd==15. Add a check to ensure we only expose FPSCR. Backports commit ef9aae2522c22c05df17dd898099dd5c3f20d688 from qemu	2019-04-30 10:15:25 -04:00
Richard Henderson	bca82cde84	tcg: Hoist max_insns computation to tb_gen_code In order to handle TB's that translate to too much code, we need to place the control of the length of the translation in the hands of the code gen master loop. Backports commit 8b86d6d25807e13a63ab6ea879f976b9f18cc45a from qemu	2019-04-30 09:49:57 -04:00
Lioncash	c3df12e534	target/arm/translate: Synchronize with Qemu	2019-04-27 10:13:01 -04:00
Lioncash	d844d7cc9d	exec: Backport tb_cflags accessor	2019-04-22 06:12:59 -04:00
Lioncash	5968b3d96f	target/arm: Synchronize with qemu	2019-04-19 15:31:18 -04:00
Lioncash	bf6dfeb175	target/arm/translate: Synchronize with qemu Backports a few other missing pieces from mainline qemu.	2019-04-18 06:22:36 -04:00
Lioncash	5b062dacf2	target/arm: Simplify and correct thumb instruction tracing This wasn't subtracting the size of the instruction off the PC like how the ARM mode tracing was performing the tracing. This simplifies it and makes the behavior identical.	2019-04-18 06:00:15 -04:00
Lioncash	5d6ddec7fb	target/arm/translate: Subtract PC value properly for thumb tracecode calls	2019-04-18 05:44:48 -04:00
Lioncash	3521e72580	target/arm: Sychronize with qemu Synchronizes with bits and pieces that were missed due to merging incorrectly (sorry :<)	2019-04-18 04:49:11 -04:00
Lioncash	ddcf400955	arm: Always enable access to coprocessors initially Allows non-AArch64 environments to always access coprocessors initially. Removes the need to do avoidable register management when testing floating-point code.	2019-04-13 19:49:43 -04:00
Richard Henderson	45c297c99b	target/arm: Add set/clear_pstate_bits, share gen_ss_advance We do not need an out-of-line helper for manipulating bits in pstate. While changing things, share the implementation of gen_ss_advance. Backports commit 22ac3c49641f6eed93dca5b852030b4d3eacf6c4 from qemu	2019-03-05 22:55:22 -05:00
Richard Henderson	1721e429c2	target/arm: Implement ARMv8.0-SB Backports commit 9888bd1e20425dfe4dcca5dcd1ca2fac8e90ad19 from qemu	2019-03-05 22:35:16 -05:00
Richard Henderson	fa70a2bc69	target/arm: Fix PC test for LDM (exception return) Found by inspection: Rn is the base register against which the load began; I is the register within the mask being processed. The exception return should of course be processed from the loaded PC. Backports commit 9d090d17234058f55c3c439d285db78c94d7d4de from qemu	2019-03-05 22:27:38 -05:00
Lioncash	0868015992	target/arm: Move TCGContext variable within arm_post_translate_insn into a narrower scope This is only used within the scope of the if statement, so we can just move it there.	2019-02-28 18:53:33 -05:00
Lioncash	15440a83c5	target/arm: Fix execution of ARM instructions Previously we'd be checking prior to the actual decoding if we were at the ending address. This worked fine using the old model of the translation process in qemu. However, this causes the wrong behavior to occur in both ARM and Thumb/Thumb-2 modes using the newer translator model. Given the translator itself checks for the end address already, this needs to be placed within arm_post_translate_insn(). This prevents the emulation process being off-by-one as well when it comes to actually executing the instructions.	2019-02-28 18:49:22 -05:00
Richard Henderson	4ae3ff8e61	target/arm: Implement VFMAL and VFMSL for aarch32 Backports commit 87732318c5d68a366fc2d6fc394d9c20412099fa from qemu	2019-02-28 15:44:59 -05:00
Peter Maydell	82b8e97f76	target/arm: Gate "miscellaneous FP" insns by ID register field There is a set of VFP instructions which we implement in disas_vfp_v8_insn() and gate on the ARM_FEATURE_V8 bit. These were all first introduced in v8 for A-profile, but in M-profile they appeared in v7M. Gate them on the MVFR2 FPMisc field instead, and rename the function appropriately. Backports commit c0c760afe800b60b48c80ddf3509fec413594778 from qemu	2019-02-28 15:26:27 -05:00
Peter Maydell	118a2bde5c	target/arm: Use MVFR1 feature bits to gate A32/T32 FP16 instructions Instead of gating the A32/T32 FP16 conversion instructions on the ARM_FEATURE_VFP_FP16 flag, switch to our new approach of looking at ID register bits. In this case MVFR1 fields FPHP and SIMDHP indicate the presence of these insns. This change doesn't alter behaviour for any of our CPUs. Backports commit 602f6e42cfbfe9278be34e9b91d2ceb695837e02 from qemu	2019-02-28 15:23:51 -05:00
Richard Henderson	c9ad233678	target/arm: Implement ARMv8.3-JSConv Backports commit 6c1f6f2733a7692793135ea5ce72b829add99a50 from qemu	2019-02-22 19:08:57 -05:00
Richard Henderson	f16dcbe226	target/arm: Rearrange Floating-point data-processing (2 regs) There are lots of special cases within these insns. Split the major argument decode/loading/saving into no_output (compares), rd_is_dp, and rm_is_dp. We still need to special case argument load for compare (rd as input, rm as zero) and vcvt fixed (rd as input+output), but lots of special cases do disappear. Now that we have a full switch at the beginning, hoist the ISA checks from the code generation. Backports commit e80941bd64cc388554770fd72334e9e7d459a1ef from qemu	2019-02-22 18:57:25 -05:00
Richard Henderson	f3cb92c86c	target/arm: Use vector operations for saturation For same-sign saturation, we have tcg vector operations. We can compute the QC bit by comparing the saturated value against the unsaturated value. Backports commit 89e68b575e138d0af1435f11a8ffcd8779c237bd from qemu	2019-02-15 18:14:09 -05:00
Richard Henderson	4e44043956	target/arm: Fix arm_cpu_dump_state vs FPSCR Backports commit ec527e4eeccc31e3beadf3b61b66c61bbd873811 from qemu	2019-02-15 17:58:25 -05:00
Richard Henderson	198befc50e	target/arm: Use tcg integer min/max primitives for neon The 32-bit PMIN/PMAX has been decomposed to scalars, and so can be trivially expanded inline. Backports commit 9ecd3c5c1651fa7f9adbedff4806a2da0b50490c from qemu	2019-02-15 17:55:11 -05:00
Richard Henderson	eee33bd692	target/arm: Use vector minmax expanders for aarch32 Backports commit 6f2782218230bbb33fa22f9a2f73f8a570046007 from qemu	2019-02-15 17:54:05 -05:00
Richard Henderson	d147946edc	target/arm: Rely on optimization within tcg_gen_gvec_or Since we're now handling a == b generically, we no longer need to do it by hand within target/arm/. Backports commit 2900847ff4c862887af750935a875059615f509a from qemu	2019-02-15 17:50:28 -05:00
Peter Maydell	55bc017af4	target/arm: Emit barriers for A32/T32 load-acquire/store-release insns Now that MTTCG is here, the comment in the 32-bit Arm decoder that "Since the emulation does not have barriers, the acquire/release semantics need no special handling" is no longer true. Emit the correct barriers for the load-acquire/store-release insns, as we already do in the A64 decoder. Backports commit 96c552958dbb63453b5f02bea6e704006d50e39a from qemu	2019-01-13 19:48:27 -05:00
Richard Henderson	4d8b7a9967	target/arm: Convert ARM_TBFLAG_* to FIELDs Use "register" TBFLAG_ANY to indicate shared state between A32 and A64, and "registers" TBFLAG_A32 & TBFLAG_A64 for fields that are specific to the given cpu state. Move ARM_TBFLAG_BE_DATA to shared state, instead of its current placement within "Bit usage when in AArch32 state". Backports commit aad821ac4faad369fad8941d25e59edf2514246b from qemu	2019-01-13 19:21:18 -05:00
Richard Henderson	1bcba0737e	target/arm: Reorg NEON VLD/VST single element to one lane Instead of shifts and masks, use direct loads and stores from the neon register file. Backports commit 2d6ac920837f558be214ad2ddd28cad7f3b15e5c from qemu	2018-11-10 11:24:37 -05:00
Richard Henderson	37103f1bc4	target/arm: Promote consecutive memory ops for aa32 For a sequence of loads or stores from a single register, little-endian operations can be promoted to an 8-byte op. This can reduce the number of operations by a factor of 8. Backports commit e23f12b3a252352b575908ca7b94587acd004641 from qemu	2018-11-10 11:19:15 -05:00
Richard Henderson	1cab7a41ac	target/arm: Reorg NEON VLD/VST all elements Instead of shifts and masks, use direct loads and stores from the neon register file. Mirror the iteration structure of the ARM pseudocode more closely. Correct the parameters of the VLD2 A2 insn. Note that this includes a bugfix for handling of the insn "VLD2 (multiple 2-element structures)" -- we were using an incorrect stride value. Backports commit ac55d00709e78cd39dfa298dcaac7aecb58762e8 from qemu	2018-11-10 11:18:45 -05:00
Richard Henderson	a2239b9f5b	target/arm: Use gvec for NEON VLD all lanes Backports commit 7377c2c97e20e64ed9b481eb2d9b9084bfd5b7e9 from qemu	2018-11-10 11:08:29 -05:00
Richard Henderson	985acb9cde	target/arm: Use gvec for NEON_3R_VTST_VCEQ, NEON_3R_VCGT, NEON_3R_VCGE Move cmtst_op expanders from translate-a64.c. Backports commit ea580fa312674c1ba82a8b137caf42b0609ce3e3 from qemu	2018-11-10 11:03:42 -05:00
Richard Henderson	5d9c0e52bf	target/arm: Use gvec for NEON_3R_VML Move mla_op and mls_op expanders from translate-a64.c. Backports commit 4a7832b095b9ce97a815749a13516f5cfb3c5dd4 from qemu	2018-11-10 10:58:44 -05:00
Richard Henderson	79bbb7c730	target/arm: Use gvec for VSRI, VSLI Move shi_op and sli_op expanders from translate-a64.c. Backports commit f3cd8218d1d3e534877ce3f3cb61c6757d10f9df from qemu	2018-11-10 10:53:28 -05:00
Lioncash	edb36c7505	target/arm: Use gvec for VSRA	2018-11-10 10:32:29 -05:00
Richard Henderson	b5877f1dfb	target/arm: Use gvec for VSHR, VSHL Backports commit 1dc8425e551be1371d657e94367f37130cd7aede from qemu	2018-11-10 10:14:31 -05:00
Lioncash	7790ca1ccb	target/arm: Use gvec for NEON_3R_VMUL	2018-11-10 10:11:10 -05:00
Richard Henderson	dfdc6bc05c	target/arm: Use gvec for NEON_2RM_VMN, NEON_2RM_VNEG Backports commit 4bf940bebad273e4b3534ae3f83f2c9d1191d3a2 from qemu	2018-11-10 10:09:38 -05:00
Richard Henderson	7b4b5ac249	target/arm: Use gvec for NEON_3R_VADD_VSUB insns Backports commit e4717ae02dd0c2e544a07302c1ed473775209aba from qemu	2018-11-10 10:08:23 -05:00
Richard Henderson	0965b9513a	target/arm: Use gvec for NEON_3R_LOGIC insns Move expanders for VBSL, VBIT, and VBIF from translate-a64.c. Backports commit eabcd6faa90461e0b7463f4ebe75b8d050487c9c from qemu	2018-11-10 10:06:13 -05:00
Richard Henderson	9f767248a2	target/arm: Use gvec for NEON VMOV, VMVN, VBIC & VORR (immediate) Backports commit 246fa4aca95e213fba10c8222dbc6bd0a9a2a8d4 from qemu	2018-11-10 09:56:30 -05:00
Richard Henderson	c1251a19e1	target/arm: Use gvec for NEON VDUP Also introduces neon_element_offset to find the env offset of a specific element within a neon register. Backports commit 32f91fb71f4c32113ec8c2af5f74f14abe6c7162 from qemu	2018-11-10 09:51:40 -05:00
Richard Henderson	3d5f040608	target/arm: Mark some arrays const Backports commit 308e5636152594daa4c5597b1188d44d7266db04 from qemu	2018-11-10 09:49:25 -05:00
Richard Henderson	74aba4ba51	target/arm: Don't call tcg_clear_temp_count This is done generically in translator_loop. Backports commit 7108e255c2d95b44c9dfee8075d0d6fb391281a8 from qemu	2018-11-10 09:40:06 -05:00
Peter Maydell	d60fe610bb	target/arm: Report correct syndrome for FP/SIMD traps to Hyp mode For traps of FP/SIMD instructions to AArch32 Hyp mode, the syndrome provided in HSR has more information than is reported to AArch64. Specifically, there are extra fields TA and coproc which indicate whether the trapped instruction was FP or SIMD. Add this extra information to the syndromes we construct, and mask it out when taking the exception to AArch64. Backports commit 4be42f4013fa1a9df47b48aae5148767bed8e80c from qemu	2018-11-10 09:36:41 -05:00
Lioncash	a0358202a7	target/arm: Improve debug logging of AArch32 exception return For AArch32, exception return happens through certain kinds of CPSR write. We don't currently have any CPU_LOG_INT logging of these events (unlike AArch64, where we log in the ERET instruction). Add some suitable logging. This will log exception returns like this: Exception return from AArch32 hyp to usr PC 0x80100374 paralleling the existing logging in the exception_return helper for AArch64 exception returns: Exception return from AArch64 EL2 to AArch64 EL0 PC 0x8003045c Exception return from AArch64 EL2 to AArch32 EL0 PC 0x8003045c (Note that an AArch32 exception return can only be AArch32->AArch32, never to AArch64.) Backports commit 81e3728407bf4a12f83e14fd410d5f0a7d29b5b4 from qemu	2018-11-10 09:09:52 -05:00
Richard Henderson	03ec90f39b	target/arm: Convert v8.2-fp16 from feature bit to aa64pfr0 test Backports commit 5763190fa8705863b4b725aa1657661a97113eb4 from qemu	2018-11-10 08:34:32 -05:00
Richard Henderson	03e2d64aed	target/arm: Convert jazelle from feature bit to isar1 test Having V6 alone imply jazelle was wrong for cortex-m0. Change to an assertion for V6 & !M. This was harmless, because the only place we tested ARM_FEATURE_JAZELLE was for 'bxj' in disas_arm(), which is unreachable for M-profile cores. Backports commit 09cbd50198d5dcac8bea2e47fa5dd641ec505fae from qemu	2018-11-10 08:24:11 -05:00
Richard Henderson	4a58a81e31	target/arm: Convert division from feature bits to isar0 tests Both arm and thumb2 division are controlled by the same ISAR field, which takes care of the arm implies thumb case. Having M imply thumb2 division was wrong for cortex-m0, which is v6m and does not have thumb2 at all, much less thumb2 division. Backports commit 7e0cf8b47f0e67cebbc3dfa73f304e56ad1a090f from qemu	2018-11-10 08:21:02 -05:00
Richard Henderson	4221703f18	target/arm: Convert v8 extensions from feature bits to isar tests Most of the v8 extensions are self-contained within the ISAR registers and are not implied by other feature bits, which makes them the easiest to convert. Backports commit 962fcbf2efe57231a9f5df0ae0f40c05e35628ba from qemu	2018-11-10 08:17:57 -05:00
Peter Maydell	76f521e6c3	target/arm: Add v8M stack checks for VLDM/VSTM Add the v8M stack checks for the VLDM/VSTM (aka VPUSH/VPOP) instructions. This code is currently unreachable because we haven't yet implemented M profile floating point support, but since the change is simple, we add it now because otherwise we're likely to forget to do it later. Backports commit 8a954faf5412d5073d585d85a1da63a09bb5d84e from qemu	2018-10-08 14:23:02 -04:00
Peter Maydell	37d0c7fcf1	target/arm: Add v8M stack checks for Thumb push/pop Add v8M stack checks for the 16-bit Thumb push/pop encodings: STMDB, STMFD, LDM, LDMIA, LDMFD. Backports commit aa369e5c08bbe2748d2be96f13f4ef469a4d3080 from qemu	2018-10-08 14:22:08 -04:00
Peter Maydell	ef9afb1855	target/arm: Add v8M stack checks for T32 load/store single Add v8M stack checks for the instructions in the T32 "load/store single" encoding class: these are the "immediate pre-indexed" and "immediate, post-indexed" LDR and STR instructions. Backports commit 0bc003bad9752afc61624cb680226c922f34f82c from qemu	2018-10-08 14:20:58 -04:00
Peter Maydell	de30651f5e	target/arm: Add v8M stack checks for Thumb2 LDM/STM Add the v8M stack checks for: * LDM (T2 encoding) * STM (T2 encoding) This includes the 32-bit encodings of the instructions listed in v8M ARM ARM rule R_YVWT as * LDM, LDMIA, LDMFD * LDMDB, LDMEA * POP (multiple registers) * PUSH (muliple registers) * STM, STMIA, STMEA * STMDB, STMFD We perform the stack limit before doing any other part of the load or store. Backports commit 7c0ed88e7d6bee3e55c3d8935c46226cb544191a from qemu	2018-10-08 14:19:14 -04:00
Peter Maydell	bb97240df6	target/arm: Add v8M stack checks for LDRD/STRD (imm) Add the v8M stack checks for: * LDRD (immediate) * STRD (immediate) Loads and stores are more complicated than ADD/SUB/MOV, because we must ensure that memory accesses below the stack limit are not performed, so we can't simply do the check when we actually update SP. For these instructions, if the stack limit check triggers we must not: * perform any memory access below the SP limit * update PC, SP or the load/store base register but it is IMPDEF whether we: * perform any accesses above or equal to the SP limit * update destination registers for loads For QEMU we choose to always check the limit before doing any other part of the load or store, so we won't update any registers or perform any memory accesses. It is UNKNOWN whether the limit check triggers for a load or store where the initial SP value is below the limit and one of the stores would be below the limit, but the writeback moves SP to above the limit. For QEMU we choose to trigger the check in this situation. Note that limit checks happen only for loads and stores which update SP via writeback; they do not happen for loads and stores which simply use SP as a base register. Backports commit 910d7692e5b60f2c2d08cc3d6d36076e85b6a69d from qemu	2018-10-08 14:17:27 -04:00
Peter Maydell	0fc6e2c183	target/arm: Add some comments in Thumb decode Add some comments to the Thumb decoder indicating what bits of the instruction have been decoded at various points in the code. This is not an exhaustive set of comments; we're gradually adding comments as we work with particular bits of the code. Backports commit a2d12f0f34e9c5ef8a193556fde983aa186fa73a from qemu	2018-10-08 14:15:15 -04:00
Peter Maydell	ca5d7b8fd2	target/arm: Add v8M stack checks on ADD/SUB/MOV of SP Add code to insert calls to a helper function to do the stack limit checking when we handle these forms of instruction that write to SP: * ADD (SP plus immediate) * ADD (SP plus register) * SUB (SP minus immediate) * SUB (SP minus register) * MOV (register) Backports commit 5520318939fea5d659bf808157cd726cb967b761 from qemu	2018-10-08 14:15:15 -04:00
Peter Maydell	8b3b548961	target/arm: Define new TBFLAG for v8M stack checking The Arm v8M architecture includes hardware stack limit checking. When certain instructions update the stack pointer, if the new value of SP is below the limit set in the associated limit register then an exception is taken. Add a TB flag that tracks whether the limit-checking code needs to be emitted. Backports commit 4730fb85035e99c909db7d14ef76cd17f28f4423 from qemu	2018-10-08 14:15:15 -04:00
Lioncash	47b45f1bc2	arm: Take DisasContext as a parameter instead of TCGContext where applicable This is more future-friendly with qemu, as it's more generic.	2018-10-06 04:17:12 -04:00
Lioncash	766c70f608	arm: Move cpu_M0 to DisasContext	2018-10-06 03:32:39 -04:00

1 2 3 4 5 ...

351 commits