unicorn

mirror of https://github.com/yuzu-emu/unicorn.git synced 2025-10-13 14:27:12 +00:00

Author	SHA1	Message	Date
Peter Maydell	4850377f01	target/arm: Implement FP16 for Neon VADD, VSUB, VABD, VMUL Implement FP16 support for the Neon insns which use the DO_3S_FP_GVEC macro: VADD, VSUB, VABD, VMUL. For VABD this requires us to implement a new gvec_fabd_h helper using the machinery we have already for the other helpers. Backport e4a6d4a69e239becfd83bdcd996476e7b8e1138d	2021-03-01 16:31:54 -05:00
Peter Maydell	08b70267d0	target/arm: Implement VFP fp16 VMOV between gp and halfprec registers Implement the VFP fp16 variant of VMOV that transfers a 16-bit value between a general purpose register and a VFP register. Note that Rt == 15 is UNPREDICTABLE; since this insn is v8 and later only we have no need to replicate the old "updates CPSR.NZCV" behaviour that the singleprec version of this insn does Backports commit 46a4b854525cb9f34a611f6ada6cdff1eab0ac2d	2021-03-01 16:26:34 -05:00
Peter Maydell	58485bca97	target/arm: Implement new VFP fp16 insn VMOVX The fp16 extension includes a new instruction VMOVX, which copies the upper 16 bits of a 32-bit source VFP register into the lower 16 bits of the destination and zeroes the high half of the destination. Implement it. Backports f61e5c43b86907dea17f431b528d806659d62bcb	2021-03-01 16:24:50 -05:00
Peter Maydell	3dd587e3df	target/arm: Implement new VFP fp16 insn VINS The fp16 extension includes a new instruction VINS, which copies the lower 16 bits of a 32-bit source VFP register into the upper 16 bits of the destination. Implement it. Backports commit e4875e3bcc3a9c54d7e074c8f51e04c2e6364e2e	2021-03-01 16:22:27 -05:00
Peter Maydell	90aa9647e0	target/arm: Implement VFP fp16 VRINT* Implement the fp16 version of the VFP VRINT* insns. Backports 0a6f4b4cb338665b81ad824d9a6868932461b7f7	2021-03-01 16:15:21 -05:00
Peter Maydell	1c8088b48a	target/arm: Implement VFP fp16 VSEL Implement the fp16 versions of the VFP VSEL instruction. Backports commit 11e78fecdf2d605cfed33aa09bbcf0cc4fb95886	2021-03-01 16:08:51 -05:00
Peter Maydell	beee4ad7f3	target/arm: Implement VFP vp16 VCVT-with-specified-rounding-mode Implement the fp16 versions of the VFP VCVT instruction forms which convert between floating point and integer with a specified rounding mode. Backports c505bc6a9d50a48f9d89d6cf930e863838a5b367	2021-02-28 05:18:07 -05:00
Peter Maydell	74a6af4e23	target/arm: Implement VFP fp16 VCVT between float and fixed-point Implement the fp16 versions of the VFP VCVT instruction forms which convert between floating point and fixed-point. Backports a149e2de0b63e3906729ed1d3df7d9ecdb6de5e6	2021-02-28 05:15:40 -05:00
Peter Maydell	9c5b6f06a2	target/arm: Use macros instead of open-coding fp16 conversion helpers Now the VFP_CONV_FIX macros can handle fp16's distinction between the width of the operation and the width of the type used to pass operands, use the macros rather than the open-coded functions. This creates an extra six helper functions, all of which we are going to need for the AArch32 VFP fp16 instructions. Backports commit 414ba270c4fb758d987adf37ae9bfe531715c604	2021-02-28 05:08:44 -05:00
Peter Maydell	dd6e11eaa7	target/arm: Make VFP_CONV_FIX macros take separate float type and float size Currently the VFP_CONV_FIX macros take a single fsz argument for the size of the float type, which is used both to select the name of the functions to call (eg float32_is_any_nan()) and also for the type to use for the float inputs and outputs (eg float32). Separate these into fsz and ftype arguments, so that we can use them for fp16, which uses 'float16' in the function names but is still passing inputs and outputs in a 32-bit sized type. Backports 5366f6ad7da4f6def2733ec7ee24495430256839	2021-02-28 05:05:53 -05:00
Peter Maydell	f8241ae22f	target/arm: Implement VFP fp16 VCVT between float and integer Backports 0094e9f475a5a742d10d2f1e1beceea82b69f982	2021-02-28 05:02:25 -05:00
Peter Maydell	ac9ae5cbe7	target/arm: Implement VFP fp16 VLDR and VSTR Implement the fp16 versions of the VFP VLDR/VSTR (immediate). Backports commit 274afbb121107b8aaeaa11b3e7904d5f8ae38a94	2021-02-28 04:58:32 -05:00
Peter Maydell	5d98e14545	target/arm: Implement VFP fp16 VCMP Implement fp16 version of VCMP. Backports 1b88b054c5b201e8581114d29527c6a5a7e088c9	2021-02-28 04:56:24 -05:00
Peter Maydell	25d95570f3	target/arm: Implement VFP fp16 for VMOV immediate Implement VFP fp16 support for the VMOV immediate insn. Backports commit 28c28728e53c9f4c13a5cd50f313788c7ec2f9ad	2021-02-28 04:51:11 -05:00
Peter Maydell	2d9abf7c0b	target/arm: Implement VFP fp16 for VABS, VNEG, VSQRT Implement VFP fp16 for VABS, VNEG and VSQRT. This is all the fp16 insns that use the DO_VFP_2OP macro, because there is no fp16 version of VMOV_reg. Notes: * the gen_helper_vfp_negh already exists as we needed to create it for the fp16 multiply-add insns * as usual we need to use the f16 version of the fp_status; this is only relevant for VSQRT Backports ce2d65a5d191380756cdac7a1fd1ba76bd1621cf	2021-02-28 04:48:28 -05:00
Peter Maydell	f3af6b8c25	target/arm: Macroify uses of do_vfp_2op_sp() and do_vfp_2op_dp() Macroify the uses of do_vfp_2op_sp() and do_vfp_2op_dp(); this will make it easier to add the halfprec support. Backports 009a07335b8ff492d940e1eb229a1b0d302c2512	2021-02-28 04:43:01 -05:00
Peter Maydell	6ac2c597ab	target/arm: Implement VFP fp16 for fused-multiply-add Implement VFP fp16 support for fused multiply-add insns VFNMA, VFNMS, VFMA, VFMS. Backports 9886fe2834b064a3cf0675a4659942ed547aed42	2021-02-28 04:39:21 -05:00
Peter Maydell	f86c84425b	target/arm: Macroify trans functions for VFMA, VFMS, VFNMA, VFNMS Macroify creation of the trans functions for single and double precision VFMA, VFMS, VFNMA, VFNMS. The repetition was OK for two sizes, but we're about to add halfprec and it will get a bit more than seems reasonable. Backports 2aa8dcfa14558fe2a63ed0496d60b02565c9a225	2021-02-28 04:36:07 -05:00
Peter Maydell	a42ecfe203	target/arm: Implement VFP fp16 VMLA, VMLS, VNMLS, VNMLA, VNMUL Implement fp16 versions of the VFP VMLA, VMLS, VNMLS, VNMLA, VNMUL instructions. (These are all the remaining ones which we implement via do_vfp_3op_[hsd]p().) Backports commit e7cb0ded52c6d7b86585b09935fe7caeb9e38b69	2021-02-28 04:29:37 -05:00
Peter Maydell	eae621098d	target/arm: Implement VFP fp16 for VFP_BINOP operations Implmeent VFP fp16 support for simple binary-operator VFP insns VADD, VSUB, VMUL, VDIV, VMINNM and VMAXNM: * make the VFP_BINOP() macro generate float16 helpers as well as float32 and float64 * implement a do_vfp_3op_hp() function similar to the existing do_vfp_3op_sp() * add decode for the half-precision insn patterns Note that the VFP_BINOP macro use creates a couple of unused helper functions vfp_maxh and vfp_minh, but they're small so it's not worth splitting the BINOP operations into "needs halfprec" and "no halfprec" groups. Backports commit 120a0eb3ea23a5b06fae2f3daebd46a4035864cf	2021-02-28 04:24:39 -05:00
Peter Maydell	1afb240134	target/arm: Use correct ID register check for aa32_fp16_arith The aa32_fp16_arith feature check function currently looks at the AArch64 ID_AA64PFR0 register. This is (as the comment notes) not correct. The bogus check was put in mostly to allow testing of the fp16 variants of the VCMLA instructions and it was something of a mistake that we allowed them to exist in master. Switch the feature check function to testing VMFR1.FPHP, which is what it ought to be. This will remove emulation of the VCMLA and VCADD insns from AArch32 code running on an AArch64 '-cpu max' using system emulation. (They were never enabled for aarch32 linux-user and system-emulation.) Since we weren't advertising their existence via the AArch32 ID register, well-behaved guests wouldn't have been using them anyway. Once we have implemented all the AArch32 support for the FP16 extension we will advertise it in the MVFR1 ID register field, which will reenable these insns along with all the others. Backports 02bc236d0131a666d4ac2bb7197bbad2897c336a	2021-02-27 16:47:48 -05:00
Peter Maydell	b93ca1fca6	target/arm: Remove local definitions of float constants In several places the target/arm code defines local float constants for 2, 3 and 1.5, which are also provided by include/fpu/softfloat.h. Remove the unnecessary local duplicate versions. Backports b684e49a17da39539b0ac6e4c4c98b28b38feb76	2021-02-27 16:47:10 -05:00
Chen Qun	46af765bbb	target/arm/translate-a64:Remove redundant statement in disas_simd_two_reg_misc_fp16() Clang static code analyzer show warning: target/arm/translate-a64.c:13007:5: warning: Value stored to 'rd' is never read rd = extract32(insn, 0, 5); ^ ~~~~~~~~~~~~~~~~~~~~~ target/arm/translate-a64.c:13008:5: warning: Value stored to 'rn' is never read rn = extract32(insn, 5, 5); ^ ~~~~~~~~~~~~~~~~~~~~~ Backports fa71dd531c12ad9a05cdd78392e9fc2a30ea921d	2021-02-27 16:45:25 -05:00
Chen Qun	9bac2113cd	target/arm/translate-a64:Remove dead assignment in handle_scalar_simd_shli() Clang static code analyzer show warning: target/arm/translate-a64.c:8635:14: warning: Value stored to 'tcg_rn' during its initialization is never read TCGv_i64 tcg_rn = new_tmp_a64(s); ^~~~~~ ~~~~~~~~~~~~~~ target/arm/translate-a64.c:8636:14: warning: Value stored to 'tcg_rd' during its initialization is never read TCGv_i64 tcg_rd = new_tmp_a64(s); ^~~~~~ ~~~~~~~~~~~~~~ Backports 07174c86b41e91d98ed2ee0ee12e516694853c6b	2021-02-27 16:44:29 -05:00
LIU Zhiwei	ad78fc2df5	softfloat: Define comparison operations for bfloat16 Backports c53b1079334c41b342a8ad3b7ccfd51bf5427f5	2021-02-27 16:43:10 -05:00
LIU Zhiwei	d26cd63ad6	softfloat: Define misc operations for bfloat16 Backports 5ebf5f4be66c378fd5f3dee85f54dd4942171d57	2021-02-27 16:41:46 -05:00
LIU Zhiwei	d8168a8142	softfloat: Define convert operations for bfloat16 Backports 34f0c0a98a5f3bb6706088c0384f937f7a294d3e	2021-02-27 16:37:11 -05:00
LIU Zhiwei	b0be0d28cc	softfloat: Define operations for bfloat16 Backports 8282310d8535cc2a8431c516e907da79f92df6eb	2021-02-26 15:20:30 -05:00
Stephen Long	95a0837f2d	softfloat: Add float16_is_normal This float16 predicate was missing from the normal set. Backports a03e924cf8a22888060fc0de4d91de053cd5cde4	2021-02-26 15:12:37 -05:00
Frank Chang	d97454eb63	softfloat: Add fp16 and uint8/int8 conversion functions Backports 0d93d8ec632154dea2627a9e989972ee09721187	2021-02-26 15:11:57 -05:00
Kito Cheng	76d123efee	softfloat: Implement the full set of comparisons for float16 Backports dd205025a048ef6f53ff51eb86ddc58e7a82a771	2021-02-26 15:04:12 -05:00
Lioncash	f5a21abc0b	target/arm: Convert sq{, r}dmulh to gvec for aa64 advsimd	2021-02-26 15:01:44 -05:00
Richard Henderson	aa97b6b755	target/arm: Convert integer multiply-add (indexed) to gvec for aa64 advsimd Backports 3607440c4df6498585a570cfc1041e4972b41b56	2021-02-26 14:51:17 -05:00
Richard Henderson	732674b868	target/arm: Convert integer multiply (indexed) to gvec for aa64 advsimd Backports 2e5a265e6a9e7169c4a3e87db261b2fa92582590	2021-02-26 14:46:29 -05:00
Richard Henderson	80325ac866	target/arm: Generalize inl_qrdmlah_* helper functions Unify add/sub helpers and add a parameter for rounding. This will allow saturating non-rounding to reuse this code. Backports d21798856b227a20a0a41640236af445f4f4aeb0	2021-02-26 14:41:32 -05:00
Richard Henderson	1bedcfbda3	target/arm: Tidy SVE tszimm shift formats Rather than require the user to fill in the immediate (shl or shr), create full formats that include the immediate.	2021-02-26 14:35:53 -05:00
Richard Henderson	da41a23a1b	target/arm: Split out gen_gvec_ool_zz Backports 40e32e5a8a379baf6e0d49d83cf19950cfbaf96b	2021-02-26 14:32:36 -05:00
Richard Henderson	5bd98feed9	target/arm: Split out gen_gvec_ool_zzz Backports e645d1a17a359156c6047006d760ca176d493edb	2021-02-26 14:29:48 -05:00
Richard Henderson	aa3819c396	target/arm: Split out gen_gvec_ool_zzp Model after gen_gvec_fn_zzz et al. Backports 96a461f7c12587d3a64a71e4d90cda5c09ca3eb4	2021-02-26 14:26:33 -05:00
Lioncash	2da89a626c	target/arm: Merge helper_sve_clr_* and helper_sve_movz_*	2021-02-26 14:23:06 -05:00
Richard Henderson	8eb3642d96	target/arm: Split out gen_gvec_ool_zzzp Model after gen_gvec_fn_zzz et al. Backports 36cbb7a8e7100864c488a1153cecba90b1c33a4c	2021-02-26 14:14:13 -05:00
Richard Henderson	9b3671e9ad	target/arm: Use tcg_gen_gvec_bitsel for trans_SEL_pppp The gvec operation was added after the initial implementation of the SEL instruction and was missed in the conversion. Backports d4bc623254b55e2f9613c9450216fa7e50c03929	2021-02-26 14:12:25 -05:00
Richard Henderson	c8c247410f	target/arm: Clean up 4-operand predicate expansion Move the check for !S into do_pppp_flags, which allows to merge in do_vecop4_p. Split out gen_gvec_fn_ppp without sve_access_check, to mirror gen_gvec_fn_zzz. Backport dd81a8d7cf5c90963603806e58a217bbe759f75e	2021-02-26 14:07:14 -05:00
Richard Henderson	7bef6489a8	target/arm: Merge do_vector2_p into do_mov_p This is the only user of the function Backports d0b2df5a01eeccbac71d4d883158b91e7f9a6a29	2021-02-26 13:59:00 -05:00
Richard Henderson	f329d428f3	target/arm: Rearrange {sve,fp}_check_access assert We want to ensure that access is checked by the time we ask for a specific fp/vector register. We want to ensure that we do not emit two lots of code to raise an exception. But sometimes it's difficult to cleanly organize the code such that we never pass through sve_check_access exactly once. Allow multiple calls so long as the result is true, that is, no exception to be raised. Backports 8a40fe5f1bf3837ae3f9961efe1d51e7214f2664	2021-02-26 13:56:27 -05:00
Richard Henderson	64822511dd	target/arm: Split out gen_gvec_fn_zzz, do_zzz_fn Model gen_gvec_fn_zzz on gen_gvec_fn3 in translate-a64.c, but indicating which kind of register and in which order. Model do_zzz_fn on the other do_foo functions that take an argument set and verify sve enabled. Backports 28c4da31be6a5e501b60b77bac17652dd3211378	2021-02-26 13:53:10 -05:00
Richard Henderson	3146cbb64e	target/arm: Split out gen_gvec_fn_zz Model the new function on gen_gvec_fn2 in translate-a64.c, but indicating which kind of register and in which order. Since there is only one user of do_vector2_z, fold it into do_mov_z Backports f7d79c41fa4bd0f0d27dcd14babab8575fbed39f	2021-02-26 13:50:05 -05:00
Richard Henderson	234a22803d	qemu/int128: Add int128_lshift Add left-shift to match the existing right-shift. Backports 5be4dd043f5beb5e7587d1ef8dd4e3716ec05639	2021-02-26 13:45:44 -05:00
Richard Henderson	6f341e0199	target/arm: Fill in the WnR syndrome bit in mte_check_fail According to AArch64.TagCheckFault, none of the other ISS values are provided, so we do not need to go so far as merge_syn_data_abort. But we were missing the WnR bit. Backports commit 9a4670be7f0734d27bf4058db3becf83cd0cc9d5 from qemu	2021-02-26 12:26:15 -05:00
Richard Henderson	6969435fb8	target/arm: Pass the entire mte descriptor to mte_check_fail We need more information than just the mmu_idx in order to create the proper exception syndrome. Only change the function signature so far. Backports dbf8c32178291169e111a6a9fd7ae17af4a3039d	2021-02-26 12:19:51 -05:00
Philippe Mathieu-Daudé	d4c59cce4e	target/arm: Clarify HCR_EL2 ARMCPRegInfo type In commit ce4afed839 ("target/arm: Implement AArch32 HCR and HCR2") the HCR_EL2 register has been changed from type NO_RAW (no underlying state and does not support raw access for state saving/loading) to type CONST (TCG can assume the value to be constant), removing the read/write accessors. We forgot to remove the previous type ARM_CP_NO_RAW. This is not really a problem since the field is overwritten. However it makes code review confuse, so remove it. Backports 0e5aac18bc31dbdfab51f9784240d0c31a4c5579	2021-02-26 12:18:15 -05:00
Max Filippov	d9e561ab2a	softfloat: add xtensa specialization for pickNaNMulAdd pickNaNMulAdd logic on Xtensa is to apply pickNaN to the inputs of the expression (a * b) + c. However if default NaN is produces as a result of (a * b) calculation it is not considered when c is NaN. So with two pickNaN variants there must be two pickNaNMulAdd variants. In addition the invalid flag is always set when (a * b) produces NaN. Backports commit fbcc38e4cb1b539b8615ec9b0adc285351d77628 from qemu	2021-02-26 12:16:51 -05:00
Max Filippov	fee4c62fe4	softfloat: pass float_status pointer to pickNaN Pass float_status structure pointer to the pickNaN so that machine-specific settings are available to NaN selection code. Add use_first_nan property to float_status and use it in Xtensa-specific pickNaN. Backports commit 913602e3ffe6bf50b869a14028a55cb267645ba3	2021-02-26 12:16:05 -05:00
Max Filippov	db780eff66	softfloat: make NO_SIGNALING_NANS runtime property target/xtensa, the only user of NO_SIGNALING_NANS macro has FPU implementations with and without the corresponding property. With NO_SIGNALING_NANS being a macro they cannot be a part of the same QEMU executable. Replace macro with new property in float_status to allow cores with different FPU implementations coexist. Backports cc43c6925113c5bc8f1a0205375931d2e4807c99	2021-02-26 12:11:40 -05:00
Peter Maydell	3e5aa58139	target/arm: Use correct FPST for VCMLA, VCADD on fp16 When we implemented the VCMLA and VCADD insns we put in the code to handle fp16, but left it using the standard fp status flags. Correct them to use FPST_STD_F16 for fp16 operations. Bacports commit b34aa5129e9c3aff890b4f4bcc84962e94185629	2021-02-26 12:02:23 -05:00
Peter Maydell	61377ce01c	target/arm: Implement FPST_STD_F16 fpstatus Architecturally, Neon FP16 operations use the "standard FPSCR" like all other Neon operations. However, this is defined in the Arm ARM pseudocode as "a fixed value, except that FZ16 (and AHP) follow the FPSCR bits". In QEMU, the softfloat float_status doesn't include separate flush-to-zero for FP16 operations, so we must keep separate fp_status for "Neon non-FP16" and "Neon fp16" operations, in the same way we do already for the non-Neon "fp_status" vs "fp_status_f16". Add the extra float_status field to the CPU state structure, ensure it is correctly initialized and updated on FPSCR writes, and make fpstatus_ptr(FPST_STD_F16) return a pointer to it. Backports commit aaae563bc73de0598bbc09a102e68f27fafe704a	2021-02-26 12:00:25 -05:00
Peter Maydell	b1b0a41507	target/arm: Make A32/T32 use new fpstatus_ptr() API Make A32/T32 code use the new fpstatus_ptr() API: get_fpstatus_ptr(0) -> fpstatus_ptr(FPST_FPCR) get_fpstatus_ptr(1) -> fpstatus_ptr(FPST_STD) Backports a84d1d1316726704edd2617b2c30c921d98a8137	2021-02-26 11:55:55 -05:00
Peter Maydell	79359e3a69	target/arm: Replace A64 get_fpstatus_ptr() with generic fpstatus_ptr() We currently have two versions of get_fpstatus_ptr(), which both take an effectively boolean argument: * the one for A64 takes "bool is_f16" to distinguish fp16 from other ops * the one for A32/T32 takes "int neon" to distinguish Neon from other ops This is confusing, and to implement ARMv8.2-FP16 the A32/T32 one will need to make a four-way distinction between "non-Neon, FP16", "non-Neon, single/double", "Neon, FP16" and "Neon, single/double". The A64 version will then be a strict subset of the A32/T32 version. To clean this all up, we want to go to a single implementation which takes an enum argument with values FPST_FPCR, FPST_STD, FPST_FPCR_F16, and FPST_STD_F16. We rename the function to fpstatus_ptr() so that unconverted code gets a compilation error rather than silently passing the wrong thing to the new function. This commit implements that new API, and converts A64 to use it: get_fpstatus_ptr(false) -> fpstatus_ptr(FPST_FPCR) get_fpstatus_ptr(true) -> fpstatus_ptr(FPST_FPCR_F16) Backports commit cdfb22bb7326fee607d9553358856cca341dbc9a	2021-02-26 11:46:51 -05:00
Peter Maydell	e9240f0f54	target/arm: Delete unused ARM_FEATURE_CRC In commit 962fcbf2efe57231a9f5df we converted the uses of the ARM_FEATURE_CRC bit to use the aa32_crc32 isar_feature test instead. However we forgot to remove the now-unused definition of the feature name in the enum. Delete it now. Backports commit cf6303d262e31f4812dfeb654c6c6803e52000af	2021-02-26 11:24:40 -05:00
Peter Maydell	e0000d1700	target/arm/translate.c: Delete/amend incorrect comments In arm_tr_init_disas_context() we have a FIXME comment that suggests "cpu_M0 can probably be the same as cpu_V0". This isn't in fact possible: cpu_V0 is used as a temporary inside gen_iwmmxt_shift(), and that function is called in various places where cpu_M0 contains a live value (i.e. between gen_op_iwmmxt_movq_M0_wRn() and gen_op_iwmmxt_movq_wRn_M0() calls). Remove the comment. We also have a comment on the declarations of cpu_V0/V1/M0 which claims they're "for efficiency". This isn't true with modern TCG, so replace this comment with one which notes that they're only used with the iwmmxt decode Backports 8b4c9a50dc9531a729ae4b5941d287ad0422db48	2021-02-26 11:23:52 -05:00
Peter Maydell	0759bb8eaf	target/arm: Delete unused VFP_DREG macros As part of the Neon decodetree conversion we removed all the uses of the VFP_DREG macros, but forgot to remove the macro definitions. Do so now. Backports e60527c5d501e5015a119a0388a27abeae4dac09	2021-02-26 11:22:01 -05:00
Peter Maydell	368323b03f	target/arm: Remove ARCH macro The ARCH() macro was used a lot in the legacy decoder, but there are now just two uses of it left. Since a macro which expands out to a goto is liable to be confusing when reading code, replace the last two uses with a simple open-coded qeuivalent. Backports ce51c7f522ca488c795c3510413e338021141c96	2021-02-26 11:21:20 -05:00
Peter Maydell	5d9c0addcf	target/arm: Convert T32 coprocessor insns to decodetree Convert the T32 coprocessor instructions to decodetree. As with the A32 conversion, this corrects an underdecoding where we did not check that MRRC/MCRR [24:21] were 0b0010 and so treated some kinds of LDC/STC and MRRC/MCRR rather than UNDEFing them. Backports commit 4c498dcfd84281f20bd55072630027d1b3c115fd	2021-02-26 11:19:35 -05:00
Peter Maydell	bdaaac68f5	target/arm: Do M-profile NOCP checks early and via decodetree For M-profile CPUs, the architecture specifies that the NOCP exception when a coprocessor is not present or disabled should cover the entire wide range of coprocessor-space encodings, and should take precedence over UNDEF exceptions. (This is the opposite of A-profile, where checking for a disabled FPU has to happen last.) Implement this with decodetree patterns that cover the specified ranges of the encoding space. There are a few instructions (VLLDM, VLSTM, and in v8.1 also VSCCLRM) which are in copro-space but must not be NOCP'd: these must be handled also in the new m-nocp.decode so they take precedence. This is a minor behaviour change: for unallocated insn patterns in the VFP area (cp=10,11) we will now NOCP rather than UNDEF when the FPU is disabled. As well as giving us the correct architectural behaviour for v8.1M and the recommended behaviour for v8.0M, this refactoring also removes the old NOCP handling from the remains of the 'legacy decoder' in disas_thumb2_insn(), paving the way for cleaning that up. Since we don't currently have a v8.1M feature bit or any v8.1M CPUs, the minor changes to this logic that we'll need for v8.1M are marked up with TODO comments. Backports commit a3494d4671797c291c88bd414acb0aead15f7239 from qemu	2021-02-26 11:17:23 -05:00
Peter Maydell	c675b73b1f	target/arm: Tidy up disas_arm_insn() The only thing left in the "legacy decoder" is the handling of disas_xscale_insn(), and we can simplify the code. Backports commit 8198c071bc55bee55ef4f104a5b125f541b51096	2021-02-26 10:59:09 -05:00
Peter Maydell	fc4cc9d95f	target/arm: Convert A32 coprocessor insns to decodetree Convert the A32 coprocessor instructions to decodetree. Note that this corrects an underdecoding: for the 64-bit access case (MRRC/MCRR) we did not check that bits [24:21] were 0b0010, so we would incorrectly treat LDC/STC as MRRC/MCRR rather than UNDEFing them. The decodetree versions of these insns assume the coprocessor is in the range 0..7 or 14..15. This is architecturally sensible (as per the comments) and OK in practice for QEMU because the only uses of the ARMCPRegInfo infrastructure we have that aren't for coprocessors 14 or 15 are the pxa2xx use of coprocessor 6. We add an assertion to the define_one_arm_cp_reg_with_opaque() function to catch any accidental future attempts to use it to define coprocessor registers for invalid coprocessors. Backports commit cd8be50e58f63413c033531d3273c0e44851684f from qemu	2021-02-26 10:57:00 -05:00
Peter Maydell	ef0e23f1f9	target/arm: Separate decode from handling of coproc insns As a prelude to making coproc insns use decodetree, split out the part of disas_coproc_insn() which does instruction decoding from the part which does the actual work, and make do_coproc_insn() handle the UNDEF-on-bad-permissions and similar cases itself rather than returning 1 to eventually percolate up to a callsite that calls unallocated_encoding() for it. Backports 19c23a9baafc91dd3881a7a4e9bf454e42d24e4e	2021-02-26 10:53:52 -05:00
Peter Maydell	2944a75b98	target/arm: Pull handling of XScale insns out of disas_coproc_insn() At the moment we check for XScale/iwMMXt insns inside disas_coproc_insn(): for CPUs with ARM_FEATURE_XSCALE all copro insns with cp 0 or 1 are handled specially. This works, but is an odd place for this check, because disas_coproc_insn() is called from both the Arm and Thumb decoders but the XScale case never applies for Thumb (all the XScale CPUs were ARMv5, which has only Thumb1, not Thumb2 with the 32-bit coprocessor insn encodings). It also makes it awkward to convert the real copro access insns to decodetree. Move the identification of XScale out to its own function which is only called from disas_arm_insn(). Backports commit 7b4f933db865391a90a3b4518bb2050a83f2a873 from qemu	2021-02-26 10:50:32 -05:00
LIU Zhiwei	9b7f4b72fc	target/riscv: vector single-width integer multiply instructions	2021-02-26 10:46:26 -05:00
LIU Zhiwei	ab81642440	target/riscv: vector integer min/max instructions 558fa7797c919c4f21ac10980f3ed28160d6d3cb	2021-02-26 10:43:13 -05:00
LIU Zhiwei	965af9986a	target/riscv: vector integer comparison instructions 1366fc79be04fa56a0e3f078ba4f26c27ac67e89	2021-02-26 10:40:33 -05:00
LIU Zhiwei	244793c4e8	target/riscv: vector single-width bit shift instructions Backports 3277d955d21d8943d80062b4cfd8547f831dbd51	2021-02-26 10:37:09 -05:00
LIU Zhiwei	56c0e253c2	target/riscv: vector bitwise logical instructions Backports d3842924cf93d104f691c5ea9090d6700ccef281	2021-02-26 10:30:33 -05:00
LIU Zhiwei	05153c6d7c	target/riscv: vector integer add-with-carry / subtract-with-borrow instructions 3a6f8f68ad2f4a22d9ae8287f336b5dcc80b6448	2021-02-26 10:19:48 -05:00
LIU Zhiwei	b9814de4c3	target/riscv: vector widening integer add and subtract Backports 8fcdf77630290591a6068c2d82ca2935338c3b0c	2021-02-26 10:05:43 -05:00
LIU Zhiwei	f564388e89	target/riscv: vector single-width integer add and subtract Backports 43740e3a3b3bb66456103684e622ba4e9baae297	2021-02-26 09:58:31 -05:00
LIU Zhiwei	7d0d7338c2	target/riscv: add vector amo operations Vector AMOs operate as if aq and rl bits were zero on each element with regard to ordering relative to other instructions in the same hart. Vector AMOs provide no ordering guarantee between element operations in the same vector AMO instruction Backports 268fcca66bde62257960ec8d859de374315a5e3d	2021-02-26 09:47:32 -05:00
LIU Zhiwei	152934bade	target/riscv: add fault-only-first unit stride load The unit-stride fault-only-fault load instructions are used to vectorize loops with data-dependent exit conditions(while loops). These instructions execute as a regular load except that they will only take a trap on element 0. Backports commit 022b4ecf775ffeff522eaea4f0d94edcfe00a0a9 from qemu	2021-02-26 09:28:19 -05:00
LIU Zhiwei	887c29bc79	target/riscv: add vector index load and store instructions Vector indexed operations add the contents of each element of the vector offset operand specified by vs2 to the base effective address to give the effective address of each element. Backports f732560e3551c0823cee52efba993fbb8f689a36	2021-02-26 03:00:45 -05:00
LIU Zhiwei	c7a17d04a2	target/riscv: add vector stride load and store instructions Vector strided operations access the first memory element at the base address, and then access subsequent elements at address increments given by the byte offset contained in the x register specified by rs2. Vector unit-stride operations access elements stored contiguously in memory starting from the base effective address. It can been seen as a special case of strided operations. Backports 751538d5da557e5c10e5045c2d27639580ea54a7	2021-02-26 02:55:14 -05:00
LIU Zhiwei	e4bc5056cd	target/riscv: add an internals.h header The internals.h keeps things that are not relevant to the actual architecture, only to the implementation, separate. Backports f476f17740ad42288d42dd8fedcdae8ca7007a16	2021-02-26 02:39:29 -05:00
LIU Zhiwei	9db3b70869	target/riscv: add vector configure instruction vsetvl and vsetvli are two configure instructions for vl, vtype. TB flags should update after configure instructions. The (ill, lmul, sew ) of vtype and the bit of (VSTART == 0 && VL == VLMAX) will be placed within tb_flags. Backports 2b7168fc43fb270fb89e1dddc17ef54714712f3a from qemu	2021-02-26 02:37:59 -05:00
LIU Zhiwei	0554e79ad1	target/riscv: support vector extension csr The v0.7.1 specification does not define vector status within mstatus. A future revision will define the privileged portion of the vector status. Backports 8e3a1f18871e0ea251b95561fe1ec5a9bc896c4a from qemu	2021-02-26 02:25:58 -05:00
LIU Zhiwei	bff31d8822	target/riscv: implementation-defined constant parameters vlen is the vector register length in bits. elen is the max element size in bits. vext_spec is the vector specification version, default value is v0.7.1. Backports 32931383270e2ca8209267ca99f23f3c5f780982 from qemu	2021-02-26 02:23:28 -05:00
LIU Zhiwei	0968caa249	target/riscv: add vector extension field in CPURISCVState The 32 vector registers will be viewed as a continuous memory block. It avoids the convension between element index and (regno, offset). Thus elements can be directly accessed by offset from the first vector base address. Backports ad9e5aa2ae8032f19a8293b6b8f4661c06167bf0 from qemu	2021-02-26 02:17:49 -05:00
Peter Maydell	fceb5e309a	Open 5.2 development tree Backports commit 672b2f2695891b6d818bddc3ce0df964c7627969 from qemu	2021-02-25 23:52:17 -05:00
Peter Maydell	1f497fc74a	Update version for v5.1.0 release Backports commit d0ed6a69d399ae193959225cdeaa9382746c91cc from qemu	2021-02-25 23:51:51 -05:00
Peter Maydell	3c229a2b9e	Update version for v5.1.0-rc3 release	2021-02-25 23:51:33 -05:00
Peter Maydell	0718459fb3	target/arm: Fix Rt/Rt2 in ESR_ELx for copro traps from AArch32 to 64 When a coprocessor instruction in an AArch32 guest traps to AArch32 Hyp mode, the syndrome register (HSR) includes Rt and Rt2 fields which are simply copies of the Rt and Rt2 fields from the trapped instruction. However, if the instruction is trapped from AArch32 to an AArch64 higher exception level, the Rt and Rt2 fields in the syndrome register (ESR_ELx) must be the AArch64 view of the register. This makes a difference if the AArch32 guest was in a mode other than User or System and it was using r13 or r14, or if it was in FIQ mode and using r8-r14. We don't know at translate time which AArch32 CPU mode we are in, so we leave the values we generate in our prototype syndrome register value at translate time as the raw Rt/Rt2 from the instruction, and instead correct them to the AArch64 view when we find we need to take an exception from AArch32 to AArch64 with one of these syndrome values. Fixes: https://bugs.launchpad.net/qemu/+bug/1879587 Backports commit a65dabf71a9f9b949d556b1b57fd72595df92398 from qemu	2021-02-25 23:50:18 -05:00
Peter Collingbourne	7de60dfa51	target/arm: Fix decode of LDRA[AB] instructions These instructions use zero as the discriminator, not SP. Backports commit d250bb19ced3b702c7c37731855f6876d0cc7995 from qemu	2021-02-25 23:47:25 -05:00
Kaige Li	3004cc1f97	target/arm: Avoid maybe-uninitialized warning with gcc 4.9 GCC version 4.9.4 isn't clever enough to figure out that all execution paths in disas_ldst() that use 'fn' will have initialized it first, and so it warns: /home/LiKaige/qemu/target/arm/translate-a64.c: In function ‘disas_ldst’: /home/LiKaige/qemu/target/arm/translate-a64.c:3392:5: error: ‘fn’ may be used uninitialized in this function [-Werror=maybe-uninitialized] fn(cpu_reg(s, rt), clean_addr, tcg_rs, get_mem_index(s), ^ /home/LiKaige/qemu/target/arm/translate-a64.c:3318:22: note: ‘fn’ was declared here AtomicThreeOpFn *fn; ^ Make it happy by initializing the variable to NULL. Backports commit 88a90e3de6ae99cbcfcc04c862c51f241fdf685f from qemu	2021-02-25 23:45:13 -05:00
Richard Henderson	ce8282d9cd	target/arm: Fix AddPAC error indication The definition of top_bit used in this function is one higher than that used in the Arm ARM psuedo-code, which put the error indication at top_bit - 1 at the wrong place, which meant that it wasn't visible to Auth. Fixing the definition of top_bit requires more changes, because its most common use is for the count of bits in top_bit:bot_bit, which would then need to be computed as top_bit - bot_bit + 1. For now, prefer the minimal fix to the error indication alone. Fixes: 63ff0ca94cb Backports commit 8796fe40dd30cd9ffd3c958906471715c923b341 from qemu	2021-02-25 23:44:28 -05:00
Peter Maydell	4952920d4d	Update version for v5.1.0-rc2 release Backports commit 5772f2b1fc5d00e7e04e01fa28e9081d6550440a from qemu	2021-02-25 23:43:39 -05:00
Lioncash	a1e8e0adff	target/arm: Fix bad rebase within do_mem_zpz	2021-02-25 23:43:16 -05:00
Richard Henderson	5e1316a92e	target/arm: Always pass cacheattr in S1_ptw_translate When we changed the interface of get_phys_addr_lpae to require the cacheattr parameter, this spot was missed. The compiler is unable to detect the use of NULL vs the nonnull attribute here. Fixes: 7e98e21c098 Backports commit a6d6f37aed4b171d121cd4a9363fbb41e90dcb53 from qemu	2021-02-25 23:40:32 -05:00
Laszlo Ersek	40c04c73b0	target/i386: floatx80: avoid compound literals in static initializers Quoting ISO C99 6.7.8p4, "All the expressions in an initializer for an object that has static storage duration shall be constant expressions or string literals". The compound literal produced by the make_floatx80() macro is not such a constant expression, per 6.6p7-9. (An implementation may accept it, according to 6.6p10, but is not required to.) Therefore using "floatx80_zero" and make_floatx80() for initializing "f2xm1_table" and "fpatan_table" is not portable. And gcc-4.8 in RHEL-7.6 actually chokes on them: > target/i386/fpu_helper.c:871:5: error: initializer element is not constant > { make_floatx80(0xbfff, 0x8000000000000000ULL), > ^ We've had the make_floatx80_init() macro for this purpose since commit 3bf7e40ab914 ("softfloat: fix for C99", 2012-03-17), so let's use that macro again. Fixes: eca30647fc0 ("target/i386: reimplement f2xm1 using floatx80 operations") Fixes: ff57bb7b632 ("target/i386: reimplement fpatan using floatx80 operations") Backports commit 163b3d1af2552845a60967979aca8d78a6b1b088 from qemu	2021-02-25 23:38:54 -05:00
Richard Henderson	6390789a09	target/i386: Save cc_op before loop insns We forgot to update cc_op before these branch insns, which lead to losing track of the current eflags. Buglink: https://bugs.launchpad.net/qemu/+bug/1888165 Backports commit 3cb3a7720b01830abd5fbb81819dbb9271bf7821 from qemu	2021-02-25 23:36:43 -05:00
Zong Li	001d2e6a29	target/riscv: Fix the range of pmpcfg of CSR funcion table Backports commit 8ba26b0b2b00dd5849a6c0981e358dc7a7cc315d from qemu	2021-02-25 23:35:21 -05:00
Peter Maydell	08ce565d7c	Update version for v5.1.0-rc1 release Backports commit c8004fe6bbfc0d9c2e7b942c418a85efb3ac4b00 from qemu	2021-02-25 23:34:20 -05:00
Richard Henderson	55369d710c	tcg: Save/restore vecop_list around minmax fallback Forgetting this asserts when tcg_gen_cmp_vec is called from within tcg_gen_cmpsel_vec. Fixes: 72b4c792c7a Backports commit 69c918d2ef319ac63cd759c527debc2a2bdf3a0c from qemu	2021-02-25 23:33:24 -05:00
Chenyi Qiang	e5d9e0ed53	target/i386: add fast short REP MOV support For CPUs support fast short REP MOV[CPUID.(EAX=7,ECX=0):EDX(bit4)], e.g Icelake and Tigerlake, expose it to the guest VM. Backports commit 5cb287d2bd578dfe4897458793b4fce35bc4f744 from qemu	2021-02-25 23:31:42 -05:00
Peter Maydell	113dc25fbf	Update version for v5.1.0-rc0 release Backports commit 8746309137ba470d1b2e8f5ce86ac228625db940 from qemu	2021-02-25 23:30:37 -05:00
Aaron Lindsay	e532ce610e	target/arm: Don't do raw writes for PMINTENCLR Raw writes to this register when in KVM mode can cause interrupts to be raised (even when the PMU is disabled). Because the underlying state is already aliased to PMINTENSET (which already provides raw write functions), we can safely disable raw accesses to PMINTENCLR entirely. Backports commit 887c0f1544991f567543b7c214aa11ab0cea0a29 from qemu	2021-02-25 23:27:47 -05:00
Richard Henderson	f403c1f54f	target/arm: Fix mtedesc for do_mem_zpz The mtedesc that was constructed was not actually passed in. Found by Coverity (CID 1429996). Backports commit cdecb3fc1eb182d90666348a47afe63c493686e7 from qemu	2021-02-25 23:25:54 -05:00
Paolo Bonzini	5b794349d3	target/i386: implement undocumented 'smsw r32' behavior In 32-bit mode, the higher 16 bits of the destination register are undefined. In practice CR0[31:0] is stored, just like in 64-bit mode, so just remove the "if" that currently differentiates the behavior. Backports commit c0c8445255b2b5b440c355431c8b01b7b7b7c8cf from qemu	2021-02-25 23:23:51 -05:00
Joseph Myers	cf54c51869	target/i386: fix IEEE SSE floating-point exception raising The SSE instruction implementations all fail to raise the expected IEEE floating-point exceptions because they do nothing to convert the exception state from the softfloat machinery into the exception flags in MXCSR. Fix this by adding such conversions. Unlike for x87, emulated SSE floating-point operations might be optimized using hardware floating point on the host, and so a different approach is taken that is compatible with such optimizations. The required invariant is that all exceptions set in env->sse_status (other than "denormal operand", for which the SSE semantics are different from those in the softfloat code) are ones that are set in the MXCSR; the emulated MXCSR is updated lazily when code reads MXCSR, while when code sets MXCSR, the exceptions in env->sse_status are set accordingly. A few instructions do not raise all the exceptions that would be raised by the softfloat code, and those instructions are made to save and restore the softfloat exception state accordingly. Nothing is done about "denormal operand"; setting that (only for the case when input denormals are not flushed to zero, the opposite of the logic in the softfloat code for such an exception) will require custom code for relevant instructions, or else architecture-specific conditionals in the softfloat code for when to set such an exception together with custom code for various SSE conversion and rounding instructions that do not set that exception. Nothing is done about trapping exceptions (for which there is minimal and largely broken support in QEMU's emulation in the x87 case and no support at all in the SSE case). Backports commit 418b0f93d12a1589d5031405de857844f32e9ccc from qemu	2021-02-25 23:21:32 -05:00
Joseph Myers	fd5b0dd456	target/i386: set SSE FTZ in correct floating-point state The code to set floating-point state when MXCSR changes calls set_flush_to_zero on &env->fp_status, so affecting the x87 floating-point state rather than the SSE state. Fix to call it for &env->sse_status instead. Backports commit 3ddc0eca2229846bfecc3485648a6cb85a466dc7 from qemu	2021-02-25 23:15:53 -05:00
Laurent Vivier	c15ddf11dd	softfloat,m68k: disable floatx80_invalid_encoding() for m68k According to the comment, this definition of invalid encoding is given by intel developer's manual, and doesn't comply with 680x0 FPU. With m68k, the explicit integer bit can be zero in the case of: - zeros (exp == 0, mantissa == 0) - denormalized numbers (exp == 0, mantissa != 0) - unnormalized numbers (exp != 0, exp < 0x7FFF) - infinities (exp == 0x7FFF, mantissa == 0) - not-a-numbers (exp == 0x7FFF, mantissa != 0) For infinities and NaNs, the explicit integer bit can be either one or zero. The IEEE 754 standard does not define a zero integer bit. Such a number is an unnormalized number. Hardware does not directly support denormalized and unnormalized numbers, but implicitly supports them by trapping them as unimplemented data types, allowing efficient conversion in software. See "M68000 FAMILY PROGRAMMER’S REFERENCE MANUAL", "1.6 FLOATING-POINT DATA TYPES" We will implement in the m68k TCG emulator the FP_UNIMP exception to trap into the kernel to normalize the number. In case of linux-user, the number will be normalized by QEMU. Backports commit d159dd058c7dc48a9291fde92eaae52a9f26a4d1 from qemu	2021-02-25 23:14:47 -05:00
Mark Cave-Ayland	db742bec00	target/m68k: consolidate physical translation offset into get_physical_address() Since all callers to get_physical_address() now apply the same page offset to the translation result, move the logic into get_physical_address() itself to avoid duplication. Backports commit 852002b5664bf079da05c5201dbf2345b870e5ed from qemu	2021-02-25 23:13:48 -05:00
Mark Cave-Ayland	3b2bc4b0c8	target/m68k: fix physical address translation in m68k_cpu_get_phys_page_debug() The result of the get_physical_address() function should be combined with the offset of the original page access before being returned. Otherwise the m68k_cpu_get_phys_page_debug() function can round to the wrong page causing incorrect lookups in gdbstub and various "Disassembler disagrees with translator over instruction decoding" warnings to appear at translation time. Fixes: 88b2fef6c3 ("target/m68k: add MC68040 MMU")	2021-02-25 23:12:12 -05:00
Richard Henderson	65d5288563	tcg: Fix do_nonatomic_op_* vs signed operations The smin/smax/umin/umax operations require the operands to be properly sign extended. Do not drop the MO_SIGN bit from the load, and additionally extend the val input. Backports commit 852f933e482518797f7785a2e017a215b88df815 from qemu	2021-02-25 23:10:40 -05:00
Richard Henderson	57c66389c2	target/arm: Fix temp double-free in sve ldr/str The temp that gets assigned to clean_addr has been allocated with new_tmp_a64, which means that it will be freed at the end of the instruction. Freeing it earlier leads to assertion failure. The loop creates a complication, in which we allocate a new local temp, which does need freeing, and the final code path is shared between the loop and non-loop. Fix this complication by adding new_tmp_a64_local so that the new local temp is freed at the end, and can be treated exactly like the non-loop path. Fixes: bba87d0a0f4 Backports commit 4b4dc9750a0aa0b9766bd755bf6512a84744ce8a from qemu	2021-02-25 23:10:37 -05:00
Richard Henderson	54e2107bdf	target/arm: Enable MTE We now implement all of the components of MTE, without actually supporting any tagged memory. All MTE instructions will work, trivially, so we can enable support. Backports commit c7459633baa71d1781fde4a245d6ec9ce2f008cf from qemu	2021-02-25 23:00:27 -05:00
Richard Henderson	a34fda25b0	target/arm: Add allocation tag storage for system mode Look up the physical address for the given virtual address, convert that to a tag physical address, and finally return the host address that backs it. Backports commit e4d5bf4fbd5abfc3727e711eda64a583cab4d637 from qemu	2021-02-25 22:58:56 -05:00
Richard Henderson	9b6c64f8f8	target/arm: Create tagged ram when MTE is enabled Backports commit 8bce44a2f6beb388a3f157652b46e99929839a96 from qemu	2021-02-25 22:51:23 -05:00
Richard Henderson	2ea0b53c1a	target/arm: Cache the Tagged bit for a page in MemTxAttrs This "bit" is a particular value of the page's MemAttr. Backports commit 337a03f07ff0f9e6295662f4094e03a045b60bdc from qemu	2021-02-25 22:48:04 -05:00
Richard Henderson	28cd096d67	target/arm: Always pass cacheattr to get_phys_addr We need to check the memattr of a page in order to determine whether it is Tagged for MTE. Between Stage1 and Stage2, this becomes simpler if we always collect this data, instead of occasionally being presented with NULL. Use the nonnull attribute to allow the compiler to check that all pointer arguments are non-null. Backports commit 7e98e21c09871cddc20946c8f3f3595e93154ecb from qemu	2021-02-25 22:46:00 -05:00
Richard Henderson	e2456a83a4	target/arm: Set PSTATE.TCO on exception entry D1.10 specifies that exception handlers begin with tag checks overridden. Backports commit 34669338bd9d66255fceaa84c314251ca49ca8d5 from qemu	2021-02-25 22:41:26 -05:00
Richard Henderson	35d0443056	target/arm: Implement data cache set allocation tags This is DC GVA and DC GZVA, and the tag check for DC ZVA. Backports commit eb821168db798302bd124a3b000cebc23bd0a395 from qemu	2021-02-25 22:40:08 -05:00
Richard Henderson	33f5bdabb1	target/arm: Complete TBI clearing for user-only for SVE There are a number of paths by which the TBI is still intact for user-only in the SVE helpers. Because we currently always set TBI for user-only, we do not need to pass down the actual TBI setting from above, and we can remove the top byte in the inner-most primitives, so that none are forgotten. Moreover, this keeps the "dirty" pointer around at the higher levels, where we need it for any MTE checking. Since the normal case, especially for user-only, goes through RAM, this clearing merely adds two insns per page lookup, which will be completely in the noise. Backports commit c4af8ba19b9d22aac79cab679a20b159af9d6809 from qemu	2021-02-25 22:37:12 -05:00
Richard Henderson	732efce958	target/arm: Add mte helpers for sve scatter/gather memory ops Because the elements are non-sequential, we cannot eliminate many tests straight away like we can for sequential operations. But we often have the PTE details handy, so we can test for Tagged. Backports commit d28d12f008ee44dc2cc2ee5d8f673be9febc951e from qemu	2021-02-25 22:34:24 -05:00
Richard Henderson	5698b7badb	target/arm: Handle TBI for sve scalar + int memory ops We still need to handle tbi for user-only when mte is inactive. Backports commit 9473d0ecafcffc8b258892b1f9f18e037bdba958 from qemu	2021-02-25 22:17:46 -05:00
Richard Henderson	586235d02d	target/arm: Add mte helpers for sve scalar + int ff/nf loads Because the elements are sequential, we can eliminate many tests all at once when the tag hits TCMA, or if the page(s) are not Tagged. Backports commit aa13f7c3c378fa41366b9fcd6c29af1c3d81126a from qemu	2021-02-25 22:09:17 -05:00
Richard Henderson	cb31d54b18	target/arm: Add mte helpers for sve scalar + int stores Because the elements are sequential, we can eliminate many tests all at once when the tag hits TCMA, or if the page(s) are not Tagged. Backports commit 71b9f3948c75bb97641a3c8c7de96d1cb47cdc07 from qemu	2021-02-25 21:53:55 -05:00
Richard Henderson	670b25c5fa	target/arm: Add mte helpers for sve scalar + int loads Because the elements are sequential, we can eliminate many tests all at once when the tag hits TCMA, or if the page(s) are not Tagged. Backports commit 206adacfb8d35e671e3619591608c475aa046b63 from qemu	2021-02-25 21:45:32 -05:00
Richard Henderson	6a78133659	target/arm: Remove sve_memopidx None of the sve helpers use TCGMemOpIdx any longer, so we can stop passing it. Backports commit ba080b8682fc6bde7f2d9dedddb519d63cbe138f from qemu	2021-02-25 21:33:44 -05:00
Richard Henderson	1f306230d4	target/arm: Reuse sve_probe_page for gather loads Backports commit 10a85e2c8ab6e004e7f3f1dcfea8cb0bf58fb9fb from qemu	2021-02-25 21:30:13 -05:00
Richard Henderson	585da952ec	target/arm: Reuse sve_probe_page for scatter stores Backports commit 88a660a48ef513ce9875b595e19b2a820b3f3fca from qemu	2021-02-25 21:27:14 -05:00
Richard Henderson	3eee880c2a	target/arm: Reuse sve_probe_page for gather first-fault loads This avoids the need for a separate set of helpers to implement no-fault semantics, and will enable MTE in the future. Backports commit 50de9b78cec06e6d16e92a114a505779359ca532 from qemu	2021-02-25 21:22:16 -05:00
Richard Henderson	b1e31f3bf3	target/arm: Use SVEContLdSt for contiguous stores Follow the model set up for contiguous loads. This handles watchpoints correctly for contiguous stores, recognizing the exception before any changes to memory. Backports commit 0fa476c1bb37a70df7eeff1e5bfb4791feb37e0e from qemu	2021-02-25 21:15:14 -05:00
Richard Henderson	3591c2f548	target/arm: Update contiguous first-fault and no-fault loads With sve_cont_ldst_pages, the differences between first-fault and no-fault are minimal, so unify the routines. With cpu_probe_watchpoint, we are able to make progress through pages with TLB_WATCHPOINT set when the watchpoint does not actually fire. Backports commit c647673ce4d72a8789703c62a7f3cbc732cb1ea8 from qemu	2021-02-25 21:06:14 -05:00
Richard Henderson	6c9304448e	target/arm: Use SVEContLdSt for multi-register contiguous loads Backports commit 5c9b8458a0b3008d24d84b67e1c9b6d5f39f4d66 from qemu	2021-02-25 20:50:22 -05:00
Richard Henderson	3979c8f73e	target/arm: Handle watchpoints in sve_ld1_r Handle all of the watchpoints for active elements all at once, before we've modified the vector register. This removes the TLB_WATCHPOINT bit from page[].flags, which means that we can use the normal fast path via RAM. Backports commit 4bcc3f0ff8e5ae2b17b5aab9aa613ff1b8025896 from qemu	2021-02-25 20:44:13 -05:00
Richard Henderson	0e5aa37c9a	target/arm: Use SVEContLdSt in sve_ld1_r First use of the new helper functions, so we can remove the unused markup. No longer need a scratch for user-only, as we completely probe the page set before reading; system mode still requires a scratch for MMIO. Backports commit b854fd06a868e0308bcfe05ad0a71210705814c7 from qemu	2021-02-25 20:41:53 -05:00
Richard Henderson	d363c3d0ba	target/arm: Adjust interface of sve_ld1_host_fn The current interface includes a loop; change it to load a single element. We will then be able to use the function for ld{2,3,4} where individual vector elements are not adjacent. Replace each call with the simplest possible loop over active elements. Backports commit cf4a49b71b1712142d7122025a8ca7ea5b59d73f from qemu	2021-02-25 20:34:18 -05:00
Richard Henderson	94b0876f15	target/arm: Add sve infrastructure for page lookup For contiguous predicated memory operations, we want to minimize the number of tlb lookups performed. We have open-coded this for sve_ld1_r, but for correctness with MTE we will need this for all of the memory operations. Create a structure that holds the bounds of active elements, and metadata for two pages. Add routines to find those active elements, lookup the pages, and run watchpoints for those pages. Temporarily mark the functions unused to avoid Werror. Backports commit b4cd95d2f4c7197b844f51b29871d888063ea3e7 from qemu	2021-02-25 20:28:23 -05:00
Richard Henderson	f430a399d4	target/arm: Drop manual handling of set/clear_helper_retaddr Since we converted back to cpu_*_data_ra, we do not need to do this ourselves. Backports commit f32e2ab65f3a0fc03d58936709e5a565c4b0db50 from qemu	2021-02-25 20:20:29 -05:00
Richard Henderson	2e03f74a53	target/arm: Use cpu_*_data_ra for sve_ldst_tlb_fn Use the "normal" memory access functions, rather than the softmmu internal helper functions directly. Since fb901c9, cpu_mem_index is now a simple extract from env->hflags and not a large computation. Which means that it's now more work to pass around this value than it is to recompute it. This only adjusts the primitives, and does not clean up all of the uses within sve_helper.c.	2021-02-25 20:16:38 -05:00
Richard Henderson	84012be55c	target/arm: Add arm_tlb_bti_gp Introduce an lvalue macro to wrap target_tlb_bit0. Backports commit 149d3b31f3f0f7f9e1c3a77043450a95c7a7e93d from qemu	2021-02-25 17:45:50 -05:00
Richard Henderson	a9eb62d211	target/arm: Tidy trans_LD1R_zpri Move the variable declarations to the top of the function, but do not create a new label before sve_access_check. Backports commit c0ed9166b1aea86a2fbaada1195aacd1049f9e85 from qemu	2021-02-25 17:42:40 -05:00
Richard Henderson	49bd9a5c68	target/arm: Use mte_checkN for sve unpredicated stores Backports commit bba87d0a0f480805223a6428a7942a51733c488a from qemu	2021-02-25 17:40:43 -05:00
Richard Henderson	3ce14ebc78	target/arm: Use mte_checkN for sve unpredicated loads Backports commit b2aa8879b884cd66acde4123899dd92a38fe6527 from qemu	2021-02-25 17:26:37 -05:00
Richard Henderson	4fdd05e1aa	target/arm: Add helper_mte_check_zva Use a special helper for DC_ZVA, rather than the more general mte_checkN. Backports commit 46dc1bc0601554823a42ad27f236da2ad8f3bdc6 from qemu	2021-02-25 17:17:54 -05:00
Richard Henderson	9a05ca01e7	target/arm: Implement helper_mte_checkN Fill out the stub that was added earlier. Backports commit 5add8248556a3c1006018d7d8e601c9572b280a9 from qemu	2021-02-25 17:10:56 -05:00
Richard Henderson	91e2f55b69	target/arm: Implement helper_mte_check1 Fill out the stub that was added earlier. Backports commit 2e34ff45f32cb032883616a1cc5ea8ac96f546d5 from qemu	2021-02-25 17:02:34 -05:00
Richard Henderson	3e786526cf	target/arm: Add gen_mte_checkN Replace existing uses of check_data_tbi in translate-a64.c that perform multiple logical memory access. Leave the helper blank for now to reduce the patch size. Backports commit 73ceeb0011b23bac8bd2c09ebe3c18d034aa69ce from qemu	2021-02-25 16:40:16 -05:00
Richard Henderson	582e64f348	target/arm: Add gen_mte_check1 Replace existing uses of check_data_tbi in translate-a64.c that perform a single logical memory access. Leave the helper blank for now to reduce the patch size. Backports commit 0a405be2b8fd9506a009b10d7d2d98c394b36db6 from qemu	2021-02-25 16:13:31 -05:00
Richard Henderson	4488858072	target/arm: Move regime_tcr to internals.h We will shortly need this in mte_helper.c as well. Backports commit 38659d311d05e6c5feff6bddcc1c33b60d3b86a1 from qemu	2021-02-25 16:04:54 -05:00
Richard Henderson	e46cec8543	target/arm: Move regime_el to internals.h We will shortly need this in mte_helper.c as well. Backports commit 9c7ab8fc8cb6d6e2fb7a82c1088691c7c23fa1b9 from qemu	2021-02-25 16:04:12 -05:00
Richard Henderson	7147f5f28a	target/arm: Implement the access tag cache flushes Like the regular data cache flushes, these are nops within qemu. Backports commit 5463df160ecee510e78493993eb1bd38b4838a10 from qemu	2021-02-25 16:02:30 -05:00
Richard Henderson	4bb37fc3c1	target/arm: Implement the LDGM, STGM, STZGM instructions Backports commit 5f716a82388eb09754dd900e7dbb8ffa15897a28 from qemu	2021-02-25 16:00:50 -05:00
Richard Henderson	5b3ddcf2e2	target/arm: Simplify DC_ZVA Now that we know that the operation is on a single page, we need not loop over pages while probing. Backports commit e26d0d226892f67435cadcce86df0ddfb9943174 from qemu	2021-02-25 15:55:46 -05:00
Richard Henderson	7de60598d5	target/arm: Restrict the values of DCZID.BS under TCG We can simplify our DC_ZVA if we recognize that the largest BS that we actually use in system mode is 64. Let us just assert that it fits within TARGET_PAGE_SIZE. For DC_GVA and STZGM, we want to be able to write whole bytes of tag memory, so assert that BS is >= 2 * TAG_GRANULE, or 32. Backports commit a4157b80242bf1c8aa0ee77aae7458ba79012d5d from qemu	2021-02-25 15:12:20 -05:00
Richard Henderson	e15aa7c5a7	target/arm: Implement the STGP instruction Backports commit 6439d67fc944cf29de94a160e9450a2063c7b515 from qemu	2021-02-25 15:10:40 -05:00
Richard Henderson	e8b9cb8b4a	target/arm: Implement LDG, STG, ST2G instructions Backports commit c15294c1e36a7dd9b25bd54d98178e80f4b64bc1 from qemu	2021-02-25 15:08:44 -05:00
Richard Henderson	448fc3ae4a	target/arm: Define arm_cpu_do_unaligned_access for user-only Use the same code as system mode, so that we generate the same exception + syndrome for the unaligned access. For the moment, if MTE is enabled so that this path is reachable, this would generate a SIGSEGV in the user-only cpu_loop. Decoding the syndrome to produce the proper SIGBUS will be done later. Backports commit 0d1762e931f8a694f261c604daba605bcda70928 from qemu	2021-02-25 14:51:19 -05:00
Richard Henderson	43ea7aa828	target/arm: Implement the SUBP instruction Backports commit dad3015f55f8d48f84f0eae36021a9c6f9587e57 from qemu	2021-02-25 14:46:00 -05:00
Richard Henderson	13fd83fcc9	target/arm: Implement the GMI instruction Backports commit 438efea0bb639c9c2dfb42c8d9459e21aa183c8a from qemu	2021-02-25 14:44:29 -05:00
Richard Henderson	911a6b57ed	target/arm: Implement the ADDG, SUBG instructions Backports commit efbc78ad978763aedd11cb718eb1ff8db3fc9152 from qemu	2021-02-25 14:42:33 -05:00
Richard Henderson	acd7e4cb18	target/arm: Revise decoding for disas_add_sub_imm The current Arm ARM has adjusted the official decode of "Add/subtract (immediate)" so that the shift field is only bit 22, and bit 23 is part of the op1 field of the parent category "Data processing - immediate". Backports commit 21a8b343eaae63f6984f9a200092b0ea167647f1 from qemu	2021-02-25 14:38:46 -05:00
Richard Henderson	58f3dd2cc7	target/arm: Implement the IRG instruction Backports commit da54941f45b820cbaca72aa6efd5669b3dc86e2f from qemu	2021-02-25 14:36:11 -05:00
Richard Henderson	6bec295bf8	target/arm: Add MTE bits to tb_flags Cache the composite ATA setting. Cache when MTE is fully enabled, i.e. access to tags are enabled and tag checks affect the PE. Do this for both the normal context and the UNPRIV context. Backports commit 81ae05fa2d21ac1a0054935b74342aa38a5ecef7 from qemu	2021-02-25 14:31:41 -05:00
Richard Henderson	f6be2a1a42	target/arm: Add MTE system registers This is TFSRE0_EL1, TFSR_EL1, TFSR_EL2, TFSR_EL3, RGSR_EL1, GCR_EL1, GMID_EL1, and PSTATE.TCO. Backports commit 4b779cebb3e5ab30b945181f1ba3932f5f8a1cb5 from qemu	2021-02-25 14:12:24 -05:00
Richard Henderson	179a3aacdf	target/arm: Add DISAS_UPDATE_NOCHAIN Add an option that writes back the PC, like DISAS_UPDATE_EXIT, but does not exit back to the main loop. Backports commit 329833286d7a1b0ef8c7daafe13c6ae32429694e from qemu	2021-02-25 14:08:08 -05:00
Richard Henderson	eaa6291aa7	target/arm: Rename DISAS_UPDATE to DISAS_UPDATE_EXIT Emphasize that the is_jmp option exits to the main loop. Backports commit 14407ec2007e18536ed34772eef46f6e0a0e3d0e from qemu	2021-02-25 14:02:46 -05:00
Richard Henderson	2540911bdd	target/arm: Add support for MTE to SCTLR_ELx target/arm: Add support for MTE to HCR_EL2 and SCR_EL3 This does not attempt to rectify all of the res0 bits, but does clear the mte bits when not enabled. Since there is no high-part mapping of SCTLR, aa32 mode cannot write to these bits. Backports commits f00faf130d5dcf64b04f71a95f14745845ca1014, and 8ddb300bf60a5f3d358dd6fbf81174f6c03c1d9f from qemu.	2021-02-25 13:59:11 -05:00
Richard Henderson	d81feac642	target/arm: Improve masking of SCR RES0 bits Protect reads of aa64 id registers with ARM_CP_STATE_AA64. Use this as a simpler test than arm_el_is_aa64, since EL3 cannot change mode. Backports commit 252e8c69669599b4bcff802df300726300292f47 from qemu	2021-02-25 13:56:35 -05:00
Richard Henderson	1a35600453	target/arm: Add isar tests for mte Backports commit c7fd0baac0c24defec66263799faa8618327b352 from qemu	2021-02-25 13:55:52 -05:00
Joseph Myers	c01b7432a1	target/i386: reimplement fpatan using floatx80 operations The x87 fpatan emulation is currently based around conversion to double. This is inherently unsuitable for a good emulation of any floatx80 operation. Reimplement using the soft-float operations, as for other such instructions. Backports commit ff57bb7b63267dabd60f88354c8c29ea5e1eb3ec from qemu	2021-02-25 13:48:32 -05:00
Joseph Myers	ddb2f1d4dd	target/i386: reimplement fyl2x using floatx80 operations The x87 fyl2x emulation is currently based around conversion to double. This is inherently unsuitable for a good emulation of any floatx80 operation. Reimplement using the soft-float operations, building on top of the reimplementation of fyl2xp1 and factoring out code to be shared between the two instructions. The included test assumes that the result in round-to-nearest mode should always be one of the two closest floating-point numbers to the mathematically exact result (including that it should be exact, in the exact cases which cover more cases than for fyl2xp1). Backports commit 1f18a1e6ab8368a4eab2d22894d3b2ae75250cd3 from qemu	2021-02-25 13:46:29 -05:00
Joseph Myers	ac2f3fa0f2	target/i386: reimplement fyl2xp1 using floatx80 operations The x87 fyl2xp1 emulation is currently based around conversion to double. This is inherently unsuitable for a good emulation of any floatx80 operation, even before considering that it is a particularly naive implementation using double (adding 1 then using log rather than attempting a better emulation using log1p). Reimplement using the soft-float operations, as was done for f2xm1; as in that case, m68k has related operations but not exactly this one and it seemed safest to implement directly rather than reusing the m68k code to avoid accumulation of errors. A test is included with many randomly generated inputs. The assumption of the test is that the result in round-to-nearest mode should always be one of the two closest floating-point numbers to the mathematical value of y * log2(x + 1); the implementation aims to do somewhat better than that (about 70 correct bits before rounding). I haven't investigated how accurate hardware is. Intel manuals describe a narrower range of valid arguments to this instruction than AMD manuals. The implementation accepts the wider range (it's needed anyway for the core code to be reusable in a subsequent patch reimplementing fyl2x), but the test only has inputs in the narrower range so that it's valid on hardware that may reject or produce poor results for inputs outside that range. Code in the previous implementation that sets C2 for some out-of-range arguments is not carried forward to the new implementation; C2 is undefined for this instruction and I suspect that code was just cut-and-pasted from the trigonometric instructions (fcos, fptan, fsin, fsincos) where C2 is defined to be set for out-of-range arguments. Backports commit 5eebc49d2d0aa5fc7e90eeac97533051bb7b72fa from qemu	2021-02-25 13:43:46 -05:00
Joseph Myers	0a790f9937	target/i386: reimplement fprem, fprem1 using floatx80 operations The x87 fprem and fprem1 emulation is currently based around conversion to double, which is inherently unsuitable for a good emulation of any floatx80 operation. Reimplement using the soft-float floatx80 remainder operations. Backports commit 5ef396e2ba865f34a4766dbd60c739fb4bcb4fcc from qemu	2021-02-25 13:41:54 -05:00
Joseph Myers	8d0bf2d6e1	softfloat: return low bits of quotient from floatx80_modrem Both x87 and m68k need the low parts of the quotient for their remainder operations. Arrange for floatx80_modrem to track those bits and return them via a pointer. The architectures using float32_rem and float64_rem do not appear to need this information, so the *_rem interface is left unchanged and the information returned only from floatx80_modrem. The logic used to determine the low 7 bits of the quotient for m68k (target/m68k/fpu_helper.c:make_quotient) appears completely bogus (it looks at the result of converting the remainder to integer, the quotient having been discarded by that point); this patch does not change that, but the m68k maintainers may wish to do so. Backports commit 445810ec915687d37b8ae0ef8d7340ab4a153efa from qemu	2021-02-25 13:39:10 -05:00
Joseph Myers	e4cfbc1f06	softfloat: do not set denominator high bit for floatx80 remainder The floatx80 remainder implementation unnecessarily sets the high bit of bSig explicitly. By that point in the function, arguments that are invalid, zero, infinity or NaN have already been handled and subnormals have been through normalizeFloatx80Subnormal, so the high bit will already be set. Remove the unnecessary code. Backports commit 566601f1f9d972e44214696d3cb320e6c18880aa from qemu	2021-02-25 13:37:13 -05:00
Joseph Myers	2d50384633	softfloat: do not return pseudo-denormal from floatx80 remainder The floatx80 remainder implementation sometimes returns the numerator unchanged when the denominator is sufficiently larger than the numerator. But if the value to be returned unchanged is a pseudo-denormal, that is incorrect. Fix it to normalize the numerator in that case. Backports commit b662495dca0a2a36008cf8def91e2566519ed3f2 from qemu	2021-02-25 13:36:42 -05:00
Joseph Myers	6b63555a00	softfloat: fix floatx80 remainder pseudo-denormal check for zero The floatx80 remainder implementation ignores the high bit of the significand when checking whether an operand (numerator) with zero exponent is zero. This means it mishandles a pseudo-denormal representation of 0x1p-16382L by treating it as zero. Fix this by checking the whole significand instead. Backports commit 499a2f7b554a295cfc10f8cd026d9b20a38fe664 from qemu	2021-02-25 13:35:17 -05:00
Joseph Myers	b08d204a37	softfloat: merge floatx80_mod and floatx80_rem The m68k-specific softfloat code includes a function floatx80_mod that is extremely similar to floatx80_rem, but computing the remainder based on truncating the quotient toward zero rather than rounding it to nearest integer. This is also useful for emulating the x87 fprem and fprem1 instructions. Change the floatx80_rem implementation into floatx80_modrem that can perform either operation, with both floatx80_rem and floatx80_mod as thin wrappers available for all targets. There does not appear to be any use for the _mod operation for other floating-point formats in QEMU (the only other architectures using _rem at all are linux-user/arm/nwfpe, for FPA emulation, and openrisc, for instructions that have been removed in the latest version of the architecture), so no change is made to the code for other formats. Backports commit 6b8b0136ab3018e4b552b485f808bf66bcf19ead from qemu	2021-02-25 13:34:05 -05:00
Joseph Myers	2aee4714ab	target/i386: reimplement f2xm1 using floatx80 operations The x87 f2xm1 emulation is currently based around conversion to double. This is inherently unsuitable for a good emulation of any floatx80 operation, even before considering that it is a particularly naive implementation using double (computing with pow and then subtracting 1 rather than attempting a better emulation using expm1). Reimplement using the soft-float operations, including additions and multiplications with higher precision where appropriate to limit accumulation of errors. I considered reusing some of the m68k code for transcendental operations, but the instructions don't generally correspond exactly to x87 operations (for example, m68k has 2^x and e^x - 1, but not 2^x - 1); to avoid possible accumulation of errors from applying multiple such operations each rounding to floatx80 precision, I wrote a direct implementation of 2^x - 1 instead. It would be possible in principle to make the implementation more efficient by doing the intermediate operations directly with significands, signs and exponents and not packing / unpacking floatx80 format for each operation, but that would make it significantly more complicated and it's not clear that's worthwhile; the m68k emulation doesn't try to do that. A test is included with many randomly generated inputs. The assumption of the test is that the result in round-to-nearest mode should always be one of the two closest floating-point numbers to the mathematical value of 2^x - 1; the implementation aims to do somewhat better than that (about 70 correct bits before rounding). I haven't investigated how accurate hardware is. Backports commit eca30647fc078f4d9ed1b455bd67960f99dbeb7a from qemu	2021-02-25 13:31:13 -05:00
Peter Maydell	4a1996502f	target/arm: Remove dead code relating to SABA and UABA In commit cfdb2c0c95ae9205b0 ("target/arm: Vectorize SABA/UABA") we replaced the old handling of SABA/UABA with a vectorized implementation which returns early rather than falling into the loop-ever-elements code. We forgot to delete the part of the old looping code that did the accumulate step, and Coverity correctly warns (CID 1428955) that this code is now dead. Delete it. Fixes: cfdb2c0c95ae9205b0 Backports commit ced7e8edb282765685d2ba0206a11f8692d8ec1c from qemu	2021-02-25 13:18:51 -05:00
Peter Maydell	167ed57625	target/arm: Remove unnecessary gen_io_end() calls Since commit ba3e7926691ed3 it has been unnecessary for target code to call gen_io_end() after an IO instruction in icount mode; it is sufficient to call gen_io_start() before it and to force the end of the TB. Many now-unnecessary calls to gen_io_end() were removed in commit 9e9b10c6491153b, but some were missed or accidentally added later. Remove unneeded calls from the arm target: * the call in the handling of exception-return-via-LDM is unnecessary, and the code is already forcing end-of-TB * the call in the VFP access check code is more complicated: we weren't ending the TB, so we need to add the code to force that by setting DISAS_UPDATE * the doc comment for ARM_CP_IO doesn't need to mention gen_io_end() any more Backports commit 55c812b74289863c348449135812027d188f040a from qemu	2021-02-25 13:17:32 -05:00
Peter Maydell	083d207fb0	target/arm: Move some functions used only in translate-neon.inc.c to that file The functions neon_element_offset(), neon_load_element(), neon_load_element64(), neon_store_element() and neon_store_element64() are used only in the translate-neon.inc.c file, so move their definitions there. Since the .inc.c file is #included in translate.c this doesn't make much difference currently, but it's a more logical place to put the functions and it might be helpful if we ever decide to try to make the .inc.c files genuinely separate compilation units. Backports commit 6fb5787898aab6aa04887fed9cf3220dd4c3f36a from qemu	2021-02-25 13:15:23 -05:00
Peter Maydell	0b06317dc4	target/arm: Convert Neon VTRN to decodetree Convert the Neon VTRN insn to decodetree. This is the last insn in the Neon data-processing group, so we can remove all the now-unused old decoder framework. It's possible that there's a more efficient implementation of VTRN, but for this conversion we just copy the existing approach. Backports commit d4366190f84fe89cc5d46da995dac1e7d541b98e from qemu	2021-02-25 13:12:28 -05:00
Peter Maydell	b7584069dd	target/arm: Convert Neon VSWP to decodetree Convert the Neon VSWP insn to decodetree. Since the new implementation doesn't have to share a pass-loop with the other 2-reg-misc operations we can implement the swap with 64-bit accesses rather than 32-bits (which brings us into line with the pseudocode and is more efficient). Backports commit 8ab3a227a0f13f0ff85846f36f7c466769aef4fc from qemu	2021-02-25 13:07:56 -05:00
Peter Maydell	73abdfea53	target/arm: Convert Neon 2-reg-misc VCVT insns to decodetree Convert the VCVT instructions in the 2-reg-misc grouping to decodetree. Backports commit a183d5fb38b07bab2a840196186c4806f3c67c0d from qemu	2021-02-25 13:07:15 -05:00
Peter Maydell	7e705fdc8c	target/arm: Convert Neon 2-reg-misc VRINT insns to decodetree Convert the Neon 2-reg-misc VRINT insns to decodetree. Giving these insns their own do_vrint() function allows us to change the rounding mode just once at the start and end rather than doing it for every element in the vector. Backports commit 128123ea34e9e6afe4842aefcb9cf84b9642ac22 from qemu	2021-02-25 13:02:24 -05:00
Peter Maydell	3eddb77327	target/arm: Convert Neon 2-reg-misc fp-compare-with-zero insns to decodetree Convert the fp-compare-with-zero insns in the Neon 2-reg-misc group to decodetree. Backports commit baa59323e841f76523f6ad4d746cdeb47ea574cd from qemu	2021-02-25 12:59:22 -05:00
Peter Maydell	6eb852ec1c	target/arm: Convert simple fp Neon 2-reg-misc insns Convert the Neon 2-reg-misc insns which are implemented with simple calls to functions that take the input, output and fpstatus pointer. Backports commit 3e96b205286dfb8bbf363229709e4f8648fce379 from qemu	2021-02-25 12:56:28 -05:00
Peter Maydell	3dcee11013	target/arm: Convert Neon VQABS, VQNEG to decodetree Convert the Neon VQABS and VQNEG insns to decodetree. Since these are the only ones which need cpu_env passing to the helper, we wrap the helper rather than creating a whole new do_2misc_env() function. Backports commit 4936f38abe6db0a9d23fd04e4cb0cf4d51cff174 from qemu	2021-02-25 12:53:18 -05:00
Peter Maydell	4033a3ca5c	target/arm: Convert remaining simple 2-reg-misc Neon ops Convert the remaining ops in the Neon 2-reg-misc group which can be implemented simply with our do_2misc() helper. Backports commit 84eae770af69c37a92496a4c4248875c070d5ee3 from qemu	2021-02-25 12:50:55 -05:00
Peter Maydell	88f8111500	target/arm: Convert Neon 2-reg-misc VREV32 and VREV16 to decodetree Convert the VREV32 and VREV16 insns in the Neon 2-reg-misc group to decodetree. Backports commit 8966808205b59d6c196b380b638475bcd1657ef4 from qemu	2021-02-25 12:49:16 -05:00
Peter Maydell	db1e503708	target/arm: Make gen_swap_half() take separate src and dest Make gen_swap_half() take a source and destination TCGv_i32 rather than modifying the input TCGv_i32; we're going to want to be able to use it with the more flexible function signature, and this also brings it into line with other functions like gen_rev16() and gen_revsh(). Backports commit 8ec3de7018a8198624aae49eef5568256114a829 from qemu	2021-02-25 12:40:23 -05:00
Peter Maydell	3c1289c594	target/arm: Fix capitalization in NeonGenTwo{Single, Double}OPFn typedefs All the other typedefs like these spell "Op" with a lowercase 'p'; remane the NeonGenTwoSingleOPFn and NeonGenTwoDoubleOPFn typedefs to match. Backports commit 5de3fd045be11b74cd0fbf36c6d4fb8387d5463b from qemu	2021-02-25 12:38:30 -05:00
Peter Maydell	fa6727ebba	target/arm: Rename NeonGenOneOpFn to NeonGenOne64OpFn The NeonGenOneOpFn typedef breaks with the pattern of the other NeonGen*Fn typedefs, because it is a TCGv_i64 -> TCGv_i64 operation but it does not have '64' in its name. Rename it to NeonGenOne64OpFn, so that the old name is available for a TCGv_i32 -> TCGv_i32 operation (which we will need in a subsequent commit). Backports commit 039f4e809ad2772fb33de4511ff68a485d875618 from qemu	2021-02-25 12:34:51 -05:00
Peter Maydell	27e74962e5	target/arm: Convert Neon 2-reg-misc crypto operations to decodetree Convert the Neon-2-reg misc crypto ops (AESE, AESMC, SHA1H, SHA1SU1) to decodetree. Backports commit 0b30dd5b85e20aba259768cb7aaa952b3e319468 from qemu	2021-02-25 12:32:39 -05:00
Peter Maydell	4354448f57	target/arm: Convert vectorised 2-reg-misc Neon ops to decodetree Convert to decodetree the insns in the Neon 2-reg-misc grouping which we implement using gvec. Backports commit 75153179e9928775d5333243ea4b278f438d75ae from qemu	2021-02-25 12:28:31 -05:00
Peter Maydell	6301f9acaa	target/arm: Convert Neon VCVT f16/f32 insns to decodetree Convert the Neon insns in the 2-reg-misc group which are VCVT between f32 and f16 to decodetree. Backports commit 654a517355e249435505ae5ff14a7520410cf7a4 from qemu	2021-02-25 12:25:32 -05:00
Peter Maydell	4ca33c54a2	target/arm: Convert Neon 2-reg-misc VSHLL to decodetree Convert the VSHLL insn in the 2-reg-misc Neon group to decodetree. Backports commit 749e2be36d75f11d5fa8f8277e2a0569bd2a1c97 from qemu	2021-02-25 12:20:57 -05:00
Peter Maydell	48d57d0dc7	target/arm: Convert Neon narrowing moves to decodetree Convert the Neon narrowing moves VMQNV, VQMOVN, VQMOVUN in the 2-reg-misc group to decodetree. Backports commit 3882bdacb0ad548864b9f2582a32bb5c785e3165 from qemu	2021-02-25 12:18:01 -05:00
Peter Maydell	35d8a3e83f	target/arm: Convert VZIP, VUZP to decodetree Convert the Neon VZIP and VUZP insns in the 2-reg-misc group to decodetree. Backports commit 567663a2af2457da8aa74f221b1f3f8a6d2eddf6 from qemu	2021-02-25 12:14:29 -05:00
Peter Maydell	d21fae82ba	target/arm: Convert Neon 2-reg-misc pairwise ops to decodetree Convert the pairwise ops VPADDL and VPADAL in the 2-reg-misc grouping to decodetree. At this point we can get rid of the weird CPU_V001 #define that was used to avoid having to explicitly list all the arguments being passed to some TCG gen/helper functions. Backports commit 6106af3aa2304fccee91a3a90138352b0c2af998 from qemu	2021-02-25 12:12:11 -05:00
Peter Maydell	505923e676	target/arm: Convert Neon 2-reg-misc VREV64 to decodetree Convert the Neon VREV64 insn from the 2-reg-misc grouping to decodetree. Backports commit 353d2b85058711a5e44c2dc63eb5b620db50a602 from qemu	2021-02-25 12:07:06 -05:00
Alistair Francis	e1f49dc888	target/riscv: Implement checks for hfence Call the helper_hyp_tlb_flush() function on hfence instructions which will generate an illegal insruction execption if we don't have permission to flush the Hypervisor level TLBs. Backports commit 2761db5fc20943bbd606b6fd49640ac000398de6 from qemu	2021-02-25 12:03:57 -05:00
Alistair Francis	8eb8bc290f	target/riscv: Move the hfence instructions to the rvh decode Also correct the name of the VVMA instruction. Backports commit b8429ded723ec52568e05f6a24ed78c93224687c from qemu	2021-02-25 11:59:49 -05:00
Alistair Francis	39ff690eff	target/riscv: Report errors validating 2nd-stage PTEs Backports commit 88914473e748db20d8e18b9735f647a683319fa6 from qemu	2021-02-25 11:55:53 -05:00
Alistair Francis	a6c323c912	target/riscv: Set access as data_load when validating stage-2 PTEs Backports commit efe9f9c820d1322729957a60ff785c9527a79ddf from qemu	2021-02-25 11:54:31 -05:00
Ian Jiang	5c3a2f391c	riscv: Add helper to make NaN-boxing for FP register The function that makes NaN-boxing when a 32-bit value is assigned to a 64-bit FP register is split out to a helper gen_nanbox_fpr(). Then it is applied in translating of the FLW instruction. Backports commit 354908cee1f7ff761b5fedbdb6376c378c10f941 from qemu	2021-02-25 11:53:27 -05:00
MerryMage	92243aefd4	arm/translate: Do not tracecode when in an IT block	2021-02-07 19:14:32 +00:00
Sunho Kim	d56c79776e	unicorn: fix uc_emu_start until if end instruction is in another tlb	2020-08-05 03:18:51 +09:00
MerryMage	9ac17104b8	arm: Add missing file vec_internal.h Missing from commit `1df7314dc3`. Ported from qemu a04b68e1d4c4f0cd5cd7542697b1b230b84532f5.	2020-06-20 00:12:09 +01:00
Philippe Mathieu-Daudé	4465ff9c93	fpu/softfloat: Silence 'bitwise negation of boolean expression' warning When building with clang version 10.0.0-4ubuntu1, we get: CC lm32-softmmu/fpu/softfloat.o fpu/softfloat.c:3365:13: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation] absZ &= ~ ( ( ( roundBits ^ 0x40 ) == 0 ) & roundNearestEven ); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ fpu/softfloat.c:3423:18: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation] absZ0 &= ~ ( ( (uint64_t) ( absZ1<<1 ) == 0 ) & roundNearestEven ); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ... fpu/softfloat.c:4273:18: error: bitwise negation of a boolean expression; did you mean logical negation? [-Werror,-Wbool-operation] zSig1 &= ~ ( ( zSig2 + zSig2 == 0 ) & roundNearestEven ); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix by rewriting the fishy bitwise AND of two bools as an int. Backports commit 4066288694c3bdd175df813cad675a3b5191956b from qemu	2020-06-18 23:56:27 -04:00
Peter Maydell	709610e606	target/arm: Convert Neon VDUP (scalar) to decodetree Convert the Neon VDUP (scalar) insn to decodetree. (Note that we can't call this just "VDUP" as we used that already in vfp.decode for the "VDUP (general purpose register" insn.) Backports commit 9aaa23c2ae18e6fb9a291b81baf91341db76dfa0 from qemu	2020-06-17 00:43:19 -04:00
Peter Maydell	8de8a4500a	target/arm: Convert Neon VTBL, VTBX to decodetree Convert the Neon VTBL, VTBX instructions to decodetree. The actual implementation of the insn is copied across to the new trans function unchanged except for renaming 'tmp5' to 'tmp4'. Backports commit 54e96c744b70a5d19f14b212a579dd3be8fcaad9 from qemu	2020-06-17 00:39:27 -04:00
Peter Maydell	4731a69d66	target/arm: Convert Neon VEXT to decodetree Convert the Neon VEXT insn to decodetree. Rather than keeping the old implementation which used fixed temporaries cpu_V0 and cpu_V1 and did the extraction with by-hand shift and logic ops, we use the TCG extract2 insn. We don't need to special case 0 or 8 immediates any more as the optimizer is smart enough to throw away the dead code. Backports commit 0aad761fb0aed40c99039eacac470cbd03d07019 from qemu	2020-06-17 00:29:04 -04:00
Peter Maydell	1aa9046120	target/arm: Convert Neon 2-reg-scalar long multiplies to decodetree Convert the Neon 2-reg-scalar long multiplies to decodetree. These are the last instructions in the group. Backports commit 77e576a9281825fc170f3b3af83f47e110549b5c from qemu	2020-06-17 00:24:12 -04:00
Peter Maydell	088a1e8ba9	target/arm: Convert Neon 2-reg-scalar VQRDMLAH, VQRDMLSH to decodetree Convert the VQRDMLAH and VQRDMLSH insns in the 2-reg-scalar group to decodetree. Backports commit aa318f5b9b4ab3b6744b5305dd8ae9b96676f20e from qemu	2020-06-17 00:15:18 -04:00
Peter Maydell	c0551804d4	target/arm: Convert Neon 2-reg-scalar VQDMULH, VQRDMULH to decodetree Convert the VQDMULH and VQRDMULH insns in the 2-reg-scalar group to decodetree. Backports commit b2fc7be972b94872f6a6dd32d9bda1b88ddbcaad from qemu	2020-06-17 00:11:56 -04:00
Peter Maydell	2e8ae1130e	target/arm: Convert Neon 2-reg-scalar float multiplies to decodetree Convert the float versions of VMLA, VMLS and VMUL in the Neon 2-reg-scalar group to decodetree. Backports commit 85ac9aef9a5418de3168df569e21258e853840a2 from qemu	2020-06-17 00:09:32 -04:00
Peter Maydell	bf1b0374b9	target/arm: Convert Neon 2-reg-scalar integer multiplies to decodetree Convert the VMLA, VMLS and VMUL insns in the Neon "2 registers and a scalar" group to decodetree. These are 32x32->32 operations where one of the inputs is the scalar, followed by a possible accumulate operation of the 32-bit result. The refactoring removes some of the oddities of the old decoder: * operands to the operation and accumulation were often reversed (taking advantage of the fact that most of these ops are commutative); the new code follows the pseudocode order * the Q bit in the insn was in a local variable 'u'; in the new code it is decoded into a->q Backports commit 96fc80f5f186decd1a649f6c04252faceb057ad2 from qemu	2020-06-17 00:04:29 -04:00
Peter Maydell	1817f28afd	target/arm: Add missing TCG temp free in do_2shift_env_64() In commit 37bfce81b10450071 we accidentally introduced a leak of a TCG temporary in do_2shift_env_64(); free it. Backports commit a4f67e180def790ff0bbb33fc93bb6e80382f041 from qemu	2020-06-16 23:57:17 -04:00
Peter Maydell	06dfc2ada6	target/arm: Add 'static' and 'const' annotations to VSHLL function arrays Mark the arrays of function pointers in trans_VSHLL_S_2sh() and trans_VSHLL_U_2sh() as both 'static' and 'const'. Backports commit 448f0e5f3ecfbd089b934e5e3aa0ccd1f51a6174 from qemu	2020-06-16 23:56:30 -04:00
Peter Maydell	6383a2bd15	target/arm: Convert Neon 3-reg-diff polynomial VMULL Convert the Neon 3-reg-diff insn polynomial VMULL. This is the last insn in this group to be converted. Backports commit 18fb58d588898550919392277787979ee7d0d84e from qemu	2020-06-16 23:54:51 -04:00
Peter Maydell	090426b120	target/arm: Convert Neon 3-reg-diff saturating doubling multiplies Convert the Neon 3-reg-diff insns VQDMULL, VQDMLAL and VQDMLSL: these are all saturating doubling long multiplies with a possible accumulate step. These are the last insns in the group which use the pass-over-each elements loop, so we can delete that code. Backports commit 9546ca5998d3cbd98a81b2d46a2e92a11b0f78a4 from qemu	2020-06-16 23:51:56 -04:00
Peter Maydell	5464405d5c	target/arm: Convert Neon 3-reg-diff long multiplies Convert the Neon 3-reg-diff insns VMULL, VMLAL and VMLSL; these perform a 32x32->64 multiply with possible accumulate. Note that for VMLSL we do the accumulate directly with a subtraction rather than doing a negate-then-add as the old code did. Backports commit 3a1d9eb07b767a7592abca642af80906f9eab0ed from qemu	2020-06-16 23:47:28 -04:00
Peter Maydell	21044a1d11	target/arm: Convert Neon 3-reg-diff VABAL, VABDL to decodetree Convert the Neon 3-reg-diff insns VABAL and VABDL to decodetree. Like almost all the remaining insns in this group, these are a combination of a two-input operation which returns a double width result and then a possible accumulation of that double width result into the destination. Backports commit f5b28401200ec95ba89552df3ecdcdc342f6b90b from qemu	2020-06-16 23:41:20 -04:00
Peter Maydell	34418f1998	target/arm: Convert Neon 3-reg-diff narrowing ops to decodetree Convert the narrow-to-high-half insns VADDHN, VSUBHN, VRADDHN, VRSUBHN in the Neon 3-registers-different-lengths group to decodetree. Backports commit 0fa1ab0302badabc3581aefcbb2f189ef52c4985 from qemu	2020-06-16 23:36:18 -04:00
Peter Maydell	d25998ba7d	target/arm: Convert Neon 3-reg-diff prewidening ops to decodetree Convert the "pre-widening" insns VADDL, VSUBL, VADDW and VSUBW in the Neon 3-registers-different-lengths group to decodetree. These insns work by widening one or both inputs to double their size, performing an add or subtract at the doubled size and then storing the double-size result. As usual, rather than copying the loop of the original decoder (which needs awkward code to avoid problems when source and destination registers overlap) we just unroll the two passes. Backports commit b28be09570d0827969b62b8f82b0f720a9915427 from qemu	2020-06-16 23:29:53 -04:00
Peter Maydell	a9d0e36bcf	target/arm: Fix missing temp frees in do_vshll_2sh The widenfn() in do_vshll_2sh() does not free the input 32-bit TCGv, so we need to do this in the calling code. Backports commit 9593a3988c3e788790aa107d778386b09f456a6d from qemu	2020-06-16 23:26:04 -04:00
Thomas Huth	6053203c1c	target/i386: Remove obsolete TODO file The last real change to this file is from 2012, so it is very likely that this file is completely out-of-date and ignored today. Let's simply remove it to avoid confusion if someone finds it by accident. Backports commit 3575b0aea983ad57804c9af739ed8ff7bc168393 from qemu	2020-06-15 13:22:56 -04:00
Joseph Myers	18b0ae9ebd	target/i386: correct fix for pcmpxstrx substring search This corrects a bug introduced in my previous fix for SSE4.2 pcmpestri / pcmpestrm / pcmpistri / pcmpistrm substring search, commit ae35eea7e4a9f21dd147406dfbcd0c4c6aaf2a60. That commit fixed a bug that showed up in four GCC tests with one libc implementation. The tests in question generate random inputs to the intrinsics and compare results to a C implementation, but they only test 1024 possible random inputs, and when the tests use the cases of those instructions that work with word rather than byte inputs, it's easy to have problematic cases that show up much less frequently than that. Thus, testing with a different libc implementation, and so a different random number generator, showed up a problem with the previous patch. When investigating the previous test failures, I found the description of these instructions in the Intel manuals (starting from computing a 16x16 or 8x8 set of comparison results) confusing and hard to match up with the more optimized implementation in QEMU, and referred to AMD manuals which described the instructions in a different way. Those AMD descriptions are very explicit that the whole of the string being searched for must be found in the other operand, not running off the end of that operand; they say "If the prototype and the SUT are equal in length, the two strings must be identical for the comparison to be TRUE.". However, that statement is incorrect. In my previous commit message, I noted: The operation in this case is a search for a string (argument d to the helper) in another string (argument s to the helper); if a copy of d at a particular position would run off the end of s, the resulting output bit should be 0 whether or not the strings match in the region where they overlap, but the QEMU implementation was wrongly comparing only up to the point where s ends and counting it as a match if an initial segment of d matched a terminal segment of s. Here, "run off the end of s" means that some byte of d would overlap some byte outside of s; thus, if d has zero length, it is considered to match everywhere, including after the end of s. The description "some byte of d would overlap some byte outside of s" is accurate only when understood to refer to overlapping some byte within the 16-byte operand but at or after the zero terminator; it is valid to run over the end of s if the end of s is the end of the 16-byte operand. So the fix in the previous patch for the case of d being empty was correct, but the other part of that patch was not correct (as it never allowed partial matches even at the end of the 16-byte operand). Nor was the code before the previous patch correct for the case of d nonempty, as it would always have allowed partial matches at the end of s. Fix with a partial revert of my previous change, combined with inserting a check for the special case of s having maximum length to determine where it is necessary to check for matches. In the added test, test 1 is for the case of empty strings, which failed before my 2017 patch, test 2 is for the bug introduced by my 2017 patch and test 3 deals with the case where a match of an initial segment at the end of the string is not valid when the string ends before the end of the 16-byte operand (that is, the case that would be broken by a simple revert of the non-empty-string part of my 2017 patch). Backports commit bc921b2711c4e2e8ab99a3045f6c0f134a93b535 from qemu	2020-06-15 13:20:48 -04:00
Joseph Myers	e79024e0cf	target/i386: fix IEEE x87 floating-point exception raising Most x87 instruction implementations fail to raise the expected IEEE floating-point exceptions because they do nothing to convert the exception state from the softfloat machinery into the exception flags in the x87 status word. There is special-case handling of division to raise the divide-by-zero exception, but that handling is itself buggy: it raises the exception in inappropriate cases (inf / 0 and nan / 0, which should not raise any exceptions, and 0 / 0, which should raise "invalid" instead). Fix this by converting the floating-point exceptions raised during an operation by the softfloat machinery into exceptions in the x87 status word (passing through the existing fpu_set_exception function for handling related to trapping exceptions). There are special cases where some functions convert to integer internally but exceptions from that conversion are not always correct exceptions for the instruction to raise. There might be scope for some simplification if the softfloat exception state either could always be assumed to be in sync with the state in the status word, or could always be ignored at the start of each instruction and just set to 0 then; I haven't looked into that in detail, and it might run into interactions with the various ways the emulation does not yet handle trapping exceptions properly. I think the approach taken here, of saving the softfloat state, setting exceptions there to 0 and then merging the old exceptions back in after carrying out the operation, is conservatively safe Backports commit 975af797f1e04e4d1b1a12f1731141d3770fdbce from qemu	2020-06-15 13:19:27 -04:00
Joseph Myers	cb50df6aae	target/i386: fix fisttpl, fisttpll handling of out-of-range values The fist / fistt family of instructions should all store the most negative integer in the destination format when the rounded / truncated integer result is out of range or the input is an invalid encoding, infinity or NaN. The fisttpl and fisttpll implementations (32-bit and 64-bit results, truncate towards zero) failed to do this, producing the most positive integer in some cases instead. Fix this by copying the code used to handle this issue for fistpl and fistpll, adjusted to use the _round_to_zero functions for the actual conversion (but without any other changes to that code). Backports commit c8af85b10c818709755f5dc8061c69920611fd4c from qemu	2020-06-15 13:10:23 -04:00
Joseph Myers	ceaa77e576	target/i386: fix fbstp handling of out-of-range values The fbstp implementation fails to check for out-of-range and invalid values, instead just taking the result of conversion to int64_t and storing its sign and low 18 decimal digits. Fix this by checking for an out-of-range result (invalid conversions always result in INT64_MAX or INT64_MIN from the softfloat code, which are large enough to be considered as out-of-range by this code) and storing the packed BCD indefinite encoding in that case. Backports commit 374ff4d0a3c2cce2bc6e4ba8a77eaba55c165252 from qemu	2020-06-15 13:09:23 -04:00
Joseph Myers	477a0af161	target/i386: fix fbstp handling of negative zero The fbstp implementation stores +0 when the rounded result should be -0 because it compares an integer value with 0 to determine the sign. Fix this by checking the sign bit of the operand instead. Backports commit 18c53e1e73197a24f9f4b66b1276eb9868db5bf0 from qemu	2020-06-15 13:08:38 -04:00
Joseph Myers	c796ee5e13	target/i386: fix fxam handling of invalid encodings The fxam implementation does not check for invalid encodings, instead treating them like NaN or normal numbers depending on the exponent. Fix it to check that the high bit of the significand is set before treating an encoding as NaN or normal, thus resulting in correct handling (all of C0, C2 and C3 cleared) for invalid encodings. Backports commit 34b9cc076ff423023a779a04a9f7cd7c17372cbf from qemu	2020-06-15 13:07:54 -04:00
Joseph Myers	5a01ea31eb	target/i386: fix floating-point load-constant rounding The implementations of the fldl2t, fldl2e, fldpi, fldlg2 and fldln2 instructions load fixed constants independent of the rounding mode. Fix them to load a value correctly rounded for the current rounding mode (but always rounded to 64-bit precision independent of the precision control, and without setting "inexact") as specified. Backports commit 80b4008c805ebcfd4c0d302ac31c1689e34571e0 from qemu	2020-06-15 13:07:06 -04:00
Joseph Myers	95368d250b	target/i386: fix fscale handling of rounding precision The fscale implementation uses floatx80_scalbn for the final scaling operation. floatx80_scalbn ends up rounding the result using the dynamic rounding precision configured for the FPU. But only a limited set of x87 floating-point instructions are supposed to respect the dynamic rounding precision, and fscale is not in that set. Fix the implementation to save and restore the rounding precision around the call to floatx80_scalbn. Backports commit c535d68755576bfa33be7aef7bd294a601f776e0 from qemu	2020-06-15 13:05:31 -04:00
Joseph Myers	ad83656acc	target/i386: fix fscale handling of infinite exponents The fscale implementation passes infinite exponents through to generic code that rounds the exponent to a 32-bit integer before using floatx80_scalbn. In round-to-nearest mode, and ignoring exceptions, this works in many cases. But it fails to handle the special cases of scaling 0 by a +Inf exponent or an infinity by a -Inf exponent, which should produce a NaN, and because it produces an inexact result for finite nonzero numbers being scaled, the result is sometimes incorrect in other rounding modes. Add appropriate handling of infinite exponents to produce a NaN or an appropriately signed exact zero or infinity as a result Backports commit c1c5fb8f9067c830e36830c2b82c0ec146c03d7b from qemu	2020-06-15 13:04:46 -04:00
Joseph Myers	bbbf25fdd9	target/i386: fix fscale handling of invalid exponent encodings The fscale implementation does not check for invalid encodings in the exponent operand, thus treating them like INT_MIN (the value returned for invalid encodings by floatx80_to_int32_round_to_zero). Fix it to treat them similarly to signaling NaN exponents, thus generating a quiet NaN result. Backports commit b40eec96b26028b68c3594fbf34b6d6f029df26a from qemu	2020-06-15 13:03:54 -04:00
Joseph Myers	d96c218664	target/i386: fix fscale handling of signaling NaN The implementation of the fscale instruction returns a NaN exponent unchanged. Fix it to return a quiet NaN when the provided exponent is a signaling NaN. Backports commit 0d48b436327955c69e2eb53f88aba9aa1e0dbaa0 from qemu	2020-06-15 13:03:16 -04:00
Joseph Myers	18fc17ca25	target/i386: implement special cases for fxtract The implementation of the fxtract instruction treats all nonzero operands as normal numbers, so yielding incorrect results for invalid formats, infinities, NaNs and subnormal and pseudo-denormal operands. Implement appropriate handling of all those cases. Backports commit c415f2c58296d86e9abb7e4a133111acf7031da3 from qemu	2020-06-15 13:02:33 -04:00
Liran Alon	7373942623	i386/cpu: Store LAPIC bus frequency in CPU structure No functional change. This information will be used by following patches. Backports commit 73b994f6d74ec00a1d78daf4145096ff9f0e2982 from qemu	2020-06-15 13:00:58 -04:00
Janne Grunau	6f41687234	target/i386: fix phadd* with identical destination and source register Detected by asm test suite failures in dav1d (https://code.videolan.org/videolan/dav1d). Can be reproduced by `qemu-x86_64 -cpu core2duo ./tests/checkasm --test=mc_8bpc 1659890620`. Backports commit 2dfbea1a872727fb747ca6adf2390e09956cdc6e from qemu	2020-06-15 12:59:49 -04:00
Philippe Mathieu-Daudé	34930da196	target/i386: Fix OUTL debug output Fix OUTL instructions incorrectly displayed as OUTW. Backports commit ce8540fde2cb535923a52a012f57b418eea85e1b from qemu	2020-06-15 12:56:33 -04:00
Richard Henderson	a93d01c61d	target/arm: Use a non-overlapping group for misc control The miscellaneous control instructions are mutually exclusive within the t32 decode sub-group. Backports commit d6084fba47bb9aef79775c1102d4b647eb58c365 from qemu	2020-06-15 12:52:48 -04:00
Richard Henderson	b45a02e2f7	decodetree: Multi-cleanup Includes multiple changes by Richard Henderson as follows: - Use proper varargs to print the arguments. (2fd51b19c9) - Rename MultiPattern to IncMultiPattern (040145c4f8) - Split out MultiPattern from IncMultiPattern (df63044d02) - Allow group covering the entire insn space (b44b3449a0) - Move semantic propagation into classes (08561fc128) - Implement non-overlapping groups (067e8b0f45) - Drop check for less than 2 patterns in a group (fe079aa13d)	2020-06-15 12:49:02 -04:00
Peter Maydell	7427cca6cc	target/arm: Convert Neon one-register-and-immediate insns to decodetree Convert the insns in the one-register-and-immediate group to decodetree. In the new decode, our asimd_imm_const() function returns a 64-bit value rather than a 32-bit one, which means we don't need to treat cmode=14 op=1 as a special case in the decoder (it is the only encoding where the two halves of the 64-bit value are different). Backports commit 2c35a39eda0b16c2ed85c94cec204bf5efb97812 from qemu	2020-06-15 12:44:54 -04:00
Peter Maydell	93e6d464c8	target/arm: Convert VCVT fixed-point ops to decodetree Convert the VCVT fixed-point conversion operations in the Neon 2-regs-and-shift group to decodetree. Backports commit 3da26f11711caeaa18318b6afa14dfb81d7650ab from qemu	2020-06-15 12:40:59 -04:00
Peter Maydell	a5f903b2a5	target/arm: Convert Neon VSHLL, VMOVL to decodetree Convert the VSHLL and VMOVL insns from the 2-reg-shift group to decodetree. Since the loop always has two passes, we unroll it to avoid the awkward reassignment of one TCGv to another. Backports commit 968bf842742a5ffbb0041cb31089e61a9f7a833d from qemu	2020-06-15 12:35:32 -04:00
Peter Maydell	6fc8fdaa2b	target/arm: Convert Neon narrowing shifts with op==9 to decodetree Convert the remaining Neon narrowing shifts to decodetree: * VQSHRN * VQRSHRN Backports commit b4a3a77bb7a0dff1cc5673fe3be467d9e3635d44 from qemu	2020-06-15 12:31:35 -04:00
Peter Maydell	ef29b91a43	target/arm: Convert Neon narrowing shifts with op==8 to decodetree Convert the Neon narrowing shifts where op==8 to decodetree: * VSHRN * VRSHRN * VQSHRUN * VQRSHRUN backports commit 712182d340e33c2ce86143f25fb2f04ae23d90de from qemu	2020-06-15 12:29:09 -04:00
Peter Maydell	69a3312e3a	target/arm: Convert VQSHLU, VQSHL 2-reg-shift insns to decodetree Convert the VQSHLU and QVSHL 2-reg-shift insns to decodetree. These are the last of the simple shift-by-immediate insns. Backports commit 37bfce81b10450071193c8495a07f182ec652e2a from qemu	2020-06-15 12:21:10 -04:00
Peter Maydell	055c96f985	target/arm: Convert Neon VSHR 2-reg-shift insns to decodetree Convert the VSHR 2-reg-shift insns to decodetree. Note that unlike the legacy decoder, we present the right shift amount to the trans_ function as a positive integer. Backports commit 66432d6b8294e3508218b360acfdf7c244eea993 from qemu	2020-06-15 12:15:29 -04:00
Peter Maydell	bf18bf983d	target/arm: Convert Neon VSHL and VSLI 2-reg-shift insn to decodetree Convert the VSHL and VSLI insns from the Neon 2-registers-and-a-shift group to decodetree. Backports commit d3c8c736f8b4bdd02831076286b1788232f46ced from qemu	2020-06-15 12:07:02 -04:00
Richard Henderson	1d95dd1c89	target/arm: Split helper_crypto_sm3tt Rather than passing an opcode to a helper, fully decode the operation at translate time. Use clear_tail_16 to zap the balance of the SVE register with the AdvSIMD write. Backports commit 43fa36c96c24349145497adc1b451f9caf74e344 from qemu	2020-06-14 23:24:21 -04:00
Richard Henderson	5ca8caf656	target/arm: Split helper_crypto_sha1_3reg Rather than passing an opcode to a helper, fully decode the operation at translate time. Use clear_tail_16 to zap the balance of the SVE register with the AdvSIMD write. Backports commit afc8b7d32668547308bdd654a63cf5228936e0ba from qemu	2020-06-14 23:18:45 -04:00
Richard Henderson	41c4efdb22	target/arm: Convert sha1 and sha256 to gvec helpers Do not yet convert the helpers to loop over opr_sz, but the descriptor allows the vector tail to be cleared. Which fixes an existing bug vs SVE. Backports commit effa992f153f5e7ab97ab843b565690748c5b402 from qemu	2020-06-14 23:11:28 -04:00
Richard Henderson	2c6c4da80c	target/arm: Convert sha512 and sm3 to gvec helpers Do not yet convert the helpers to loop over opr_sz, but the descriptor allows the vector tail to be cleared. Which fixes an existing bug vs SVE. Backports commit aaffebd6d3135b8aed7e61932af53b004d261579 from qemu	2020-06-14 23:01:49 -04:00
Richard Henderson	894f2168da	target/arm: Convert rax1 to gvec helpers With this conversion, we will be able to use the same helpers with sve. This also fixes a bug in which we failed to clear the high bits of the SVE register after an AdvSIMD operation. Backports commit 1738860d7e60dec5dbeba17f8b44d31aae3accac from qemu	2020-06-14 22:49:36 -04:00
Richard Henderson	1df7314dc3	target/arm: Convert aes and sm4 to gvec helpers With this conversion, we will be able to use the same helpers with sve. In particular, pass 3 vector parameters for the 3-operand operations; for advsimd the destination register is also an input. This also fixes a bug in which we failed to clear the high bits of the SVE register after an AdvSIMD operation. Backports commit a04b68e1d4c4f0cd5cd7542697b1b230b84532f5 from qemu	2020-06-14 22:41:33 -04:00
Alistair Francis	2b2f91f82c	target/riscv: Add the lowRISC Ibex CPU The reset vector is set in the init function don't set it again in realize. Backports commit 36b80ad99f7ea4979a4c5fc6e4072619b405e3b0 from qemu	2020-06-14 22:28:55 -04:00
Alistair Francis	2584ab8ee5	target/riscv: Drop support for ISA spec version 1.09.1 The RISC-V ISA spec version 1.09.1 has been deprecated in QEMU since 4.1. It's not commonly used so let's remove support for it. Backports commit 1a9540d1f1a9c5022d9273d0244e5809679dd33b from qemu	2020-06-14 22:23:26 -04:00
Alistair Francis	e35d56a146	target/riscv: Remove the deprecated CPUs	2020-06-14 22:15:16 -04:00
Richard Henderson	0e68fa345e	tcg: Improve move ops in liveness_pass_2 If the output of the move is dead, then the last use is in the store. If we propagate the input to the store, then we can remove the move opcode entirely. Backports commit 61f15c487fc2aea14f6b0e52c459ae8b7d252a65 from qemu	2020-06-14 22:13:04 -04:00
Richard Henderson	6b91e9bae1	tcg/i386: Implement INDEX_op_rotl{i,s,v}_vec For immediates, we must continue the special casing of 8-bit elements. The other element sizes and shift types are trivially implemented with shifts. Backports commit 885b1706df6f0211a22e120fac910fb3abf3e733 from qemu	2020-06-14 22:09:24 -04:00
Richard Henderson	cc3187b1e4	tcg: Implement gvec support for rotate by scalar No host backend support yet, but the interfaces for rotls are in place. Only implement left-rotate for now, as the only known use of vector rotate by scalar is s390x, so any right-rotate would be unused and untestable. Backports commit 23850a74afb641102325b4b7f74071d929fc4594 from qemu	2020-06-14 22:00:50 -04:00
Richard Henderson	2aa9d13120	tcg: Remove expansion to shift by vector from do_shifts We do not reflect this expansion in tcg_can_emit_vecop_list, so it is unused and unusable. However, we actually perform the same expansion in do_gvec_shifts, so it is also unneeded. Backports commit 3d5bb2ea5cc9ed54f65a6929a6e6baa01cabd98b from qemu	2020-06-14 21:53:36 -04:00
Richard Henderson	be78062fd8	tcg: Implement gvec support for rotate by vector No host backend support yet, but the interfaces for rotlv and rotrv are in place. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- v3: Drop the generic expansion from rot to shift; we can do better for each backend, and then this code becomes unused. Backports commit 5d0ceda902915e3f0e21c39d142c92c4e97c3ebb from qemu	2020-06-14 21:43:46 -04:00
Richard Henderson	5cce52a04b	tcg: Implement gvec support for rotate by immediate No host backend support yet, but the interfaces for rotli are in place. Canonicalize immediate rotate to the left, based on a survey of architectures, but provide both left and right shift interfaces to the translators. Backports commit b0f7e7444c03da17e41bf327c8aea590104a28ab from qemu	2020-06-14 21:26:58 -04:00
Laurent Vivier	50aa85e560	target/m68k: implement opcode fetoxm1 Example provided in the launchpad bug fails with: qemu: uncaught target signal 4 (Illegal instruction) - core dumped Illegal instruction (core dumped) It appears fetoxm1 is not implemented: IN: expm1f 0x800005cc: fetoxm1x %fp2,%fp0 Disassembler disagrees with translator over instruction decoding Please report this to qemu-devel@nongnu.org (gdb) x/2hx 0x800005cc 0x800005cc: 0xf200 0x0808 This patch adds the instruction. Backports commit 250b1da35d579f42319af234f36207902ca4baa4 from qemu	2020-06-14 21:13:29 -04:00
Laurent Vivier	aa69ab54ad	target/m68k: implement fmove.l #<data>,FPCR The immediate value mode was ignored and instruction execution ends to an invalid access mode. This was found running 'R' that set FPSR to 0 at startup with a 'fmove.l #0,FPSR' in qemu-system-m68k emulation and triggers a kernel crash: [ 56.640000] * ADDRESS ERROR * FORMAT=2 [ 56.640000] Current process id is 728 [ 56.640000] BAD KERNEL TRAP: 00000000 [ 56.640000] Modules linked in: sg evdev mac_hid ip_tables x_tables sha1_generic hmac ipv6 nf_defrag_ipv6 autofs4 ext4 crc16 mbcache jbd2 crc32c_generic sd_mod t10_pi crc_t10dif crct10dif_generic crct10dif_common sr_mod cdrom mac_esp macsonic esp_scsi [ 56.640000] PC: [<00016a2c>] X_UNSUPP+0x2c/0x3c [ 56.640000] SR: 2004 SP: 3eb5e68c a2: c02e239a [ 56.640000] d0: 00000040 d1: 00000002 d2: 8002adec d3: 8002ad50 [ 56.640000] d4: 8002c768 d5: 0000000d a0: ffffffc2 a1: ffffffc1 [ 56.640000] Process R (pid: 728, task=a3dfda5d) [ 56.640000] Frame format=2 instr addr=00000000 [ 56.650000] Stack from 3a4d9f30: [ 56.650000] 41000000 00000002 00000002 ffffffc2 ffffffc1 1fff0000 80000000 00000000 [ 56.650000] 3fbf0000 80000000 00000000 00000000 20000000 00000000 7fff0000 ffffffff [ 56.650000] ffffffff 00000000 00050008 00000000 8000067c c02c2000 efffee20 000002d8 [ 56.650000] 00002a28 3a4d9f98 00000002 00000014 fffffffe 8002c768 00000002 00000041 [ 56.650000] 00000002 c041fc58 c0743758 ffffffff 00000000 0008c075 00002b24 00000012 [ 56.650000] 000007d0 00000024 00000002 c05bef04 c05bef04 0000005e 00000077 c28aca70 [ 56.650000] Call Trace: [<00050008>] copy_overflow+0x10/0x28 [ 56.650000] [<00002a28>] buserr+0x20/0x28 [ 56.650000] [<0008c075>] bpf_check+0x57f/0x1cfa [ 56.650000] [<00002b24>] syscall+0x8/0xc [ 56.650000] [<0000c019>] dn_sched_init+0x75/0x88 [ 56.650000] Code: 1017 0200 00f0 0c00 0040 66ff 0000 05ac <f23c> 8800 0000 0000 f23c 9000 0000 0000 222e ff84 082e 0005 ff1c 6600 000a 0281 [ 56.650000] Disabling lock debugging due to kernel taint ... Backports commit 6a0e8bb4956c34328f4624e20bd3a6c2b1d90adc from qemu	2020-06-14 21:11:54 -04:00
Huacai Chen	504946fb79	target/mips: Support variable page size Traditionally, MIPS use 4KB page size, but Loongson prefer 16KB page size in system emulator. So, let's define TARGET_PAGE_BITS_VARY and TARGET_PAGE_BITS_MIN to support variable page size. Backports commit ee3863b9d414f0b4a59a88f2a79b496a99d4f6dd from qemu	2020-06-14 21:09:51 -04:00
Peter Maydell	1c6b0339e6	target/arm: Allow user-mode code to write CPSR.E via MSR Using the MSR instruction to write to CPSR.E is deprecated, but it is required to work from any mode including unprivileged code. We were incorrectly forbidding usermode code from writing it because CPSR_USER did not include the CPSR_E bit. We use CPSR_USER in only three places: * as the mask of what to allow userspace MSR to write to CPSR * when deciding what bits a linux-user signal-return should be able to write from the sigcontext structure * in target_user_copy_regs() when we set up the initial registers for the linux-user process In the first two cases not being able to update CPSR.E is a bug, and in the third case it doesn't matter because CPSR.E is always 0 there. So we can fix both bugs by adding CPSR_E to CPSR_USER. Because the cpsr_write() in restore_sigcontext() is now changing a CPSR bit which is cached in hflags, we need to add an arm_rebuild_hflags() call there; the callsite in target_user_copy_regs() was already rebuilding hflags for other reasons. (The recommended way to change CPSR.E is to use the 'SETEND' instruction, which we do correctly allow from usermode code.) Backports commit 268b1b3dfbb92a9348406f728a33f39e3d8dcd8a from qemu	2020-06-14 21:08:03 -04:00
Richard Henderson	acdd5c6065	target/arm: Use clear_vec_high more effectively Do not explicitly store zero to the NEON high part when we can pass !is_q to clear_vec_high. Backports commit e1f778596ebfa8782276f4dd4651f2b285d734ff from qemu	2020-06-14 21:06:40 -04:00
Richard Henderson	3ac9b9b206	target/arm: Use tcg_gen_gvec_mov for clear_vec_high The 8-byte store for the end a !is_q operation can be merged with the other stores. Use a no-op vector move to trigger the expand_clr portion of tcg_gen_gvec_mov. Backports commit 5c27392dd08bd8534893abf25ef501f1bd8680fe from qemu	2020-06-14 21:00:57 -04:00
Richard Henderson	22004b8106	softfloat: Return bool from all classification predicates This includes _is_any_nan, _is_neg, *_is_inf, etc. Backports commit 150c7a91ce7862bcaf7422f6038dcf0ba4a7eee3 from qemu	2020-05-21 18:23:11 -04:00
Richard Henderson	afd8d05aa2	softfloat: Inline floatx80 compare specializations Replace the floatx80 compare specializations with inline functions that call the standard floatx80_compare{,_quiet} functions. Use bool as the return type. Backports commit c6baf65000f826a713e8d9b5b35e617b0ca9ab5d from qemu	2020-05-21 18:17:53 -04:00
Richard Henderson	57d2419cd3	softfloat: Inline float128 compare specializations Replace the float128 compare specializations with inline functions that call the standard float128_compare{,_quiet} functions. Use bool as the return type. Backports commit b7b1ac684fea49c6bfe1ad8b706aed7b09116d15 from qemu	2020-05-21 18:15:55 -04:00
Richard Henderson	18a46c4d79	softfloat: Inline float64 compare specializations Replace the float64 compare specializations with inline functions that call the standard float64_compare{,_quiet} functions. Use bool as the return type. Backports commit 0673ecdf6cb2b1445a85283db8cbacb251c46516 from qemu	2020-05-21 18:13:44 -04:00
Richard Henderson	a35333741a	softfloat: Inline float32 compare specializations Replace the float32 compare specializations with inline functions that call the standard float32_compare{,_quiet} functions. Use bool as the return type. Backports commit 5da2d2d8e53d80e92a61720ea995c86b33cbf25d from qemu	2020-05-21 18:11:25 -04:00
Richard Henderson	d960523cbd	softfloat: Name compare relation enum Give the previously unnamed enum a typedef name. Use it in the prototypes of compare functions. Use it to hold the results of the compare functions. Backports commit 71bfd65c5fcd72f8af2735905415c7ce4220f6dc from qemu	2020-05-21 18:08:52 -04:00
Richard Henderson	8adc704058	softfloat: Name rounding mode enum Give the previously unnamed enum a typedef name. Use the packed attribute so that we do not affect the layout of the float_status struct. Use it in the prototypes of relevant functions. Adjust switch statements as necessary to avoid compiler warnings. Backports commit 3dede407cc61b64997f0c30f6dbf4df09949abc9 from qemu	2020-05-21 18:02:05 -04:00
Richard Henderson	a5c8178e35	softfloat: Change tininess_before_rounding to bool Slightly tidies the usage within softfloat.c and the representation in float_status. Backports commit a828b373bdabc7e53d1e218e3fc76f85b6674688 from qemu	2020-05-21 17:52:50 -04:00
Richard Henderson	a417227674	softfloat: Replace flag with bool We have had this on the to-do list for quite some time. Backports commit c120391c0090d9c40425c92cdb00f38ea8588ff6 from qemu	2020-05-21 17:48:12 -04:00
Richard Henderson	6530d6342f	softfloat: Use post test for floatN_mul The existing f{32,64}_addsub_post test, which checks for zero inputs, is identical to f{32,64}_mul_fast_test. Which means we can eliminate the fast_test/fast_op hooks in favor of reusing the same post hook. This means we have one fewer test along the fast path for multiply. Backports commit b240c9c497b9880ac0ba29465907d5ebecd48083 from qemu	2020-05-21 17:24:00 -04:00
Joseph Myers	c675454b27	softfloat: fix floatx80 pseudo-denormal round to integer The softfloat function floatx80_round_to_int incorrectly handles the case of a pseudo-denormal where only the high bit of the significand is set, ignoring that bit (treating the number as an exact zero) rather than treating the number as an alternative representation of +/- 2^-16382 (which may round to +/- 1 depending on the rounding mode) as hardware does. Fix this check (simplifying the code in the process). Backports commit 9ecaf5ccec13ff2e8fe1e72f6e0f3367d2169c1c from qemu	2020-05-15 23:59:23 -04:00
Joseph Myers	3d4a7e34e1	softfloat: fix floatx80 pseudo-denormal comparisons The softfloat floatx80 comparisons fail to allow for pseudo-denormals, which should compare equal to corresponding values with biased exponent 1 rather than 0. Add an adjustment for that case when comparing numbers with the same sign. Backports commit be53fa785ab766d2722628403edee75b3e6ab599 from qemu	2020-05-15 23:58:49 -04:00
Joseph Myers	85964d48d2	softfloat: fix floatx80 pseudo-denormal addition / subtraction The softfloat function addFloatx80Sigs, used for addition of values with the same sign and subtraction of values with opposite sign, fails to handle the case where the two values both have biased exponent zero and there is a carry resulting from adding the significands, which can occur if one or both values are pseudo-denormals (biased exponent zero, explicit integer bit 1). Add a check for that case, so making the results match those seen on x86 hardware for pseudo-denormals. Backports commit 41602807766e253ccb6fb761f3ff12767f786e2c from qemu	2020-05-15 23:56:24 -04:00
Joseph Myers	2ea23a5bbd	softfloat: silence sNaN for conversions to/from floatx80 Conversions between IEEE floating-point formats should convert signaling NaNs to quiet NaNs. Most of those in QEMU's softfloat code do so, but those for floatx80 fail to. Fix those conversions to silence signaling NaNs as well. Backports commit 7537c2b4a363237534c96d089a02b0712b49d890 from qemu	2020-05-15 23:54:32 -04:00
Peter Maydell	7b2fb5bc63	target/arm: Convert NEON VFMA, VFMS 3-reg-same insns to decodetree Convert the Neon floating point VFMA and VFMS insn to decodetree. These are the last insns in the 3-reg-same group so we can remove all the support/loop code from the old decoder. Backports commit e95485f85657be21135c17a9226e297c21e73360 from qemu	2020-05-15 23:49:20 -04:00
Peter Maydell	82484db863	target/arm: Convert Neon fp VMAX/VMIN/VMAXNM/VMINNM/VRECPS/VRSQRTS to decodetree Convert the Neon fp VMAX/VMIN/VMAXNM/VMINNM/VRECPS/VRSQRTS 3-reg-same insns to decodetree. (These are all the remaining non-accumulation instructions in this group.) Backports commit d5fdf9e9e1c6f2bbb0a4bcaafd85d344cce9c298 from qemu	2020-05-15 23:44:52 -04:00
Peter Maydell	a593866af6	target/arm: Move 'env' argument of recps_f32 and rsqrts_f32 helpers to usual place The usual location for the env argument in the argument list of a TCG helper is immediately after the return-value argument. recps_f32 and rsqrts_f32 differ in that they put it at the end. Move the env argument to its usual place; this will allow us to more easily use these helper functions with the gvec APIs. Backports commit 26c6f695cfd2a3ccddb4d015a25b56f56aa62928 from qemu	2020-05-15 23:41:37 -04:00
Peter Maydell	05e72483f4	target/arm: Convert Neon 3-reg-same compare insns to decodetree Convert the Neon integer 3-reg-same compare insns VCGE, VCGT, VCEQ, VACGE and VACGT to decodetree. Backports commit 727ff1d63213e6666e511956903b9e97a339ec7e from qemu	2020-05-15 23:37:53 -04:00
Peter Maydell	042df686ca	target/arm: Convert Neon fp VMUL, VMLA, VMLS 3-reg-same insns to decodetree Convert the Neon integer VMUL, VMLA, and VMLS 3-reg-same inssn to decodetree. We don't have a gvec helper for multiply-accumulate, so VMLA and VMLS need a loop function do_3same_fp(). This takes a reads_vd parameter to do_3same_fp() which tells it to load the old value into vd before calling the callback function, in the same way that the do_vfp_3op_sp() and do_vfp_3op_dp() functions in translate-vfp.inc.c work. (The only uses in this patch pass reads_vd == true, but later commits will use reads_vd == false.) This conversion fixes in passing an underdecoding for VMUL Backports commit 8aa71ead912ca0a9c0d29b74e0976f91952f950a from qemu	2020-05-15 23:35:21 -04:00
Peter Maydell	2527e76926	target/arm: Convert Neon VPMIN/VPMAX/VPADD float 3-reg-same insns to decodetree Convert the Neon float VPMIN, VPMAX and VPADD 3-reg-same insns to decodetree. These are the only remaining 'pairwise' operations, so we can delete the pairwise-specific bits of the old decoder's for-each-element loop now. Backports commit ab978335a56e3618212868fdce3a54217c6e71e6 from qemu	2020-05-15 23:31:15 -04:00
Peter Maydell	bb0aa79847	target/arm: Convert Neon VADD, VSUB, VABD 3-reg-same insns to decodetree Convert the Neon VADD, VSUB, VABD 3-reg-same insns to decodetree. We already have gvec helpers for addition and subtraction, but must add one for fabd. Backports commit a26a352bb498662cd0c205cb433a352f86fac7d2 from qemu	2020-05-15 23:26:51 -04:00
Peter Maydell	1df5d57e8a	target/arm: Convert Neon VQDMULH/VQRDMULH 3-reg-same to decodetree Convert the Neon VQDMULH and VQRDMULH 3-reg-same insns to decodetree. These are the last integer operations in the 3-reg-same group. Backports commit 7ecc28bc72b8033cf4e0c6332135ec20d4125dfb from qemu	2020-05-15 23:06:44 -04:00
Peter Maydell	59818edb3c	target/arm: Convert Neon VPADD 3-reg-same insns to decodetree Convert the Neon integer VPADD 3-reg-same insns to decodetree. These are 'pairwise' operations. (Note that VQRDMLAH, which shares the same primary opcode but has U=1, has already been converted.) Backports commit fa22827d4eb078b6c58cd3d19af0b50ed951e832 from qemu	2020-05-15 23:01:25 -04:00
Peter Maydell	1cc6451cb6	target/arm: Convert Neon VPMAX/VPMIN 3-reg-same insns to decodetree Convert the Neon integer VPMAX and VPMIN 3-reg-same insns to decodetree. These are 'pairwise' operations. Backports commit 059c2398a2b1ae86c6722c45e79fb0d0f4d95b1d from qemu	2020-05-15 22:59:10 -04:00
Peter Maydell	f35ae14ab4	target/arm: Convert Neon VQSHL, VRSHL, VQRSHL 3-reg-same insns to decodetree Convert the VQSHL, VRSHL and VQRSHL insns in the 3-reg-same group to decodetree. We have already implemented the size==0b11 case of these insns; this commit handles the remaining sizes Backports commit 6812dfdc6b0286730d6f903ebfbdc4f81b80c29b from qemu	2020-05-15 22:53:27 -04:00
Peter Maydell	5308fb324e	target/arm: Convert Neon VRHADD, VHSUB 3-reg-same insns to decodetree Convert the Neon VRHADD and VHSUB 3-reg-same insns to decodetree. (These are all the other insns in 3-reg-same which were using GEN_NEON_INTEGER_OP() and which are not pairwise or reversed-operands.) Backports commit 8e44d03f4b5590e19a4f7910ca1c327609933dd7 from qemu	2020-05-15 22:50:02 -04:00
Peter Maydell	ec327c7fc8	target/arm: Convert Neon VABA/VABD 3-reg-same to decodetree Convert the Neon VABA and VABD insns in the 3-reg-same group to decodetree. Backports commit 7715098f93ff5205334edf161e5fe156346122b0 from qemu	2020-05-15 22:46:02 -04:00
Peter Maydell	f1028fe4a7	target/arm: Convert Neon VHADD 3-reg-same insns Convert the Neon VHADD insns in the 3-reg-same group to decodetree. Backports commit cb294bca866f1cd776e44e03e5e432942bc676e8 from qemu	2020-05-15 22:43:01 -04:00
Peter Maydell	4098e0b80a	target/arm: Convert Neon 64-bit element 3-reg-same insns Convert the 64-bit element insns in the 3-reg-same group to decodetree. This covers VQSHL, VRSHL and VQRSHL where size==0b11. Backports commit 35d4352fa9e94b35bf17f58181cb16c184b98d56 from qemu	2020-05-15 22:40:48 -04:00
Peter Maydell	e2b703a82c	target/arm: Convert Neon 3-reg-same SHA to decodetree Convert the Neon SHA instructions in the 3-reg-same group to decodetree Backports commit 21290edfc29d8929741c0ed043733c23c69bc3b9 from qemu	2020-05-15 22:34:40 -04:00
Richard Henderson	1740e018f4	target/arm: Convert Neon 3-reg-same VQRDMLAH/VQRDMLSH to decodetree Convert the Neon VQRDMLAH and VQRDMLSH insns in the 3-reg-same group to decodetree. These don't use do_3same() because they want to operate on VFP double registers, whose offsets are different from the neon_reg_offset() calculations do_3same does. Backports commit a063569508af8295cf6271e06700e5b956bb402d from qemu	2020-05-15 22:20:23 -04:00
Richard Henderson	451683ee79	target/arm: Vectorize SABA/UABA Include 64-bit element size in preparation for SVE2. Backports commit cfdb2c0c95ae9205b0dd7f0f5e970cdec50fef20 from qemu	2020-05-15 22:15:14 -04:00
Richard Henderson	98c79f9afc	target/arm: Vectorize SABD/UABD Include 64-bit element size in preparation for SVE2. Backports commit 50c160d44eb059c7fc7f348ae2c3b0cb41437044 from qemu	2020-05-15 22:01:29 -04:00
Richard Henderson	765dbb57f0	target/arm: Clear tail in gvec_fmul_idx_, gvec_fmla_idx_ Must clear the tail for AdvSIMD when SVE is enabled. Fixes: ca40a6e6e39 Backports commit 525d9b6d42844e187211d25b69be8b378785bc24 from qemu	2020-05-15 21:50:30 -04:00
Richard Henderson	73d08253a2	target/arm: Pass pointer to qc to qrdmla/qrdmls Pass a pointer directly to env->vfp.qc[0], rather than env. This will allow SVE2, which does not modify QC, to pass a pointer to dummy storage. Change the return type of inl_qrdml.h_s16 to match the sense of the operation: signed. Backports commit e286bf4a72fe3a60490b8d6e3f28d6335677e08c from qemu	2020-05-15 21:48:35 -04:00
Richard Henderson	3c4f226e00	target/arm: Create gen_gvec_{qrdmla,qrdmls} Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Backports commit 146aa66ce58b686b8037d0eb3921c1125942dbde from qemu	2020-05-15 21:43:22 -04:00
Richard Henderson	efdcad70b1	target/arm: Remove fp_status from helper_{recpe, rsqrte}_u32 These operations do not touch fp_status. Backports commit fe6fb4beb2f9bb0afc813e565504b66a92bbf04b from qemu	2020-05-15 21:32:03 -04:00
Richard Henderson	9dfc0479ff	target/arm: Create gen_gvec_{uqadd, sqadd, uqsub, sqsub} Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Backports commit c7715b6b51a6f7a5412c5fcb40a4c8586105e597 from qemu	2020-05-15 21:25:06 -04:00
Richard Henderson	4abfe5156d	target/arm: Create gen_gvec_{cmtst,ushl,sshl} Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Backports commit 8161b75357095fef54c76b1a6ed1e54d0e8655e0 from qemu	2020-05-15 21:15:49 -04:00
Richard Henderson	15b2850f4d	target/arm: Swap argument order for VSHL during decode Rather than perform the argument swap during code generation, perform it during decode. This means it doesn't have to be special cased later, and we can share code with aarch64 code generation. Hopefully the decode comment addresses any confusion that might arise in between. Backports commit e9eee5316ffec5f37643de806b2e5577c5c189cf from qemu	2020-05-15 21:07:59 -04:00
Richard Henderson	546db9089c	target/arm: Create gen_gvec_{mla,mls} Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Backports commit 271063206a46062a45fc6bab8dabe45f0b88159d from qemu	2020-05-15 21:06:06 -04:00
Richard Henderson	340f97bf4c	target/arm: Create gen_gvec_{ceq,clt,cle,cgt,cge}0 Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Macro-ize the 5 nearly identical comparisons. Backports commit 69d5e2bf8c3cefedbfa1c1670137e636dbd7faa5 from qemu	2020-05-15 20:57:33 -04:00
Richard Henderson	e08c2b8ece	target/arm: Tidy handle_vec_simd_shri Now that we've converted all cases to gvec, there is quite a bit of dead code at the end of the function. Remove it. Sink the call to gen_gvec_fn2i to the end, loading a function pointer within the switch statement. Backports commit 3f08f0bce841e7857ec98ce7909629d0c335005e from qemu	2020-05-15 20:47:47 -04:00
Richard Henderson	7a1750d691	target/arm: Remove unnecessary range check for VSHL In 1dc8425e551, while converting to gvec, I added an extra range check against the shift count. This was unnecessary because the encoding of the shift count produces 0 to the element size - 1. Backports commit 2f27c5244db300387f15d9ffa5067a204ffd625d from qemu	2020-05-15 20:42:12 -04:00
Richard Henderson	6190be3191	target/arm: Create gen_gvec_{sri,sli} The functions eliminate duplication of the special cases for this operation. They match up with the GVecGen2iFn typedef. Add out-of-line helpers. We got away with only having inline expanders because the neon vector size is only 16 bytes, and we know that the inline expansion will always succeed. When we reuse this for SVE, tcg-gvec-op may decide to use an out-of-line helper due to longer vector lengths. Backports commit 893ab0542aa385a287cbe46d5535c8b9e95ce699 from qemu	2020-05-15 20:39:28 -04:00
Richard Henderson	2609e6f319	target/arm: Create gen_gvec_{u,s}{rshr,rsra} Create vectorized versions of handle_shri_with_rndacc for shift+round and shift+round+accumulate. Add out-of-line helpers in preparation for longer vector lengths from SVE. Backports commit 6ccd48d4ea244c1c46a24dfa50bfb547f11422dd from qemu	2020-05-15 20:28:44 -04:00
Richard Henderson	5d7c46204d	target/arm: Create gen_gvec_[us]sra The functions eliminate duplication of the special cases for this operation. They match up with the GVecGen2iFn typedef. Add out-of-line helpers. We got away with only having inline expanders because the neon vector size is only 16 bytes, and we know that the inline expansion will always succeed. When we reuse this for SVE, tcg-gvec-op may decide to use an out-of-line helper due to longer vector lengths. Backports commit 631e565450c483e0622eec3d8b61d7fa41d16bca from qemu	2020-05-15 20:10:32 -04:00
Richard Henderson	4be4ca57b1	target/arm: Fix tcg_gen_gvec_dup_imm vs DUP (indexed) DUP (indexed) can duplicate 128-bit elements, so using esz unconditionally can assert in tcg_gen_gvec_dup_imm. Fixes: 8711e71f9cbb Backports commit 7e17d50ebd359ee5fa3d65d7fdc0fe0336d60694 from qemu	2020-05-11 17:22:52 -04:00
Lioncash	5c03efd5d6	arm/helper: Amend sign conversion warning	2020-05-11 17:21:25 -04:00
Lioncash	08cc2c6dcc	arm/cpu64: Remove unused variable	2020-05-11 17:18:13 -04:00
Richard Henderson	f93deb0786	target/arm: Use tcg_gen_gvec_5_ptr for sve FMLA/FCMLA Now that we can pass 7 parameters, do not encode register operands within simd_data. Backports commit 08975da9f0bfcfa654628cae71201a351ba5449a from qemu	2020-05-11 17:17:17 -04:00
Thomas Huth	dfe548117e	target/arm: Make set_feature() available for other files Move the common set_feature() and unset_feature() functions from cpu.c and cpu64.c to cpu.h. Backports commit 5fda95041d7237ab35733ceb66e0cb89f6107169 from qemu	2020-05-11 17:02:21 -04:00
Philippe Mathieu-Daudé	cfe94f63f3	target/arm/cpu: Use ARRAY_SIZE() to iterate over ARMCPUInfo[] Since on the aarch64-linux-user build, arm_cpus[] is empty, add the cpu_count variable and only iterate when it is non-zero. Backports commit 92b6a659388ab3735e5fbb17ac486923b681f57f from qemu	2020-05-11 16:59:54 -04:00
Richard Henderson	4016b667f3	accel/tcg: Add block comment for probe_access Backports commit 857129b34190a4c2e782006dc255352a6cd3934b from qemu	2020-05-11 16:42:10 -04:00
Edgar E. Iglesias	91dbd53f77	target/arm: Drop access_el3_aa32ns_aa64any() Calling access_el3_aa32ns() works for AArch32 only cores but it does not handle 32-bit EL2 on top of 64-bit EL3 for mixed 32/64-bit cores. Merge access_el3_aa32ns_aa64any() into access_el3_aa32ns() and only use the latter. Fixes: 68e9c2fe65 ("target-arm: Add VTCR_EL2") Backports commit 93dd1e6140e2652347cfe7208591d4cd32762d08 from qemu	2020-05-11 16:39:40 -04:00
MerryMage	9255fbce96	target/arm: Introduce add_reg_for_lit (fixup) Backports commit 16e0d8234ef9291747332d2c431e46808a060472 from qemu Missed from original backporting commit `a2e60445de`	2020-05-10 12:30:52 +01:00
Richard Henderson	742301a7c1	tcg: Fix integral argument type to tcg_gen_rot[rl]i_i{32,64} For the benefit of compatibility of function pointer types, we have standardized on int32_t and int64_t as the integral argument to tcg expanders. We converted most of them in 474b2e8f0f7, but missed the rotates. Backports commit 07dada0336a83002dfa8673a9220a88e13d9a45c from qemu	2020-05-07 10:41:01 -04:00
Richard Henderson	0bcd0ca93d	tcg: Add load_dest parameter to GVecGen2 We have this same parameter for GVecGen2i, GVecGen3, and GVecGen3i. This will make some SVE2 insns easier to parameterize. Backports commit ac09ae627e9a2c65c8a452b69c3dac33c29d0719 from qemu	2020-05-07 10:35:47 -04:00
Richard Henderson	f02f71f38f	tcg: Improve vector tail clearing Better handling of non-power-of-2 tails as seen with Arm 8-byte vector operations. Backports commit f47db80cc073c0a7a22136c8296b5eca20c0e199 from qemu	2020-05-07 10:24:00 -04:00
Richard Henderson	549b0ec3c5	tcg: Add tcg_gen_gvec_dup_tl For use when a target needs to pass a configure-specific target_ulong value to duplicate. Backports commit 0f039e3ad9131966d9fe509c231b756868b015e2 from qemu	2020-05-07 10:12:09 -04:00
Richard Henderson	e65806c356	tcg: Remove tcg_gen_gvec_dup{8,16,32,64}i These interfaces are now unused. Backports commit 398f21412aeec158338963e3f71c9313bc126a71 form qemu	2020-05-07 10:11:00 -04:00
Richard Henderson	43a72b0540	tcg: Use tcg_gen_gvec_dup_imm in logical simplifications Replace the outgoing interface. Backports commit 03ddb6f315ca6d02dfdba0aecc43aa97c728c428 from qemu	2020-05-07 10:09:53 -04:00
Richard Henderson	b0f6374149	target/arm: Use tcg_gen_gvec_dup_imm In a few cases, we're able to remove some manual replication. Backports commit 8711e71f9cbb692d614e6ecf5d51222372f7b77e from qemu	2020-05-07 10:05:49 -04:00
Richard Henderson	07f622e57d	tcg: Add tcg_gen_gvec_dup_imm Add a version of tcg_gen_dup_* that takes both immediate and a vector element size operand. This will replace the set of tcg_gen_gvec_dup{8,16,32,64}i functions that encode the element size within the function name. Backports commit 44c94677febd15488f9190b11eaa4a08e8ac696b from qemu	2020-05-07 09:55:25 -04:00
Peter Maydell	d350125eab	target/arm: Move gen_ function typedefs to translate.h We're going to want at least some of the NeonGen* typedefs for the refactored 32-bit Neon decoder, so move them all to translate.h since it makes more sense to keep them in one group. Backports commit 9aefc6cf9b73f66062d2f914a0136756e7a28211 from qemu	2020-05-07 09:51:52 -04:00
Peter Maydell	652165d671	target/arm: Convert Neon 3-reg-same VMUL, VMLA, VMLS, VSHL to decodetree Convert the Neon VMUL, VMLA, VMLS and VSHL insns in the 3-reg-same grouping to decodetree. Backports commit 0de34fd48ad4e44bf5caa2330657ebefa93cea7d from qemu	2020-05-07 09:50:44 -04:00
Peter Maydell	17bd8930fc	target/arm: Convert Neon 3-reg-same VQADD/VQSUB to decodetree Convert the Neon VQADD/VQSUB insns in the 3-reg-same grouping to decodetree. Backports commit 7a9497f1cf73667a4744d09673b808c20e067915 from qemu	2020-05-07 09:47:18 -04:00
Peter Maydell	d52b830ce3	target/arm: Convert Neon 3-reg-same comparisons to decodetree Convert the Neon comparison ops in the 3-reg-same grouping to decodetree. Backports commit 02bd0cdb64b3e79419ba3a8746cb86430883b3ae from qemu	2020-05-07 09:45:03 -04:00
Peter Maydell	c6f9fb54fd	target/arm: Convert Neon 3-reg-same VMAX/VMIN to decodetree Convert the Neon 3-reg-same VMAX and VMIN insns to decodetree. Backports commit 36b59310c38d45213bf860affa90618aa5eeca93 from qemu	2020-05-07 09:42:04 -04:00
Peter Maydell	d30f99ca79	target/arm: Convert Neon 3-reg-same logic ops to decodetree Convert the Neon logic ops in the 3-reg-same grouping to decodetree. Note that for the logic ops the 'size' field forms part of their decode and the actual operations are always bitwise. Backports commit 35a548edb6f5043386183b9f6b4139d99d1f130a from qemu	2020-05-07 09:40:10 -04:00
Peter Maydell	eae3ce9899	target/arm: Convert Neon 3-reg-same VADD/VSUB to decodetree Convert the Neon 3-reg-same VADD and VSUB insns to decodetree. Note that we don't need the neon_3r_sizes[op] check here because all size values are OK for VADD and VSUB; we'll add this when we convert the first insn that has size restrictions. For this we need one of the GVecGen*Fn typedefs currently in translate-a64.h; move them all to translate.h as a block so they are visible to the 32-bit decoder. Backports commit a4e143ac5b9185f670d2f17ee9cc1a430047cb65 from qemu	2020-05-07 09:36:28 -04:00
Peter Maydell	c7a31355fc	target/arm: Convert Neon 'load/store single structure' to decodetree Convert the Neon "load/store single structure to one lane" insns to decodetree. As this is the last set of insns in the neon load/store group, we can remove the whole disas_neon_ls_insn() function. Backports commit 123ce4e3daba26b760b472687e1fb1ad82cf1993 from qemu	2020-05-07 09:32:17 -04:00
Peter Maydell	302506f2f6	target/arm: Convert Neon 'load single structure to all lanes' to decodetree Convert the Neon "load single structure to all lanes" insns to decodetree. Backports commit 3698747c48db871d876a398592c5a23d7580ed4a from qemu	2020-05-07 09:29:03 -04:00
Peter Maydell	7aad825fa6	target/arm: Convert Neon load/store multiple structures to decodetree Convert the Neon "load/store multiple structures" insns to decodetree. Backports commit a27b46304352a0eced45e560e96515dbe3cc174f from qemu	2020-05-07 09:25:51 -04:00
Peter Maydell	9814c1722f	target/arm: Convert VFM[AS]L (scalar) to decodetree Convert the VFM[AS]L (scalar) insns in the 2reg-scalar-ext group to decodetree. These are the last ones in the group so we can remove all the legacy decode for the group. Note that in disas_thumb2_insn() the parts of this encoding space where the decodetree decoder returns false will correctly be directed to illegal_op by the "(insn & (1 << 28))" check so they won't fall into disas_coproc_insn() by mistake. Backports commit d27e82f7d02f35e5919bd9cbbcb157f3537069a0 from qemu	2020-05-07 09:20:35 -04:00
Peter Maydell	49cdb7e2db	target/arm: Convert V[US]DOT (scalar) to decodetree Convert the V[US]DOT (scalar) insns in the 2reg-scalar-ext group to decodetree. Backports commit 35f5d4d1747558c6af2d914bcd848dcc30c3b531 from qemu	2020-05-07 09:17:32 -04:00

... 5 6 7 8 9 ...

5864 commits