Commit graph

1474 commits

Author SHA1 Message Date
Peter Maydell dd6e11eaa7 target/arm: Make VFP_CONV_FIX macros take separate float type and float size
Currently the VFP_CONV_FIX macros take a single fsz argument for the
size of the float type, which is used both to select the name of
the functions to call (eg float32_is_any_nan()) and also for the
type to use for the float inputs and outputs (eg float32).

Separate these into fsz and ftype arguments, so that we can use them
for fp16, which uses 'float16' in the function names but is still
passing inputs and outputs in a 32-bit sized type.

Backports 5366f6ad7da4f6def2733ec7ee24495430256839
2021-02-28 05:05:53 -05:00
Peter Maydell f8241ae22f target/arm: Implement VFP fp16 VCVT between float and integer
Backports 0094e9f475a5a742d10d2f1e1beceea82b69f982
2021-02-28 05:02:25 -05:00
Peter Maydell ac9ae5cbe7 target/arm: Implement VFP fp16 VLDR and VSTR
Implement the fp16 versions of the VFP VLDR/VSTR (immediate).

Backports commit 274afbb121107b8aaeaa11b3e7904d5f8ae38a94
2021-02-28 04:58:32 -05:00
Peter Maydell 5d98e14545 target/arm: Implement VFP fp16 VCMP
Implement fp16 version of VCMP.

Backports 1b88b054c5b201e8581114d29527c6a5a7e088c9
2021-02-28 04:56:24 -05:00
Peter Maydell 25d95570f3 target/arm: Implement VFP fp16 for VMOV immediate
Implement VFP fp16 support for the VMOV immediate insn.

Backports commit 28c28728e53c9f4c13a5cd50f313788c7ec2f9ad
2021-02-28 04:51:11 -05:00
Peter Maydell 2d9abf7c0b target/arm: Implement VFP fp16 for VABS, VNEG, VSQRT
Implement VFP fp16 for VABS, VNEG and VSQRT. This is all
the fp16 insns that use the DO_VFP_2OP macro, because there
is no fp16 version of VMOV_reg.

Notes:
* the gen_helper_vfp_negh already exists as we needed to create
it for the fp16 multiply-add insns
* as usual we need to use the f16 version of the fp_status;
this is only relevant for VSQRT

Backports ce2d65a5d191380756cdac7a1fd1ba76bd1621cf
2021-02-28 04:48:28 -05:00
Peter Maydell f3af6b8c25 target/arm: Macroify uses of do_vfp_2op_sp() and do_vfp_2op_dp()
Macroify the uses of do_vfp_2op_sp() and do_vfp_2op_dp(); this will
make it easier to add the halfprec support.

Backports 009a07335b8ff492d940e1eb229a1b0d302c2512
2021-02-28 04:43:01 -05:00
Peter Maydell 6ac2c597ab target/arm: Implement VFP fp16 for fused-multiply-add
Implement VFP fp16 support for fused multiply-add insns
VFNMA, VFNMS, VFMA, VFMS.

Backports 9886fe2834b064a3cf0675a4659942ed547aed42
2021-02-28 04:39:21 -05:00
Peter Maydell f86c84425b target/arm: Macroify trans functions for VFMA, VFMS, VFNMA, VFNMS
Macroify creation of the trans functions for single and double
precision VFMA, VFMS, VFNMA, VFNMS. The repetition was OK for
two sizes, but we're about to add halfprec and it will get a bit
more than seems reasonable.

Backports 2aa8dcfa14558fe2a63ed0496d60b02565c9a225
2021-02-28 04:36:07 -05:00
Peter Maydell a42ecfe203 target/arm: Implement VFP fp16 VMLA, VMLS, VNMLS, VNMLA, VNMUL
Implement fp16 versions of the VFP VMLA, VMLS, VNMLS, VNMLA, VNMUL
instructions. (These are all the remaining ones which we implement
via do_vfp_3op_[hsd]p().)

Backports commit e7cb0ded52c6d7b86585b09935fe7caeb9e38b69
2021-02-28 04:29:37 -05:00
Peter Maydell eae621098d target/arm: Implement VFP fp16 for VFP_BINOP operations
Implmeent VFP fp16 support for simple binary-operator VFP insns VADD,
VSUB, VMUL, VDIV, VMINNM and VMAXNM:

* make the VFP_BINOP() macro generate float16 helpers as well as
float32 and float64
* implement a do_vfp_3op_hp() function similar to the existing
do_vfp_3op_sp()
* add decode for the half-precision insn patterns

Note that the VFP_BINOP macro use creates a couple of unused helper
functions vfp_maxh and vfp_minh, but they're small so it's not worth
splitting the BINOP operations into "needs halfprec" and "no
halfprec" groups.

Backports commit 120a0eb3ea23a5b06fae2f3daebd46a4035864cf
2021-02-28 04:24:39 -05:00
Peter Maydell 1afb240134 target/arm: Use correct ID register check for aa32_fp16_arith
The aa32_fp16_arith feature check function currently looks at the
AArch64 ID_AA64PFR0 register. This is (as the comment notes) not
correct. The bogus check was put in mostly to allow testing of the
fp16 variants of the VCMLA instructions and it was something of
a mistake that we allowed them to exist in master.

Switch the feature check function to testing VMFR1.FPHP, which is
what it ought to be.

This will remove emulation of the VCMLA and VCADD insns from
AArch32 code running on an AArch64 '-cpu max' using system emulation.
(They were never enabled for aarch32 linux-user and system-emulation.)
Since we weren't advertising their existence via the AArch32 ID
register, well-behaved guests wouldn't have been using them anyway.

Once we have implemented all the AArch32 support for the FP16 extension
we will advertise it in the MVFR1 ID register field, which will reenable
these insns along with all the others.

Backports 02bc236d0131a666d4ac2bb7197bbad2897c336a
2021-02-27 16:47:48 -05:00
Peter Maydell b93ca1fca6 target/arm: Remove local definitions of float constants
In several places the target/arm code defines local float constants
for 2, 3 and 1.5, which are also provided by include/fpu/softfloat.h.
Remove the unnecessary local duplicate versions.

Backports b684e49a17da39539b0ac6e4c4c98b28b38feb76
2021-02-27 16:47:10 -05:00
Chen Qun 46af765bbb target/arm/translate-a64:Remove redundant statement in disas_simd_two_reg_misc_fp16()
Clang static code analyzer show warning:
target/arm/translate-a64.c:13007:5: warning: Value stored to 'rd' is never read
rd = extract32(insn, 0, 5);
^ ~~~~~~~~~~~~~~~~~~~~~
target/arm/translate-a64.c:13008:5: warning: Value stored to 'rn' is never read
rn = extract32(insn, 5, 5);
^ ~~~~~~~~~~~~~~~~~~~~~

Backports fa71dd531c12ad9a05cdd78392e9fc2a30ea921d
2021-02-27 16:45:25 -05:00
Chen Qun 9bac2113cd target/arm/translate-a64:Remove dead assignment in handle_scalar_simd_shli()
Clang static code analyzer show warning:
target/arm/translate-a64.c:8635:14: warning: Value stored to 'tcg_rn' during its
initialization is never read
TCGv_i64 tcg_rn = new_tmp_a64(s);
^~~~~~ ~~~~~~~~~~~~~~
target/arm/translate-a64.c:8636:14: warning: Value stored to 'tcg_rd' during its
initialization is never read
TCGv_i64 tcg_rd = new_tmp_a64(s);
^~~~~~ ~~~~~~~~~~~~~~

Backports 07174c86b41e91d98ed2ee0ee12e516694853c6b
2021-02-27 16:44:29 -05:00
Lioncash f5a21abc0b target/arm: Convert sq{, r}dmulh to gvec for aa64 advsimd 2021-02-26 15:01:44 -05:00
Richard Henderson aa97b6b755 target/arm: Convert integer multiply-add (indexed) to gvec for aa64 advsimd
Backports 3607440c4df6498585a570cfc1041e4972b41b56
2021-02-26 14:51:17 -05:00
Richard Henderson 732674b868 target/arm: Convert integer multiply (indexed) to gvec for aa64 advsimd
Backports 2e5a265e6a9e7169c4a3e87db261b2fa92582590
2021-02-26 14:46:29 -05:00
Richard Henderson 80325ac866 target/arm: Generalize inl_qrdmlah_* helper functions
Unify add/sub helpers and add a parameter for rounding.
This will allow saturating non-rounding to reuse this code.

Backports d21798856b227a20a0a41640236af445f4f4aeb0
2021-02-26 14:41:32 -05:00
Richard Henderson 1bedcfbda3 target/arm: Tidy SVE tszimm shift formats
Rather than require the user to fill in the immediate (shl or shr),
create full formats that include the immediate.
2021-02-26 14:35:53 -05:00
Richard Henderson da41a23a1b target/arm: Split out gen_gvec_ool_zz
Backports 40e32e5a8a379baf6e0d49d83cf19950cfbaf96b
2021-02-26 14:32:36 -05:00
Richard Henderson 5bd98feed9 target/arm: Split out gen_gvec_ool_zzz
Backports e645d1a17a359156c6047006d760ca176d493edb
2021-02-26 14:29:48 -05:00
Richard Henderson aa3819c396 target/arm: Split out gen_gvec_ool_zzp
Model after gen_gvec_fn_zzz et al.

Backports 96a461f7c12587d3a64a71e4d90cda5c09ca3eb4
2021-02-26 14:26:33 -05:00
Lioncash 2da89a626c target/arm: Merge helper_sve_clr_* and helper_sve_movz_* 2021-02-26 14:23:06 -05:00
Richard Henderson 8eb3642d96 target/arm: Split out gen_gvec_ool_zzzp
Model after gen_gvec_fn_zzz et al.

Backports 36cbb7a8e7100864c488a1153cecba90b1c33a4c
2021-02-26 14:14:13 -05:00
Richard Henderson 9b3671e9ad target/arm: Use tcg_gen_gvec_bitsel for trans_SEL_pppp
The gvec operation was added after the initial implementation
of the SEL instruction and was missed in the conversion.

Backports d4bc623254b55e2f9613c9450216fa7e50c03929
2021-02-26 14:12:25 -05:00
Richard Henderson c8c247410f target/arm: Clean up 4-operand predicate expansion
Move the check for !S into do_pppp_flags, which allows to merge in
do_vecop4_p. Split out gen_gvec_fn_ppp without sve_access_check,
to mirror gen_gvec_fn_zzz.

Backport dd81a8d7cf5c90963603806e58a217bbe759f75e
2021-02-26 14:07:14 -05:00
Richard Henderson 7bef6489a8 target/arm: Merge do_vector2_p into do_mov_p
This is the only user of the function

Backports d0b2df5a01eeccbac71d4d883158b91e7f9a6a29
2021-02-26 13:59:00 -05:00
Richard Henderson f329d428f3 target/arm: Rearrange {sve,fp}_check_access assert
We want to ensure that access is checked by the time we ask
for a specific fp/vector register. We want to ensure that
we do not emit two lots of code to raise an exception.

But sometimes it's difficult to cleanly organize the code
such that we never pass through sve_check_access exactly once.
Allow multiple calls so long as the result is true, that is,
no exception to be raised.

Backports 8a40fe5f1bf3837ae3f9961efe1d51e7214f2664
2021-02-26 13:56:27 -05:00
Richard Henderson 64822511dd target/arm: Split out gen_gvec_fn_zzz, do_zzz_fn
Model gen_gvec_fn_zzz on gen_gvec_fn3 in translate-a64.c, but
indicating which kind of register and in which order.

Model do_zzz_fn on the other do_foo functions that take an
argument set and verify sve enabled.

Backports 28c4da31be6a5e501b60b77bac17652dd3211378
2021-02-26 13:53:10 -05:00
Richard Henderson 3146cbb64e target/arm: Split out gen_gvec_fn_zz
Model the new function on gen_gvec_fn2 in translate-a64.c, but
indicating which kind of register and in which order. Since there
is only one user of do_vector2_z, fold it into do_mov_z

Backports f7d79c41fa4bd0f0d27dcd14babab8575fbed39f
2021-02-26 13:50:05 -05:00
Richard Henderson 6f341e0199 target/arm: Fill in the WnR syndrome bit in mte_check_fail
According to AArch64.TagCheckFault, none of the other ISS values are
provided, so we do not need to go so far as merge_syn_data_abort.
But we were missing the WnR bit.

Backports commit 9a4670be7f0734d27bf4058db3becf83cd0cc9d5 from qemu
2021-02-26 12:26:15 -05:00
Richard Henderson 6969435fb8 target/arm: Pass the entire mte descriptor to mte_check_fail
We need more information than just the mmu_idx in order
to create the proper exception syndrome. Only change the
function signature so far.

Backports dbf8c32178291169e111a6a9fd7ae17af4a3039d
2021-02-26 12:19:51 -05:00
Philippe Mathieu-Daudé d4c59cce4e target/arm: Clarify HCR_EL2 ARMCPRegInfo type
In commit ce4afed839 ("target/arm: Implement AArch32 HCR and HCR2")
the HCR_EL2 register has been changed from type NO_RAW (no underlying
state and does not support raw access for state saving/loading) to
type CONST (TCG can assume the value to be constant), removing the
read/write accessors.
We forgot to remove the previous type ARM_CP_NO_RAW. This is not
really a problem since the field is overwritten. However it makes
code review confuse, so remove it.

Backports 0e5aac18bc31dbdfab51f9784240d0c31a4c5579
2021-02-26 12:18:15 -05:00
Peter Maydell 3e5aa58139 target/arm: Use correct FPST for VCMLA, VCADD on fp16
When we implemented the VCMLA and VCADD insns we put in the
code to handle fp16, but left it using the standard fp status
flags. Correct them to use FPST_STD_F16 for fp16 operations.

Bacports commit b34aa5129e9c3aff890b4f4bcc84962e94185629
2021-02-26 12:02:23 -05:00
Peter Maydell 61377ce01c target/arm: Implement FPST_STD_F16 fpstatus
Architecturally, Neon FP16 operations use the "standard FPSCR" like
all other Neon operations. However, this is defined in the Arm ARM
pseudocode as "a fixed value, except that FZ16 (and AHP) follow the
FPSCR bits". In QEMU, the softfloat float_status doesn't include
separate flush-to-zero for FP16 operations, so we must keep separate
fp_status for "Neon non-FP16" and "Neon fp16" operations, in the
same way we do already for the non-Neon "fp_status" vs "fp_status_f16".

Add the extra float_status field to the CPU state structure,
ensure it is correctly initialized and updated on FPSCR writes,
and make fpstatus_ptr(FPST_STD_F16) return a pointer to it.

Backports commit aaae563bc73de0598bbc09a102e68f27fafe704a
2021-02-26 12:00:25 -05:00
Peter Maydell b1b0a41507 target/arm: Make A32/T32 use new fpstatus_ptr() API
Make A32/T32 code use the new fpstatus_ptr() API:
get_fpstatus_ptr(0) -> fpstatus_ptr(FPST_FPCR)
get_fpstatus_ptr(1) -> fpstatus_ptr(FPST_STD)

Backports a84d1d1316726704edd2617b2c30c921d98a8137
2021-02-26 11:55:55 -05:00
Peter Maydell 79359e3a69 target/arm: Replace A64 get_fpstatus_ptr() with generic fpstatus_ptr()
We currently have two versions of get_fpstatus_ptr(), which both take
an effectively boolean argument:
* the one for A64 takes "bool is_f16" to distinguish fp16 from other ops
* the one for A32/T32 takes "int neon" to distinguish Neon from other ops

This is confusing, and to implement ARMv8.2-FP16 the A32/T32 one will
need to make a four-way distinction between "non-Neon, FP16",
"non-Neon, single/double", "Neon, FP16" and "Neon, single/double".
The A64 version will then be a strict subset of the A32/T32 version.

To clean this all up, we want to go to a single implementation which
takes an enum argument with values FPST_FPCR, FPST_STD,
FPST_FPCR_F16, and FPST_STD_F16. We rename the function to
fpstatus_ptr() so that unconverted code gets a compilation error
rather than silently passing the wrong thing to the new function.

This commit implements that new API, and converts A64 to use it:
get_fpstatus_ptr(false) -> fpstatus_ptr(FPST_FPCR)
get_fpstatus_ptr(true) -> fpstatus_ptr(FPST_FPCR_F16)

Backports commit cdfb22bb7326fee607d9553358856cca341dbc9a
2021-02-26 11:46:51 -05:00
Peter Maydell e9240f0f54 target/arm: Delete unused ARM_FEATURE_CRC
In commit 962fcbf2efe57231a9f5df we converted the uses of the
ARM_FEATURE_CRC bit to use the aa32_crc32 isar_feature test
instead. However we forgot to remove the now-unused definition
of the feature name in the enum. Delete it now.

Backports commit cf6303d262e31f4812dfeb654c6c6803e52000af
2021-02-26 11:24:40 -05:00
Peter Maydell e0000d1700 target/arm/translate.c: Delete/amend incorrect comments
In arm_tr_init_disas_context() we have a FIXME comment that suggests
"cpu_M0 can probably be the same as cpu_V0". This isn't in fact
possible: cpu_V0 is used as a temporary inside gen_iwmmxt_shift(),
and that function is called in various places where cpu_M0 contains a
live value (i.e. between gen_op_iwmmxt_movq_M0_wRn() and
gen_op_iwmmxt_movq_wRn_M0() calls). Remove the comment.

We also have a comment on the declarations of cpu_V0/V1/M0 which
claims they're "for efficiency". This isn't true with modern TCG, so
replace this comment with one which notes that they're only used with
the iwmmxt decode

Backports 8b4c9a50dc9531a729ae4b5941d287ad0422db48
2021-02-26 11:23:52 -05:00
Peter Maydell 0759bb8eaf target/arm: Delete unused VFP_DREG macros
As part of the Neon decodetree conversion we removed all
the uses of the VFP_DREG macros, but forgot to remove the
macro definitions. Do so now.

Backports e60527c5d501e5015a119a0388a27abeae4dac09
2021-02-26 11:22:01 -05:00
Peter Maydell 368323b03f target/arm: Remove ARCH macro
The ARCH() macro was used a lot in the legacy decoder, but
there are now just two uses of it left. Since a macro which
expands out to a goto is liable to be confusing when reading
code, replace the last two uses with a simple open-coded
qeuivalent.

Backports ce51c7f522ca488c795c3510413e338021141c96
2021-02-26 11:21:20 -05:00
Peter Maydell 5d9c0addcf target/arm: Convert T32 coprocessor insns to decodetree
Convert the T32 coprocessor instructions to decodetree.
As with the A32 conversion, this corrects an underdecoding
where we did not check that MRRC/MCRR [24:21] were 0b0010
and so treated some kinds of LDC/STC and MRRC/MCRR rather
than UNDEFing them.

Backports commit 4c498dcfd84281f20bd55072630027d1b3c115fd
2021-02-26 11:19:35 -05:00
Peter Maydell bdaaac68f5 target/arm: Do M-profile NOCP checks early and via decodetree
For M-profile CPUs, the architecture specifies that the NOCP
exception when a coprocessor is not present or disabled should cover
the entire wide range of coprocessor-space encodings, and should take
precedence over UNDEF exceptions. (This is the opposite of
A-profile, where checking for a disabled FPU has to happen last.)

Implement this with decodetree patterns that cover the specified
ranges of the encoding space. There are a few instructions (VLLDM,
VLSTM, and in v8.1 also VSCCLRM) which are in copro-space but must
not be NOCP'd: these must be handled also in the new m-nocp.decode so
they take precedence.

This is a minor behaviour change: for unallocated insn patterns in
the VFP area (cp=10,11) we will now NOCP rather than UNDEF when the
FPU is disabled.

As well as giving us the correct architectural behaviour for v8.1M
and the recommended behaviour for v8.0M, this refactoring also
removes the old NOCP handling from the remains of the 'legacy
decoder' in disas_thumb2_insn(), paving the way for cleaning that up.

Since we don't currently have a v8.1M feature bit or any v8.1M CPUs,
the minor changes to this logic that we'll need for v8.1M are marked
up with TODO comments.

Backports commit a3494d4671797c291c88bd414acb0aead15f7239 from qemu
2021-02-26 11:17:23 -05:00
Peter Maydell c675b73b1f target/arm: Tidy up disas_arm_insn()
The only thing left in the "legacy decoder" is the handling
of disas_xscale_insn(), and we can simplify the code.

Backports commit 8198c071bc55bee55ef4f104a5b125f541b51096
2021-02-26 10:59:09 -05:00
Peter Maydell fc4cc9d95f target/arm: Convert A32 coprocessor insns to decodetree
Convert the A32 coprocessor instructions to decodetree.

Note that this corrects an underdecoding: for the 64-bit access case
(MRRC/MCRR) we did not check that bits [24:21] were 0b0010, so we
would incorrectly treat LDC/STC as MRRC/MCRR rather than UNDEFing
them.

The decodetree versions of these insns assume the coprocessor
is in the range 0..7 or 14..15. This is architecturally sensible
(as per the comments) and OK in practice for QEMU because the only
uses of the ARMCPRegInfo infrastructure we have that aren't
for coprocessors 14 or 15 are the pxa2xx use of coprocessor 6.
We add an assertion to the define_one_arm_cp_reg_with_opaque()
function to catch any accidental future attempts to use it to
define coprocessor registers for invalid coprocessors.

Backports commit cd8be50e58f63413c033531d3273c0e44851684f from qemu
2021-02-26 10:57:00 -05:00
Peter Maydell ef0e23f1f9 target/arm: Separate decode from handling of coproc insns
As a prelude to making coproc insns use decodetree, split out the
part of disas_coproc_insn() which does instruction decoding from the
part which does the actual work, and make do_coproc_insn() handle the
UNDEF-on-bad-permissions and similar cases itself rather than
returning 1 to eventually percolate up to a callsite that calls
unallocated_encoding() for it.

Backports 19c23a9baafc91dd3881a7a4e9bf454e42d24e4e
2021-02-26 10:53:52 -05:00
Peter Maydell 2944a75b98 target/arm: Pull handling of XScale insns out of disas_coproc_insn()
At the moment we check for XScale/iwMMXt insns inside
disas_coproc_insn(): for CPUs with ARM_FEATURE_XSCALE all copro insns
with cp 0 or 1 are handled specially. This works, but is an odd
place for this check, because disas_coproc_insn() is called from both
the Arm and Thumb decoders but the XScale case never applies for
Thumb (all the XScale CPUs were ARMv5, which has only Thumb1, not
Thumb2 with the 32-bit coprocessor insn encodings). It also makes it
awkward to convert the real copro access insns to decodetree.

Move the identification of XScale out to its own function
which is only called from disas_arm_insn().

Backports commit 7b4f933db865391a90a3b4518bb2050a83f2a873 from qemu
2021-02-26 10:50:32 -05:00
Peter Maydell 0718459fb3 target/arm: Fix Rt/Rt2 in ESR_ELx for copro traps from AArch32 to 64
When a coprocessor instruction in an AArch32 guest traps to AArch32
Hyp mode, the syndrome register (HSR) includes Rt and Rt2 fields
which are simply copies of the Rt and Rt2 fields from the trapped
instruction. However, if the instruction is trapped from AArch32 to
an AArch64 higher exception level, the Rt and Rt2 fields in the
syndrome register (ESR_ELx) must be the AArch64 view of the register.
This makes a difference if the AArch32 guest was in a mode other than
User or System and it was using r13 or r14, or if it was in FIQ mode
and using r8-r14.

We don't know at translate time which AArch32 CPU mode we are in, so
we leave the values we generate in our prototype syndrome register
value at translate time as the raw Rt/Rt2 from the instruction, and
instead correct them to the AArch64 view when we find we need to take
an exception from AArch32 to AArch64 with one of these syndrome
values.

Fixes: https://bugs.launchpad.net/qemu/+bug/1879587

Backports commit a65dabf71a9f9b949d556b1b57fd72595df92398 from qemu
2021-02-25 23:50:18 -05:00
Peter Collingbourne 7de60dfa51 target/arm: Fix decode of LDRA[AB] instructions
These instructions use zero as the discriminator, not SP.

Backports commit d250bb19ced3b702c7c37731855f6876d0cc7995 from qemu
2021-02-25 23:47:25 -05:00