Commit graph

6953 commits

Author SHA1 Message Date
Peter Maydell beee4ad7f3 target/arm: Implement VFP vp16 VCVT-with-specified-rounding-mode
Implement the fp16 versions of the VFP VCVT instruction forms
which convert between floating point and integer with a specified
rounding mode.

Backports c505bc6a9d50a48f9d89d6cf930e863838a5b367
2021-02-28 05:18:07 -05:00
Peter Maydell 74a6af4e23 target/arm: Implement VFP fp16 VCVT between float and fixed-point
Implement the fp16 versions of the VFP VCVT instruction forms which
convert between floating point and fixed-point.

Backports a149e2de0b63e3906729ed1d3df7d9ecdb6de5e6
2021-02-28 05:15:40 -05:00
Peter Maydell 9c5b6f06a2 target/arm: Use macros instead of open-coding fp16 conversion helpers
Now the VFP_CONV_FIX macros can handle fp16's distinction between the
width of the operation and the width of the type used to pass operands,
use the macros rather than the open-coded functions.

This creates an extra six helper functions, all of which we are going
to need for the AArch32 VFP fp16 instructions.

Backports commit 414ba270c4fb758d987adf37ae9bfe531715c604
2021-02-28 05:08:44 -05:00
Peter Maydell dd6e11eaa7 target/arm: Make VFP_CONV_FIX macros take separate float type and float size
Currently the VFP_CONV_FIX macros take a single fsz argument for the
size of the float type, which is used both to select the name of
the functions to call (eg float32_is_any_nan()) and also for the
type to use for the float inputs and outputs (eg float32).

Separate these into fsz and ftype arguments, so that we can use them
for fp16, which uses 'float16' in the function names but is still
passing inputs and outputs in a 32-bit sized type.

Backports 5366f6ad7da4f6def2733ec7ee24495430256839
2021-02-28 05:05:53 -05:00
Peter Maydell f8241ae22f target/arm: Implement VFP fp16 VCVT between float and integer
Backports 0094e9f475a5a742d10d2f1e1beceea82b69f982
2021-02-28 05:02:25 -05:00
Peter Maydell ac9ae5cbe7 target/arm: Implement VFP fp16 VLDR and VSTR
Implement the fp16 versions of the VFP VLDR/VSTR (immediate).

Backports commit 274afbb121107b8aaeaa11b3e7904d5f8ae38a94
2021-02-28 04:58:32 -05:00
Peter Maydell 5d98e14545 target/arm: Implement VFP fp16 VCMP
Implement fp16 version of VCMP.

Backports 1b88b054c5b201e8581114d29527c6a5a7e088c9
2021-02-28 04:56:24 -05:00
Peter Maydell 25d95570f3 target/arm: Implement VFP fp16 for VMOV immediate
Implement VFP fp16 support for the VMOV immediate insn.

Backports commit 28c28728e53c9f4c13a5cd50f313788c7ec2f9ad
2021-02-28 04:51:11 -05:00
Peter Maydell 2d9abf7c0b target/arm: Implement VFP fp16 for VABS, VNEG, VSQRT
Implement VFP fp16 for VABS, VNEG and VSQRT. This is all
the fp16 insns that use the DO_VFP_2OP macro, because there
is no fp16 version of VMOV_reg.

Notes:
* the gen_helper_vfp_negh already exists as we needed to create
it for the fp16 multiply-add insns
* as usual we need to use the f16 version of the fp_status;
this is only relevant for VSQRT

Backports ce2d65a5d191380756cdac7a1fd1ba76bd1621cf
2021-02-28 04:48:28 -05:00
Peter Maydell f3af6b8c25 target/arm: Macroify uses of do_vfp_2op_sp() and do_vfp_2op_dp()
Macroify the uses of do_vfp_2op_sp() and do_vfp_2op_dp(); this will
make it easier to add the halfprec support.

Backports 009a07335b8ff492d940e1eb229a1b0d302c2512
2021-02-28 04:43:01 -05:00
Peter Maydell 6ac2c597ab target/arm: Implement VFP fp16 for fused-multiply-add
Implement VFP fp16 support for fused multiply-add insns
VFNMA, VFNMS, VFMA, VFMS.

Backports 9886fe2834b064a3cf0675a4659942ed547aed42
2021-02-28 04:39:21 -05:00
Peter Maydell f86c84425b target/arm: Macroify trans functions for VFMA, VFMS, VFNMA, VFNMS
Macroify creation of the trans functions for single and double
precision VFMA, VFMS, VFNMA, VFNMS. The repetition was OK for
two sizes, but we're about to add halfprec and it will get a bit
more than seems reasonable.

Backports 2aa8dcfa14558fe2a63ed0496d60b02565c9a225
2021-02-28 04:36:07 -05:00
Peter Maydell a42ecfe203 target/arm: Implement VFP fp16 VMLA, VMLS, VNMLS, VNMLA, VNMUL
Implement fp16 versions of the VFP VMLA, VMLS, VNMLS, VNMLA, VNMUL
instructions. (These are all the remaining ones which we implement
via do_vfp_3op_[hsd]p().)

Backports commit e7cb0ded52c6d7b86585b09935fe7caeb9e38b69
2021-02-28 04:29:37 -05:00
Peter Maydell eae621098d target/arm: Implement VFP fp16 for VFP_BINOP operations
Implmeent VFP fp16 support for simple binary-operator VFP insns VADD,
VSUB, VMUL, VDIV, VMINNM and VMAXNM:

* make the VFP_BINOP() macro generate float16 helpers as well as
float32 and float64
* implement a do_vfp_3op_hp() function similar to the existing
do_vfp_3op_sp()
* add decode for the half-precision insn patterns

Note that the VFP_BINOP macro use creates a couple of unused helper
functions vfp_maxh and vfp_minh, but they're small so it's not worth
splitting the BINOP operations into "needs halfprec" and "no
halfprec" groups.

Backports commit 120a0eb3ea23a5b06fae2f3daebd46a4035864cf
2021-02-28 04:24:39 -05:00
Peter Maydell 1afb240134 target/arm: Use correct ID register check for aa32_fp16_arith
The aa32_fp16_arith feature check function currently looks at the
AArch64 ID_AA64PFR0 register. This is (as the comment notes) not
correct. The bogus check was put in mostly to allow testing of the
fp16 variants of the VCMLA instructions and it was something of
a mistake that we allowed them to exist in master.

Switch the feature check function to testing VMFR1.FPHP, which is
what it ought to be.

This will remove emulation of the VCMLA and VCADD insns from
AArch32 code running on an AArch64 '-cpu max' using system emulation.
(They were never enabled for aarch32 linux-user and system-emulation.)
Since we weren't advertising their existence via the AArch32 ID
register, well-behaved guests wouldn't have been using them anyway.

Once we have implemented all the AArch32 support for the FP16 extension
we will advertise it in the MVFR1 ID register field, which will reenable
these insns along with all the others.

Backports 02bc236d0131a666d4ac2bb7197bbad2897c336a
2021-02-27 16:47:48 -05:00
Peter Maydell b93ca1fca6 target/arm: Remove local definitions of float constants
In several places the target/arm code defines local float constants
for 2, 3 and 1.5, which are also provided by include/fpu/softfloat.h.
Remove the unnecessary local duplicate versions.

Backports b684e49a17da39539b0ac6e4c4c98b28b38feb76
2021-02-27 16:47:10 -05:00
Chen Qun 46af765bbb target/arm/translate-a64:Remove redundant statement in disas_simd_two_reg_misc_fp16()
Clang static code analyzer show warning:
target/arm/translate-a64.c:13007:5: warning: Value stored to 'rd' is never read
rd = extract32(insn, 0, 5);
^ ~~~~~~~~~~~~~~~~~~~~~
target/arm/translate-a64.c:13008:5: warning: Value stored to 'rn' is never read
rn = extract32(insn, 5, 5);
^ ~~~~~~~~~~~~~~~~~~~~~

Backports fa71dd531c12ad9a05cdd78392e9fc2a30ea921d
2021-02-27 16:45:25 -05:00
Chen Qun 9bac2113cd target/arm/translate-a64:Remove dead assignment in handle_scalar_simd_shli()
Clang static code analyzer show warning:
target/arm/translate-a64.c:8635:14: warning: Value stored to 'tcg_rn' during its
initialization is never read
TCGv_i64 tcg_rn = new_tmp_a64(s);
^~~~~~ ~~~~~~~~~~~~~~
target/arm/translate-a64.c:8636:14: warning: Value stored to 'tcg_rd' during its
initialization is never read
TCGv_i64 tcg_rd = new_tmp_a64(s);
^~~~~~ ~~~~~~~~~~~~~~

Backports 07174c86b41e91d98ed2ee0ee12e516694853c6b
2021-02-27 16:44:29 -05:00
LIU Zhiwei ad78fc2df5 softfloat: Define comparison operations for bfloat16
Backports c53b1079334c41b342a8ad3b7ccfd51bf5427f5
2021-02-27 16:43:10 -05:00
LIU Zhiwei d26cd63ad6 softfloat: Define misc operations for bfloat16
Backports 5ebf5f4be66c378fd5f3dee85f54dd4942171d57
2021-02-27 16:41:46 -05:00
LIU Zhiwei d8168a8142 softfloat: Define convert operations for bfloat16
Backports 34f0c0a98a5f3bb6706088c0384f937f7a294d3e
2021-02-27 16:37:11 -05:00
LIU Zhiwei b0be0d28cc softfloat: Define operations for bfloat16
Backports 8282310d8535cc2a8431c516e907da79f92df6eb
2021-02-26 15:20:30 -05:00
Stephen Long 95a0837f2d softfloat: Add float16_is_normal
This float16 predicate was missing from the normal set.

Backports a03e924cf8a22888060fc0de4d91de053cd5cde4
2021-02-26 15:12:37 -05:00
Frank Chang d97454eb63 softfloat: Add fp16 and uint8/int8 conversion functions
Backports 0d93d8ec632154dea2627a9e989972ee09721187
2021-02-26 15:11:57 -05:00
Kito Cheng 76d123efee softfloat: Implement the full set of comparisons for float16
Backports dd205025a048ef6f53ff51eb86ddc58e7a82a771
2021-02-26 15:04:12 -05:00
Lioncash f5a21abc0b target/arm: Convert sq{, r}dmulh to gvec for aa64 advsimd 2021-02-26 15:01:44 -05:00
Richard Henderson aa97b6b755 target/arm: Convert integer multiply-add (indexed) to gvec for aa64 advsimd
Backports 3607440c4df6498585a570cfc1041e4972b41b56
2021-02-26 14:51:17 -05:00
Richard Henderson 732674b868 target/arm: Convert integer multiply (indexed) to gvec for aa64 advsimd
Backports 2e5a265e6a9e7169c4a3e87db261b2fa92582590
2021-02-26 14:46:29 -05:00
Richard Henderson 80325ac866 target/arm: Generalize inl_qrdmlah_* helper functions
Unify add/sub helpers and add a parameter for rounding.
This will allow saturating non-rounding to reuse this code.

Backports d21798856b227a20a0a41640236af445f4f4aeb0
2021-02-26 14:41:32 -05:00
Richard Henderson 1bedcfbda3 target/arm: Tidy SVE tszimm shift formats
Rather than require the user to fill in the immediate (shl or shr),
create full formats that include the immediate.
2021-02-26 14:35:53 -05:00
Richard Henderson da41a23a1b target/arm: Split out gen_gvec_ool_zz
Backports 40e32e5a8a379baf6e0d49d83cf19950cfbaf96b
2021-02-26 14:32:36 -05:00
Richard Henderson 5bd98feed9 target/arm: Split out gen_gvec_ool_zzz
Backports e645d1a17a359156c6047006d760ca176d493edb
2021-02-26 14:29:48 -05:00
Richard Henderson aa3819c396 target/arm: Split out gen_gvec_ool_zzp
Model after gen_gvec_fn_zzz et al.

Backports 96a461f7c12587d3a64a71e4d90cda5c09ca3eb4
2021-02-26 14:26:33 -05:00
Lioncash 2da89a626c target/arm: Merge helper_sve_clr_* and helper_sve_movz_* 2021-02-26 14:23:06 -05:00
Richard Henderson 8eb3642d96 target/arm: Split out gen_gvec_ool_zzzp
Model after gen_gvec_fn_zzz et al.

Backports 36cbb7a8e7100864c488a1153cecba90b1c33a4c
2021-02-26 14:14:13 -05:00
Richard Henderson 9b3671e9ad target/arm: Use tcg_gen_gvec_bitsel for trans_SEL_pppp
The gvec operation was added after the initial implementation
of the SEL instruction and was missed in the conversion.

Backports d4bc623254b55e2f9613c9450216fa7e50c03929
2021-02-26 14:12:25 -05:00
Richard Henderson c8c247410f target/arm: Clean up 4-operand predicate expansion
Move the check for !S into do_pppp_flags, which allows to merge in
do_vecop4_p. Split out gen_gvec_fn_ppp without sve_access_check,
to mirror gen_gvec_fn_zzz.

Backport dd81a8d7cf5c90963603806e58a217bbe759f75e
2021-02-26 14:07:14 -05:00
Richard Henderson 7bef6489a8 target/arm: Merge do_vector2_p into do_mov_p
This is the only user of the function

Backports d0b2df5a01eeccbac71d4d883158b91e7f9a6a29
2021-02-26 13:59:00 -05:00
Richard Henderson f329d428f3 target/arm: Rearrange {sve,fp}_check_access assert
We want to ensure that access is checked by the time we ask
for a specific fp/vector register. We want to ensure that
we do not emit two lots of code to raise an exception.

But sometimes it's difficult to cleanly organize the code
such that we never pass through sve_check_access exactly once.
Allow multiple calls so long as the result is true, that is,
no exception to be raised.

Backports 8a40fe5f1bf3837ae3f9961efe1d51e7214f2664
2021-02-26 13:56:27 -05:00
Richard Henderson 64822511dd target/arm: Split out gen_gvec_fn_zzz, do_zzz_fn
Model gen_gvec_fn_zzz on gen_gvec_fn3 in translate-a64.c, but
indicating which kind of register and in which order.

Model do_zzz_fn on the other do_foo functions that take an
argument set and verify sve enabled.

Backports 28c4da31be6a5e501b60b77bac17652dd3211378
2021-02-26 13:53:10 -05:00
Richard Henderson 3146cbb64e target/arm: Split out gen_gvec_fn_zz
Model the new function on gen_gvec_fn2 in translate-a64.c, but
indicating which kind of register and in which order. Since there
is only one user of do_vector2_z, fold it into do_mov_z

Backports f7d79c41fa4bd0f0d27dcd14babab8575fbed39f
2021-02-26 13:50:05 -05:00
Richard Henderson 234a22803d qemu/int128: Add int128_lshift
Add left-shift to match the existing right-shift.

Backports 5be4dd043f5beb5e7587d1ef8dd4e3716ec05639
2021-02-26 13:45:44 -05:00
Richard Henderson 6f341e0199 target/arm: Fill in the WnR syndrome bit in mte_check_fail
According to AArch64.TagCheckFault, none of the other ISS values are
provided, so we do not need to go so far as merge_syn_data_abort.
But we were missing the WnR bit.

Backports commit 9a4670be7f0734d27bf4058db3becf83cd0cc9d5 from qemu
2021-02-26 12:26:15 -05:00
Richard Henderson 6969435fb8 target/arm: Pass the entire mte descriptor to mte_check_fail
We need more information than just the mmu_idx in order
to create the proper exception syndrome. Only change the
function signature so far.

Backports dbf8c32178291169e111a6a9fd7ae17af4a3039d
2021-02-26 12:19:51 -05:00
Philippe Mathieu-Daudé d4c59cce4e target/arm: Clarify HCR_EL2 ARMCPRegInfo type
In commit ce4afed839 ("target/arm: Implement AArch32 HCR and HCR2")
the HCR_EL2 register has been changed from type NO_RAW (no underlying
state and does not support raw access for state saving/loading) to
type CONST (TCG can assume the value to be constant), removing the
read/write accessors.
We forgot to remove the previous type ARM_CP_NO_RAW. This is not
really a problem since the field is overwritten. However it makes
code review confuse, so remove it.

Backports 0e5aac18bc31dbdfab51f9784240d0c31a4c5579
2021-02-26 12:18:15 -05:00
Max Filippov d9e561ab2a softfloat: add xtensa specialization for pickNaNMulAdd
pickNaNMulAdd logic on Xtensa is to apply pickNaN to the inputs of the
expression (a * b) + c. However if default NaN is produces as a result
of (a * b) calculation it is not considered when c is NaN.
So with two pickNaN variants there must be two pickNaNMulAdd variants.
In addition the invalid flag is always set when (a * b) produces NaN.

Backports commit fbcc38e4cb1b539b8615ec9b0adc285351d77628 from qemu
2021-02-26 12:16:51 -05:00
Max Filippov fee4c62fe4 softfloat: pass float_status pointer to pickNaN
Pass float_status structure pointer to the pickNaN so that
machine-specific settings are available to NaN selection code.
Add use_first_nan property to float_status and use it in Xtensa-specific
pickNaN.

Backports commit 913602e3ffe6bf50b869a14028a55cb267645ba3
2021-02-26 12:16:05 -05:00
Max Filippov db780eff66 softfloat: make NO_SIGNALING_NANS runtime property
target/xtensa, the only user of NO_SIGNALING_NANS macro has FPU
implementations with and without the corresponding property. With
NO_SIGNALING_NANS being a macro they cannot be a part of the same QEMU
executable.
Replace macro with new property in float_status to allow cores with
different FPU implementations coexist.

Backports cc43c6925113c5bc8f1a0205375931d2e4807c99
2021-02-26 12:11:40 -05:00
Peter Maydell 3e5aa58139 target/arm: Use correct FPST for VCMLA, VCADD on fp16
When we implemented the VCMLA and VCADD insns we put in the
code to handle fp16, but left it using the standard fp status
flags. Correct them to use FPST_STD_F16 for fp16 operations.

Bacports commit b34aa5129e9c3aff890b4f4bcc84962e94185629
2021-02-26 12:02:23 -05:00
Peter Maydell 61377ce01c target/arm: Implement FPST_STD_F16 fpstatus
Architecturally, Neon FP16 operations use the "standard FPSCR" like
all other Neon operations. However, this is defined in the Arm ARM
pseudocode as "a fixed value, except that FZ16 (and AHP) follow the
FPSCR bits". In QEMU, the softfloat float_status doesn't include
separate flush-to-zero for FP16 operations, so we must keep separate
fp_status for "Neon non-FP16" and "Neon fp16" operations, in the
same way we do already for the non-Neon "fp_status" vs "fp_status_f16".

Add the extra float_status field to the CPU state structure,
ensure it is correctly initialized and updated on FPSCR writes,
and make fpstatus_ptr(FPST_STD_F16) return a pointer to it.

Backports commit aaae563bc73de0598bbc09a102e68f27fafe704a
2021-02-26 12:00:25 -05:00