Commit graph

6980 commits

Author SHA1 Message Date
Frank Chang d97454eb63 softfloat: Add fp16 and uint8/int8 conversion functions
Backports 0d93d8ec632154dea2627a9e989972ee09721187
2021-02-26 15:11:57 -05:00
Kito Cheng 76d123efee softfloat: Implement the full set of comparisons for float16
Backports dd205025a048ef6f53ff51eb86ddc58e7a82a771
2021-02-26 15:04:12 -05:00
Lioncash f5a21abc0b target/arm: Convert sq{, r}dmulh to gvec for aa64 advsimd 2021-02-26 15:01:44 -05:00
Richard Henderson aa97b6b755 target/arm: Convert integer multiply-add (indexed) to gvec for aa64 advsimd
Backports 3607440c4df6498585a570cfc1041e4972b41b56
2021-02-26 14:51:17 -05:00
Richard Henderson 732674b868 target/arm: Convert integer multiply (indexed) to gvec for aa64 advsimd
Backports 2e5a265e6a9e7169c4a3e87db261b2fa92582590
2021-02-26 14:46:29 -05:00
Richard Henderson 80325ac866 target/arm: Generalize inl_qrdmlah_* helper functions
Unify add/sub helpers and add a parameter for rounding.
This will allow saturating non-rounding to reuse this code.

Backports d21798856b227a20a0a41640236af445f4f4aeb0
2021-02-26 14:41:32 -05:00
Richard Henderson 1bedcfbda3 target/arm: Tidy SVE tszimm shift formats
Rather than require the user to fill in the immediate (shl or shr),
create full formats that include the immediate.
2021-02-26 14:35:53 -05:00
Richard Henderson da41a23a1b target/arm: Split out gen_gvec_ool_zz
Backports 40e32e5a8a379baf6e0d49d83cf19950cfbaf96b
2021-02-26 14:32:36 -05:00
Richard Henderson 5bd98feed9 target/arm: Split out gen_gvec_ool_zzz
Backports e645d1a17a359156c6047006d760ca176d493edb
2021-02-26 14:29:48 -05:00
Richard Henderson aa3819c396 target/arm: Split out gen_gvec_ool_zzp
Model after gen_gvec_fn_zzz et al.

Backports 96a461f7c12587d3a64a71e4d90cda5c09ca3eb4
2021-02-26 14:26:33 -05:00
Lioncash 2da89a626c target/arm: Merge helper_sve_clr_* and helper_sve_movz_* 2021-02-26 14:23:06 -05:00
Richard Henderson 8eb3642d96 target/arm: Split out gen_gvec_ool_zzzp
Model after gen_gvec_fn_zzz et al.

Backports 36cbb7a8e7100864c488a1153cecba90b1c33a4c
2021-02-26 14:14:13 -05:00
Richard Henderson 9b3671e9ad target/arm: Use tcg_gen_gvec_bitsel for trans_SEL_pppp
The gvec operation was added after the initial implementation
of the SEL instruction and was missed in the conversion.

Backports d4bc623254b55e2f9613c9450216fa7e50c03929
2021-02-26 14:12:25 -05:00
Richard Henderson c8c247410f target/arm: Clean up 4-operand predicate expansion
Move the check for !S into do_pppp_flags, which allows to merge in
do_vecop4_p. Split out gen_gvec_fn_ppp without sve_access_check,
to mirror gen_gvec_fn_zzz.

Backport dd81a8d7cf5c90963603806e58a217bbe759f75e
2021-02-26 14:07:14 -05:00
Richard Henderson 7bef6489a8 target/arm: Merge do_vector2_p into do_mov_p
This is the only user of the function

Backports d0b2df5a01eeccbac71d4d883158b91e7f9a6a29
2021-02-26 13:59:00 -05:00
Richard Henderson f329d428f3 target/arm: Rearrange {sve,fp}_check_access assert
We want to ensure that access is checked by the time we ask
for a specific fp/vector register. We want to ensure that
we do not emit two lots of code to raise an exception.

But sometimes it's difficult to cleanly organize the code
such that we never pass through sve_check_access exactly once.
Allow multiple calls so long as the result is true, that is,
no exception to be raised.

Backports 8a40fe5f1bf3837ae3f9961efe1d51e7214f2664
2021-02-26 13:56:27 -05:00
Richard Henderson 64822511dd target/arm: Split out gen_gvec_fn_zzz, do_zzz_fn
Model gen_gvec_fn_zzz on gen_gvec_fn3 in translate-a64.c, but
indicating which kind of register and in which order.

Model do_zzz_fn on the other do_foo functions that take an
argument set and verify sve enabled.

Backports 28c4da31be6a5e501b60b77bac17652dd3211378
2021-02-26 13:53:10 -05:00
Richard Henderson 3146cbb64e target/arm: Split out gen_gvec_fn_zz
Model the new function on gen_gvec_fn2 in translate-a64.c, but
indicating which kind of register and in which order. Since there
is only one user of do_vector2_z, fold it into do_mov_z

Backports f7d79c41fa4bd0f0d27dcd14babab8575fbed39f
2021-02-26 13:50:05 -05:00
Richard Henderson 234a22803d qemu/int128: Add int128_lshift
Add left-shift to match the existing right-shift.

Backports 5be4dd043f5beb5e7587d1ef8dd4e3716ec05639
2021-02-26 13:45:44 -05:00
Richard Henderson 6f341e0199 target/arm: Fill in the WnR syndrome bit in mte_check_fail
According to AArch64.TagCheckFault, none of the other ISS values are
provided, so we do not need to go so far as merge_syn_data_abort.
But we were missing the WnR bit.

Backports commit 9a4670be7f0734d27bf4058db3becf83cd0cc9d5 from qemu
2021-02-26 12:26:15 -05:00
Richard Henderson 6969435fb8 target/arm: Pass the entire mte descriptor to mte_check_fail
We need more information than just the mmu_idx in order
to create the proper exception syndrome. Only change the
function signature so far.

Backports dbf8c32178291169e111a6a9fd7ae17af4a3039d
2021-02-26 12:19:51 -05:00
Philippe Mathieu-Daudé d4c59cce4e target/arm: Clarify HCR_EL2 ARMCPRegInfo type
In commit ce4afed839 ("target/arm: Implement AArch32 HCR and HCR2")
the HCR_EL2 register has been changed from type NO_RAW (no underlying
state and does not support raw access for state saving/loading) to
type CONST (TCG can assume the value to be constant), removing the
read/write accessors.
We forgot to remove the previous type ARM_CP_NO_RAW. This is not
really a problem since the field is overwritten. However it makes
code review confuse, so remove it.

Backports 0e5aac18bc31dbdfab51f9784240d0c31a4c5579
2021-02-26 12:18:15 -05:00
Max Filippov d9e561ab2a softfloat: add xtensa specialization for pickNaNMulAdd
pickNaNMulAdd logic on Xtensa is to apply pickNaN to the inputs of the
expression (a * b) + c. However if default NaN is produces as a result
of (a * b) calculation it is not considered when c is NaN.
So with two pickNaN variants there must be two pickNaNMulAdd variants.
In addition the invalid flag is always set when (a * b) produces NaN.

Backports commit fbcc38e4cb1b539b8615ec9b0adc285351d77628 from qemu
2021-02-26 12:16:51 -05:00
Max Filippov fee4c62fe4 softfloat: pass float_status pointer to pickNaN
Pass float_status structure pointer to the pickNaN so that
machine-specific settings are available to NaN selection code.
Add use_first_nan property to float_status and use it in Xtensa-specific
pickNaN.

Backports commit 913602e3ffe6bf50b869a14028a55cb267645ba3
2021-02-26 12:16:05 -05:00
Max Filippov db780eff66 softfloat: make NO_SIGNALING_NANS runtime property
target/xtensa, the only user of NO_SIGNALING_NANS macro has FPU
implementations with and without the corresponding property. With
NO_SIGNALING_NANS being a macro they cannot be a part of the same QEMU
executable.
Replace macro with new property in float_status to allow cores with
different FPU implementations coexist.

Backports cc43c6925113c5bc8f1a0205375931d2e4807c99
2021-02-26 12:11:40 -05:00
Peter Maydell 3e5aa58139 target/arm: Use correct FPST for VCMLA, VCADD on fp16
When we implemented the VCMLA and VCADD insns we put in the
code to handle fp16, but left it using the standard fp status
flags. Correct them to use FPST_STD_F16 for fp16 operations.

Bacports commit b34aa5129e9c3aff890b4f4bcc84962e94185629
2021-02-26 12:02:23 -05:00
Peter Maydell 61377ce01c target/arm: Implement FPST_STD_F16 fpstatus
Architecturally, Neon FP16 operations use the "standard FPSCR" like
all other Neon operations. However, this is defined in the Arm ARM
pseudocode as "a fixed value, except that FZ16 (and AHP) follow the
FPSCR bits". In QEMU, the softfloat float_status doesn't include
separate flush-to-zero for FP16 operations, so we must keep separate
fp_status for "Neon non-FP16" and "Neon fp16" operations, in the
same way we do already for the non-Neon "fp_status" vs "fp_status_f16".

Add the extra float_status field to the CPU state structure,
ensure it is correctly initialized and updated on FPSCR writes,
and make fpstatus_ptr(FPST_STD_F16) return a pointer to it.

Backports commit aaae563bc73de0598bbc09a102e68f27fafe704a
2021-02-26 12:00:25 -05:00
Peter Maydell b1b0a41507 target/arm: Make A32/T32 use new fpstatus_ptr() API
Make A32/T32 code use the new fpstatus_ptr() API:
get_fpstatus_ptr(0) -> fpstatus_ptr(FPST_FPCR)
get_fpstatus_ptr(1) -> fpstatus_ptr(FPST_STD)

Backports a84d1d1316726704edd2617b2c30c921d98a8137
2021-02-26 11:55:55 -05:00
Peter Maydell 79359e3a69 target/arm: Replace A64 get_fpstatus_ptr() with generic fpstatus_ptr()
We currently have two versions of get_fpstatus_ptr(), which both take
an effectively boolean argument:
* the one for A64 takes "bool is_f16" to distinguish fp16 from other ops
* the one for A32/T32 takes "int neon" to distinguish Neon from other ops

This is confusing, and to implement ARMv8.2-FP16 the A32/T32 one will
need to make a four-way distinction between "non-Neon, FP16",
"non-Neon, single/double", "Neon, FP16" and "Neon, single/double".
The A64 version will then be a strict subset of the A32/T32 version.

To clean this all up, we want to go to a single implementation which
takes an enum argument with values FPST_FPCR, FPST_STD,
FPST_FPCR_F16, and FPST_STD_F16. We rename the function to
fpstatus_ptr() so that unconverted code gets a compilation error
rather than silently passing the wrong thing to the new function.

This commit implements that new API, and converts A64 to use it:
get_fpstatus_ptr(false) -> fpstatus_ptr(FPST_FPCR)
get_fpstatus_ptr(true) -> fpstatus_ptr(FPST_FPCR_F16)

Backports commit cdfb22bb7326fee607d9553358856cca341dbc9a
2021-02-26 11:46:51 -05:00
Peter Maydell e9240f0f54 target/arm: Delete unused ARM_FEATURE_CRC
In commit 962fcbf2efe57231a9f5df we converted the uses of the
ARM_FEATURE_CRC bit to use the aa32_crc32 isar_feature test
instead. However we forgot to remove the now-unused definition
of the feature name in the enum. Delete it now.

Backports commit cf6303d262e31f4812dfeb654c6c6803e52000af
2021-02-26 11:24:40 -05:00
Peter Maydell e0000d1700 target/arm/translate.c: Delete/amend incorrect comments
In arm_tr_init_disas_context() we have a FIXME comment that suggests
"cpu_M0 can probably be the same as cpu_V0". This isn't in fact
possible: cpu_V0 is used as a temporary inside gen_iwmmxt_shift(),
and that function is called in various places where cpu_M0 contains a
live value (i.e. between gen_op_iwmmxt_movq_M0_wRn() and
gen_op_iwmmxt_movq_wRn_M0() calls). Remove the comment.

We also have a comment on the declarations of cpu_V0/V1/M0 which
claims they're "for efficiency". This isn't true with modern TCG, so
replace this comment with one which notes that they're only used with
the iwmmxt decode

Backports 8b4c9a50dc9531a729ae4b5941d287ad0422db48
2021-02-26 11:23:52 -05:00
Peter Maydell 0759bb8eaf target/arm: Delete unused VFP_DREG macros
As part of the Neon decodetree conversion we removed all
the uses of the VFP_DREG macros, but forgot to remove the
macro definitions. Do so now.

Backports e60527c5d501e5015a119a0388a27abeae4dac09
2021-02-26 11:22:01 -05:00
Peter Maydell 368323b03f target/arm: Remove ARCH macro
The ARCH() macro was used a lot in the legacy decoder, but
there are now just two uses of it left. Since a macro which
expands out to a goto is liable to be confusing when reading
code, replace the last two uses with a simple open-coded
qeuivalent.

Backports ce51c7f522ca488c795c3510413e338021141c96
2021-02-26 11:21:20 -05:00
Peter Maydell 5d9c0addcf target/arm: Convert T32 coprocessor insns to decodetree
Convert the T32 coprocessor instructions to decodetree.
As with the A32 conversion, this corrects an underdecoding
where we did not check that MRRC/MCRR [24:21] were 0b0010
and so treated some kinds of LDC/STC and MRRC/MCRR rather
than UNDEFing them.

Backports commit 4c498dcfd84281f20bd55072630027d1b3c115fd
2021-02-26 11:19:35 -05:00
Peter Maydell bdaaac68f5 target/arm: Do M-profile NOCP checks early and via decodetree
For M-profile CPUs, the architecture specifies that the NOCP
exception when a coprocessor is not present or disabled should cover
the entire wide range of coprocessor-space encodings, and should take
precedence over UNDEF exceptions. (This is the opposite of
A-profile, where checking for a disabled FPU has to happen last.)

Implement this with decodetree patterns that cover the specified
ranges of the encoding space. There are a few instructions (VLLDM,
VLSTM, and in v8.1 also VSCCLRM) which are in copro-space but must
not be NOCP'd: these must be handled also in the new m-nocp.decode so
they take precedence.

This is a minor behaviour change: for unallocated insn patterns in
the VFP area (cp=10,11) we will now NOCP rather than UNDEF when the
FPU is disabled.

As well as giving us the correct architectural behaviour for v8.1M
and the recommended behaviour for v8.0M, this refactoring also
removes the old NOCP handling from the remains of the 'legacy
decoder' in disas_thumb2_insn(), paving the way for cleaning that up.

Since we don't currently have a v8.1M feature bit or any v8.1M CPUs,
the minor changes to this logic that we'll need for v8.1M are marked
up with TODO comments.

Backports commit a3494d4671797c291c88bd414acb0aead15f7239 from qemu
2021-02-26 11:17:23 -05:00
Peter Maydell c675b73b1f target/arm: Tidy up disas_arm_insn()
The only thing left in the "legacy decoder" is the handling
of disas_xscale_insn(), and we can simplify the code.

Backports commit 8198c071bc55bee55ef4f104a5b125f541b51096
2021-02-26 10:59:09 -05:00
Peter Maydell fc4cc9d95f target/arm: Convert A32 coprocessor insns to decodetree
Convert the A32 coprocessor instructions to decodetree.

Note that this corrects an underdecoding: for the 64-bit access case
(MRRC/MCRR) we did not check that bits [24:21] were 0b0010, so we
would incorrectly treat LDC/STC as MRRC/MCRR rather than UNDEFing
them.

The decodetree versions of these insns assume the coprocessor
is in the range 0..7 or 14..15. This is architecturally sensible
(as per the comments) and OK in practice for QEMU because the only
uses of the ARMCPRegInfo infrastructure we have that aren't
for coprocessors 14 or 15 are the pxa2xx use of coprocessor 6.
We add an assertion to the define_one_arm_cp_reg_with_opaque()
function to catch any accidental future attempts to use it to
define coprocessor registers for invalid coprocessors.

Backports commit cd8be50e58f63413c033531d3273c0e44851684f from qemu
2021-02-26 10:57:00 -05:00
Peter Maydell ef0e23f1f9 target/arm: Separate decode from handling of coproc insns
As a prelude to making coproc insns use decodetree, split out the
part of disas_coproc_insn() which does instruction decoding from the
part which does the actual work, and make do_coproc_insn() handle the
UNDEF-on-bad-permissions and similar cases itself rather than
returning 1 to eventually percolate up to a callsite that calls
unallocated_encoding() for it.

Backports 19c23a9baafc91dd3881a7a4e9bf454e42d24e4e
2021-02-26 10:53:52 -05:00
Peter Maydell 2944a75b98 target/arm: Pull handling of XScale insns out of disas_coproc_insn()
At the moment we check for XScale/iwMMXt insns inside
disas_coproc_insn(): for CPUs with ARM_FEATURE_XSCALE all copro insns
with cp 0 or 1 are handled specially. This works, but is an odd
place for this check, because disas_coproc_insn() is called from both
the Arm and Thumb decoders but the XScale case never applies for
Thumb (all the XScale CPUs were ARMv5, which has only Thumb1, not
Thumb2 with the 32-bit coprocessor insn encodings). It also makes it
awkward to convert the real copro access insns to decodetree.

Move the identification of XScale out to its own function
which is only called from disas_arm_insn().

Backports commit 7b4f933db865391a90a3b4518bb2050a83f2a873 from qemu
2021-02-26 10:50:32 -05:00
LIU Zhiwei 9b7f4b72fc target/riscv: vector single-width integer multiply instructions 2021-02-26 10:46:26 -05:00
LIU Zhiwei ab81642440 target/riscv: vector integer min/max instructions
558fa7797c919c4f21ac10980f3ed28160d6d3cb
2021-02-26 10:43:13 -05:00
LIU Zhiwei 965af9986a target/riscv: vector integer comparison instructions
1366fc79be04fa56a0e3f078ba4f26c27ac67e89
2021-02-26 10:40:33 -05:00
LIU Zhiwei 244793c4e8 target/riscv: vector single-width bit shift instructions
Backports 3277d955d21d8943d80062b4cfd8547f831dbd51
2021-02-26 10:37:09 -05:00
LIU Zhiwei 56c0e253c2 target/riscv: vector bitwise logical instructions
Backports d3842924cf93d104f691c5ea9090d6700ccef281
2021-02-26 10:30:33 -05:00
LIU Zhiwei 05153c6d7c target/riscv: vector integer add-with-carry / subtract-with-borrow instructions
3a6f8f68ad2f4a22d9ae8287f336b5dcc80b6448
2021-02-26 10:19:48 -05:00
LIU Zhiwei b9814de4c3 target/riscv: vector widening integer add and subtract
Backports 8fcdf77630290591a6068c2d82ca2935338c3b0c
2021-02-26 10:05:43 -05:00
LIU Zhiwei f564388e89 target/riscv: vector single-width integer add and subtract
Backports 43740e3a3b3bb66456103684e622ba4e9baae297
2021-02-26 09:58:31 -05:00
LIU Zhiwei 7d0d7338c2 target/riscv: add vector amo operations
Vector AMOs operate as if aq and rl bits were zero on each element
with regard to ordering relative to other instructions in the same hart.
Vector AMOs provide no ordering guarantee between element operations
in the same vector AMO instruction

Backports 268fcca66bde62257960ec8d859de374315a5e3d
2021-02-26 09:47:32 -05:00
LIU Zhiwei 152934bade target/riscv: add fault-only-first unit stride load
The unit-stride fault-only-fault load instructions are used to
vectorize loops with data-dependent exit conditions(while loops).
These instructions execute as a regular load except that they
will only take a trap on element 0.

Backports commit 022b4ecf775ffeff522eaea4f0d94edcfe00a0a9 from qemu
2021-02-26 09:28:19 -05:00
LIU Zhiwei 887c29bc79 target/riscv: add vector index load and store instructions
Vector indexed operations add the contents of each element of the
vector offset operand specified by vs2 to the base effective address
to give the effective address of each element.

Backports f732560e3551c0823cee52efba993fbb8f689a36
2021-02-26 03:00:45 -05:00