Commit graph

1441 commits

Author SHA1 Message Date
Philippe Mathieu-Daudé d4c59cce4e target/arm: Clarify HCR_EL2 ARMCPRegInfo type
In commit ce4afed839 ("target/arm: Implement AArch32 HCR and HCR2")
the HCR_EL2 register has been changed from type NO_RAW (no underlying
state and does not support raw access for state saving/loading) to
type CONST (TCG can assume the value to be constant), removing the
read/write accessors.
We forgot to remove the previous type ARM_CP_NO_RAW. This is not
really a problem since the field is overwritten. However it makes
code review confuse, so remove it.

Backports 0e5aac18bc31dbdfab51f9784240d0c31a4c5579
2021-02-26 12:18:15 -05:00
Peter Maydell 3e5aa58139 target/arm: Use correct FPST for VCMLA, VCADD on fp16
When we implemented the VCMLA and VCADD insns we put in the
code to handle fp16, but left it using the standard fp status
flags. Correct them to use FPST_STD_F16 for fp16 operations.

Bacports commit b34aa5129e9c3aff890b4f4bcc84962e94185629
2021-02-26 12:02:23 -05:00
Peter Maydell 61377ce01c target/arm: Implement FPST_STD_F16 fpstatus
Architecturally, Neon FP16 operations use the "standard FPSCR" like
all other Neon operations. However, this is defined in the Arm ARM
pseudocode as "a fixed value, except that FZ16 (and AHP) follow the
FPSCR bits". In QEMU, the softfloat float_status doesn't include
separate flush-to-zero for FP16 operations, so we must keep separate
fp_status for "Neon non-FP16" and "Neon fp16" operations, in the
same way we do already for the non-Neon "fp_status" vs "fp_status_f16".

Add the extra float_status field to the CPU state structure,
ensure it is correctly initialized and updated on FPSCR writes,
and make fpstatus_ptr(FPST_STD_F16) return a pointer to it.

Backports commit aaae563bc73de0598bbc09a102e68f27fafe704a
2021-02-26 12:00:25 -05:00
Peter Maydell b1b0a41507 target/arm: Make A32/T32 use new fpstatus_ptr() API
Make A32/T32 code use the new fpstatus_ptr() API:
get_fpstatus_ptr(0) -> fpstatus_ptr(FPST_FPCR)
get_fpstatus_ptr(1) -> fpstatus_ptr(FPST_STD)

Backports a84d1d1316726704edd2617b2c30c921d98a8137
2021-02-26 11:55:55 -05:00
Peter Maydell 79359e3a69 target/arm: Replace A64 get_fpstatus_ptr() with generic fpstatus_ptr()
We currently have two versions of get_fpstatus_ptr(), which both take
an effectively boolean argument:
* the one for A64 takes "bool is_f16" to distinguish fp16 from other ops
* the one for A32/T32 takes "int neon" to distinguish Neon from other ops

This is confusing, and to implement ARMv8.2-FP16 the A32/T32 one will
need to make a four-way distinction between "non-Neon, FP16",
"non-Neon, single/double", "Neon, FP16" and "Neon, single/double".
The A64 version will then be a strict subset of the A32/T32 version.

To clean this all up, we want to go to a single implementation which
takes an enum argument with values FPST_FPCR, FPST_STD,
FPST_FPCR_F16, and FPST_STD_F16. We rename the function to
fpstatus_ptr() so that unconverted code gets a compilation error
rather than silently passing the wrong thing to the new function.

This commit implements that new API, and converts A64 to use it:
get_fpstatus_ptr(false) -> fpstatus_ptr(FPST_FPCR)
get_fpstatus_ptr(true) -> fpstatus_ptr(FPST_FPCR_F16)

Backports commit cdfb22bb7326fee607d9553358856cca341dbc9a
2021-02-26 11:46:51 -05:00
Peter Maydell e9240f0f54 target/arm: Delete unused ARM_FEATURE_CRC
In commit 962fcbf2efe57231a9f5df we converted the uses of the
ARM_FEATURE_CRC bit to use the aa32_crc32 isar_feature test
instead. However we forgot to remove the now-unused definition
of the feature name in the enum. Delete it now.

Backports commit cf6303d262e31f4812dfeb654c6c6803e52000af
2021-02-26 11:24:40 -05:00
Peter Maydell e0000d1700 target/arm/translate.c: Delete/amend incorrect comments
In arm_tr_init_disas_context() we have a FIXME comment that suggests
"cpu_M0 can probably be the same as cpu_V0". This isn't in fact
possible: cpu_V0 is used as a temporary inside gen_iwmmxt_shift(),
and that function is called in various places where cpu_M0 contains a
live value (i.e. between gen_op_iwmmxt_movq_M0_wRn() and
gen_op_iwmmxt_movq_wRn_M0() calls). Remove the comment.

We also have a comment on the declarations of cpu_V0/V1/M0 which
claims they're "for efficiency". This isn't true with modern TCG, so
replace this comment with one which notes that they're only used with
the iwmmxt decode

Backports 8b4c9a50dc9531a729ae4b5941d287ad0422db48
2021-02-26 11:23:52 -05:00
Peter Maydell 0759bb8eaf target/arm: Delete unused VFP_DREG macros
As part of the Neon decodetree conversion we removed all
the uses of the VFP_DREG macros, but forgot to remove the
macro definitions. Do so now.

Backports e60527c5d501e5015a119a0388a27abeae4dac09
2021-02-26 11:22:01 -05:00
Peter Maydell 368323b03f target/arm: Remove ARCH macro
The ARCH() macro was used a lot in the legacy decoder, but
there are now just two uses of it left. Since a macro which
expands out to a goto is liable to be confusing when reading
code, replace the last two uses with a simple open-coded
qeuivalent.

Backports ce51c7f522ca488c795c3510413e338021141c96
2021-02-26 11:21:20 -05:00
Peter Maydell 5d9c0addcf target/arm: Convert T32 coprocessor insns to decodetree
Convert the T32 coprocessor instructions to decodetree.
As with the A32 conversion, this corrects an underdecoding
where we did not check that MRRC/MCRR [24:21] were 0b0010
and so treated some kinds of LDC/STC and MRRC/MCRR rather
than UNDEFing them.

Backports commit 4c498dcfd84281f20bd55072630027d1b3c115fd
2021-02-26 11:19:35 -05:00
Peter Maydell bdaaac68f5 target/arm: Do M-profile NOCP checks early and via decodetree
For M-profile CPUs, the architecture specifies that the NOCP
exception when a coprocessor is not present or disabled should cover
the entire wide range of coprocessor-space encodings, and should take
precedence over UNDEF exceptions. (This is the opposite of
A-profile, where checking for a disabled FPU has to happen last.)

Implement this with decodetree patterns that cover the specified
ranges of the encoding space. There are a few instructions (VLLDM,
VLSTM, and in v8.1 also VSCCLRM) which are in copro-space but must
not be NOCP'd: these must be handled also in the new m-nocp.decode so
they take precedence.

This is a minor behaviour change: for unallocated insn patterns in
the VFP area (cp=10,11) we will now NOCP rather than UNDEF when the
FPU is disabled.

As well as giving us the correct architectural behaviour for v8.1M
and the recommended behaviour for v8.0M, this refactoring also
removes the old NOCP handling from the remains of the 'legacy
decoder' in disas_thumb2_insn(), paving the way for cleaning that up.

Since we don't currently have a v8.1M feature bit or any v8.1M CPUs,
the minor changes to this logic that we'll need for v8.1M are marked
up with TODO comments.

Backports commit a3494d4671797c291c88bd414acb0aead15f7239 from qemu
2021-02-26 11:17:23 -05:00
Peter Maydell c675b73b1f target/arm: Tidy up disas_arm_insn()
The only thing left in the "legacy decoder" is the handling
of disas_xscale_insn(), and we can simplify the code.

Backports commit 8198c071bc55bee55ef4f104a5b125f541b51096
2021-02-26 10:59:09 -05:00
Peter Maydell fc4cc9d95f target/arm: Convert A32 coprocessor insns to decodetree
Convert the A32 coprocessor instructions to decodetree.

Note that this corrects an underdecoding: for the 64-bit access case
(MRRC/MCRR) we did not check that bits [24:21] were 0b0010, so we
would incorrectly treat LDC/STC as MRRC/MCRR rather than UNDEFing
them.

The decodetree versions of these insns assume the coprocessor
is in the range 0..7 or 14..15. This is architecturally sensible
(as per the comments) and OK in practice for QEMU because the only
uses of the ARMCPRegInfo infrastructure we have that aren't
for coprocessors 14 or 15 are the pxa2xx use of coprocessor 6.
We add an assertion to the define_one_arm_cp_reg_with_opaque()
function to catch any accidental future attempts to use it to
define coprocessor registers for invalid coprocessors.

Backports commit cd8be50e58f63413c033531d3273c0e44851684f from qemu
2021-02-26 10:57:00 -05:00
Peter Maydell ef0e23f1f9 target/arm: Separate decode from handling of coproc insns
As a prelude to making coproc insns use decodetree, split out the
part of disas_coproc_insn() which does instruction decoding from the
part which does the actual work, and make do_coproc_insn() handle the
UNDEF-on-bad-permissions and similar cases itself rather than
returning 1 to eventually percolate up to a callsite that calls
unallocated_encoding() for it.

Backports 19c23a9baafc91dd3881a7a4e9bf454e42d24e4e
2021-02-26 10:53:52 -05:00
Peter Maydell 2944a75b98 target/arm: Pull handling of XScale insns out of disas_coproc_insn()
At the moment we check for XScale/iwMMXt insns inside
disas_coproc_insn(): for CPUs with ARM_FEATURE_XSCALE all copro insns
with cp 0 or 1 are handled specially. This works, but is an odd
place for this check, because disas_coproc_insn() is called from both
the Arm and Thumb decoders but the XScale case never applies for
Thumb (all the XScale CPUs were ARMv5, which has only Thumb1, not
Thumb2 with the 32-bit coprocessor insn encodings). It also makes it
awkward to convert the real copro access insns to decodetree.

Move the identification of XScale out to its own function
which is only called from disas_arm_insn().

Backports commit 7b4f933db865391a90a3b4518bb2050a83f2a873 from qemu
2021-02-26 10:50:32 -05:00
Peter Maydell 0718459fb3 target/arm: Fix Rt/Rt2 in ESR_ELx for copro traps from AArch32 to 64
When a coprocessor instruction in an AArch32 guest traps to AArch32
Hyp mode, the syndrome register (HSR) includes Rt and Rt2 fields
which are simply copies of the Rt and Rt2 fields from the trapped
instruction. However, if the instruction is trapped from AArch32 to
an AArch64 higher exception level, the Rt and Rt2 fields in the
syndrome register (ESR_ELx) must be the AArch64 view of the register.
This makes a difference if the AArch32 guest was in a mode other than
User or System and it was using r13 or r14, or if it was in FIQ mode
and using r8-r14.

We don't know at translate time which AArch32 CPU mode we are in, so
we leave the values we generate in our prototype syndrome register
value at translate time as the raw Rt/Rt2 from the instruction, and
instead correct them to the AArch64 view when we find we need to take
an exception from AArch32 to AArch64 with one of these syndrome
values.

Fixes: https://bugs.launchpad.net/qemu/+bug/1879587

Backports commit a65dabf71a9f9b949d556b1b57fd72595df92398 from qemu
2021-02-25 23:50:18 -05:00
Peter Collingbourne 7de60dfa51 target/arm: Fix decode of LDRA[AB] instructions
These instructions use zero as the discriminator, not SP.

Backports commit d250bb19ced3b702c7c37731855f6876d0cc7995 from qemu
2021-02-25 23:47:25 -05:00
Kaige Li 3004cc1f97 target/arm: Avoid maybe-uninitialized warning with gcc 4.9
GCC version 4.9.4 isn't clever enough to figure out that all
execution paths in disas_ldst() that use 'fn' will have initialized
it first, and so it warns:

/home/LiKaige/qemu/target/arm/translate-a64.c: In function ‘disas_ldst’:
/home/LiKaige/qemu/target/arm/translate-a64.c:3392:5: error: ‘fn’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
fn(cpu_reg(s, rt), clean_addr, tcg_rs, get_mem_index(s),
^
/home/LiKaige/qemu/target/arm/translate-a64.c:3318:22: note: ‘fn’ was declared here
AtomicThreeOpFn *fn;
^

Make it happy by initializing the variable to NULL.

Backports commit 88a90e3de6ae99cbcfcc04c862c51f241fdf685f from qemu
2021-02-25 23:45:13 -05:00
Richard Henderson ce8282d9cd target/arm: Fix AddPAC error indication
The definition of top_bit used in this function is one higher
than that used in the Arm ARM psuedo-code, which put the error
indication at top_bit - 1 at the wrong place, which meant that
it wasn't visible to Auth.

Fixing the definition of top_bit requires more changes, because
its most common use is for the count of bits in top_bit:bot_bit,
which would then need to be computed as top_bit - bot_bit + 1.

For now, prefer the minimal fix to the error indication alone.

Fixes: 63ff0ca94cb

Backports commit 8796fe40dd30cd9ffd3c958906471715c923b341 from qemu
2021-02-25 23:44:28 -05:00
Lioncash a1e8e0adff target/arm: Fix bad rebase within do_mem_zpz 2021-02-25 23:43:16 -05:00
Richard Henderson 5e1316a92e target/arm: Always pass cacheattr in S1_ptw_translate
When we changed the interface of get_phys_addr_lpae to require
the cacheattr parameter, this spot was missed. The compiler is
unable to detect the use of NULL vs the nonnull attribute here.

Fixes: 7e98e21c098

Backports commit a6d6f37aed4b171d121cd4a9363fbb41e90dcb53 from qemu
2021-02-25 23:40:32 -05:00
Aaron Lindsay e532ce610e target/arm: Don't do raw writes for PMINTENCLR
Raw writes to this register when in KVM mode can cause interrupts to be
raised (even when the PMU is disabled). Because the underlying state is
already aliased to PMINTENSET (which already provides raw write
functions), we can safely disable raw accesses to PMINTENCLR entirely.

Backports commit 887c0f1544991f567543b7c214aa11ab0cea0a29 from qemu
2021-02-25 23:27:47 -05:00
Richard Henderson f403c1f54f target/arm: Fix mtedesc for do_mem_zpz
The mtedesc that was constructed was not actually passed in.
Found by Coverity (CID 1429996).

Backports commit cdecb3fc1eb182d90666348a47afe63c493686e7 from qemu
2021-02-25 23:25:54 -05:00
Richard Henderson 57c66389c2 target/arm: Fix temp double-free in sve ldr/str
The temp that gets assigned to clean_addr has been allocated with
new_tmp_a64, which means that it will be freed at the end of the
instruction. Freeing it earlier leads to assertion failure.

The loop creates a complication, in which we allocate a new local
temp, which does need freeing, and the final code path is shared
between the loop and non-loop.

Fix this complication by adding new_tmp_a64_local so that the new
local temp is freed at the end, and can be treated exactly like
the non-loop path.

Fixes: bba87d0a0f4

Backports commit 4b4dc9750a0aa0b9766bd755bf6512a84744ce8a from qemu
2021-02-25 23:10:37 -05:00
Richard Henderson 54e2107bdf target/arm: Enable MTE
We now implement all of the components of MTE, without actually
supporting any tagged memory. All MTE instructions will work,
trivially, so we can enable support.

Backports commit c7459633baa71d1781fde4a245d6ec9ce2f008cf from qemu
2021-02-25 23:00:27 -05:00
Richard Henderson a34fda25b0 target/arm: Add allocation tag storage for system mode
Look up the physical address for the given virtual address,
convert that to a tag physical address, and finally return
the host address that backs it.

Backports commit e4d5bf4fbd5abfc3727e711eda64a583cab4d637 from qemu
2021-02-25 22:58:56 -05:00
Richard Henderson 9b6c64f8f8 target/arm: Create tagged ram when MTE is enabled
Backports commit 8bce44a2f6beb388a3f157652b46e99929839a96 from qemu
2021-02-25 22:51:23 -05:00
Richard Henderson 2ea0b53c1a target/arm: Cache the Tagged bit for a page in MemTxAttrs
This "bit" is a particular value of the page's MemAttr.

Backports commit 337a03f07ff0f9e6295662f4094e03a045b60bdc from qemu
2021-02-25 22:48:04 -05:00
Richard Henderson 28cd096d67 target/arm: Always pass cacheattr to get_phys_addr
We need to check the memattr of a page in order to determine
whether it is Tagged for MTE. Between Stage1 and Stage2,
this becomes simpler if we always collect this data, instead
of occasionally being presented with NULL.

Use the nonnull attribute to allow the compiler to check that
all pointer arguments are non-null.

Backports commit 7e98e21c09871cddc20946c8f3f3595e93154ecb from qemu
2021-02-25 22:46:00 -05:00
Richard Henderson e2456a83a4 target/arm: Set PSTATE.TCO on exception entry
D1.10 specifies that exception handlers begin with tag checks overridden.

Backports commit 34669338bd9d66255fceaa84c314251ca49ca8d5 from qemu
2021-02-25 22:41:26 -05:00
Richard Henderson 35d0443056 target/arm: Implement data cache set allocation tags
This is DC GVA and DC GZVA, and the tag check for DC ZVA.

Backports commit eb821168db798302bd124a3b000cebc23bd0a395 from qemu
2021-02-25 22:40:08 -05:00
Richard Henderson 33f5bdabb1 target/arm: Complete TBI clearing for user-only for SVE
There are a number of paths by which the TBI is still intact
for user-only in the SVE helpers.

Because we currently always set TBI for user-only, we do not
need to pass down the actual TBI setting from above, and we
can remove the top byte in the inner-most primitives, so that
none are forgotten. Moreover, this keeps the "dirty" pointer
around at the higher levels, where we need it for any MTE checking.

Since the normal case, especially for user-only, goes through
RAM, this clearing merely adds two insns per page lookup, which
will be completely in the noise.

Backports commit c4af8ba19b9d22aac79cab679a20b159af9d6809 from qemu
2021-02-25 22:37:12 -05:00
Richard Henderson 732efce958 target/arm: Add mte helpers for sve scatter/gather memory ops
Because the elements are non-sequential, we cannot eliminate many
tests straight away like we can for sequential operations. But
we often have the PTE details handy, so we can test for Tagged.

Backports commit d28d12f008ee44dc2cc2ee5d8f673be9febc951e from qemu
2021-02-25 22:34:24 -05:00
Richard Henderson 5698b7badb target/arm: Handle TBI for sve scalar + int memory ops
We still need to handle tbi for user-only when mte is inactive.

Backports commit 9473d0ecafcffc8b258892b1f9f18e037bdba958 from qemu
2021-02-25 22:17:46 -05:00
Richard Henderson 586235d02d target/arm: Add mte helpers for sve scalar + int ff/nf loads
Because the elements are sequential, we can eliminate many tests all
at once when the tag hits TCMA, or if the page(s) are not Tagged.

Backports commit aa13f7c3c378fa41366b9fcd6c29af1c3d81126a from qemu
2021-02-25 22:09:17 -05:00
Richard Henderson cb31d54b18 target/arm: Add mte helpers for sve scalar + int stores
Because the elements are sequential, we can eliminate many tests all
at once when the tag hits TCMA, or if the page(s) are not Tagged.

Backports commit 71b9f3948c75bb97641a3c8c7de96d1cb47cdc07 from qemu
2021-02-25 21:53:55 -05:00
Richard Henderson 670b25c5fa target/arm: Add mte helpers for sve scalar + int loads
Because the elements are sequential, we can eliminate many tests all
at once when the tag hits TCMA, or if the page(s) are not Tagged.

Backports commit 206adacfb8d35e671e3619591608c475aa046b63 from qemu
2021-02-25 21:45:32 -05:00
Richard Henderson 6a78133659 target/arm: Remove sve_memopidx
None of the sve helpers use TCGMemOpIdx any longer, so we can
stop passing it.

Backports commit ba080b8682fc6bde7f2d9dedddb519d63cbe138f from qemu
2021-02-25 21:33:44 -05:00
Richard Henderson 1f306230d4 target/arm: Reuse sve_probe_page for gather loads
Backports commit 10a85e2c8ab6e004e7f3f1dcfea8cb0bf58fb9fb from qemu
2021-02-25 21:30:13 -05:00
Richard Henderson 585da952ec target/arm: Reuse sve_probe_page for scatter stores
Backports commit 88a660a48ef513ce9875b595e19b2a820b3f3fca from qemu
2021-02-25 21:27:14 -05:00
Richard Henderson 3eee880c2a target/arm: Reuse sve_probe_page for gather first-fault loads
This avoids the need for a separate set of helpers to implement
no-fault semantics, and will enable MTE in the future.

Backports commit 50de9b78cec06e6d16e92a114a505779359ca532 from qemu
2021-02-25 21:22:16 -05:00
Richard Henderson b1e31f3bf3 target/arm: Use SVEContLdSt for contiguous stores
Follow the model set up for contiguous loads. This handles
watchpoints correctly for contiguous stores, recognizing the
exception before any changes to memory.

Backports commit 0fa476c1bb37a70df7eeff1e5bfb4791feb37e0e from qemu
2021-02-25 21:15:14 -05:00
Richard Henderson 3591c2f548 target/arm: Update contiguous first-fault and no-fault loads
With sve_cont_ldst_pages, the differences between first-fault and no-fault
are minimal, so unify the routines. With cpu_probe_watchpoint, we are able
to make progress through pages with TLB_WATCHPOINT set when the watchpoint
does not actually fire.

Backports commit c647673ce4d72a8789703c62a7f3cbc732cb1ea8 from qemu
2021-02-25 21:06:14 -05:00
Richard Henderson 6c9304448e target/arm: Use SVEContLdSt for multi-register contiguous loads
Backports commit 5c9b8458a0b3008d24d84b67e1c9b6d5f39f4d66 from qemu
2021-02-25 20:50:22 -05:00
Richard Henderson 3979c8f73e target/arm: Handle watchpoints in sve_ld1_r
Handle all of the watchpoints for active elements all at once,
before we've modified the vector register. This removes the
TLB_WATCHPOINT bit from page[].flags, which means that we can
use the normal fast path via RAM.

Backports commit 4bcc3f0ff8e5ae2b17b5aab9aa613ff1b8025896 from qemu
2021-02-25 20:44:13 -05:00
Richard Henderson 0e5aa37c9a target/arm: Use SVEContLdSt in sve_ld1_r
First use of the new helper functions, so we can remove the
unused markup. No longer need a scratch for user-only, as
we completely probe the page set before reading; system mode
still requires a scratch for MMIO.

Backports commit b854fd06a868e0308bcfe05ad0a71210705814c7 from qemu
2021-02-25 20:41:53 -05:00
Richard Henderson d363c3d0ba target/arm: Adjust interface of sve_ld1_host_fn
The current interface includes a loop; change it to load a
single element. We will then be able to use the function
for ld{2,3,4} where individual vector elements are not adjacent.

Replace each call with the simplest possible loop over active
elements.

Backports commit cf4a49b71b1712142d7122025a8ca7ea5b59d73f from qemu
2021-02-25 20:34:18 -05:00
Richard Henderson 94b0876f15 target/arm: Add sve infrastructure for page lookup
For contiguous predicated memory operations, we want to
minimize the number of tlb lookups performed. We have
open-coded this for sve_ld1_r, but for correctness with
MTE we will need this for all of the memory operations.

Create a structure that holds the bounds of active elements,
and metadata for two pages. Add routines to find those
active elements, lookup the pages, and run watchpoints
for those pages.

Temporarily mark the functions unused to avoid Werror.

Backports commit b4cd95d2f4c7197b844f51b29871d888063ea3e7 from qemu
2021-02-25 20:28:23 -05:00
Richard Henderson f430a399d4 target/arm: Drop manual handling of set/clear_helper_retaddr
Since we converted back to cpu_*_data_ra, we do not need to
do this ourselves.

Backports commit f32e2ab65f3a0fc03d58936709e5a565c4b0db50 from qemu
2021-02-25 20:20:29 -05:00
Richard Henderson 2e03f74a53 target/arm: Use cpu_*_data_ra for sve_ldst_tlb_fn
Use the "normal" memory access functions, rather than the
softmmu internal helper functions directly.

Since fb901c9, cpu_mem_index is now a simple extract
from env->hflags and not a large computation.  Which means
that it's now more work to pass around this value than it
is to recompute it.

This only adjusts the primitives, and does not clean up
all of the uses within sve_helper.c.
2021-02-25 20:16:38 -05:00