The temp that gets assigned to clean_addr has been allocated with
new_tmp_a64, which means that it will be freed at the end of the
instruction. Freeing it earlier leads to assertion failure.
The loop creates a complication, in which we allocate a new local
temp, which does need freeing, and the final code path is shared
between the loop and non-loop.
Fix this complication by adding new_tmp_a64_local so that the new
local temp is freed at the end, and can be treated exactly like
the non-loop path.
Fixes: bba87d0a0f4
Backports commit 4b4dc9750a0aa0b9766bd755bf6512a84744ce8a from qemu
We now implement all of the components of MTE, without actually
supporting any tagged memory. All MTE instructions will work,
trivially, so we can enable support.
Backports commit c7459633baa71d1781fde4a245d6ec9ce2f008cf from qemu
Look up the physical address for the given virtual address,
convert that to a tag physical address, and finally return
the host address that backs it.
Backports commit e4d5bf4fbd5abfc3727e711eda64a583cab4d637 from qemu
We need to check the memattr of a page in order to determine
whether it is Tagged for MTE. Between Stage1 and Stage2,
this becomes simpler if we always collect this data, instead
of occasionally being presented with NULL.
Use the nonnull attribute to allow the compiler to check that
all pointer arguments are non-null.
Backports commit 7e98e21c09871cddc20946c8f3f3595e93154ecb from qemu
There are a number of paths by which the TBI is still intact
for user-only in the SVE helpers.
Because we currently always set TBI for user-only, we do not
need to pass down the actual TBI setting from above, and we
can remove the top byte in the inner-most primitives, so that
none are forgotten. Moreover, this keeps the "dirty" pointer
around at the higher levels, where we need it for any MTE checking.
Since the normal case, especially for user-only, goes through
RAM, this clearing merely adds two insns per page lookup, which
will be completely in the noise.
Backports commit c4af8ba19b9d22aac79cab679a20b159af9d6809 from qemu
Because the elements are non-sequential, we cannot eliminate many
tests straight away like we can for sequential operations. But
we often have the PTE details handy, so we can test for Tagged.
Backports commit d28d12f008ee44dc2cc2ee5d8f673be9febc951e from qemu
Because the elements are sequential, we can eliminate many tests all
at once when the tag hits TCMA, or if the page(s) are not Tagged.
Backports commit aa13f7c3c378fa41366b9fcd6c29af1c3d81126a from qemu
Because the elements are sequential, we can eliminate many tests all
at once when the tag hits TCMA, or if the page(s) are not Tagged.
Backports commit 71b9f3948c75bb97641a3c8c7de96d1cb47cdc07 from qemu
Because the elements are sequential, we can eliminate many tests all
at once when the tag hits TCMA, or if the page(s) are not Tagged.
Backports commit 206adacfb8d35e671e3619591608c475aa046b63 from qemu
This avoids the need for a separate set of helpers to implement
no-fault semantics, and will enable MTE in the future.
Backports commit 50de9b78cec06e6d16e92a114a505779359ca532 from qemu
Follow the model set up for contiguous loads. This handles
watchpoints correctly for contiguous stores, recognizing the
exception before any changes to memory.
Backports commit 0fa476c1bb37a70df7eeff1e5bfb4791feb37e0e from qemu
With sve_cont_ldst_pages, the differences between first-fault and no-fault
are minimal, so unify the routines. With cpu_probe_watchpoint, we are able
to make progress through pages with TLB_WATCHPOINT set when the watchpoint
does not actually fire.
Backports commit c647673ce4d72a8789703c62a7f3cbc732cb1ea8 from qemu
Handle all of the watchpoints for active elements all at once,
before we've modified the vector register. This removes the
TLB_WATCHPOINT bit from page[].flags, which means that we can
use the normal fast path via RAM.
Backports commit 4bcc3f0ff8e5ae2b17b5aab9aa613ff1b8025896 from qemu
First use of the new helper functions, so we can remove the
unused markup. No longer need a scratch for user-only, as
we completely probe the page set before reading; system mode
still requires a scratch for MMIO.
Backports commit b854fd06a868e0308bcfe05ad0a71210705814c7 from qemu
The current interface includes a loop; change it to load a
single element. We will then be able to use the function
for ld{2,3,4} where individual vector elements are not adjacent.
Replace each call with the simplest possible loop over active
elements.
Backports commit cf4a49b71b1712142d7122025a8ca7ea5b59d73f from qemu
For contiguous predicated memory operations, we want to
minimize the number of tlb lookups performed. We have
open-coded this for sve_ld1_r, but for correctness with
MTE we will need this for all of the memory operations.
Create a structure that holds the bounds of active elements,
and metadata for two pages. Add routines to find those
active elements, lookup the pages, and run watchpoints
for those pages.
Temporarily mark the functions unused to avoid Werror.
Backports commit b4cd95d2f4c7197b844f51b29871d888063ea3e7 from qemu
Use the "normal" memory access functions, rather than the
softmmu internal helper functions directly.
Since fb901c9, cpu_mem_index is now a simple extract
from env->hflags and not a large computation. Which means
that it's now more work to pass around this value than it
is to recompute it.
This only adjusts the primitives, and does not clean up
all of the uses within sve_helper.c.
Move the variable declarations to the top of the function,
but do not create a new label before sve_access_check.
Backports commit c0ed9166b1aea86a2fbaada1195aacd1049f9e85 from qemu
Replace existing uses of check_data_tbi in translate-a64.c that
perform multiple logical memory access. Leave the helper blank
for now to reduce the patch size.
Backports commit 73ceeb0011b23bac8bd2c09ebe3c18d034aa69ce from qemu
Replace existing uses of check_data_tbi in translate-a64.c that
perform a single logical memory access. Leave the helper blank
for now to reduce the patch size.
Backports commit 0a405be2b8fd9506a009b10d7d2d98c394b36db6 from qemu
Now that we know that the operation is on a single page,
we need not loop over pages while probing.
Backports commit e26d0d226892f67435cadcce86df0ddfb9943174 from qemu
We can simplify our DC_ZVA if we recognize that the largest BS
that we actually use in system mode is 64. Let us just assert
that it fits within TARGET_PAGE_SIZE.
For DC_GVA and STZGM, we want to be able to write whole bytes
of tag memory, so assert that BS is >= 2 * TAG_GRANULE, or 32.
Backports commit a4157b80242bf1c8aa0ee77aae7458ba79012d5d from qemu
Use the same code as system mode, so that we generate the same
exception + syndrome for the unaligned access.
For the moment, if MTE is enabled so that this path is reachable,
this would generate a SIGSEGV in the user-only cpu_loop. Decoding
the syndrome to produce the proper SIGBUS will be done later.
Backports commit 0d1762e931f8a694f261c604daba605bcda70928 from qemu
The current Arm ARM has adjusted the official decode of
"Add/subtract (immediate)" so that the shift field is only bit 22,
and bit 23 is part of the op1 field of the parent category
"Data processing - immediate".
Backports commit 21a8b343eaae63f6984f9a200092b0ea167647f1 from qemu