GCC version 4.9.4 isn't clever enough to figure out that all
execution paths in disas_ldst() that use 'fn' will have initialized
it first, and so it warns:
/home/LiKaige/qemu/target/arm/translate-a64.c: In function ‘disas_ldst’:
/home/LiKaige/qemu/target/arm/translate-a64.c:3392:5: error: ‘fn’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
fn(cpu_reg(s, rt), clean_addr, tcg_rs, get_mem_index(s),
^
/home/LiKaige/qemu/target/arm/translate-a64.c:3318:22: note: ‘fn’ was declared here
AtomicThreeOpFn *fn;
^
Make it happy by initializing the variable to NULL.
Backports commit 88a90e3de6ae99cbcfcc04c862c51f241fdf685f from qemu
The definition of top_bit used in this function is one higher
than that used in the Arm ARM psuedo-code, which put the error
indication at top_bit - 1 at the wrong place, which meant that
it wasn't visible to Auth.
Fixing the definition of top_bit requires more changes, because
its most common use is for the count of bits in top_bit:bot_bit,
which would then need to be computed as top_bit - bot_bit + 1.
For now, prefer the minimal fix to the error indication alone.
Fixes: 63ff0ca94cb
Backports commit 8796fe40dd30cd9ffd3c958906471715c923b341 from qemu
When we changed the interface of get_phys_addr_lpae to require
the cacheattr parameter, this spot was missed. The compiler is
unable to detect the use of NULL vs the nonnull attribute here.
Fixes: 7e98e21c098
Backports commit a6d6f37aed4b171d121cd4a9363fbb41e90dcb53 from qemu
Raw writes to this register when in KVM mode can cause interrupts to be
raised (even when the PMU is disabled). Because the underlying state is
already aliased to PMINTENSET (which already provides raw write
functions), we can safely disable raw accesses to PMINTENCLR entirely.
Backports commit 887c0f1544991f567543b7c214aa11ab0cea0a29 from qemu
The mtedesc that was constructed was not actually passed in.
Found by Coverity (CID 1429996).
Backports commit cdecb3fc1eb182d90666348a47afe63c493686e7 from qemu
The temp that gets assigned to clean_addr has been allocated with
new_tmp_a64, which means that it will be freed at the end of the
instruction. Freeing it earlier leads to assertion failure.
The loop creates a complication, in which we allocate a new local
temp, which does need freeing, and the final code path is shared
between the loop and non-loop.
Fix this complication by adding new_tmp_a64_local so that the new
local temp is freed at the end, and can be treated exactly like
the non-loop path.
Fixes: bba87d0a0f4
Backports commit 4b4dc9750a0aa0b9766bd755bf6512a84744ce8a from qemu
We now implement all of the components of MTE, without actually
supporting any tagged memory. All MTE instructions will work,
trivially, so we can enable support.
Backports commit c7459633baa71d1781fde4a245d6ec9ce2f008cf from qemu
Look up the physical address for the given virtual address,
convert that to a tag physical address, and finally return
the host address that backs it.
Backports commit e4d5bf4fbd5abfc3727e711eda64a583cab4d637 from qemu
We need to check the memattr of a page in order to determine
whether it is Tagged for MTE. Between Stage1 and Stage2,
this becomes simpler if we always collect this data, instead
of occasionally being presented with NULL.
Use the nonnull attribute to allow the compiler to check that
all pointer arguments are non-null.
Backports commit 7e98e21c09871cddc20946c8f3f3595e93154ecb from qemu
There are a number of paths by which the TBI is still intact
for user-only in the SVE helpers.
Because we currently always set TBI for user-only, we do not
need to pass down the actual TBI setting from above, and we
can remove the top byte in the inner-most primitives, so that
none are forgotten. Moreover, this keeps the "dirty" pointer
around at the higher levels, where we need it for any MTE checking.
Since the normal case, especially for user-only, goes through
RAM, this clearing merely adds two insns per page lookup, which
will be completely in the noise.
Backports commit c4af8ba19b9d22aac79cab679a20b159af9d6809 from qemu
Because the elements are non-sequential, we cannot eliminate many
tests straight away like we can for sequential operations. But
we often have the PTE details handy, so we can test for Tagged.
Backports commit d28d12f008ee44dc2cc2ee5d8f673be9febc951e from qemu
Because the elements are sequential, we can eliminate many tests all
at once when the tag hits TCMA, or if the page(s) are not Tagged.
Backports commit aa13f7c3c378fa41366b9fcd6c29af1c3d81126a from qemu
Because the elements are sequential, we can eliminate many tests all
at once when the tag hits TCMA, or if the page(s) are not Tagged.
Backports commit 71b9f3948c75bb97641a3c8c7de96d1cb47cdc07 from qemu
Because the elements are sequential, we can eliminate many tests all
at once when the tag hits TCMA, or if the page(s) are not Tagged.
Backports commit 206adacfb8d35e671e3619591608c475aa046b63 from qemu
This avoids the need for a separate set of helpers to implement
no-fault semantics, and will enable MTE in the future.
Backports commit 50de9b78cec06e6d16e92a114a505779359ca532 from qemu
Follow the model set up for contiguous loads. This handles
watchpoints correctly for contiguous stores, recognizing the
exception before any changes to memory.
Backports commit 0fa476c1bb37a70df7eeff1e5bfb4791feb37e0e from qemu
With sve_cont_ldst_pages, the differences between first-fault and no-fault
are minimal, so unify the routines. With cpu_probe_watchpoint, we are able
to make progress through pages with TLB_WATCHPOINT set when the watchpoint
does not actually fire.
Backports commit c647673ce4d72a8789703c62a7f3cbc732cb1ea8 from qemu
Handle all of the watchpoints for active elements all at once,
before we've modified the vector register. This removes the
TLB_WATCHPOINT bit from page[].flags, which means that we can
use the normal fast path via RAM.
Backports commit 4bcc3f0ff8e5ae2b17b5aab9aa613ff1b8025896 from qemu
First use of the new helper functions, so we can remove the
unused markup. No longer need a scratch for user-only, as
we completely probe the page set before reading; system mode
still requires a scratch for MMIO.
Backports commit b854fd06a868e0308bcfe05ad0a71210705814c7 from qemu
The current interface includes a loop; change it to load a
single element. We will then be able to use the function
for ld{2,3,4} where individual vector elements are not adjacent.
Replace each call with the simplest possible loop over active
elements.
Backports commit cf4a49b71b1712142d7122025a8ca7ea5b59d73f from qemu
For contiguous predicated memory operations, we want to
minimize the number of tlb lookups performed. We have
open-coded this for sve_ld1_r, but for correctness with
MTE we will need this for all of the memory operations.
Create a structure that holds the bounds of active elements,
and metadata for two pages. Add routines to find those
active elements, lookup the pages, and run watchpoints
for those pages.
Temporarily mark the functions unused to avoid Werror.
Backports commit b4cd95d2f4c7197b844f51b29871d888063ea3e7 from qemu
Use the "normal" memory access functions, rather than the
softmmu internal helper functions directly.
Since fb901c9, cpu_mem_index is now a simple extract
from env->hflags and not a large computation. Which means
that it's now more work to pass around this value than it
is to recompute it.
This only adjusts the primitives, and does not clean up
all of the uses within sve_helper.c.
Move the variable declarations to the top of the function,
but do not create a new label before sve_access_check.
Backports commit c0ed9166b1aea86a2fbaada1195aacd1049f9e85 from qemu
Replace existing uses of check_data_tbi in translate-a64.c that
perform multiple logical memory access. Leave the helper blank
for now to reduce the patch size.
Backports commit 73ceeb0011b23bac8bd2c09ebe3c18d034aa69ce from qemu
Replace existing uses of check_data_tbi in translate-a64.c that
perform a single logical memory access. Leave the helper blank
for now to reduce the patch size.
Backports commit 0a405be2b8fd9506a009b10d7d2d98c394b36db6 from qemu
Now that we know that the operation is on a single page,
we need not loop over pages while probing.
Backports commit e26d0d226892f67435cadcce86df0ddfb9943174 from qemu
We can simplify our DC_ZVA if we recognize that the largest BS
that we actually use in system mode is 64. Let us just assert
that it fits within TARGET_PAGE_SIZE.
For DC_GVA and STZGM, we want to be able to write whole bytes
of tag memory, so assert that BS is >= 2 * TAG_GRANULE, or 32.
Backports commit a4157b80242bf1c8aa0ee77aae7458ba79012d5d from qemu
Use the same code as system mode, so that we generate the same
exception + syndrome for the unaligned access.
For the moment, if MTE is enabled so that this path is reachable,
this would generate a SIGSEGV in the user-only cpu_loop. Decoding
the syndrome to produce the proper SIGBUS will be done later.
Backports commit 0d1762e931f8a694f261c604daba605bcda70928 from qemu
The current Arm ARM has adjusted the official decode of
"Add/subtract (immediate)" so that the shift field is only bit 22,
and bit 23 is part of the op1 field of the parent category
"Data processing - immediate".
Backports commit 21a8b343eaae63f6984f9a200092b0ea167647f1 from qemu
Cache the composite ATA setting.
Cache when MTE is fully enabled, i.e. access to tags are enabled
and tag checks affect the PE. Do this for both the normal context
and the UNPRIV context.
Backports commit 81ae05fa2d21ac1a0054935b74342aa38a5ecef7 from qemu
This is TFSRE0_EL1, TFSR_EL1, TFSR_EL2, TFSR_EL3,
RGSR_EL1, GCR_EL1, GMID_EL1, and PSTATE.TCO.
Backports commit 4b779cebb3e5ab30b945181f1ba3932f5f8a1cb5 from qemu
Add an option that writes back the PC, like DISAS_UPDATE_EXIT,
but does not exit back to the main loop.
Backports commit 329833286d7a1b0ef8c7daafe13c6ae32429694e from qemu
target/arm: Add support for MTE to HCR_EL2 and SCR_EL3
This does not attempt to rectify all of the res0 bits, but does
clear the mte bits when not enabled. Since there is no high-part
mapping of SCTLR, aa32 mode cannot write to these bits.
Backports commits f00faf130d5dcf64b04f71a95f14745845ca1014, and
8ddb300bf60a5f3d358dd6fbf81174f6c03c1d9f from qemu.
Protect reads of aa64 id registers with ARM_CP_STATE_AA64.
Use this as a simpler test than arm_el_is_aa64, since EL3
cannot change mode.
Backports commit 252e8c69669599b4bcff802df300726300292f47 from qemu
In commit cfdb2c0c95ae9205b0 ("target/arm: Vectorize SABA/UABA") we
replaced the old handling of SABA/UABA with a vectorized implementation
which returns early rather than falling into the loop-ever-elements
code. We forgot to delete the part of the old looping code that
did the accumulate step, and Coverity correctly warns (CID 1428955)
that this code is now dead. Delete it.
Fixes: cfdb2c0c95ae9205b0
Backports commit ced7e8edb282765685d2ba0206a11f8692d8ec1c from qemu
Since commit ba3e7926691ed3 it has been unnecessary for target code
to call gen_io_end() after an IO instruction in icount mode; it is
sufficient to call gen_io_start() before it and to force the end of
the TB.
Many now-unnecessary calls to gen_io_end() were removed in commit
9e9b10c6491153b, but some were missed or accidentally added later.
Remove unneeded calls from the arm target:
* the call in the handling of exception-return-via-LDM is
unnecessary, and the code is already forcing end-of-TB
* the call in the VFP access check code is more complicated:
we weren't ending the TB, so we need to add the code to
force that by setting DISAS_UPDATE
* the doc comment for ARM_CP_IO doesn't need to mention
gen_io_end() any more
Backports commit 55c812b74289863c348449135812027d188f040a from qemu
The functions neon_element_offset(), neon_load_element(),
neon_load_element64(), neon_store_element() and
neon_store_element64() are used only in the translate-neon.inc.c
file, so move their definitions there.
Since the .inc.c file is #included in translate.c this doesn't make
much difference currently, but it's a more logical place to put the
functions and it might be helpful if we ever decide to try to make
the .inc.c files genuinely separate compilation units.
Backports commit 6fb5787898aab6aa04887fed9cf3220dd4c3f36a from qemu
Convert the Neon VTRN insn to decodetree. This is the last insn in the
Neon data-processing group, so we can remove all the now-unused old
decoder framework.
It's possible that there's a more efficient implementation of
VTRN, but for this conversion we just copy the existing approach.
Backports commit d4366190f84fe89cc5d46da995dac1e7d541b98e from qemu
Convert the Neon VSWP insn to decodetree. Since the new implementation
doesn't have to share a pass-loop with the other 2-reg-misc operations
we can implement the swap with 64-bit accesses rather than 32-bits
(which brings us into line with the pseudocode and is more efficient).
Backports commit 8ab3a227a0f13f0ff85846f36f7c466769aef4fc from qemu
Convert the Neon 2-reg-misc VRINT insns to decodetree.
Giving these insns their own do_vrint() function allows us
to change the rounding mode just once at the start and end
rather than doing it for every element in the vector.
Backports commit 128123ea34e9e6afe4842aefcb9cf84b9642ac22 from qemu
Convert the Neon 2-reg-misc insns which are implemented with
simple calls to functions that take the input, output and
fpstatus pointer.
Backports commit 3e96b205286dfb8bbf363229709e4f8648fce379 from qemu
Convert the Neon VQABS and VQNEG insns to decodetree.
Since these are the only ones which need cpu_env passing to
the helper, we wrap the helper rather than creating a whole
new do_2misc_env() function.
Backports commit 4936f38abe6db0a9d23fd04e4cb0cf4d51cff174 from qemu
Convert the remaining ops in the Neon 2-reg-misc group which
can be implemented simply with our do_2misc() helper.
Backports commit 84eae770af69c37a92496a4c4248875c070d5ee3 from qemu
Make gen_swap_half() take a source and destination TCGv_i32 rather
than modifying the input TCGv_i32; we're going to want to be able to
use it with the more flexible function signature, and this also
brings it into line with other functions like gen_rev16() and
gen_revsh().
Backports commit 8ec3de7018a8198624aae49eef5568256114a829 from qemu
All the other typedefs like these spell "Op" with a lowercase 'p';
remane the NeonGenTwoSingleOPFn and NeonGenTwoDoubleOPFn typedefs to
match.
Backports commit 5de3fd045be11b74cd0fbf36c6d4fb8387d5463b from qemu
The NeonGenOneOpFn typedef breaks with the pattern of the other
NeonGen*Fn typedefs, because it is a TCGv_i64 -> TCGv_i64 operation
but it does not have '64' in its name. Rename it to NeonGenOne64OpFn,
so that the old name is available for a TCGv_i32 -> TCGv_i32 operation
(which we will need in a subsequent commit).
Backports commit 039f4e809ad2772fb33de4511ff68a485d875618 from qemu
Convert to decodetree the insns in the Neon 2-reg-misc grouping which
we implement using gvec.
Backports commit 75153179e9928775d5333243ea4b278f438d75ae from qemu
Convert the Neon insns in the 2-reg-misc group which are
VCVT between f32 and f16 to decodetree.
Backports commit 654a517355e249435505ae5ff14a7520410cf7a4 from qemu
Convert the Neon narrowing moves VMQNV, VQMOVN, VQMOVUN in the 2-reg-misc
group to decodetree.
Backports commit 3882bdacb0ad548864b9f2582a32bb5c785e3165 from qemu
Convert the pairwise ops VPADDL and VPADAL in the 2-reg-misc grouping
to decodetree.
At this point we can get rid of the weird CPU_V001 #define that was
used to avoid having to explicitly list all the arguments being
passed to some TCG gen/helper functions.
Backports commit 6106af3aa2304fccee91a3a90138352b0c2af998 from qemu
Convert the Neon VDUP (scalar) insn to decodetree. (Note that we
can't call this just "VDUP" as we used that already in vfp.decode for
the "VDUP (general purpose register" insn.)
Backports commit 9aaa23c2ae18e6fb9a291b81baf91341db76dfa0 from qemu
Convert the Neon VTBL, VTBX instructions to decodetree. The actual
implementation of the insn is copied across to the new trans function
unchanged except for renaming 'tmp5' to 'tmp4'.
Backports commit 54e96c744b70a5d19f14b212a579dd3be8fcaad9 from qemu
Convert the Neon VEXT insn to decodetree. Rather than keeping the
old implementation which used fixed temporaries cpu_V0 and cpu_V1
and did the extraction with by-hand shift and logic ops, we use
the TCG extract2 insn.
We don't need to special case 0 or 8 immediates any more as the
optimizer is smart enough to throw away the dead code.
Backports commit 0aad761fb0aed40c99039eacac470cbd03d07019 from qemu
Convert the Neon 2-reg-scalar long multiplies to decodetree.
These are the last instructions in the group.
Backports commit 77e576a9281825fc170f3b3af83f47e110549b5c from qemu
Convert the float versions of VMLA, VMLS and VMUL in the Neon
2-reg-scalar group to decodetree.
Backports commit 85ac9aef9a5418de3168df569e21258e853840a2 from qemu
Convert the VMLA, VMLS and VMUL insns in the Neon "2 registers and a
scalar" group to decodetree. These are 32x32->32 operations where
one of the inputs is the scalar, followed by a possible accumulate
operation of the 32-bit result.
The refactoring removes some of the oddities of the old decoder:
* operands to the operation and accumulation were often
reversed (taking advantage of the fact that most of these ops
are commutative); the new code follows the pseudocode order
* the Q bit in the insn was in a local variable 'u'; in the
new code it is decoded into a->q
Backports commit 96fc80f5f186decd1a649f6c04252faceb057ad2 from qemu
In commit 37bfce81b10450071 we accidentally introduced a leak of a TCG
temporary in do_2shift_env_64(); free it.
Backports commit a4f67e180def790ff0bbb33fc93bb6e80382f041 from qemu
Mark the arrays of function pointers in trans_VSHLL_S_2sh() and
trans_VSHLL_U_2sh() as both 'static' and 'const'.
Backports commit 448f0e5f3ecfbd089b934e5e3aa0ccd1f51a6174 from qemu
Convert the Neon 3-reg-diff insn polynomial VMULL. This is the last
insn in this group to be converted.
Backports commit 18fb58d588898550919392277787979ee7d0d84e from qemu
Convert the Neon 3-reg-diff insns VQDMULL, VQDMLAL and VQDMLSL:
these are all saturating doubling long multiplies with a possible
accumulate step.
These are the last insns in the group which use the pass-over-each
elements loop, so we can delete that code.
Backports commit 9546ca5998d3cbd98a81b2d46a2e92a11b0f78a4 from qemu