The miscellaneous control instructions are mutually exclusive
within the t32 decode sub-group.
Backports commit d6084fba47bb9aef79775c1102d4b647eb58c365 from qemu
Convert the insns in the one-register-and-immediate group to decodetree.
In the new decode, our asimd_imm_const() function returns a 64-bit value
rather than a 32-bit one, which means we don't need to treat cmode=14 op=1
as a special case in the decoder (it is the only encoding where the two
halves of the 64-bit value are different).
Backports commit 2c35a39eda0b16c2ed85c94cec204bf5efb97812 from qemu
Convert the VCVT fixed-point conversion operations in the
Neon 2-regs-and-shift group to decodetree.
Backports commit 3da26f11711caeaa18318b6afa14dfb81d7650ab from qemu
Convert the VSHLL and VMOVL insns from the 2-reg-shift group
to decodetree. Since the loop always has two passes, we unroll
it to avoid the awkward reassignment of one TCGv to another.
Backports commit 968bf842742a5ffbb0041cb31089e61a9f7a833d from qemu
Convert the VQSHLU and QVSHL 2-reg-shift insns to decodetree.
These are the last of the simple shift-by-immediate insns.
Backports commit 37bfce81b10450071193c8495a07f182ec652e2a from qemu
Convert the VSHR 2-reg-shift insns to decodetree.
Note that unlike the legacy decoder, we present the right shift
amount to the trans_ function as a positive integer.
Backports commit 66432d6b8294e3508218b360acfdf7c244eea993 from qemu
Convert the VSHL and VSLI insns from the Neon 2-registers-and-a-shift
group to decodetree.
Backports commit d3c8c736f8b4bdd02831076286b1788232f46ced from qemu
Rather than passing an opcode to a helper, fully decode the
operation at translate time. Use clear_tail_16 to zap the
balance of the SVE register with the AdvSIMD write.
Backports commit 43fa36c96c24349145497adc1b451f9caf74e344 from qemu
Rather than passing an opcode to a helper, fully decode the
operation at translate time. Use clear_tail_16 to zap the
balance of the SVE register with the AdvSIMD write.
Backports commit afc8b7d32668547308bdd654a63cf5228936e0ba from qemu
Do not yet convert the helpers to loop over opr_sz, but the
descriptor allows the vector tail to be cleared. Which fixes
an existing bug vs SVE.
Backports commit effa992f153f5e7ab97ab843b565690748c5b402 from qemu
Do not yet convert the helpers to loop over opr_sz, but the
descriptor allows the vector tail to be cleared. Which fixes
an existing bug vs SVE.
Backports commit aaffebd6d3135b8aed7e61932af53b004d261579 from qemu
With this conversion, we will be able to use the same helpers
with sve. This also fixes a bug in which we failed to clear
the high bits of the SVE register after an AdvSIMD operation.
Backports commit 1738860d7e60dec5dbeba17f8b44d31aae3accac from qemu
With this conversion, we will be able to use the same helpers
with sve. In particular, pass 3 vector parameters for the
3-operand operations; for advsimd the destination register
is also an input.
This also fixes a bug in which we failed to clear the high bits
of the SVE register after an AdvSIMD operation.
Backports commit a04b68e1d4c4f0cd5cd7542697b1b230b84532f5 from qemu
Using the MSR instruction to write to CPSR.E is deprecated, but it is
required to work from any mode including unprivileged code. We were
incorrectly forbidding usermode code from writing it because
CPSR_USER did not include the CPSR_E bit.
We use CPSR_USER in only three places:
* as the mask of what to allow userspace MSR to write to CPSR
* when deciding what bits a linux-user signal-return should be
able to write from the sigcontext structure
* in target_user_copy_regs() when we set up the initial
registers for the linux-user process
In the first two cases not being able to update CPSR.E is a bug, and
in the third case it doesn't matter because CPSR.E is always 0 there.
So we can fix both bugs by adding CPSR_E to CPSR_USER.
Because the cpsr_write() in restore_sigcontext() is now changing
a CPSR bit which is cached in hflags, we need to add an
arm_rebuild_hflags() call there; the callsite in
target_user_copy_regs() was already rebuilding hflags for other
reasons.
(The recommended way to change CPSR.E is to use the 'SETEND'
instruction, which we do correctly allow from usermode code.)
Backports commit 268b1b3dfbb92a9348406f728a33f39e3d8dcd8a from qemu
Do not explicitly store zero to the NEON high part
when we can pass !is_q to clear_vec_high.
Backports commit e1f778596ebfa8782276f4dd4651f2b285d734ff from qemu
The 8-byte store for the end a !is_q operation can be
merged with the other stores. Use a no-op vector move
to trigger the expand_clr portion of tcg_gen_gvec_mov.
Backports commit 5c27392dd08bd8534893abf25ef501f1bd8680fe from qemu
Give the previously unnamed enum a typedef name. Use it in the
prototypes of compare functions. Use it to hold the results
of the compare functions.
Backports commit 71bfd65c5fcd72f8af2735905415c7ce4220f6dc from qemu
Give the previously unnamed enum a typedef name. Use the packed
attribute so that we do not affect the layout of the float_status
struct. Use it in the prototypes of relevant functions.
Adjust switch statements as necessary to avoid compiler warnings.
Backports commit 3dede407cc61b64997f0c30f6dbf4df09949abc9 from qemu
The existing f{32,64}_addsub_post test, which checks for zero
inputs, is identical to f{32,64}_mul_fast_test. Which means
we can eliminate the fast_test/fast_op hooks in favor of
reusing the same post hook.
This means we have one fewer test along the fast path for multiply.
Backports commit b240c9c497b9880ac0ba29465907d5ebecd48083 from qemu
Convert the Neon floating point VFMA and VFMS insn to decodetree.
These are the last insns in the 3-reg-same group so we can
remove all the support/loop code from the old decoder.
Backports commit e95485f85657be21135c17a9226e297c21e73360 from qemu
Convert the Neon fp VMAX/VMIN/VMAXNM/VMINNM/VRECPS/VRSQRTS 3-reg-same
insns to decodetree. (These are all the remaining non-accumulation
instructions in this group.)
Backports commit d5fdf9e9e1c6f2bbb0a4bcaafd85d344cce9c298 from qemu
The usual location for the env argument in the argument list of a TCG helper
is immediately after the return-value argument. recps_f32 and rsqrts_f32
differ in that they put it at the end.
Move the env argument to its usual place; this will allow us to
more easily use these helper functions with the gvec APIs.
Backports commit 26c6f695cfd2a3ccddb4d015a25b56f56aa62928 from qemu
Convert the Neon integer 3-reg-same compare insns VCGE, VCGT,
VCEQ, VACGE and VACGT to decodetree.
Backports commit 727ff1d63213e6666e511956903b9e97a339ec7e from qemu
Convert the Neon integer VMUL, VMLA, and VMLS 3-reg-same inssn to
decodetree.
We don't have a gvec helper for multiply-accumulate, so VMLA and VMLS
need a loop function do_3same_fp(). This takes a reads_vd parameter
to do_3same_fp() which tells it to load the old value into vd before
calling the callback function, in the same way that the do_vfp_3op_sp()
and do_vfp_3op_dp() functions in translate-vfp.inc.c work. (The
only uses in this patch pass reads_vd == true, but later commits
will use reads_vd == false.)
This conversion fixes in passing an underdecoding for VMUL
Backports commit 8aa71ead912ca0a9c0d29b74e0976f91952f950a from qemu
Convert the Neon float VPMIN, VPMAX and VPADD 3-reg-same insns to
decodetree. These are the only remaining 'pairwise' operations,
so we can delete the pairwise-specific bits of the old decoder's
for-each-element loop now.
Backports commit ab978335a56e3618212868fdce3a54217c6e71e6 from qemu
Convert the Neon VADD, VSUB, VABD 3-reg-same insns to decodetree.
We already have gvec helpers for addition and subtraction, but must
add one for fabd.
Backports commit a26a352bb498662cd0c205cb433a352f86fac7d2 from qemu
Convert the Neon VQDMULH and VQRDMULH 3-reg-same insns to
decodetree. These are the last integer operations in the
3-reg-same group.
Backports commit 7ecc28bc72b8033cf4e0c6332135ec20d4125dfb from qemu
Convert the Neon integer VPADD 3-reg-same insns to decodetree. These
are 'pairwise' operations. (Note that VQRDMLAH, which shares the
same primary opcode but has U=1, has already been converted.)
Backports commit fa22827d4eb078b6c58cd3d19af0b50ed951e832 from qemu
Convert the Neon integer VPMAX and VPMIN 3-reg-same insns to
decodetree. These are 'pairwise' operations.
Backports commit 059c2398a2b1ae86c6722c45e79fb0d0f4d95b1d from qemu
Convert the VQSHL, VRSHL and VQRSHL insns in the 3-reg-same
group to decodetree. We have already implemented the size==0b11
case of these insns; this commit handles the remaining sizes
Backports commit 6812dfdc6b0286730d6f903ebfbdc4f81b80c29b from qemu
Convert the Neon VRHADD and VHSUB 3-reg-same insns to decodetree.
(These are all the other insns in 3-reg-same which were using
GEN_NEON_INTEGER_OP() and which are not pairwise or
reversed-operands.)
Backports commit 8e44d03f4b5590e19a4f7910ca1c327609933dd7 from qemu
Convert the 64-bit element insns in the 3-reg-same group
to decodetree. This covers VQSHL, VRSHL and VQRSHL where
size==0b11.
Backports commit 35d4352fa9e94b35bf17f58181cb16c184b98d56 from qemu
Convert the Neon VQRDMLAH and VQRDMLSH insns in the 3-reg-same group
to decodetree. These don't use do_3same() because they want to
operate on VFP double registers, whose offsets are different from the
neon_reg_offset() calculations do_3same does.
Backports commit a063569508af8295cf6271e06700e5b956bb402d from qemu
Pass a pointer directly to env->vfp.qc[0], rather than env.
This will allow SVE2, which does not modify QC, to pass a
pointer to dummy storage.
Change the return type of inl_qrdml.h_s16 to match the
sense of the operation: signed.
Backports commit e286bf4a72fe3a60490b8d6e3f28d6335677e08c from qemu
Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.
Backports commit 146aa66ce58b686b8037d0eb3921c1125942dbde from qemu
Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.
Backports commit c7715b6b51a6f7a5412c5fcb40a4c8586105e597 from qemu
Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.
Backports commit 8161b75357095fef54c76b1a6ed1e54d0e8655e0 from qemu
Rather than perform the argument swap during code generation,
perform it during decode. This means it doesn't have to be
special cased later, and we can share code with aarch64 code
generation. Hopefully the decode comment addresses any confusion
that might arise in between.
Backports commit e9eee5316ffec5f37643de806b2e5577c5c189cf from qemu
Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.
Backports commit 271063206a46062a45fc6bab8dabe45f0b88159d from qemu
Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.
Macro-ize the 5 nearly identical comparisons.
Backports commit 69d5e2bf8c3cefedbfa1c1670137e636dbd7faa5 from qemu
Now that we've converted all cases to gvec, there is quite a bit
of dead code at the end of the function. Remove it.
Sink the call to gen_gvec_fn2i to the end, loading a function
pointer within the switch statement.
Backports commit 3f08f0bce841e7857ec98ce7909629d0c335005e from qemu
In 1dc8425e551, while converting to gvec, I added an extra range check
against the shift count. This was unnecessary because the encoding of
the shift count produces 0 to the element size - 1.
Backports commit 2f27c5244db300387f15d9ffa5067a204ffd625d from qemu
The functions eliminate duplication of the special cases for
this operation. They match up with the GVecGen2iFn typedef.
Add out-of-line helpers. We got away with only having inline
expanders because the neon vector size is only 16 bytes, and
we know that the inline expansion will always succeed.
When we reuse this for SVE, tcg-gvec-op may decide to use an
out-of-line helper due to longer vector lengths.
Backports commit 893ab0542aa385a287cbe46d5535c8b9e95ce699 from qemu
Create vectorized versions of handle_shri_with_rndacc
for shift+round and shift+round+accumulate. Add out-of-line
helpers in preparation for longer vector lengths from SVE.
Backports commit 6ccd48d4ea244c1c46a24dfa50bfb547f11422dd from qemu
The functions eliminate duplication of the special cases for
this operation. They match up with the GVecGen2iFn typedef.
Add out-of-line helpers. We got away with only having inline
expanders because the neon vector size is only 16 bytes, and
we know that the inline expansion will always succeed.
When we reuse this for SVE, tcg-gvec-op may decide to use an
out-of-line helper due to longer vector lengths.
Backports commit 631e565450c483e0622eec3d8b61d7fa41d16bca from qemu
DUP (indexed) can duplicate 128-bit elements, so using esz
unconditionally can assert in tcg_gen_gvec_dup_imm.
Fixes: 8711e71f9cbb
Backports commit 7e17d50ebd359ee5fa3d65d7fdc0fe0336d60694 from qemu
Now that we can pass 7 parameters, do not encode register
operands within simd_data.
Backports commit 08975da9f0bfcfa654628cae71201a351ba5449a from qemu
Move the common set_feature() and unset_feature() functions
from cpu.c and cpu64.c to cpu.h.
Backports commit 5fda95041d7237ab35733ceb66e0cb89f6107169 from qemu
Since on the aarch64-linux-user build, arm_cpus[] is empty, add
the cpu_count variable and only iterate when it is non-zero.
Backports commit 92b6a659388ab3735e5fbb17ac486923b681f57f from qemu
Calling access_el3_aa32ns() works for AArch32 only cores
but it does not handle 32-bit EL2 on top of 64-bit EL3
for mixed 32/64-bit cores.
Merge access_el3_aa32ns_aa64any() into access_el3_aa32ns()
and only use the latter.
Fixes: 68e9c2fe65 ("target-arm: Add VTCR_EL2")
Backports commit 93dd1e6140e2652347cfe7208591d4cd32762d08 from qemu
We're going to want at least some of the NeonGen* typedefs
for the refactored 32-bit Neon decoder, so move them all
to translate.h since it makes more sense to keep them in
one group.
Backports commit 9aefc6cf9b73f66062d2f914a0136756e7a28211 from qemu
Convert the Neon VMUL, VMLA, VMLS and VSHL insns in the
3-reg-same grouping to decodetree.
Backports commit 0de34fd48ad4e44bf5caa2330657ebefa93cea7d from qemu
Convert the Neon logic ops in the 3-reg-same grouping to decodetree.
Note that for the logic ops the 'size' field forms part of their
decode and the actual operations are always bitwise.
Backports commit 35a548edb6f5043386183b9f6b4139d99d1f130a from qemu
Convert the Neon 3-reg-same VADD and VSUB insns to decodetree.
Note that we don't need the neon_3r_sizes[op] check here because all
size values are OK for VADD and VSUB; we'll add this when we convert
the first insn that has size restrictions.
For this we need one of the GVecGen*Fn typedefs currently in
translate-a64.h; move them all to translate.h as a block so they
are visible to the 32-bit decoder.
Backports commit a4e143ac5b9185f670d2f17ee9cc1a430047cb65 from qemu
Convert the Neon "load/store single structure to one lane" insns to
decodetree.
As this is the last set of insns in the neon load/store group,
we can remove the whole disas_neon_ls_insn() function.
Backports commit 123ce4e3daba26b760b472687e1fb1ad82cf1993 from qemu
Convert the VFM[AS]L (scalar) insns in the 2reg-scalar-ext group
to decodetree. These are the last ones in the group so we can remove
all the legacy decode for the group.
Note that in disas_thumb2_insn() the parts of this encoding space
where the decodetree decoder returns false will correctly be directed
to illegal_op by the "(insn & (1 << 28))" check so they won't fall
into disas_coproc_insn() by mistake.
Backports commit d27e82f7d02f35e5919bd9cbbcb157f3537069a0 from qemu
Convert the VFM[AS]L (vector) insns to decodetree. This is the last
insn in the legacy decoder for the 3same_ext group, so we can
delete the legacy decoder function for the group entirely.
Note that in disas_thumb2_insn() the parts of this encoding space
where the decodetree decoder returns false will correctly be directed
to illegal_op by the "(insn & (1 << 28))" check so they won't fall
into disas_coproc_insn() by mistake.
Backports commit 9a107e7b8a3c87ab63ec830d3d60f319fc577ff7 from qemu
Add the infrastructure for building and invoking a decodetree decoder
for the AArch32 Neon encodings. At the moment the new decoder covers
nothing, so we always fall back to the existing hand-written decode.
We follow the same pattern we did for the VFP decodetree conversion
(commit 78e138bc1f672c145ef6ace74617d and following): code that deals
with Neon will be moving gradually out to translate-neon.vfp.inc,
which we #include into translate.c.
In order to share the decode files between A32 and T32, we
split Neon into 3 parts:
* data-processing
* load-store
* 'shared' encodings
The first two groups of instructions have similar but not identical
A32 and T32 encodings, so we need to manually transform the T32
encoding into the A32 one before calling the decoder; the third group
covers the Neon instructions which are identical in A32 and T32.
Backports commit 625e3dd44a15dfbe9532daa6454df3f86cf04d3e from qemu
We were accidentally permitting decode of Thumb Neon insns even if
the CPU didn't have the FEATURE_NEON bit set, because the feature
check was being done before the call to disas_neon_data_insn() and
disas_neon_ls_insn() in the Arm decoder but was omitted from the
Thumb decoder. Push the feature bit check down into the called
functions so it is done for both Arm and Thumb encodings.
Backports commit d1a6d3b594157425232a1ae5ea7f51b7a1c1aa2e from qemu
Somewhere along theline we accidentally added a duplicate
"using D16-D31 when they don't exist" check to do_vfm_dp()
(probably an artifact of a patchseries rebase). Remove it.
Backports commit 0d787cf1f3c88fa29477e054f8523f6d82d91c98 from qemu
MIDR_EL1 is a 64-bit system register with the top 32-bit being RES0.
Represent it in QEMU's ARMCPU struct with a uint64_t, not a
uint32_t.
This fixes an error when compiling with -Werror=conversion
because we were manipulating the register value using a
local uint64_t variable:
target/arm/cpu64.c: In function ‘aarch64_max_initfn’:
target/arm/cpu64.c:628:21: error: conversion from ‘uint64_t’ {aka ‘long unsigned int’} to ‘uint32_t’ {aka ‘unsigned int’} may change value [-Werror=conversion]
628 | cpu->midr = t;
| ^
and future-proofs us against a possible future architecture
change using some of the top 32 bits.
Backports commit e544f80030121040c8932ff1bd4006f390266c0f from qemu
In aarch64_max_initfn() we update both 32-bit and 64-bit ID
registers. The intended pattern is that for 64-bit ID registers we
use FIELD_DP64 and the uint64_t 't' register, while 32-bit ID
registers use FIELD_DP32 and the uint32_t 'u' register. For
ID_AA64DFR0 we accidentally used 'u', meaning that the top 32 bits of
this 64-bit ID register would end up always zero. Luckily at the
moment that's what they should be anyway, so this bug has no visible
effects.
Use the right-sized variable.
Backports commit 5a89dd2385a193aa954a7c9bf4e381f2ba6ae359 from qemu
The ARMv8.2-TTS2UXN feature extends the XN field in stage 2
translation table descriptors from just bit [54] to bits [54:53],
allowing stage 2 to control execution permissions separately for EL0
and EL1. Implement the new semantics of the XN field and enable
the feature for our 'max' CPU.
Backports commit ce3125bed935a12e619a8253c19340ecaa899347 from qemu
For ARMv8.2-TTS2UXN, the stage 2 page table walk wants to know
whether the stage 1 access is for EL0 or not, because whether
exec permission is given can depend on whether this is an EL0
or EL1 access. Add a new argument to get_phys_addr_lpae() so
the call sites can pass this information in.
Since get_phys_addr_lpae() doesn't already have a doc comment,
add one so we have a place to put the documentation of the
semantics of the new s1_is_el0 argument.
Backports commit ff7de2fc2c994030bfb83af9ddc9a3cd70ce3e88 from qemu
The access_type argument to get_phys_addr_lpae() is an MMUAccessType;
use the enum constant MMU_DATA_LOAD rather than a literal 0 when we
call it in S1_ptw_translate().
Backports commit 59dff859cd850876df2cfa561c7bcfc4bdda4599 from qemu
We define ARMMMUIdx_Stage2 as being an MMU index which uses a QEMU
TLB. However we never actually use the TLB -- all stage 2 lookups
are done by direct calls to get_phys_addr_lpae() followed by a
physical address load via address_space_ld*().
Remove Stage2 from the list of ARM MMU indexes which correspond to
real core MMU indexes, and instead put it in the set of "NOTLB" ARM
MMU indexes.
This allows us to drop NB_MMU_MODES to 11. It also means we can
safely add support for the ARMv8.3-TTS2UXN extension, which adds
permission bits to the stage 2 descriptors which define execute
permission separatel for EL0 and EL1; supporting that while keeping
Stage2 in a QEMU TLB would require us to use separate TLBs for
"Stage2 for an EL0 access" and "Stage2 for an EL1 access", which is a
lot of extra complication given we aren't even using the QEMU TLB.
In the process of updating the comment on our MMU index use,
fix a couple of other minor errors:
* NS EL2 EL2&0 was missing from the list in the comment
* some text hadn't been updated from when we bumped NB_MMU_MODES
above 8
Backports commit bf05340cb655637451162c02dadcd6581a05c02c from qemu
According to Arm ARM, VQDMULL is only valid when U=0, while having
U=1 is unallocated.
Backports commit ab553ef74ee52c0889679d0bd0da084aaf938f5c from qemu
We will move this code in the next commit. Clean it up
first to avoid checkpatch.pl errors.
Backports commit 51c510aa5876a681cd0059ed3bacaa17590dc2d5 from qemu
Make cpu_register() (renamed to arm_cpu_register()) available
from internals.h so we can register CPUs also from other files
in the future.
Backports commit 37bcf244454f4efb82e2c0c64bbd7eabcc165a0c from qemu
Under KVM these registers are written by the hardware.
Restrict the writefn handlers to TCG to avoid when building
without TCG:
LINK aarch64-softmmu/qemu-system-aarch64
target/arm/helper.o: In function `do_ats_write':
target/arm/helper.c:3524: undefined reference to `raise_exception'
Backports commit 9fb005b02dbda7f47b789b7f19bf5f73622a4756 from qemu
These instructions are often used in glibc's string routines.
They were the final uses of the 32-bit at a time neon helpers.
Backports commit 6b375d3546b009d1e63e07397ec9c6af256e15e9 from qemu
In commit 41a4bf1feab098da4cd the added code to set the CNP
field in ID_MMFR4 for the AArch64 'max' CPU had a typo
where it used the wrong variable name, resulting in ID_MMFR4
fields AC2, XNX and LSM being wrong. Fix the typo.
Fixes: 41a4bf1feab098da4cd
Backports commit e73c4443473107ddf11ad3a7fea5bef2001ee802 from qemu
An old comment in get_phys_addr_lpae() claims that the code does not
support the different format TCR for VTCR_EL2. This used to be true
but it is not true now (in particular the aa64_va_parameters() and
aa32_va_parameters() functions correctly handle the different
register format by checking whether the mmu_idx is Stage2).
Remove the out of date parts of the comment.
Backports commit 07d1be3b3aac20c21ac4a95c7f3f01a3622a31a3 from qemu
Our implementation of the PSTATE.PAN bit incorrectly cleared all
access permission bits for privileged access to memory which is
user-accessible. It should only affect the privileged read and write
permissions; execute permission is dealt with via XN/PXN instead.
Fixes: 81636b70c226dc27d7ebc8d
Backports commit f4e1dbc578a051db08a40c05276ebf525b98f949 from qemu
The arm_current_el() should be invoked after mode switching. Otherwise, we
get a wrong current EL value, since current EL is also determined by
current mode.
Fixes: 4a2696c0d4 ("target/arm: Set PAN bit as required on exception entry")
Backports commit 88828bf133b64b7a860c166af3423ef1a47c5d3b from qemu
Coverity reports a BAD_SHIFT with ctz32(imm5), with imm5 == 0.
This is an invalid encoding, but we diagnose that just below
by rejecting size > 3. Avoid the warning by sinking the
computation of index below the check.
Backports commit 550a04893c2bd4442211b353680b9a6408d94dba from qemu