No host backend support yet, but the interfaces for rotls
are in place. Only implement left-rotate for now, as the
only known use of vector rotate by scalar is s390x, so any
right-rotate would be unused and untestable.
Backports commit 23850a74afb641102325b4b7f74071d929fc4594 from qemu
No host backend support yet, but the interfaces for rotlv
and rotrv are in place.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v3: Drop the generic expansion from rot to shift; we can do better
for each backend, and then this code becomes unused.
Backports commit 5d0ceda902915e3f0e21c39d142c92c4e97c3ebb from qemu
No host backend support yet, but the interfaces for rotli
are in place. Canonicalize immediate rotate to the left,
based on a survey of architectures, but provide both left
and right shift interfaces to the translators.
Backports commit b0f7e7444c03da17e41bf327c8aea590104a28ab from qemu
Convert the Neon VADD, VSUB, VABD 3-reg-same insns to decodetree.
We already have gvec helpers for addition and subtraction, but must
add one for fabd.
Backports commit a26a352bb498662cd0c205cb433a352f86fac7d2 from qemu
Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.
Backports commit 146aa66ce58b686b8037d0eb3921c1125942dbde from qemu
Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.
Backports commit c7715b6b51a6f7a5412c5fcb40a4c8586105e597 from qemu
Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.
Backports commit 8161b75357095fef54c76b1a6ed1e54d0e8655e0 from qemu
Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.
Backports commit 271063206a46062a45fc6bab8dabe45f0b88159d from qemu
Provide a functional interface for the vector expansion.
This fits better with the existing set of helpers that
we provide for other operations.
Macro-ize the 5 nearly identical comparisons.
Backports commit 69d5e2bf8c3cefedbfa1c1670137e636dbd7faa5 from qemu
The functions eliminate duplication of the special cases for
this operation. They match up with the GVecGen2iFn typedef.
Add out-of-line helpers. We got away with only having inline
expanders because the neon vector size is only 16 bytes, and
we know that the inline expansion will always succeed.
When we reuse this for SVE, tcg-gvec-op may decide to use an
out-of-line helper due to longer vector lengths.
Backports commit 893ab0542aa385a287cbe46d5535c8b9e95ce699 from qemu
Create vectorized versions of handle_shri_with_rndacc
for shift+round and shift+round+accumulate. Add out-of-line
helpers in preparation for longer vector lengths from SVE.
Backports commit 6ccd48d4ea244c1c46a24dfa50bfb547f11422dd from qemu
The functions eliminate duplication of the special cases for
this operation. They match up with the GVecGen2iFn typedef.
Add out-of-line helpers. We got away with only having inline
expanders because the neon vector size is only 16 bytes, and
we know that the inline expansion will always succeed.
When we reuse this for SVE, tcg-gvec-op may decide to use an
out-of-line helper due to longer vector lengths.
Backports commit 631e565450c483e0622eec3d8b61d7fa41d16bca from qemu
Add a version of tcg_gen_dup_* that takes both immediate and
a vector element size operand. This will replace the set of
tcg_gen_gvec_dup{8,16,32,64}i functions that encode the element
size within the function name.
Backports commit 44c94677febd15488f9190b11eaa4a08e8ac696b from qemu
Make cpu_register() (renamed to arm_cpu_register()) available
from internals.h so we can register CPUs also from other files
in the future.
Backports commit 37bcf244454f4efb82e2c0c64bbd7eabcc165a0c from qemu
These instructions are often used in glibc's string routines.
They were the final uses of the 32-bit at a time neon helpers.
Backports commit 6b375d3546b009d1e63e07397ec9c6af256e15e9 from qemu
We still need two different helpers, since NEON and SVE2 get the
inputs from different locations within the source vector. However,
we can convert both to the same internal form for computation.
The sve2 helper is not used yet, but adding it with this patch
helps illustrate why the neon changes are helpful.
Backports commit e7e96fc5ec8c79dc77fef522d5226ac09f684ba5 from qemu
The gvec form will be needed for implementing SVE2.
Extend the implementation to operate on uint64_t instead of uint32_t.
Use a counted inner loop instead of terminating when op1 goes to zero,
looking toward the required implementation for ARMv8.4-DIT.
Backports commit a21bb78e5817be3f494922e1dadd6455fe5d6318 from qemu
These instructions shift left or right depending on the sign
of the input, and 7 bits are significant to the shift. This
requires several masks and selects in addition to the actual
shifts to form the complete answer.
That said, the operation is still a small improvement even for
two 64-bit elements -- 13 vector operations instead of 2 * 7
integer operations.
Backports commit 87b74e8b6edd287ea2160caa0ebea725fa8f1ca1 from qemu
Use the correct sctlr for EL2&0 regime. Due to header ordering,
and where arm_mmu_idx_el is declared, we need to move the function
out of line. Use the function in many more places in order to
select the correct control.
Backports commit aaec143212bb70ac9549cf73203d13100bd5c7c2 from qemu
Prepare for, but do not yet implement, the EL2&0 regime.
This involves adding the new MMUIdx enumerators and adjusting
some of the MMUIdx related predicates to match.
Backports commit b9f6033c1a5fb7da55ed353794db8ec064f78bb2 from qemu.
Add an option to trigger memory writeback to sync given memory region
with the corresponding backing store, case one is available.
This extends the support for persistent memory, allowing syncing on-demand.
Backports commit 61c490e25e081af39ff40556f6c1229b8b011585 from qemu
The raising of exceptions from check_watchpoint, buried inside
of the I/O subsystem, is fundamentally broken. We do not have
the helper return address with which we can unwind guest state.
Replace PHYS_SECTION_WATCH and io_mem_watch with TLB_WATCHPOINT.
Move the call to cpu_check_watchpoint into the cputlb helpers
where we do have the helper return address.
This allows watchpoints on RAM to bypass the full i/o access path.
Backports commit 50b107c5d617eaf93301cef20221312e7a986701 from qemu
Preparation for collapsing the two byte swaps adjust_endianness and
handle_bswap into the former.
Call memory_region_dispatch_{read|write} with endianness encoded into
the "MemOp op" operand.
This patch does not change any behaviour as
memory_region_dispatch_{read|write} is yet to handle the endianness.
Once it does handle endianness, callers with byte swaps can collapse
them into adjust_endianness.
Backports commit d5d680cacc66ef7e3c02c81dc8f3a34eabce6dfe from qemu
HCR_EL2.TID3 requires that AArch32 reads of MVFR[012] are trapped to
EL2, and HCR_EL2.TID0 does the same for reads of FPSID.
In order to handle this, introduce a new TCG helper function that
checks for these control bits before executing the VMRC instruction.
Tested with a hacked-up version of KVM/arm64 that sets the control
bits for 32bit guests.
Backports commit 9ca1d776cb49c09b09579d9edd0447542970c834 from qemu
Replace x = double_saturate(y) with x = add_saturate(y, y).
There is no need for a separate more specialized helper.
Backports commit 640581a06d14e2d0d3c3ba79b916de6bc43578b0 from qemu
In the next commit we will split the M-profile functions from this
file. Some function will be called out of helper.c. Declare them in
the "internals.h" header.
Backports commit 787a7e76c2e93a48c47b324fea592c9910a70483 from qemu
These routines are TCG specific.
The arm_deliver_fault() function is only used within the new
helper. Make it static.
Backports commit e21b551cb652663f2f2405a64d63ef6b4a1042b7 from qemu
In the next commit we will split the TLB related routines of
this file, and this function will also be called in the new
file. Declare it in the "internals.h" header.
Backports commit ebae861fc6c385a7bcac72dde4716be06e6776f1 from qemu
Perform a per-element conditional move. This combination operation is
easier to implement on some host vector units than plain cmp+bitsel.
Omit the usual gvec interface, as this is intended to be used by
target-specific gvec expansion call-backs.
Backports commit f75da2988eb2457fa23d006d573220c5c680ec4e from qemu
This operation performs d = (b & a) | (c & ~a), and is present
on a majority of host vector units. Include gvec expanders.
Backports commit 38dc12947ec9106237f9cdbd428792c985cd86ae from qemu
We can now use the CPUClass hook instead of a named function.
Create a static tlb_fill function to avoid other changes within
cputlb.c. This also isolates the asserts within. Remove the
named tlb_fill function from all of the targets.
Backports commit c319dc13579a92937bffe02ad2c9f1a550e73973 from qemu
Remove a function of the same name from target/arm/.
Use a branchless implementation of abs gleaned from gcc.
Backports commit ff1f11f7f8710a768f9313f24bd7f509d3db27e5 from qemu
Allow expansion either via shift by scalar or by replicating
the scalar for shift by vector.
Backports commit b4578cd91cda4cef1c413304353ca6dc5b957b60 from qemu
The gvec expanders perform a modulo on the shift count. If the target
requires alternate behaviour, then it cannot use the generic gvec
expanders anyway, and will have to have its own custom code.
Backports commit 5ee5c14cacda27e904cd6b0d9e7ffe1acff42838 from qemu
Allow the backend to expand dup from memory directly, instead of
forcing the value into a temp first. This is especially important
if integer/vector register moves do not exist.
Note that officially tcg_out_dupm_vec is allowed to fail.
If it did, we could fix this up relatively easily:
VECE == 32/64:
Load the value into a vector register, then dup.
Both of these must work.
VECE == 8/16:
If the value happens to be at an offset such that an aligned
load would place the desired value in the least significant
end of the register, go ahead and load w/garbage in high bits.
Load the value w/INDEX_op_ld{8,16}_i32.
Attempt a move directly to vector reg, which may fail.
Store the value into the backing store for OTS.
Load the value into the vector reg w/TCG_TYPE_I32, which must work.
Duplicate from the vector reg into itself, which must work.
All of which is well and good, except that all supported
hosts can support dupm for all vece, so all of the failure
paths would be dead code and untestable.
Backports commit 37ee55a081b7863ffab2151068dd1b2f11376914 from qemu
Replace the single opcode in .opc with a null-terminated
array in .opt_opc. We still require that all opcodes be
used with the same .vece.
Validate the contents of this list with CONFIG_DEBUG_TCG.
All tcg_gen_*_vec functions will check any list active
during .fniv expansion. Swap the active list in and out
as we expand other opcodes, or take control away from the
front-end function.
Convert all existing vector aware front ends.
Backports commit 53229a7703eeb2bbe101a19a33ef22aaf960c65b from qemu
Let's add tcg_gen_gvec_3i(), similar to tcg_gen_gvec_2i(), however
without introducing "gen_helper_gvec_3i *fnoi", as it isn't needed
for now.
Backports commit e1227bb6e59173117f094a6a13b998587b45c928 from qemu
Thereby decoupling the resulting translated code from the current state
of the system.
Backports commit 2399d4e7cec22ecf1c51062d2ebfd45220dbaace from qemu