Because we now store uint64_t in TCGTemp, we can now always
store the full 64-bit duplicate immediate. So remove the
difference between 32- and 64-bit hosts.
Backports 0b4286dd15e2bcaf2aa53dfac0fb3103690f5a34
These will hold a single constant for the duration of the TB.
They are hashed, so that each value has one temp across the TB.
Not used yet, this is all infrastructure.
Backports c0522136adf550c7a0ef7c0755c1f9d1560d2757
This will reduce the differences between 32-bit and 64-bit hosts,
allowing full 64-bit constants to be created with the same interface.
Backports bdb38b95f72ebbef2d24e057828dd18ba9c81f63
In most, but not all, places that we check for TEMP_FIXED,
we are really testing that we do not modify the temporary.
Backports e01fa97dea857a35be5bb8cce0d632a62e72c689
The temp_fixed, temp_global, temp_local bits are all related.
Combine them into a single enumeration.
Backports ee17db83d2dce35792e9bf03366af193e5e0e5c9
While we don't store more than tcg_target_long in TCGTemp,
we shouldn't be limited to that for code generation. We will
be able to use this for INDEX_op_dup2_vec with 2 constants.
Also pass along the minimal vece that may be said to apply
to the constant. This allows some simplification in the
various backends.
Backports 4e18617555955503628a004ed97e1fc2fa7818b9
Enable this on i386 to restrict the set of input registers
for an 8-bit store, as required by the architecture. This
removes the last use of scratch registers for user-only mode.
Backports 07ce0b05300de5bc8f1932a4cfbe38f3323e5ab1
These are easier to set and test when they have their own fields.
Reduce the size of alias_index and sort_index to 4 bits, which is
sufficient for TCG_MAX_OP_ARGS. This leaves only the bits indicating
constants within the ct field.
Move all initialization to allocation time, rather than init
individual fields in process_op_defs.
Backports bc2b17e6ea582ef3ade2bdca750de269c674c915
This wasn't actually used for anything, really. All variable
operands must accept registers, and which are indicated by the
set in TCGArgConstraint.regs.
Backports commit 74a117906b87ff9220e4baae5a7431d6f4eadd45
This uses an existing hole in the TCGArgConstraint structure
and will be convenient for keeping the data in one place.
Backports 66792f90f14fef18b25a168922877a367ecdca05
If the output of the move is dead, then the last use is in
the store. If we propagate the input to the store, then we
can remove the move opcode entirely.
Backports commit 61f15c487fc2aea14f6b0e52c459ae8b7d252a65 from qemu
No host backend support yet, but the interfaces for rotls
are in place. Only implement left-rotate for now, as the
only known use of vector rotate by scalar is s390x, so any
right-rotate would be unused and untestable.
Backports commit 23850a74afb641102325b4b7f74071d929fc4594 from qemu
No host backend support yet, but the interfaces for rotlv
and rotrv are in place.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v3: Drop the generic expansion from rot to shift; we can do better
for each backend, and then this code becomes unused.
Backports commit 5d0ceda902915e3f0e21c39d142c92c4e97c3ebb from qemu
No host backend support yet, but the interfaces for rotli
are in place. Canonicalize immediate rotate to the left,
based on a survey of architectures, but provide both left
and right shift interfaces to the translators.
Backports commit b0f7e7444c03da17e41bf327c8aea590104a28ab from qemu
Preparation for collapsing the two byte swaps, adjust_endianness and
handle_bswap, along the I/O path.
Target dependant attributes are conditionalized upon NEED_CPU_H.
Backports commit 14776ab5a12972ea439c7fb2203a4c15a09094b4 from qemu
Perform a per-element conditional move. This combination operation is
easier to implement on some host vector units than plain cmp+bitsel.
Omit the usual gvec interface, as this is intended to be used by
target-specific gvec expansion call-backs.
Backports commit f75da2988eb2457fa23d006d573220c5c680ec4e from qemu
This operation performs d = (b & a) | (c & ~a), and is present
on a majority of host vector units. Include gvec expanders.
Backports commit 38dc12947ec9106237f9cdbd428792c985cd86ae from qemu
Allow the backend to expand dup from memory directly, instead of
forcing the value into a temp first. This is especially important
if integer/vector register moves do not exist.
Note that officially tcg_out_dupm_vec is allowed to fail.
If it did, we could fix this up relatively easily:
VECE == 32/64:
Load the value into a vector register, then dup.
Both of these must work.
VECE == 8/16:
If the value happens to be at an offset such that an aligned
load would place the desired value in the least significant
end of the register, go ahead and load w/garbage in high bits.
Load the value w/INDEX_op_ld{8,16}_i32.
Attempt a move directly to vector reg, which may fail.
Store the value into the backing store for OTS.
Load the value into the vector reg w/TCG_TYPE_I32, which must work.
Duplicate from the vector reg into itself, which must work.
All of which is well and good, except that all supported
hosts can support dupm for all vece, so all of the failure
paths would be dead code and untestable.
Backports commit 37ee55a081b7863ffab2151068dd1b2f11376914 from qemu
This case is similar to INDEX_op_mov_* in that we need to do
different things depending on the current location of the source.
Backports commit bab1671f0fa928fd678a22f934739f06fd5fd035 from qemu
The i386 backend already has these functions, and the aarch64 backend
could easily split out one. Nothing is done with these functions yet,
but this will aid register allocation of INDEX_op_dup_vec in a later patch.
Adjust the aarch64 tcg_out_dupi_vec signature to match the new interface.
Backports commit e7632cfa8b76cdbbc1c76e8737338ef5844e7d60 from qemu
PowerPC Altivec does not support direct moves between vector registers
and general registers. So when tcg_out_mov fails, we can use the
backing memory for the temporary to perform the move.
Backports commit 240c08d0998f402c325fce489de0d14831048128 from qemu
This patch merely changes the interface, aborting on all failures,
of which there are currently none.
Backports commit 78113e83e0007e869c9f0cb4c0497a77538988e3 from qemu
The only fixed_reg is cpu_env, and it should not be modified
during any TB. Therefore code that tries to special-case moves
into a fixed_reg is dead. Remove it.
Backports commit d63e3b6e694ad6c887be135dddb9cd4893f1a844 from qemu
If the TB generates too much code, such that backend relocations
overflow, try again with a smaller TB. In support of this, move
relocation processing from a random place within tcg_out_op, in
the handling of branch opcodes, to a new function at the end of
tcg_gen_code.
This is not a complete solution, as there are additional relocs
generated for out-of-line ldst handling and constant pools.
Backports commit 7ecd02a06f8f4c0bbf872ecc15e37035b7e1df5f from qemu
If a TB generates too much code, try again with fewer insns.
Fixes: https://bugs.launchpad.net/bugs/1824853
Backports commit 6e6c4efed995d9eca6ae0cfdb2252df830262f50 from qemu
Currently, a jump to a label that is not defined anywhere will
be emitted not be relocated. This results in a jump to a random
jump target. With tcg debugging, print a diagnostic to the -d op
file and abort.
This could help debug or detect errors like
c2d9644e6d ("target/arm: Fix crash on conditional instruction in an IT block")
Backports commit bef16ab4e641636b4e85c3d863b4257ce0be4e6f from qemu
Free the argument register only after we have verified that the
temporary is not already in that register. This case is likely
now that we are back propagating the preferred register.
Backports commit 4250da10923347c9ee907f8d72bd93dfa5ee8742 from qemu
With these preferences, we can arrange for function call arguments to
be computed into the proper registers instead of requiring extra moves.
Backports commit 25f49c5f1508ddf081ce89fa6bbfd87a51eea37b from qemu
No need for a "tcg_" prefix for a static function; we already
have another "la_" prefix for indicating liveness analysis.
Pass in nb_globals and nb_temps, as we will already have them
in registers for other loops within the parent function.
Backports commit 2616c8082143373e794b62444bf81754f50dbf6b from qemu
Try harder to honor the output_pref. When we're forced to allocate
a second register for the input, it does not need to use the input
constraint; that will be honored by the register we allocate for the
output and a move is already required.
Backports commit d62816f2db439b2dd761c674f0256f21d9dd2ed0 from qemu
Allocate storage for, but do not yet fill in, per-opcode
preferences for the output operands. Pass it in to the
register allocation routines for output operands.
Backports commit 69e3706d2b473815e382552e729d12590339e0ac from qemu