Commit graph

171 commits

Author SHA1 Message Date
Richard Henderson 1b811b8546 tcg: Add tcg_reg_alloc_dup2
There are several ways we can expand a vector dup of a 64-bit
element on a 32-bit host.

Backports efe86b21ead9b5d256ce90c378e31681c5e243a5
2021-03-04 13:20:00 -05:00
Richard Henderson 471fc98c49 tcg: Remove movi and dupi opcodes
These are now completely covered by mov from a
TYPE_CONST temporary.

Backports c58f4c97b2ad9247c5ee85d625a934370862fba1
2021-03-04 13:14:30 -05:00
Richard Henderson 6e38e5004f tcg: Use tcg_constant_{i32,i64,vec} with gvec expanders
Backports 88d4005b098427638d7551aa04ebde4fdd06835b
2021-03-04 13:05:25 -05:00
Richard Henderson 6e54b46d28 tcg: Convert tcg_gen_dupi_vec to TCG_CONST
Because we now store uint64_t in TCGTemp, we can now always
store the full 64-bit duplicate immediate. So remove the
difference between 32- and 64-bit hosts.

Backports 0b4286dd15e2bcaf2aa53dfac0fb3103690f5a34
2021-03-04 12:19:48 -05:00
Richard Henderson 8edc9b76dd tcg: Introduce TYPE_CONST temporaries
These will hold a single constant for the duration of the TB.
They are hashed, so that each value has one temp across the TB.

Not used yet, this is all infrastructure.

Backports c0522136adf550c7a0ef7c0755c1f9d1560d2757
2021-03-03 21:29:40 -05:00
Richard Henderson 0f71f52216 tcg: Expand TCGTemp.val to 64-bits
This will reduce the differences between 32-bit and 64-bit hosts,
allowing full 64-bit constants to be created with the same interface.

Backports bdb38b95f72ebbef2d24e057828dd18ba9c81f63
2021-03-03 20:46:32 -05:00
Richard Henderson b49c4639d1 tcg: Add temp_readonly
In most, but not all, places that we check for TEMP_FIXED,
we are really testing that we do not modify the temporary.

Backports e01fa97dea857a35be5bb8cce0d632a62e72c689
2021-03-03 20:45:25 -05:00
Richard Henderson 30739864d2 tcg: Consolidate 3 bits into enum TCGTempKind
The temp_fixed, temp_global, temp_local bits are all related.
Combine them into a single enumeration.

Backports ee17db83d2dce35792e9bf03366af193e5e0e5c9
2021-03-03 20:41:24 -05:00
Richard Henderson 520ec7ca76 tcg: Increase tcg_out_dupi_vec immediate to int64_t
While we don't store more than tcg_target_long in TCGTemp,
we shouldn't be limited to that for code generation. We will
be able to use this for INDEX_op_dup2_vec with 2 constants.

Also pass along the minimal vece that may be said to apply
to the constant. This allows some simplification in the
various backends.

Backports 4e18617555955503628a004ed97e1fc2fa7818b9
2021-03-03 20:27:39 -05:00
Richard Henderson c5c19529c5 tcg: Use tcg_out_dupi_vec from temp_load
Having dupi pass though movi is confusing and arguably wrong.

Backports 0a6a8bc8ebfe5ae2a3f18ef48b92a74bc2df2f96
2021-03-03 20:23:02 -05:00
Richard Henderson 9d453e820a tcg: Introduce INDEX_op_qemu_st8_i32
Enable this on i386 to restrict the set of input registers
for an 8-bit store, as required by the architecture. This
removes the last use of scratch registers for user-only mode.

Backports 07ce0b05300de5bc8f1932a4cfbe38f3323e5ab1
2021-03-03 19:51:20 -05:00
Richard Henderson 7813c57f9e tcg: Move some TCG_CT_* bits to TCGArgConstraint bitfields
These are easier to set and test when they have their own fields.
Reduce the size of alias_index and sort_index to 4 bits, which is
sufficient for TCG_MAX_OP_ARGS. This leaves only the bits indicating
constants within the ct field.

Move all initialization to allocation time, rather than init
individual fields in process_op_defs.

Backports bc2b17e6ea582ef3ade2bdca750de269c674c915
2021-03-01 19:41:34 -05:00
Richard Henderson 71a34d84e5 tcg: Remove TCG_CT_REG
This wasn't actually used for anything, really. All variable
operands must accept registers, and which are indicated by the
set in TCGArgConstraint.regs.

Backports commit 74a117906b87ff9220e4baae5a7431d6f4eadd45
2021-03-01 19:38:00 -05:00
Richard Henderson ae075d324d tcg: Move sorted_args into TCGArgConstraint.sort_index
This uses an existing hole in the TCGArgConstraint structure
and will be convenient for keeping the data in one place.

Backports 66792f90f14fef18b25a168922877a367ecdca05
2021-03-01 19:33:45 -05:00
Richard Henderson e3356f9bad tcg: Drop union from TCGArgConstraint
The union is unused; let "regs" appear in the main structure
without the "u.regs" wrapping.

Backports 9be0d08019465b38e2f1a605960961a491430c21
2021-03-01 19:29:19 -05:00
Richard Henderson 0e68fa345e tcg: Improve move ops in liveness_pass_2
If the output of the move is dead, then the last use is in
the store. If we propagate the input to the store, then we
can remove the move opcode entirely.

Backports commit 61f15c487fc2aea14f6b0e52c459ae8b7d252a65 from qemu
2020-06-14 22:13:04 -04:00
Richard Henderson cc3187b1e4 tcg: Implement gvec support for rotate by scalar
No host backend support yet, but the interfaces for rotls
are in place. Only implement left-rotate for now, as the
only known use of vector rotate by scalar is s390x, so any
right-rotate would be unused and untestable.

Backports commit 23850a74afb641102325b4b7f74071d929fc4594 from qemu
2020-06-14 22:00:50 -04:00
Richard Henderson be78062fd8 tcg: Implement gvec support for rotate by vector
No host backend support yet, but the interfaces for rotlv
and rotrv are in place.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v3: Drop the generic expansion from rot to shift; we can do better
for each backend, and then this code becomes unused.

Backports commit 5d0ceda902915e3f0e21c39d142c92c4e97c3ebb from qemu
2020-06-14 21:43:46 -04:00
Richard Henderson 5cce52a04b tcg: Implement gvec support for rotate by immediate
No host backend support yet, but the interfaces for rotli
are in place. Canonicalize immediate rotate to the left,
based on a survey of architectures, but provide both left
and right shift interfaces to the translators.

Backports commit b0f7e7444c03da17e41bf327c8aea590104a28ab from qemu
2020-06-14 21:26:58 -04:00
Tony Nguyen f75368cd0f
tcg: TCGMemOp is now accelerator independent MemOp
Preparation for collapsing the two byte swaps, adjust_endianness and
handle_bswap, along the I/O path.

Target dependant attributes are conditionalized upon NEED_CPU_H.

Backports commit 14776ab5a12972ea439c7fb2203a4c15a09094b4 from qemu
2019-11-28 03:01:12 -05:00
Richard Henderson 2ea6dfbd63
tcg: Add support for vector compare select
Perform a per-element conditional move. This combination operation is
easier to implement on some host vector units than plain cmp+bitsel.
Omit the usual gvec interface, as this is intended to be used by
target-specific gvec expansion call-backs.

Backports commit f75da2988eb2457fa23d006d573220c5c680ec4e from qemu
2019-05-24 18:21:13 -04:00
Richard Henderson ca58be9cb4
tcg: Add support for vector bitwise select
This operation performs d = (b & a) | (c & ~a), and is present
on a majority of host vector units. Include gvec expanders.

Backports commit 38dc12947ec9106237f9cdbd428792c985cd86ae from qemu
2019-05-24 18:15:10 -04:00
Richard Henderson 6d5e7856ff
tcg: Add support for vector absolute value
Backports commit bcefc90208f8a1d6f619d61c2647281d92277015 from qemu
2019-05-16 16:33:43 -04:00
Richard Henderson 66e6bea084
tcg: Add INDEX_op_dupm_vec
Allow the backend to expand dup from memory directly, instead of
forcing the value into a temp first. This is especially important
if integer/vector register moves do not exist.

Note that officially tcg_out_dupm_vec is allowed to fail.
If it did, we could fix this up relatively easily:

VECE == 32/64:
Load the value into a vector register, then dup.
Both of these must work.

VECE == 8/16:
If the value happens to be at an offset such that an aligned
load would place the desired value in the least significant
end of the register, go ahead and load w/garbage in high bits.

Load the value w/INDEX_op_ld{8,16}_i32.
Attempt a move directly to vector reg, which may fail.
Store the value into the backing store for OTS.
Load the value into the vector reg w/TCG_TYPE_I32, which must work.
Duplicate from the vector reg into itself, which must work.

All of which is well and good, except that all supported
hosts can support dupm for all vece, so all of the failure
paths would be dead code and untestable.

Backports commit 37ee55a081b7863ffab2151068dd1b2f11376914 from qemu
2019-05-16 15:38:02 -04:00
Richard Henderson d4e7c6a8c5
tcg: Add tcg_out_dupm_vec to the backend interface
Currently stubbed out in all backends that support vectors.

Backports commit d6ecb4a978b718dbe108a9fa9ecccc8b7f7cb579 from qemu
2019-05-16 15:24:48 -04:00
Richard Henderson cf238d3544
tcg: Manually expand INDEX_op_dup_vec
This case is similar to INDEX_op_mov_* in that we need to do
different things depending on the current location of the source.

Backports commit bab1671f0fa928fd678a22f934739f06fd5fd035 from qemu
2019-05-16 15:22:29 -04:00
Richard Henderson 3d20e1678c
tcg: Promote tcg_out_{dup,dupi}_vec to backend interface
The i386 backend already has these functions, and the aarch64 backend
could easily split out one. Nothing is done with these functions yet,
but this will aid register allocation of INDEX_op_dup_vec in a later patch.

Adjust the aarch64 tcg_out_dupi_vec signature to match the new interface.

Backports commit e7632cfa8b76cdbbc1c76e8737338ef5844e7d60 from qemu
2019-05-16 15:18:48 -04:00
Richard Henderson d58d9ad16e
tcg: Support cross-class moves without instruction support
PowerPC Altivec does not support direct moves between vector registers
and general registers. So when tcg_out_mov fails, we can use the
backing memory for the temporary to perform the move.

Backports commit 240c08d0998f402c325fce489de0d14831048128 from qemu
2019-05-16 15:16:23 -04:00
Richard Henderson f86bd1c5d6
tcg: Return bool success from tcg_out_mov
This patch merely changes the interface, aborting on all failures,
of which there are currently none.

Backports commit 78113e83e0007e869c9f0cb4c0497a77538988e3 from qemu
2019-05-16 15:14:42 -04:00
Richard Henderson fef5700c9c
tcg: Assert fixed_reg is read-only
The only fixed_reg is cpu_env, and it should not be modified
during any TB. Therefore code that tries to special-case moves
into a fixed_reg is dead. Remove it.

Backports commit d63e3b6e694ad6c887be135dddb9cd4893f1a844 from qemu
2019-05-16 15:09:37 -04:00
Richard Henderson 6145e3fdd7
tcg: Restart TB generation after out-of-line ldst overflow
This is part c of relocation overflow handling.

Backports commit aeee05f53a5d67304a521d2644dc0a607e3c8b28 from qemu
2019-04-30 10:06:53 -04:00
Richard Henderson 196631e0a4
tcg: Restart TB generation after constant pool overflow
This is part b of relocation overflow handling.

Backports commit 1768987b73fa7e23e58b7844abe5882490ff8e42 from qemu
2019-04-30 10:00:52 -04:00
Richard Henderson 45315fd8ef
tcg: Restart TB generation after relocation overflow
If the TB generates too much code, such that backend relocations
overflow, try again with a smaller TB. In support of this, move
relocation processing from a random place within tcg_out_op, in
the handling of branch opcodes, to a new function at the end of
tcg_gen_code.

This is not a complete solution, as there are additional relocs
generated for out-of-line ldst handling and constant pools.

Backports commit 7ecd02a06f8f4c0bbf872ecc15e37035b7e1df5f from qemu
2019-04-30 09:58:45 -04:00
Richard Henderson 434b3ab9ec
tcg: Restart after TB code generation overflow
If a TB generates too much code, try again with fewer insns.

Fixes: https://bugs.launchpad.net/bugs/1824853

Backports commit 6e6c4efed995d9eca6ae0cfdb2252df830262f50 from qemu
2019-04-30 09:52:57 -04:00
Richard Henderson 269fa0daba
tcg: Add INDEX_op_extract2_{i32,i64}
This will let backends implement the double-word shift operation.

Backports commit fce1296f135669eca85dc42154a2a352c818ad76 from qemu
2019-04-30 09:29:05 -04:00
Lioncash f0c271ca2f
tcg: Correct special-cased brcond handling 2019-04-26 10:25:46 -04:00
Lioncash 4a64ebf95e
tcg: Synchronize with qemu 2019-04-26 09:32:20 -04:00
Lioncash 3996153514
tcg: Synchronize with qemu 2019-04-26 08:48:32 -04:00
Lioncash 006a13026a
tcg: Remove inconsistent g_strdup usage
The ts's name was allocated with strdup, but ts2's was being done with
g_strdup. Makes them consistent with upstream Qemu.
2019-04-26 08:48:32 -04:00
Lioncash e579832dcb
tcg: Synchronize with qemu 2019-04-18 04:57:19 -04:00
Richard Henderson f7c5f0ccbe
tcg: Diagnose referenced labels that have not been emitted
Currently, a jump to a label that is not defined anywhere will
be emitted not be relocated. This results in a jump to a random
jump target. With tcg debugging, print a diagnostic to the -d op
file and abort.

This could help debug or detect errors like
c2d9644e6d ("target/arm: Fix crash on conditional instruction in an IT block")

Backports commit bef16ab4e641636b4e85c3d863b4257ce0be4e6f from qemu
2019-02-12 11:35:11 -05:00
Richard Henderson fb684825c8
tcg: Add opcodes for vector minmax arithmetic
Backports commit dd0a0fcdd8848c2a18970c44a62bd8f394c2b495 from qemu
2019-01-29 16:24:52 -05:00
Richard Henderson e0266239ea
tcg: Add opcodes for vector saturated arithmetic
Backports commit 8afaf0506606f8003ef696df849c5a98637a7a83 from qemu
2019-01-29 16:14:34 -05:00
Richard Henderson 6ed82f77b4
tcg: Improve call argument loading
Free the argument register only after we have verified that the
temporary is not already in that register. This case is likely
now that we are back propagating the preferred register.

Backports commit 4250da10923347c9ee907f8d72bd93dfa5ee8742 from qemu
2019-01-05 07:24:08 -05:00
Richard Henderson 64843e8c09
tcg: Record register preferences during liveness
With these preferences, we can arrange for function call arguments to
be computed into the proper registers instead of requiring extra moves.

Backports commit 25f49c5f1508ddf081ce89fa6bbfd87a51eea37b from qemu
2019-01-05 07:22:57 -05:00
Richard Henderson 63cf164724
tcg: Split out more subroutines from liveness_pass_1
Backports commit f65a061c39cc4f9d088201031050e42eb23d5b2a from qemu
2019-01-05 07:07:49 -05:00
Richard Henderson c348ceba56
tcg: Rename and adjust liveness_pass_1 helpers
No need for a "tcg_" prefix for a static function; we already
have another "la_" prefix for indicating liveness analysis.
Pass in nb_globals and nb_temps, as we will already have them
in registers for other loops within the parent function.

Backports commit 2616c8082143373e794b62444bf81754f50dbf6b from qemu
2019-01-05 07:05:58 -05:00
Richard Henderson b356212b33
tcg: Dump register preference info with liveness
Backports commit 1894f69a612b35c2a39b44a824da04d74bfe324a from qemu
2019-01-05 07:00:21 -05:00
Richard Henderson 494d802781
tcg: Improve register allocation for matching constraints
Try harder to honor the output_pref. When we're forced to allocate
a second register for the input, it does not need to use the input
constraint; that will be honored by the register we allocate for the
output and a move is already required.

Backports commit d62816f2db439b2dd761c674f0256f21d9dd2ed0 from qemu
2019-01-05 06:57:56 -05:00
Richard Henderson 83a7de2566
tcg: Add output_pref to TCGOp
Allocate storage for, but do not yet fill in, per-opcode
preferences for the output operands. Pass it in to the
register allocation routines for output operands.

Backports commit 69e3706d2b473815e382552e729d12590339e0ac from qemu
2019-01-05 06:54:40 -05:00