Commit graph

651 commits

Author SHA1 Message Date
Richard Henderson 7c9b3a9021
tcg/aarch64: Support vector absolute value
Backports commit a456394ae540f852cd0d10fd693fe9f33598dc01 from qemu
2019-05-16 16:39:14 -04:00
Richard Henderson fd35490991
tcg/i386: Support vector absolute value
Backports commit 18f9b65f1a4225dd314cb9b0a8dea968c5bc2ef3 from qemu
2019-05-16 16:37:33 -04:00
Richard Henderson 6d5e7856ff
tcg: Add support for vector absolute value
Backports commit bcefc90208f8a1d6f619d61c2647281d92277015 from qemu
2019-05-16 16:33:43 -04:00
Richard Henderson 6d1730048d
tcg: Add support for integer absolute value
Remove a function of the same name from target/arm/.
Use a branchless implementation of abs gleaned from gcc.

Backports commit ff1f11f7f8710a768f9313f24bd7f509d3db27e5 from qemu
2019-05-16 16:25:15 -04:00
Richard Henderson 18b3df6e4e
tcg/i386: Support vector scalar shift opcodes
Backports commit 0a8d7a3bf5a149a82450eef555fd61728703dd84 from qemu
2019-05-16 16:19:44 -04:00
Richard Henderson 79b9dc559e
tcg: Add gvec expanders for vector shift by scalar
Allow expansion either via shift by scalar or by replicating
the scalar for shift by vector.

Backports commit b4578cd91cda4cef1c413304353ca6dc5b957b60 from qemu
2019-05-16 16:17:58 -04:00
Richard Henderson 0217ee7b24
tcg/aarch64: Support vector variable shift opcodes
Backports commit 79525dfd08262d8de10d271f17e5a4096ef96d16 from qemu
2019-05-16 15:58:54 -04:00
Richard Henderson f793ec847d
tcg/i386: Support vector variable shift opcodes
Backports commit a2ce146a06807fe1d1a81e878b8f249ff1e14038 from qemu
2019-05-16 15:53:33 -04:00
Richard Henderson 8c17687934
tcg: Add gvec expanders for variable shift
The gvec expanders perform a modulo on the shift count. If the target
requires alternate behaviour, then it cannot use the generic gvec
expanders anyway, and will have to have its own custom code.

Backports commit 5ee5c14cacda27e904cd6b0d9e7ffe1acff42838 from qemu
2019-05-16 15:51:09 -04:00
Richard Henderson 66e6bea084
tcg: Add INDEX_op_dupm_vec
Allow the backend to expand dup from memory directly, instead of
forcing the value into a temp first. This is especially important
if integer/vector register moves do not exist.

Note that officially tcg_out_dupm_vec is allowed to fail.
If it did, we could fix this up relatively easily:

VECE == 32/64:
Load the value into a vector register, then dup.
Both of these must work.

VECE == 8/16:
If the value happens to be at an offset such that an aligned
load would place the desired value in the least significant
end of the register, go ahead and load w/garbage in high bits.

Load the value w/INDEX_op_ld{8,16}_i32.
Attempt a move directly to vector reg, which may fail.
Store the value into the backing store for OTS.
Load the value into the vector reg w/TCG_TYPE_I32, which must work.
Duplicate from the vector reg into itself, which must work.

All of which is well and good, except that all supported
hosts can support dupm for all vece, so all of the failure
paths would be dead code and untestable.

Backports commit 37ee55a081b7863ffab2151068dd1b2f11376914 from qemu
2019-05-16 15:38:02 -04:00
Richard Henderson fd7a67e4a7
tcg/aarch64: Implement tcg_out_dupm_vec
The LD1R instruction does all the work. Note that the only
useful addressing mode is a base register with no offset.

Backports commit f23e5e15edfd49d5dd72cab2ed2d85ac354b2eeb from qemu
2019-05-16 15:29:04 -04:00
Richard Henderson a6fd4e2345
tcg/i386: Implement tcg_out_dupm_vec
At the same time, improve tcg_out_dupi_vec wrt broadcast
from the constant pool.

Backports commit 1e262b49b5331441f697461e4305fe06719758a7 from qemu
2019-05-16 15:27:15 -04:00
Richard Henderson d4e7c6a8c5
tcg: Add tcg_out_dupm_vec to the backend interface
Currently stubbed out in all backends that support vectors.

Backports commit d6ecb4a978b718dbe108a9fa9ecccc8b7f7cb579 from qemu
2019-05-16 15:24:48 -04:00
Richard Henderson cf238d3544
tcg: Manually expand INDEX_op_dup_vec
This case is similar to INDEX_op_mov_* in that we need to do
different things depending on the current location of the source.

Backports commit bab1671f0fa928fd678a22f934739f06fd5fd035 from qemu
2019-05-16 15:22:29 -04:00
Richard Henderson 3d20e1678c
tcg: Promote tcg_out_{dup,dupi}_vec to backend interface
The i386 backend already has these functions, and the aarch64 backend
could easily split out one. Nothing is done with these functions yet,
but this will aid register allocation of INDEX_op_dup_vec in a later patch.

Adjust the aarch64 tcg_out_dupi_vec signature to match the new interface.

Backports commit e7632cfa8b76cdbbc1c76e8737338ef5844e7d60 from qemu
2019-05-16 15:18:48 -04:00
Richard Henderson d58d9ad16e
tcg: Support cross-class moves without instruction support
PowerPC Altivec does not support direct moves between vector registers
and general registers. So when tcg_out_mov fails, we can use the
backing memory for the temporary to perform the move.

Backports commit 240c08d0998f402c325fce489de0d14831048128 from qemu
2019-05-16 15:16:23 -04:00
Richard Henderson f86bd1c5d6
tcg: Return bool success from tcg_out_mov
This patch merely changes the interface, aborting on all failures,
of which there are currently none.

Backports commit 78113e83e0007e869c9f0cb4c0497a77538988e3 from qemu
2019-05-16 15:14:42 -04:00
Richard Henderson f7d9ee8451
tcg/arm: Use tcg_out_mov_reg in tcg_out_mov
We have a function that takes an additional condition parameter
over the standard backend interface. It already takes care of
eliding no-op moves.

Backports commit c16f52b2c5d91c36e121795bd3b386cea0b7573c from qemu
2019-05-16 15:10:52 -04:00
Richard Henderson fef5700c9c
tcg: Assert fixed_reg is read-only
The only fixed_reg is cpu_env, and it should not be modified
during any TB. Therefore code that tries to special-case moves
into a fixed_reg is dead. Remove it.

Backports commit d63e3b6e694ad6c887be135dddb9cd4893f1a844 from qemu
2019-05-16 15:09:37 -04:00
Richard Henderson c54b2776f6
tcg: Specify optional vector requirements with a list
Replace the single opcode in .opc with a null-terminated
array in .opt_opc. We still require that all opcodes be
used with the same .vece.

Validate the contents of this list with CONFIG_DEBUG_TCG.
All tcg_gen_*_vec functions will check any list active
during .fniv expansion. Swap the active list in and out
as we expand other opcodes, or take control away from the
front-end function.

Convert all existing vector aware front ends.

Backports commit 53229a7703eeb2bbe101a19a33ef22aaf960c65b from qemu
2019-05-16 15:05:02 -04:00
Richard Henderson 37762fd92b
tcg: Allow add_vec, sub_vec, neg_vec, not_vec to be expanded
PowerPC Altivec does not support add and subtract of 64-bit elements.
Prepare for that configuration by not assuming the operation is
universally supported.

Backports commit ce27c5d1a38e93da38653af71fb468c5eded4c7b from qemu
2019-05-16 14:33:18 -04:00
Richard Henderson 9a9b681b38
tcg: Do not recreate INDEX_op_neg_vec unless supported
Use tcg_can_emit_vec_op instead of just TCG_TARGET_HAS_neg_vec,
so that we check the type and vece for the actual operation.

Backports commit ac383dde33405106469d04a78de1d76f1a730cb1 from qemu
2019-05-16 14:28:41 -04:00
David Hildenbrand f3b4a64d27
tcg: Implement tcg_gen_gvec_3i()
Let's add tcg_gen_gvec_3i(), similar to tcg_gen_gvec_2i(), however
without introducing "gen_helper_gvec_3i *fnoi", as it isn't needed
for now.

Backports commit e1227bb6e59173117f094a6a13b998587b45c928 from qemu
2019-05-16 14:26:50 -04:00
Emilio G. Cota c1e26c4e35
tcg: check CF_PARALLEL instead of parallel_cpus
Thereby decoupling the resulting translated code from the current state
of the system.

The tb->cflags field is not passed to tcg generation functions. So
we add a field to TCGContext, storing there a copy of tb->cflags.

Most architectures have <= 32 registers, which results in a 4-byte hole
in TCGContext. Use this hole for the new field.

Backports commit e82d5a2460b0e176128027651ff9b104e4bdf5cc from qemu
2019-05-06 00:52:08 -04:00
Lioncash cc37db76b6
tcg: Synchronize with qemu 2019-05-04 21:40:23 -04:00
Richard Henderson 5847d833b2
tcg/arm: Restrict constant pool displacement to 12 bits
This will not necessarily restrict the size of the TB, since for v7
the majority of constant pool usage is for calls from the out-of-line
ldst code, which is already at the end of the TB. But this does
allow us to save one insn per reference on the off-chance.

Backports commit b4b82d7e9caff7ccca5c621817b5a4b8e95eb9b1 from qemu
2019-04-30 10:10:21 -04:00
Richard Henderson 187e80c9a5
tcg/ppc: Allow the constant pool to overflow at 32k
There is no point in coding for a 2GB offset when the max TB size
is already limited to 64k. If we further restrict to 32k then we
can eliminate the extra ADDIS instruction.

Backports commit a7cdaf710f2aaaf0be855a338dd67463d4bb99e2 from qemu
2019-04-30 10:08:18 -04:00
Richard Henderson 6145e3fdd7
tcg: Restart TB generation after out-of-line ldst overflow
This is part c of relocation overflow handling.

Backports commit aeee05f53a5d67304a521d2644dc0a607e3c8b28 from qemu
2019-04-30 10:06:53 -04:00
Richard Henderson 196631e0a4
tcg: Restart TB generation after constant pool overflow
This is part b of relocation overflow handling.

Backports commit 1768987b73fa7e23e58b7844abe5882490ff8e42 from qemu
2019-04-30 10:00:52 -04:00
Richard Henderson 45315fd8ef
tcg: Restart TB generation after relocation overflow
If the TB generates too much code, such that backend relocations
overflow, try again with a smaller TB. In support of this, move
relocation processing from a random place within tcg_out_op, in
the handling of branch opcodes, to a new function at the end of
tcg_gen_code.

This is not a complete solution, as there are additional relocs
generated for out-of-line ldst handling and constant pools.

Backports commit 7ecd02a06f8f4c0bbf872ecc15e37035b7e1df5f from qemu
2019-04-30 09:58:45 -04:00
Richard Henderson 434b3ab9ec
tcg: Restart after TB code generation overflow
If a TB generates too much code, try again with fewer insns.

Fixes: https://bugs.launchpad.net/bugs/1824853

Backports commit 6e6c4efed995d9eca6ae0cfdb2252df830262f50 from qemu
2019-04-30 09:52:57 -04:00
Richard Henderson 2479bbd3b2
tcg/aarch64: Support INDEX_op_extract2_{i32,i64}
Backports commit 464c2969d5d7a0a5d38d2aa5d930986df876d3fb from qemu
2019-04-30 09:40:40 -04:00
Richard Henderson cbc5f919c2
tcg/arm: Support INDEX_op_extract2_i32
Backports commit 3b832d67a993968868f4087a9720a5c911e23f7a from qemu
2019-04-30 09:39:30 -04:00
Richard Henderson 0f20a26b36
tcg/i386: Support INDEX_op_extract2_{i32,i64}
Backports commit c6fb8c0cf704c4a1a48c3e99e995ad4c58150dab from qemu
2019-04-30 09:37:39 -04:00
Richard Henderson da39922c60
tcg: Use extract2 in tcg_gen_deposit_{i32,i64}
Backports commit b0a6056719b4a409a5699d11bbfdf79301417221 from qemu
2019-04-30 09:35:49 -04:00
Richard Henderson 948635602c
tcg: Use deposit and extract2 in tcg_gen_shifti_i64
Backports commit 02616bad6f0788652deaca9a48d0dfa7716ff87a from qemu
2019-04-30 09:33:44 -04:00
Richard Henderson 269fa0daba
tcg: Add INDEX_op_extract2_{i32,i64}
This will let backends implement the double-word shift operation.

Backports commit fce1296f135669eca85dc42154a2a352c818ad76 from qemu
2019-04-30 09:29:05 -04:00
David Hildenbrand 458942d94e
tcg: Implement tcg_gen_extract2_{i32,i64}
Will be helpful for s390x. Input 128 bit and output 64 bit only,
which is sufficient for now.

Backports commit 2089fcc9e7b4174d1c351eaa7d277c02188a6dd2 from qemu
2019-04-30 09:20:45 -04:00
Lioncash f0c271ca2f
tcg: Correct special-cased brcond handling 2019-04-26 10:25:46 -04:00
Lioncash 4a64ebf95e
tcg: Synchronize with qemu 2019-04-26 09:32:20 -04:00
Lioncash 3996153514
tcg: Synchronize with qemu 2019-04-26 08:48:32 -04:00
Lioncash 006a13026a
tcg: Remove inconsistent g_strdup usage
The ts's name was allocated with strdup, but ts2's was being done with
g_strdup. Makes them consistent with upstream Qemu.
2019-04-26 08:48:32 -04:00
Lioncash 96c52ea053
tcg: Synchronize with qemu 2019-04-22 02:03:01 -04:00
Lioncash b9d1002609
tcg-op: Make sure to free temporaries within gen_uc_tracecode()
After the helper is generated, these are no longer needed and can be
reclaimed.
2019-04-18 05:40:48 -04:00
Lioncash e579832dcb
tcg: Synchronize with qemu 2019-04-18 04:57:19 -04:00
Lioncash b6f752970b
target/riscv: Initial introduction of the RISC-V target
This ports over the RISC-V architecture from Qemu. This is currently a
very barebones transition. No code hooking or any fancy stuff.
Currently, you can feed it instructions and query the CPU state itself.

This also allows choosing whether or not RISC-V 32-bit or RISC-V 64-bit
is desirable through Unicorn's interface as well.

Extremely basic examples of executing a single instruction have been
added to the samples directory to help demonstrate how to use the basic
functionality.
2019-03-08 21:46:10 -05:00
Richard Henderson b9156a0f55
tcg: Remove TODO file
The last update to this file was 9 years ago. In the meantime,
4 of the 6 ideas have actually been completed. The lat two do
not actually make sense anymore.

Backports commit 9e564a1dde5abc7ae4cebc115142f685d98938d7 from qemu
2019-02-22 19:10:26 -05:00
Leon Alrae 6099733fc5
target/mips: reimplement SC instruction emulation and use cmpxchg
Completely rewrite conditional stores handling. Use cmpxchg.

This eliminates need for separate implementations of SC instruction
emulation for user and system emulation.

Backports commit 33a07fa2db66376e6ee780d4a8b064dc5118cf34 from qemu
2019-02-15 17:10:16 -05:00
Mark Cave-Ayland 576df55076
tcg/i386: fix unsigned vector saturating arithmetic
Due to a cut/paste error in the original implementation, the unsigned
vector saturating arithmetic was erroneously being calculated as signed
vector saturating arithmetic.

Fixes: 8ffafbcec2 ("tcg/i386: Implement vector saturating arithmetic")

Backports commit 3115584d39afe8cf2a84a40549029f53792abca5 from qemu
2019-02-12 11:37:12 -05:00
Richard Henderson f7c5f0ccbe
tcg: Diagnose referenced labels that have not been emitted
Currently, a jump to a label that is not defined anywhere will
be emitted not be relocated. This results in a jump to a random
jump target. With tcg debugging, print a diagnostic to the -d op
file and abort.

This could help debug or detect errors like
c2d9644e6d ("target/arm: Fix crash on conditional instruction in an IT block")

Backports commit bef16ab4e641636b4e85c3d863b4257ce0be4e6f from qemu
2019-02-12 11:35:11 -05:00
Thomas Huth 85bc48fecd
tcg: Fix LGPL version number
It's either "GNU *Library* General Public version 2" or "GNU Lesser
General Public version *2.1*", but there was no "version 2.0" of the
"Lesser" library. So assume that version 2.1 is meant here.

Backports commit fb0343d5b4dd4b9b9e96e563d913a3e0c709fe4e from qemu
2019-02-03 17:55:28 -05:00
Lioncash b36f8220b2 tcg: Provide an MSVC compatible version of dup_const
This just simply forwards to dup_const_impl
2019-01-30 17:02:54 -05:00
Lioncash 17ff842261
tcg: Remove the use of void pointer arithmetic within tcgv_i32_temp()
Removes the use of a GNU-specific extension.
2019-01-30 13:04:52 -05:00
Lioncash 6b702a9905
tcg: Convert void* casts to char* in temp_tcgv_i32()
Gets rid of the use of a GNU extension that allows arithmetic on void
pointers. This is equivalent to doing arithmetic on char/unsigned char
pointers.
2019-01-30 13:02:01 -05:00
Richard Henderson 3f0781e39b
tcg/aarch64: Implement vector minmax arithmetic
Backports commit 93f332a50371936ea02392bdb748c8140ef3f06a from qemu
2019-01-29 16:44:09 -05:00
Richard Henderson 8a012c3929
tcg/aarch64: Implement vector saturating arithmetic
Backports commit d32648d445c534cea7e2ad7ed8608208aa8831c1 from qemu
2019-01-29 16:42:50 -05:00
Richard Henderson 63d1aae6b2
tcg/i386: Implement vector minmax arithmetic
The avx instruction set does not directly provide MO_64.
We can still implement 64-bit with comparison and vpblendvb.

Backports commit bc37faf4cb2baa77c44298c01558970b88d32808 from qemu
2019-01-29 16:41:11 -05:00
Richard Henderson 5518b543ed
tcg/i386: Implement vector saturating arithmetic
Only MO_8 and MO_16 are implemented, since that's all the
instruction set provides.

Backports commit 8ffafbcec275e61f6a1a17ac1d0bd918d5b23db3 from qemu
2019-01-29 16:37:55 -05:00
Richard Henderson 24e65f60ed
tcg/i386: Split subroutines out of tcg_expand_vec_op
This routine was becoming too large.

Backports commit 44f1441dbe14e7174a707d7e7ecbc2c8e080bfda from qemu
2019-01-29 16:33:59 -05:00
Richard Henderson fb684825c8
tcg: Add opcodes for vector minmax arithmetic
Backports commit dd0a0fcdd8848c2a18970c44a62bd8f394c2b495 from qemu
2019-01-29 16:24:52 -05:00
Richard Henderson e0266239ea
tcg: Add opcodes for vector saturated arithmetic
Backports commit 8afaf0506606f8003ef696df849c5a98637a7a83 from qemu
2019-01-29 16:14:34 -05:00
Richard Henderson f1f2d78a52
tcg: Add write_aofs to GVecGen4
This allows writing 2 output, 3 input operations.

Backports commit 5d6acdd4a485f15b1081acc523b99c1f1a7c42ab from qemu
2019-01-29 16:00:57 -05:00
Richard Henderson e08d0feee4
tcg: Add gvec expanders for nand, nor, eqv
Backports commit f550805d8309500d642f640af8d9928958465478 from qemu
2019-01-29 15:57:28 -05:00
Richard Henderson 11664d444e
tcg: Add logical simplifications during gvec expand
We handle many of these during integer expansion, and the
rest of them during integer optimization.

Backports commit 9a9eda78e4e56051485efb65e01748084f99ac3c from qemu
2019-01-29 15:49:00 -05:00
Fredrik Noring baf2fe0fc1
target/mips: Introduce 32 R5900 multimedia registers
The 32 R5900 128-bit registers are split into two 64-bit halves:
the lower halves are the GPRs and the upper halves are accessible
by the R5900-specific multimedia instructions.

Backports commit a168a796e1c251787fcdf2d9ca1e9e69cb86ffcd from qemu
2019-01-22 20:14:56 -05:00
Richard Henderson 6ed82f77b4
tcg: Improve call argument loading
Free the argument register only after we have verified that the
temporary is not already in that register. This case is likely
now that we are back propagating the preferred register.

Backports commit 4250da10923347c9ee907f8d72bd93dfa5ee8742 from qemu
2019-01-05 07:24:08 -05:00
Richard Henderson 64843e8c09
tcg: Record register preferences during liveness
With these preferences, we can arrange for function call arguments to
be computed into the proper registers instead of requiring extra moves.

Backports commit 25f49c5f1508ddf081ce89fa6bbfd87a51eea37b from qemu
2019-01-05 07:22:57 -05:00
Richard Henderson c2be1cee79
tcg: Add TCG_OPF_BB_EXIT
Use this to notice the opcodes that exit the TB, which implies
that local temps are really dead and need not be synced.

Previously we so marked the true end of the TB, but that was
immediately overwritten by the la_bb_end invoked by any
TCG_OPF_BB_END opcode, like exit_tb.

Backports commit ae36a246ed1a0e96c6c4f478f03d047dfa3a8898 from qemu
2019-01-05 07:09:38 -05:00
Richard Henderson 63cf164724
tcg: Split out more subroutines from liveness_pass_1
Backports commit f65a061c39cc4f9d088201031050e42eb23d5b2a from qemu
2019-01-05 07:07:49 -05:00
Richard Henderson c348ceba56
tcg: Rename and adjust liveness_pass_1 helpers
No need for a "tcg_" prefix for a static function; we already
have another "la_" prefix for indicating liveness analysis.
Pass in nb_globals and nb_temps, as we will already have them
in registers for other loops within the parent function.

Backports commit 2616c8082143373e794b62444bf81754f50dbf6b from qemu
2019-01-05 07:05:58 -05:00
Richard Henderson b356212b33
tcg: Dump register preference info with liveness
Backports commit 1894f69a612b35c2a39b44a824da04d74bfe324a from qemu
2019-01-05 07:00:21 -05:00
Richard Henderson 494d802781
tcg: Improve register allocation for matching constraints
Try harder to honor the output_pref. When we're forced to allocate
a second register for the input, it does not need to use the input
constraint; that will be honored by the register we allocate for the
output and a move is already required.

Backports commit d62816f2db439b2dd761c674f0256f21d9dd2ed0 from qemu
2019-01-05 06:57:56 -05:00
Richard Henderson 83a7de2566
tcg: Add output_pref to TCGOp
Allocate storage for, but do not yet fill in, per-opcode
preferences for the output operands. Pass it in to the
register allocation routines for output operands.

Backports commit 69e3706d2b473815e382552e729d12590339e0ac from qemu
2019-01-05 06:54:40 -05:00
Richard Henderson 19bde1a9cf
tcg: Add preferred_reg argument to tcg_reg_alloc_do_movi
Pass this through to temp_sync.

Backports commit ba87719cd267e6f07b17f6cda08246bf483146d4 from qemu
2019-01-05 06:51:55 -05:00
Richard Henderson c3aa567b03
tcg: Add preferred_reg argument to temp_sync
Pass this through to tcg_reg_alloc.

Backports commit 98b4e186c1ccb8f1868c61a33a3be8c2b82654f3 from qemu
2019-01-05 06:50:22 -05:00
Richard Henderson 96b6640f3b
tcg: Add preferred_reg argument to temp_load
Pass this through to tcg_reg_alloc.

Backports commit b722452aefb089e003b16946a4d73bad1fd3b79b from qemu
2019-01-05 06:48:19 -05:00
Richard Henderson 5e73b27607
tcg: Add preferred_reg argument to tcg_reg_alloc
This new argument will aid register allocation by indicating how
the temporary will be used in future. If the preference cannot
be satisfied, fall back to the constraints of the current insn.

Short circuit the preference when it cannot be satisfied or if
it does not further constrain the operation.

With an eye toward optimizing function call sequences, optimize
for the preferred_reg set containing a single register.

For the moment, all users pass 0 for preference.

Backports commit b016486e7baddb43cfc1e51909b05cde9cf82e0c from qemu
2019-01-05 06:45:15 -05:00
Richard Henderson 6aea2880d2
tcg: Add reachable_code_pass
Delete trivially dead code that follows unconditional branches and
noreturn helpers. These can occur either via optimization or via
the structure of a target's translator following an exception.

Backports commit b4fc67c7afd2c338d6e7c73a7f428dfe05ae0603 from qemu
2019-01-05 06:41:16 -05:00
Richard Henderson 26ab4d6560
tcg: Reference count labels
Increment when adding branches, and decrement when removing them.

Backports commit d88a117eaa39b1d0eb1a79fe84c81840a39eb233 from qemu
2019-01-05 06:39:20 -05:00
Richard Henderson 80b4bef1cc
tcg: Add TCG_CALL_NO_RETURN
Remember which helpers have been marked noreturn.

Backports commit 15d7409260498505e991e7b9d87118627165e613 from qemu
2019-01-05 06:35:21 -05:00
Richard Henderson 7dbbf58653
tcg: Renumber TCG_CALL_* flags
Previously, the low 4 bits were used for TCG_CALL_TYPE_MASK,
which was removed in 6a18ae2d2947532d5c26439548afa0481c4529f9.

Backports commit 3b50352b05eeafeb95cccd770f7aaba00bbdf6fe from qemu
2019-01-05 06:32:52 -05:00
Lioncash f8435ca3a6
Temporarily disable tcg_debug_assert()
Backporting 6fa2cef205a60b5c5c3b058f53852416b885c455 by Thomas Huth
started invoking assertions on clang. This means Unicorn is doing
something silly. This should be tracked down, but in the meantime,
restore behavior to allow tests to still be run.
2018-12-19 10:50:48 -05:00
Emilio G. Cota 0567c69235
tcg: Drop nargs from tcg_op_insert_{before,after}
It's unused since 75e8b9b7aa0b95a761b9add7e2f09248b101a392.

Backports commit ac1043f6d607aaac206c8aac42bc32f634f59395 from qemu
2018-12-18 06:00:13 -05:00
Alistair Francis 7219548fbd
tcg/mips: Improve the add2/sub2 command to use TCG_TARGET_REG_BITS
Instead of hard coding 31 for the shift right use TCG_TARGET_REG_BITS - 1.

Backports commit 161dec9d1b03552e78e5728186eae9cf1dfbe035 from qemu
2018-12-18 05:58:09 -05:00
Richard Henderson 5c4e852c6e
tcg: Add TCG_TARGET_HAS_MEMORY_BSWAP
For now, defined universally as true, since we previously required
backends to implement swapped memory operations. Future patches
may now remove that support where it is onerous.

Backports commit e1dcf3529d0797b25bb49a20e94b62eb93e7276a from qemu
2018-12-18 05:56:58 -05:00
Richard Henderson fdb3d6488e
tcg/optimize: Optimize bswap
Somehow we forgot these operations, once upon a time.
This will allow immediate stores to have their bswap
optimized away.

Backports commit 6498594c8eda83c5f5915afc34bd03396f8de6df from qemu
2018-12-18 05:49:29 -05:00
Richard Henderson 1bcbdc2f1b
tcg: Clean up generic bswap64
Based on the only current user, Sparc:

New code uses 2 constants that take 2 insns to load from constant pool,
plus 13. Old code used 6 constants that took 1 or 2 insns to create,
plus 21. The result is a new total of 17 vs an old total of 29.

Backports commit 9e821eab0ab708add35fa0446d880086e845ee3e from qemu
2018-12-18 05:48:05 -05:00
Richard Henderson f68b4aa896
tcg: Clean up generic bswap32
Based on the only current user, Sparc:

New code uses 1 constant that takes 2 insns to create, plus 8.
Old code used 2 constants that took 2 insns to create, plus 9.
The result is a new total of 10 vs an old total of 13.

Backports commit a686dc71d89b1d7934becd95c843aa1375cdb7e7 from qemu
2018-12-18 05:46:27 -05:00
Richard Henderson 3b85c29bb9
tcg/i386: Assume 32-bit values are zero-extended
We now have an invariant that all TCG_TYPE_I32 values are
zero-extended, which means that we do not need to extend
them again during qemu_ld/st, either explicitly via a separate
tcg_out_ext32u or implicitly via P_ADDR32.

Backports commit 4810d96f03be4d3820563e3c6bf13dfc0627f205 from qemu
2018-12-18 05:42:52 -05:00
Richard Henderson b7b142ed79
tcg/i386: Implement INDEX_op_extr{lh}_i64_i32 for 32-bit guests
This preserves the invariant that all TCG_TYPE_I32 values are
zero-extended in the 64-bit host register.

Backports commit 75478279a0c1eafc7b69d5382356da138f58f1bd from qemu
2018-12-18 05:38:55 -05:00
Richard Henderson 4e882a95f3
tcg/i386: Propagate is64 to tcg_out_qemu_ld_slow_path
This helps preserve the invariant that all TCG_TYPE_I32 values
are stored zero-extended in the 64-bit host registers.

Backports commit 3dbc8c61de4e0d0a2afe0897cda7ab28cd37a164 from qemu
2018-12-18 05:36:58 -05:00
Richard Henderson bdd6118105
tcg/i386: Propagate is64 to tcg_out_qemu_ld_direct
This helps preserve the invariant that all TCG_TYPE_I32 values
are stored zero-extended in the 64-bit host registers.

Backports commit 1d21d95b6101786d44d3b4a12400eb80a1ecc647 from qemu
2018-12-18 05:35:34 -05:00
Richard Henderson 7927f3cff5
tcg/s390x: Return false on failure from patch_reloc
This does require an extra two checks within the slow paths
to replace the assert that we're moving. Also add two checks
within existing functions that lacked any kind of assert for
out of range branch.

Backports commit 55dfd8fedceb1311d9cdded1a0f94b2da91a387d from qemu
2018-12-18 05:34:00 -05:00
Richard Henderson 51b802223a
tcg/ppc: Return false on failure from patch_reloc
The reloc_pc{14,24}_val routines retain their asserts.
Use these directly within the slow paths.

Backports commit d5132903518fadad579ef2de9e45fce98eefaa63 from qemu
2018-12-18 05:32:12 -05:00
Richard Henderson 8ecb82062f
tcg/arm: Return false on failure from patch_reloc
This does require an extra two checks within the slow paths
to replace the assert that we're moving.

Backports commit 43fabd30e2f411e8d70ff347902a7c8ed308233e from qemu
2018-12-18 05:30:11 -05:00
Richard Henderson a22387f919
tcg/aarch64: Return false on failure from patch_reloc
This does require an extra two checks within the slow paths
to replace the assert that we're moving.

Backports commit 214bfe83d5a5af70bac2b8d0bd649b018c33c03b from qemu
2018-12-18 05:28:45 -05:00
Richard Henderson fc86fd34ff
tcg/i386: Return false on failure from patch_reloc
Backports commit bec3afd5fc6ab0b6e9d8a01575d58db8d1ad82ce from qemu
2018-12-18 05:27:14 -05:00
Richard Henderson 46189d87b3
tcg: Return success from patch_reloc
This will move the assert for success from within (subroutines of)
patch_reloc into the callers. It will also let new code do something
different when a relocation is out of range.

For the moment, all backends are trivially converted to return true.

Backports commit 6ac1778676f4259c10b0629ccd9df319a5d1baeb from qemu
2018-12-18 05:25:45 -05:00
Richard Henderson 294573899f
tcg/mips: Remove retranslation code
There is no longer a need for preserving branch offset operands,
as we no longer re-translate.

Backports commit 8c1b079279fadaee10dc39ca9a58c4c91c7a1854 from qemu
2018-12-18 05:22:25 -05:00
Richard Henderson ad9aec6f35
tcg/sparc: Remove retranslation code
There is no longer a need for preserving branch offset operands,
as we no longer re-translate.

Backports commit 791645f0227c9d52ce5fe1ad6e1cda55a9bfe633 from qemu
2018-12-18 05:21:50 -05:00
Richard Henderson a124110db4
tcg/s390: Remove retranslation code
There is no longer a need for preserving branch offset operands,
as we no longer re-translate.

Backports commit 3661612fc3e4b65be03482bf6bafd116101881e1 from qemu
2018-12-18 05:21:03 -05:00
Richard Henderson 85485dc20e
tcg/ppc: Fold away noaddr branch routines
There is no longer a need for preserving branch offset operands,
as we no longer re-translate.

Backports commit f9c7246faa279237200a2a53beacaa8100ea1900 from qemu
2018-12-18 05:18:59 -05:00
Richard Henderson b49a353adb
tcg/arm: Fold away noaddr branch routines
There are one use apiece for these. There is no longer a need for
preserving branch offset operands, as we no longer re-translate.

Backports commit 37ee93a974c49ab9edfcd1db0aad3838b0395b14 from qemu
2018-12-18 05:17:22 -05:00
Richard Henderson 1167aa481d
tcg/arm: Remove reloc_pc24_atomic
It is unused since 3fb53fb4d12f2e7833bd1659e6013237b130ef20.

Backports commit 2672ccc7eee742e23928f4bf60a13a77d64f540d from qemu
2018-12-18 05:16:29 -05:00
Richard Henderson 0a8bc142d3
tcg/aarch64: Fold away noaddr branch routines
There are one use apiece for these. There is no longer a need for
preserving branch offset operands, as we no longer re-translate.

Backports commit 733589b3382afcb0ae9f43e72e083a5ddd38abd5 from qemu
2018-12-18 05:15:41 -05:00
Richard Henderson cbe1065e83
tcg/aarch64: Remove reloc_pc26_atomic
It is unused since b68686bd4bfeb70040b4099df993dfa0b4f37b03.

Backports commit 90d6cb781130891f96eb54f8315e29fbd4e99a71 from qemu
2018-12-18 05:14:22 -05:00
Richard Henderson 091b4fa1ff
tcg/i386: Move TCG_REG_CALL_STACK from define to enum
Backports commit 66c0285df4270d184afce5ac8b97ac175c89562f from qemu
2018-12-18 05:13:47 -05:00
Richard Henderson f3a8a4a306
tcg/i386: Always use %ebp for TCG_AREG0
For x86_64, this can remove a REX prefix resulting in smaller code
when manipulating globals of type i32, as we move them between backing
store via cpu_env, aka TCG_AREG0.

Backports commit 5740d9f714835964873325d1210b26811252843f from qemu
2018-12-18 05:13:05 -05:00
Richard Henderson 7ab51fc012
target/sparc: Remove the constant pool
Partially reverts ab20bdc1162. The 14-bit displacement that we
allowed to reach the constant pool is not always sufficient.
Retain the tb-relative addressing, as that is how most return
values from the tb are computed.

Backports commit f6823cbe3787aa47db62deede6683077e3da9a2c from qemu
2018-12-18 05:12:11 -05:00
Thomas Huth 3ba2114043
tcg/tcg.h: Remove GCC check for tcg_debug_assert() macro
Both GCC v4.8 and Clang v3.4 (our minimum versions) support
__builtin_unreachable(), so we can remove the version check here now.

Backports commit 6fa2cef205a60b5c5c3b058f53852416b885c455 from qemu
2018-12-18 03:53:56 -05:00
Peter Maydell 78906db067
tcg/tcg-op.h: Add multiple include guard
The tcg-op.h header was missing the usual guard against multiple
inclusion; add it.

(Spotted by lgtm.com's static analyzer.)

Backports commit a7ce790a029bd94eb320d8c69f38900f5233997e from qemu
2018-11-11 08:51:51 -05:00
Craig Janeczek 58dc377890
target/mips: Introduce MXU registers
Define and initialize the 16 MXU registers - 15 general computational
register, and 1 control register). There is also a zero register, but
it does not have any corresponding variable.

Backports commit eb5559f67dc8dc12335dd996877bb6daaea32eb2 from qemu.
2018-11-11 05:50:52 -05:00
Richard Henderson d74e00a30a
tcg: Split CONFIG_ATOMIC128
GCC7+ will no longer advertise support for 16-byte __atomic operations
if only cmpxchg is supported, as for x86_64. Fortunately, x86_64 still
has support for __sync_compare_and_swap_16 and we can make use of that.
AArch64 does not have, nor ever has had such support, so open-code it.

Backports commit e6cd4bb59b8154fa00da611200beef7eb4e8ec56 from qemu
2018-10-23 15:17:39 -04:00
Emilio G. Cota e5b43d2794
tcg: plug holes in struct TCGProfile
This plugs two 4-byte holes in 64-bit.

Backports commit dd1d7da23b0abef87f46d9ab39ba9b0974eaec04 from qemu
2018-10-23 14:38:16 -04:00
Emilio G. Cota 223975ada0
tcg: fix use of uninitialized variable under CONFIG_PROFILER
We forgot to initialize n in commit 15fa08f845 ("tcg: Dynamically
allocate TCGOps", 2017-12-29).

Backports commit c1f543b739086733024e31d74a52d9e41553f316 from qemu
2018-10-23 14:37:37 -04:00
Richard Henderson e01deeb9ba
tcg: Implement CPU_LOG_TB_NOCHAIN during expansion
Rather than test NOCHAIN before linking, do not emit the
goto_tb opcode at all. We already do this for goto_ptr.

Backports commit d7f425fdea991f052241c6479acd9feae834063b from qemu
2018-10-23 14:35:12 -04:00
Lioncash cc3d618e61
tcg: Remove unnecessary MSVC ifdef
All relevant arrays have at least one member in them now, making this
unnecessary.
2018-10-06 05:08:17 -04:00
Lioncash 766c70f608
arm: Move cpu_M0 to DisasContext 2018-10-06 03:32:39 -04:00
Lioncash 787fd448b1
arm: Move cpu_V1 to DisasContext 2018-10-06 03:28:42 -04:00
Lioncash 1aa20da917
arm: Move cpu_V0 to DisasContext 2018-10-06 03:26:52 -04:00
Lioncash 06c21baaa4
arm: Move cpu_F1d to DisasContext 2018-10-06 03:11:54 -04:00
Lioncash 5f3dd68f9c
arm: Move cpu_F0d to DisasContext 2018-10-06 03:07:42 -04:00
Lioncash e457ce8ccc
arm: Move cpu_F1s to DisasContext 2018-10-06 03:02:06 -04:00
Lioncash 97a5955a2a
tcg: Remove leftover unused variable from TCGContext
This was previously used by the i386 target, however all of the locals
were moved to the DisasContext struct, leaving this unused.
2018-10-06 02:46:27 -04:00
Emilio G. Cota b9bb6cead9
target/i386: move x86_64_hregs to DisasContext
And convert it to a bool to use an existing hole
in the struct.

Backports commit 1dbe15ef57abdf7b6a26c8e638abf6413a4b9d0c from qemu
2018-10-04 04:02:50 -04:00
Emilio G. Cota 04530acab2
target/i386: move cpu_tmp3_i32 to DisasContext
Backports commit 4f82446de695f080ed148a0e47fc141e928665af from qemu
2018-10-04 03:56:05 -04:00
Emilio G. Cota 781e6bde41
target/i386: move cpu_tmp2_i32 to DisasContext
Backports commit 6bd48f6f206b6f32a5bbeebc3ae6886d4f587981 from qemu
2018-10-04 03:53:31 -04:00
Emilio G. Cota c13337d1bc
target/i386: move cpu_ptr1 to DisasContext
Backports commit 6387e8303ffb26cfb40b0f93372f1519229b4d2c from qemu
2018-10-04 03:48:09 -04:00
Emilio G. Cota 3e442d4480
target/i386: move cpu_ptr0 to DisasContext
Backports commit 2ee2646491a293a92d1c85e90e12419a8c199ed0 from qemu
2018-10-04 03:46:53 -04:00
Emilio G. Cota cc872aa711
target/i386: move cpu_tmp4 to DisasContext
Backports commit 5022f28f1e4033eb369b744ad61b96d086beca1b from qemu
2018-10-04 03:45:28 -04:00
Emilio G. Cota d2752ebc42
target/i386: move cpu_tmp0 to DisasContext
Backports commit fbd80f02df3fe272ba0f4825df27b8459dafbc14 from qemu
2018-10-04 03:41:13 -04:00
Emilio G. Cota b704b6c205
target/i386: move cpu_T1 to DisasContext
Backports commit b48597b0eda32d4c7ade2ba3f98f06f62289e3e2 from qemu
2018-10-04 03:35:10 -04:00
Emilio G. Cota 70b327dc82
target/i386: move cpu_T0 to DisasContext
Backports commit c66f97273f677d76afaaeb0e688eb08499701b1b from qemu
2018-10-04 03:29:13 -04:00
Emilio G. Cota c1d70758ea
target/i386: move cpu_A0 to DisasContext
Backports commit 6b672b5d6b14422c131969c5725f738751e12847 from qemu
2018-10-04 01:16:35 -04:00
Emilio G. Cota 30c66bcca3
target/i386: move cpu_cc_srcT to DisasContext
Backports commit 93a3e108eb6a9bb781ab7db6e92d91528e482030 from qemu
2018-10-04 00:59:00 -04:00
Roman Kapl 33e69342e3
tcg/i386: fix vector operations on 32-bit hosts
The TCG backend uses LOWREGMASK to get the low 3 bits of register numbers.
This was defined as no-op for 32-bit x86, with the assumption that we have
eight registers anyway. This assumption is not true once we have xmm regs.

Since LOWREGMASK was a no-op, xmm register indidices were wrong in opcodes
and have overflown into other opcode fields, wreaking havoc.

To trigger these problems, you can try running the "movi d8, #0x0" AArch64
instruction on 32-bit x86. "vpxor %xmm0, %xmm0, %xmm0" should be generated,
but instead TCG generated "vpxor %xmm0, %xmm0, %xmm2".

Fixes: 770c2fc7bb ("Add vector operations")

Backports commit 93bf9a42733321fb632bcb9eafd049ef0e3d9417 from qemu
2018-10-02 04:22:35 -04:00
Richard Henderson 9e8c8a617b
tcg/optimize: Do not skip default processing of dup_vec
If we do not opimize away dup_vec, we must mark its output as changed.

Backports commit 1fb57da72ae0886eba1234a2d98ddd10e88a9efc from qemu
2018-08-09 00:53:07 -04:00
Richard Henderson a4c2dbef3e
tcg/i386: Mark xmm registers call-clobbered
When host vector registers and operations were introduced, I failed
to mark the registers call clobbered as required by the ABI.

Fixes: 770c2fc7bb7

Backports commit 672189cd586ea38a2c1d8ab91eb1f9dcff5ceb05 from qemu
2018-07-23 20:00:26 -04:00
Alex Bennée 11948dd1cc
tcg/aarch64: limit mul_vec size
In AdvSIMD we can only do 32x32 integer multiples although SVE is
capable of larger 64 bit multiples. As a result we can end up
generating invalid opcodes. Fix this by only reprting we can emit
mul vector ops if the size is small enough.

Fixes a crash on:

sve-all-short-v8.3+sve@vq3/insn_mul_z_zi___INC.risu.bin

When running on AArch64 hardware.

Backports commit e65a5f227d77a5dbae7a7123c3ee915ee4bd80cf from qemu
2018-07-21 14:15:59 -04:00
Richard Henderson f1aaf5be62
tcg: Restrict check_size_impl to multiples of the line size
Normally this is automatic in the size restrictions that are placed
on vector sizes coming from the implementation. However, for the
legitimate size tuple [oprsz=8, maxsz=32], we need to clear the final
24 bytes of the vector register. Without this check, do_dup selects
TCG_TYPE_V128 and clears only 16 bytes.

Backports commit 499748d7683198a765d17b4fdf6901ab9dca920c from qemu
2018-07-09 16:41:53 -04:00
John Arbuckle 22c3206738
tcg/i386: Use byte form of xgetbv instruction
The assembler in most versions of Mac OS X is pretty old and does not
support the xgetbv instruction. To go around this problem, the raw
encoding of the instruction is used instead.

Backports commit 1019242af11400252f6735ca71a35f81ac23a66d from qemu
2018-06-28 13:23:32 -05:00
Richard Henderson 10e2b13650
tcg: Pass tb and index to tcg_gen_exit_tb separately
Do the cast to uintptr_t within the helper, so that the compiler
can type check the pointer argument. We can also do some more
sanity checking of the index argument.

Backports commit 07ea28b41830f946de3841b0ac61a3413679feb9 from qemu
2018-06-07 11:56:32 -04:00
Emilio G. Cota 7e8902eccc
tcg: fix s/compliment/complement/ typos
Backports commit 1d349821551c2da4dfefe36c6ac17319f33ebbd5 from qemu
2018-05-22 00:29:51 -04:00
Richard Henderson de1708aadc
tcg: Introduce atomic helpers for integer min/max
Given that this atomic operation will be used by both risc-v
and aarch64, let's not duplicate code across the two targets.

Backports commit 5507c2bf35aa6b4705939349184e71afd5e058b2 from qemu
2018-05-14 08:06:42 -04:00
Richard Henderson eef66443b2
tcg: Introduce helpers for integer min/max
These operations are re-invented by several targets so far.
Several supported hosts have insns for these, so place the
expanders out-of-line for a future introduction of tcg opcodes.

Backports commit b87fb8cd9f9a0ba599ff79e7bf03222da02e5724 from qemu
2018-05-14 07:31:50 -04:00
Richard Henderson f417df19b7
tcg: Limit the number of ops in a TB
In 6001f7729e12 we partially attempt to address the branch
displacement overflow caused by 15fa08f845.

However, gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vqtbX.c
is a testcase that contains a TB so large as to overflow anyway.
The limit here of 8000 ops produces a maximum output TB size of
24112 bytes on a ppc64le host with that test case. This is still
much less than the maximum forward branch distance of 32764 bytes.

Backports commit abebf92597186be2bc48d487235da28b1127860f from qemu
2018-05-11 11:25:01 -04:00
Richard Henderson 33f7f6f09a
tcg/i386: Fix dup_vec in non-AVX2 codepath
The VPUNPCKLD* instructions are all "non-destructive source",
indicated by "NDS" in the encoding string in the x86 ISA manual.
This means that they take two source operands, one of which is
encoded in the VEX.vvvv field. We were incorrectly treating them
as if they were destructive-source and passing 0 as the 'v'
argument of tcg_out_vex_modrm(). This meant we were always
using %xmm0 as one of the source operands, causing incorrect
results if the register allocator happened to want to use
something else. For instance the input AArch64 insn:
DUP v26.16b, w21
which becomes TCG IR ops:
dup_vec v128,e8,tmp2,x21
st_vec v128,e8,tmp2,env,$0xa40
was assembled to:
0x607c568c: c4 c1 7a 7e 86 e8 00 00 vmovq 0xe8(%r14), %xmm0
0x607c5694: 00
0x607c5695: c5 f9 60 c8 vpunpcklbw %xmm0, %xmm0, %xmm1
0x607c5699: c5 f9 61 c9 vpunpcklwd %xmm1, %xmm0, %xmm1
0x607c569d: c5 f9 70 c9 00 vpshufd $0, %xmm1, %xmm1
0x607c56a2: c4 c1 7a 7f 8e 40 0a 00 vmovdqu %xmm1, 0xa40(%r14)
0x607c56aa: 00

when the vpunpcklwd insn should be "%xmm1, %xmm1, %xmm1".
This resulted in our incorrectly setting the output vector to
q26=0000320000003200:0000320000003200
when given an input of x21 == 0000000002803200
rather than the expected all-zeroes.

Pass the correct source register number to tcg_out_vex_modrm()
for these insns.

Backports commit 7eb30ef0ba2eb59e7430d4848ae8d4bf4e50f768 from qemu
2018-05-11 11:22:38 -04:00
Laurent Vivier ec12091943
tcg: workaround branch instruction overflow in tcg_out_qemu_ld/st
ppc64 uses a BC instruction to call the tcg_out_qemu_ld/st
slow path. BC instruction uses a relative address encoded
on 14 bits.

The slow path functions are added at the end of the generated
instructions buffer, in the reverse order of the callers.
So more we have slow path functions more the distance between
the caller (BC) and the function increases.

This patch changes the behavior to generate the functions in
the same order of the callers.

Backports commit 6001f7729e12dd1d810291e4cbf83cee8e07441d from qemu
2018-05-03 15:09:07 -04:00
Richard Henderson 2150745db4
tcg: Improve TCGv_ptr support
Drop TCGV_PTR_TO_NAT and TCGV_NAT_TO_PTR internal macros.

Add tcg_temp_local_new_ptr, tcg_gen_brcondi_ptr, tcg_gen_ext_i32_ptr,
tcg_gen_trunc_i64_ptr, tcg_gen_extu_ptr_i64, tcg_gen_trunc_ptr_i32.

Use inlines instead of macros where possible.

Backports commit 5bfa803448638a45542441fd6b7cc1241403ea72 from qemu
2018-05-03 15:05:43 -04:00
Richard Henderson 4fa9ea2ae1
tcg: Allow wider vectors for cmp and mul
In db432672, we allow wide inputs for operations such as add.
However, in 212be173 and 3774030a we didn't do the same for
compare and multiply.

Backports commit 9a938d86b04025ac605db0ea9819e5896bf576ec from qemu
2018-05-03 14:42:57 -04:00