The allows immediates to be used for ORR and BIC,
as well as the trivial inversions, ORC and AND.
Backports commit 9e27f58b9902834dffc0d66d9eb62f78d9c2a632 from qemu
Use MOVI+ORR or MVNI+BIC in order to build some vector constants,
as opposed to dropping them to the constant pool. This includes
all 16-bit constants and a similar set of 32-bit constants.
Backports commit 02f3a5b4744885258758d07ebe09cf965de78bcf from qemu
The compliment of a subset of immediates can be computed
with a single instruction.
Backports commit 7e308e003e5b6ddd3130e09711e1d33693230696 from qemu
There are several sub-classes of vector immediate, and only MOVI
can use them all. This will enable usage of MVNI and ORRI, which
use progressively fewer sub-classes.
This patch adds no new functionality, merely splits the function
and moves part of the logic into tcg_out_dupi_vec.
Backports commit 984fdcee342473dfe797897758929dad654693c8 from qemu
The instruction set has 3 insns that perform the same operation,
only varying in which operand must overlap the destination. We
can represent the operation without overlap and choose based on
the operands seen.
Backports commit a9e434a5dc16f71ee156428619fc3c3765b68f26 from qemu
Allow the backend to expand dup from memory directly, instead of
forcing the value into a temp first. This is especially important
if integer/vector register moves do not exist.
Note that officially tcg_out_dupm_vec is allowed to fail.
If it did, we could fix this up relatively easily:
VECE == 32/64:
Load the value into a vector register, then dup.
Both of these must work.
VECE == 8/16:
If the value happens to be at an offset such that an aligned
load would place the desired value in the least significant
end of the register, go ahead and load w/garbage in high bits.
Load the value w/INDEX_op_ld{8,16}_i32.
Attempt a move directly to vector reg, which may fail.
Store the value into the backing store for OTS.
Load the value into the vector reg w/TCG_TYPE_I32, which must work.
Duplicate from the vector reg into itself, which must work.
All of which is well and good, except that all supported
hosts can support dupm for all vece, so all of the failure
paths would be dead code and untestable.
Backports commit 37ee55a081b7863ffab2151068dd1b2f11376914 from qemu
The LD1R instruction does all the work. Note that the only
useful addressing mode is a base register with no offset.
Backports commit f23e5e15edfd49d5dd72cab2ed2d85ac354b2eeb from qemu
This case is similar to INDEX_op_mov_* in that we need to do
different things depending on the current location of the source.
Backports commit bab1671f0fa928fd678a22f934739f06fd5fd035 from qemu
The i386 backend already has these functions, and the aarch64 backend
could easily split out one. Nothing is done with these functions yet,
but this will aid register allocation of INDEX_op_dup_vec in a later patch.
Adjust the aarch64 tcg_out_dupi_vec signature to match the new interface.
Backports commit e7632cfa8b76cdbbc1c76e8737338ef5844e7d60 from qemu
This patch merely changes the interface, aborting on all failures,
of which there are currently none.
Backports commit 78113e83e0007e869c9f0cb4c0497a77538988e3 from qemu
This does require an extra two checks within the slow paths
to replace the assert that we're moving.
Backports commit 214bfe83d5a5af70bac2b8d0bd649b018c33c03b from qemu
This will move the assert for success from within (subroutines of)
patch_reloc into the callers. It will also let new code do something
different when a relocation is out of range.
For the moment, all backends are trivially converted to return true.
Backports commit 6ac1778676f4259c10b0629ccd9df319a5d1baeb from qemu
There are one use apiece for these. There is no longer a need for
preserving branch offset operands, as we no longer re-translate.
Backports commit 733589b3382afcb0ae9f43e72e083a5ddd38abd5 from qemu
In AdvSIMD we can only do 32x32 integer multiples although SVE is
capable of larger 64 bit multiples. As a result we can end up
generating invalid opcodes. Fix this by only reprting we can emit
mul vector ops if the size is small enough.
Fixes a crash on:
sve-all-short-v8.3+sve@vq3/insn_mul_z_zi___INC.risu.bin
When running on AArch64 hardware.
Backports commit e65a5f227d77a5dbae7a7123c3ee915ee4bd80cf from qemu
It's not even clear what the interface REG and VAL32 were supposed to mean.
All uses had REG = 0 and VAL32 was the bitset assigned to the destination.
Backports commit f46934df662182097dce07d57ec00f37e4d2abf1 from qemu
Dispense with TCGBackendData, as it has never been used for more than
holding a single pointer. Use a define in the cpu/tcg-target.h to
signal requirement for TCGLabelQemuLdst, so that we can drop the no-op
tcg-be-null.h stubs. Rename tcg-be-ldst.h to tcg-ldst.inc.c.
Backports commit 659ef5cbb893872d25e9d95191cc23b16546c8a1 from qemu
Replace the USE_DIRECT_JUMP ifdef with a TCG_TARGET_HAS_direct_jump
boolean test. Replace the tb_set_jmp_target1 ifdef with an unconditional
function tb_target_set_jmp_target.
While we're touching all backends, add a parameter for tb->tc_ptr;
we're going to need it shortly for some backends.
Move tb_set_jmp_target and tb_add_jump from exec-all.h to cpu-exec.c.
Backports commit a85833933628384d74ec412024d55cf012640287 from qemu
This patch enables the indirect jump path using an LDR (literal)
instruction. It will be interesting to test and see which performs
better among the two paths.
Backports commit 2acee8b2b5e6bba2935bb6ce5be92d0f0f9799cb from qemu
We use ADRP+ADD to compute the target address for goto_tb. This patch
introduces the NOP instruction which is used to align the above
instruction pair so that we can use one atomic instruction to patch
the destination offsets.
Backports commit b68686bd4bfeb70040b4099df993dfa0b4f37b03 from qemu
We can use a branch to register instruction for exit_tb for offsets
greater than 128MB.
Backports commit 23b7aa1d2af04ba57cc94f74d9f0ab25dce72fa0 from qemu
The new placement of the TB means that we can use one insn
to load the return value for exit_tb returning the TB pointer.
Backports commit cc74d332ff9a78684374847375ef63fc4bd10436 from qemu
To fix the following warnings:
In file included from /users/pranith/qemu/tcg/tcg.c:255:
/users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:879:24: warning: implicit conversion from enumeration type 'TCGMemOp' (aka 'enum TCGMemOp') to different enumeration type 'TCGType' (aka 'enum TCGType')
[-Wenum-conversion]
tcg_out_cmp(s, ext, a, b, b_const);
~~~~~~~~~~~ ^~~
/users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:893:36: warning: implicit conversion from enumeration type 'TCGMemOp' (aka 'enum TCGMemOp') to different enumeration type 'TCGType' (aka 'enum TCGType')
[-Wenum-conversion]
tcg_out_insn(s, 3201, CBZ, ext, a, offset);
~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
/users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:389:65: note: expanded from macro 'tcg_out_insn'
glue(tcg_out_insn_,FMT)(S, glue(glue(glue(I,FMT),_),OP), ## __VA_ARGS__)
^
/users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:895:37: warning: implicit conversion from enumeration type 'TCGMemOp' (aka 'enum TCGMemOp') to different enumeration type 'TCGType' (aka 'enum TCGType')
[-Wenum-conversion]
tcg_out_insn(s, 3201, CBNZ, ext, a, offset);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
/users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:389:65: note: expanded from macro 'tcg_out_insn'
glue(tcg_out_insn_,FMT)(S, glue(glue(glue(I,FMT),_),OP), ## __VA_ARGS__)
^
/users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:1610:27: warning: implicit conversion from enumeration type 'TCGType' (aka 'enum TCGType') to different enumeration type 'TCGMemOp' (aka 'enum TCGMemOp')
[-Wenum-conversion]
tcg_out_brcond(s, ext, a2, a0, a1, const_args[1], arg_label(args[3]));
~~~~~~~~~~~~~~ ^~~
backports commit dc1eccd661ada3b746ca4438e444993c36a0f04f from qemu
This will let us choose how to interpret a given constraint
depending on whether the opcode is 32- or 64-bit. Which will
let us share more constraint combinations between opcodes.
At the same time, change the interface to return the advanced
pointer instead of passing it in/out by reference.
Backports commit 069ea736b50b75fdec99c9b8cc603b97bd98419e from qemu
This will allow the target to tailor the constraints to the
auto-detected ISA extensions.
Backports commit f69d277ece43c42c7ab0144c2ff05ba740f6706b from qemu
There were some patterns, like 0x0000_ffff_ffff_00ff, for which we
would select to begin a multi-insn sequence with MOVN, but would
fail to set the 0x0000 lane back from 0xffff.
Backports commit 50b468d42107a2c646b1c566ed17d9ec362c51c4 from qemu
When al == xzr, we cannot use addi/subi because that encodes xsp.
Force a zero into the temp register for that (rare) case.
Backports commit 028fbea47713f909d6ea761a457779a82b276247 from qemu
Previously we allowed fully unaligned operations, but not operations
that are aligned but with less alignment than the operation size.
In addition, arm32, ia64, mips, and sparc had been omitted from the
previous overalignment patch, which would have led to that alignment
being enforced.
Backports commit 85aa80813dd9f5c1f581c743e45678a3bee220f8 from qemu
Some architectures (e.g. ARMv8) need the address which is aligned
to a size more than the size of the memory access.
To support such check it's enough the current costless alignment
check implementation in QEMU, but we need to support
an alignment size specifying.
Backports commit 1f00b27f17518a1bcb4cedca49eaec96a4d560bd from qemu
While we can store constants via constrants on INDEX_op_st_i32 et al,
we weren't able to spill constants to backing store.
Add a new backend interface, tcg_out_sti, which may store the constant
(and is allowed to fail). Rearrange the temp_* helpers so that we only
attempt to directly store a constant when the temp is becoming dead/free.
Backports commit 59d7c14eeff8d2ad7f61aed86ce5a176113bc153 from qemu
Briefly describe in a comment how direct block chaining is done. It
should help in understanding of the following data fields.
Rename some fields in TranslationBlock and TCGContext structures to
better reflect their purpose (dropping excessive 'tb_' prefix in
TranslationBlock but keeping it in TCGContext):
tb_next_offset => jmp_reset_offset
tb_jmp_offset => jmp_insn_offset
tb_next => jmp_target_addr
jmp_next => jmp_list_next
jmp_first => jmp_list_first
Avoid using a magic constant as an invalid offset which is used to
indicate that there's no n-th jump generated.
Backports commit f309101c26b59641fc1aa8fb2a98a5441cdaea03 from qemu
Ensure direct jump patching in AArch64 is atomic by using
atomic_read()/atomic_set() for code patching.
Backports commit 9e269112953be4d670cb0d25042bd6546fcf3e45 from qemu
The TCG code is quite performance sensitive, but at the same time can
also be quite tricky. That is why asserts that can be enabled with the
--enable-debug-tcg configure option.
This used to work the following way:
| #include "config.h"
|
| ...
|
| #if !defined(CONFIG_DEBUG_TCG) && !defined(NDEBUG)
| /* define it to suppress various consistency checks (faster) */
| #define NDEBUG
| #endif
|
| ...
|
| #include <assert.h>
Since commit 757e725b (tcg: Clean up includes) "config.h" as been
replaced by "qemu/osdep.h" which itself includes <assert.h>. As a
consequence the assertions are always enabled, even when using
--disable-debug-tcg, causing a performance regression, especially on
targets with many registers. For instance on qemu-system-ppc the
speed difference is about 15%.
tcg_debug_assert is controlled directly by CONFIG_DEBUG_TCG and already
uses in some places. This patch replaces all the calls to assert into
calss to tcg_debug_assert.
Backports commit eabb7b91b36b202b4dac2df2d59d698e3aff197a from qemu