In arm_tr_init_disas_context() we have a FIXME comment that suggests
"cpu_M0 can probably be the same as cpu_V0". This isn't in fact
possible: cpu_V0 is used as a temporary inside gen_iwmmxt_shift(),
and that function is called in various places where cpu_M0 contains a
live value (i.e. between gen_op_iwmmxt_movq_M0_wRn() and
gen_op_iwmmxt_movq_wRn_M0() calls). Remove the comment.
We also have a comment on the declarations of cpu_V0/V1/M0 which
claims they're "for efficiency". This isn't true with modern TCG, so
replace this comment with one which notes that they're only used with
the iwmmxt decode
Backports 8b4c9a50dc9531a729ae4b5941d287ad0422db48
The 32 vector registers will be viewed as a continuous memory block.
It avoids the convension between element index and (regno, offset).
Thus elements can be directly accessed by offset from the first vector
base address.
Backports ad9e5aa2ae8032f19a8293b6b8f4661c06167bf0 from qemu
No host backend support yet, but the interfaces for rotls
are in place. Only implement left-rotate for now, as the
only known use of vector rotate by scalar is s390x, so any
right-rotate would be unused and untestable.
Backports commit 23850a74afb641102325b4b7f74071d929fc4594 from qemu
No host backend support yet, but the interfaces for rotlv
and rotrv are in place.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v3: Drop the generic expansion from rot to shift; we can do better
for each backend, and then this code becomes unused.
Backports commit 5d0ceda902915e3f0e21c39d142c92c4e97c3ebb from qemu
No host backend support yet, but the interfaces for rotli
are in place. Canonicalize immediate rotate to the left,
based on a survey of architectures, but provide both left
and right shift interfaces to the translators.
Backports commit b0f7e7444c03da17e41bf327c8aea590104a28ab from qemu
Preparation for collapsing the two byte swaps, adjust_endianness and
handle_bswap, along the I/O path.
Target dependant attributes are conditionalized upon NEED_CPU_H.
Backports commit 14776ab5a12972ea439c7fb2203a4c15a09094b4 from qemu
This patch moves the define of target access alignment earlier from
target/foo/cpu.h to configure.
Suggested in Richard Henderson's reply to "[PATCH 1/4] tcg: TCGMemOp is now
accelerator independent MemOp"
Backports commit 52bf9771fdfce98e98cea36a17a18915be6f6b7f from qemu
Perform a per-element conditional move. This combination operation is
easier to implement on some host vector units than plain cmp+bitsel.
Omit the usual gvec interface, as this is intended to be used by
target-specific gvec expansion call-backs.
Backports commit f75da2988eb2457fa23d006d573220c5c680ec4e from qemu
This operation performs d = (b & a) | (c & ~a), and is present
on a majority of host vector units. Include gvec expanders.
Backports commit 38dc12947ec9106237f9cdbd428792c985cd86ae from qemu
Allow the backend to expand dup from memory directly, instead of
forcing the value into a temp first. This is especially important
if integer/vector register moves do not exist.
Note that officially tcg_out_dupm_vec is allowed to fail.
If it did, we could fix this up relatively easily:
VECE == 32/64:
Load the value into a vector register, then dup.
Both of these must work.
VECE == 8/16:
If the value happens to be at an offset such that an aligned
load would place the desired value in the least significant
end of the register, go ahead and load w/garbage in high bits.
Load the value w/INDEX_op_ld{8,16}_i32.
Attempt a move directly to vector reg, which may fail.
Store the value into the backing store for OTS.
Load the value into the vector reg w/TCG_TYPE_I32, which must work.
Duplicate from the vector reg into itself, which must work.
All of which is well and good, except that all supported
hosts can support dupm for all vece, so all of the failure
paths would be dead code and untestable.
Backports commit 37ee55a081b7863ffab2151068dd1b2f11376914 from qemu
Replace the single opcode in .opc with a null-terminated
array in .opt_opc. We still require that all opcodes be
used with the same .vece.
Validate the contents of this list with CONFIG_DEBUG_TCG.
All tcg_gen_*_vec functions will check any list active
during .fniv expansion. Swap the active list in and out
as we expand other opcodes, or take control away from the
front-end function.
Convert all existing vector aware front ends.
Backports commit 53229a7703eeb2bbe101a19a33ef22aaf960c65b from qemu
Thereby decoupling the resulting translated code from the current state
of the system.
The tb->cflags field is not passed to tcg generation functions. So
we add a field to TCGContext, storing there a copy of tb->cflags.
Most architectures have <= 32 registers, which results in a 4-byte hole
in TCGContext. Use this hole for the new field.
Backports commit e82d5a2460b0e176128027651ff9b104e4bdf5cc from qemu
If the TB generates too much code, such that backend relocations
overflow, try again with a smaller TB. In support of this, move
relocation processing from a random place within tcg_out_op, in
the handling of branch opcodes, to a new function at the end of
tcg_gen_code.
This is not a complete solution, as there are additional relocs
generated for out-of-line ldst handling and constant pools.
Backports commit 7ecd02a06f8f4c0bbf872ecc15e37035b7e1df5f from qemu
This ports over the RISC-V architecture from Qemu. This is currently a
very barebones transition. No code hooking or any fancy stuff.
Currently, you can feed it instructions and query the CPU state itself.
This also allows choosing whether or not RISC-V 32-bit or RISC-V 64-bit
is desirable through Unicorn's interface as well.
Extremely basic examples of executing a single instruction have been
added to the samples directory to help demonstrate how to use the basic
functionality.
Completely rewrite conditional stores handling. Use cmpxchg.
This eliminates need for separate implementations of SC instruction
emulation for user and system emulation.
Backports commit 33a07fa2db66376e6ee780d4a8b064dc5118cf34 from qemu
Currently, a jump to a label that is not defined anywhere will
be emitted not be relocated. This results in a jump to a random
jump target. With tcg debugging, print a diagnostic to the -d op
file and abort.
This could help debug or detect errors like
c2d9644e6d ("target/arm: Fix crash on conditional instruction in an IT block")
Backports commit bef16ab4e641636b4e85c3d863b4257ce0be4e6f from qemu
The 32 R5900 128-bit registers are split into two 64-bit halves:
the lower halves are the GPRs and the upper halves are accessible
by the R5900-specific multimedia instructions.
Backports commit a168a796e1c251787fcdf2d9ca1e9e69cb86ffcd from qemu
Use this to notice the opcodes that exit the TB, which implies
that local temps are really dead and need not be synced.
Previously we so marked the true end of the TB, but that was
immediately overwritten by the la_bb_end invoked by any
TCG_OPF_BB_END opcode, like exit_tb.
Backports commit ae36a246ed1a0e96c6c4f478f03d047dfa3a8898 from qemu
Allocate storage for, but do not yet fill in, per-opcode
preferences for the output operands. Pass it in to the
register allocation routines for output operands.
Backports commit 69e3706d2b473815e382552e729d12590339e0ac from qemu
Previously, the low 4 bits were used for TCG_CALL_TYPE_MASK,
which was removed in 6a18ae2d2947532d5c26439548afa0481c4529f9.
Backports commit 3b50352b05eeafeb95cccd770f7aaba00bbdf6fe from qemu
Backporting 6fa2cef205a60b5c5c3b058f53852416b885c455 by Thomas Huth
started invoking assertions on clang. This means Unicorn is doing
something silly. This should be tracked down, but in the meantime,
restore behavior to allow tests to still be run.
Both GCC v4.8 and Clang v3.4 (our minimum versions) support
__builtin_unreachable(), so we can remove the version check here now.
Backports commit 6fa2cef205a60b5c5c3b058f53852416b885c455 from qemu
Define and initialize the 16 MXU registers - 15 general computational
register, and 1 control register). There is also a zero register, but
it does not have any corresponding variable.
Backports commit eb5559f67dc8dc12335dd996877bb6daaea32eb2 from qemu.
GCC7+ will no longer advertise support for 16-byte __atomic operations
if only cmpxchg is supported, as for x86_64. Fortunately, x86_64 still
has support for __sync_compare_and_swap_16 and we can make use of that.
AArch64 does not have, nor ever has had such support, so open-code it.
Backports commit e6cd4bb59b8154fa00da611200beef7eb4e8ec56 from qemu