Commit graph

104 commits

Author SHA1 Message Date
Eduardo Habkost bc86f4377c cpu: Move debug_excp_handler to tcg_ops
Backports e9ce43e97a19090ae8975ef168b95ba3d29be991
2021-03-04 17:05:57 -05:00
Eduardo Habkost 76a10fa8e0 cpu: Move tlb_fill to tcg_ops
Backports e124536f37377cff5d68925d4976ad604d0ebf3a
2021-03-04 17:01:55 -05:00
Eduardo Habkost 03cc62e39c cpu: Move cpu_exec_* to tcg_ops
Backports 48c1a3e303b5a2cca48679645ad3fbb914db741a
2021-03-04 16:56:55 -05:00
Eduardo Habkost eb38ac1809 cpu: Move synchronize_from_tb() to tcg_ops
Backports ec62595bab1873c48a34849de70011093177e769
2021-03-04 16:48:27 -05:00
Richard Henderson 5fb8ab10eb tcg: Restart code generation when we run out of temps
Some large translation blocks can generate so many unique
constants that we run out of temps to hold them. In this
case, longjmp back to the start of code generation and
restart with a smaller translation block.

Backports ae30e86661b0f48562cd95918d37cbeec5d0226
2021-03-04 15:37:05 -05:00
Richard Henderson 541ef541ae tcg/optimize: Use tcg_constant_internal with constant folding
Backport 8fe35e0444be88de4e3ab80a2a0e210a1f6d663d
2021-03-04 12:15:58 -05:00
Richard Henderson 4ccadaf6cf tcg: Use memset for large vector byte replication
In f47db80cc07, we handled odd-sized tail clearing for
the case of hosts that have vector operations, but did
not handle the case of hosts that do not have vector ops.

This was ok until e2e7168a214b, which changed the encoding
of simd_desc such that the odd sizes are impossible.

Add memset as a tcg helper, and use that for all out-of-line
byte stores to vectors. This includes, but is not limited to,
the tail clearing operation in question.

Backports 6d3ef04893bdea3e7aa08be3cce5141902836a31
2021-03-03 19:28:15 -05:00
Philippe Mathieu-Daudé 4a0f9846b2 accel/tcg: Remove special case for GCC < 4.6
Since commit efc6c070aca ("configure: Add a test for the
minimum compiler version") the minimum compiler version
required for GCC is 4.8.

We can safely remove the special case for GCC 4.6 introduced
in commit 0448f5f8b81 ("cpu-exec: Fix compiler warning
(-Werror=clobbered)").
No change for Clang as we don't know.

Backports 19a84318c674c157f1b04c5c99595379f8ac8bb3
2021-03-03 19:15:07 -05:00
Richard Henderson 94b0876f15 target/arm: Add sve infrastructure for page lookup
For contiguous predicated memory operations, we want to
minimize the number of tlb lookups performed. We have
open-coded this for sve_ld1_r, but for correctness with
MTE we will need this for all of the memory operations.

Create a structure that holds the bounds of active elements,
and metadata for two pages. Add routines to find those
active elements, lookup the pages, and run watchpoints
for those pages.

Temporarily mark the functions unused to avoid Werror.

Backports commit b4cd95d2f4c7197b844f51b29871d888063ea3e7 from qemu
2021-02-25 20:28:23 -05:00
Richard Henderson 2e03f74a53 target/arm: Use cpu_*_data_ra for sve_ldst_tlb_fn
Use the "normal" memory access functions, rather than the
softmmu internal helper functions directly.

Since fb901c9, cpu_mem_index is now a simple extract
from env->hflags and not a large computation.  Which means
that it's now more work to pass around this value than it
is to recompute it.

This only adjusts the primitives, and does not clean up
all of the uses within sve_helper.c.
2021-02-25 20:16:38 -05:00
Richard Henderson 5b3ddcf2e2 target/arm: Simplify DC_ZVA
Now that we know that the operation is on a single page,
we need not loop over pages while probing.

Backports commit e26d0d226892f67435cadcce86df0ddfb9943174 from qemu
2021-02-25 15:55:46 -05:00
Sunho Kim d56c79776e unicorn: fix uc_emu_start until if end instruction is in another tlb 2020-08-05 03:18:51 +09:00
Richard Henderson be78062fd8 tcg: Implement gvec support for rotate by vector
No host backend support yet, but the interfaces for rotlv
and rotrv are in place.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v3: Drop the generic expansion from rot to shift; we can do better
for each backend, and then this code becomes unused.

Backports commit 5d0ceda902915e3f0e21c39d142c92c4e97c3ebb from qemu
2020-06-14 21:43:46 -04:00
Richard Henderson 5cce52a04b tcg: Implement gvec support for rotate by immediate
No host backend support yet, but the interfaces for rotli
are in place. Canonicalize immediate rotate to the left,
based on a survey of architectures, but provide both left
and right shift interfaces to the translators.

Backports commit b0f7e7444c03da17e41bf327c8aea590104a28ab from qemu
2020-06-14 21:26:58 -04:00
Richard Henderson 6530d6342f softfloat: Use post test for floatN_mul
The existing f{32,64}_addsub_post test, which checks for zero
inputs, is identical to f{32,64}_mul_fast_test. Which means
we can eliminate the fast_test/fast_op hooks in favor of
reusing the same post hook.

This means we have one fewer test along the fast path for multiply.

Backports commit b240c9c497b9880ac0ba29465907d5ebecd48083 from qemu
2020-05-21 17:24:00 -04:00
Richard Henderson c6f644ea0c tcg: Remove softmmu code_gen_buffer fixed address
The commentary talks about "in concert with the addresses
assigned in the relevant linker script", except there is no
linker script for softmmu, nor has there been for some time.

(Do not confuse the user-only linker script editing that was
removed in the previous patch, because user-only does not
use this code_gen_buffer allocation method.)

Backports commit 64547a3bb6c92781372994e4e9b25a89f6c88074 from qemu
2020-04-30 07:03:06 -04:00
Richard Henderson 12d7126443 tcg: Remove tcg-runtime-gvec.c DO_CMP0
Partial cleanup from the CONFIG_VECTOR16 removal.
Replace DO_CMP0 with its scalar expansion, a simple negation.

Backports commit 0270bd503e3699b7202200a2d693ad1feb57473f from qemu
2020-04-30 06:43:07 -04:00
Richard Henderson 6ab9bca262 tcg: Tidy tcg-runtime-gvec.c DUP*
Partial cleanup from the CONFIG_VECTOR16 removal.
Replace the DUP* expansions with the scalar argument.

Backports commit 0a83e43a9ee624b44da61514db9b77d86e74e8c2 from qemu
2020-04-30 06:41:48 -04:00
Richard Henderson 5b14fc103a tcg: Tidy tcg-runtime-gvec.c types
Partial cleanup from the CONFIG_VECTOR16 removal.
Replace the vec* types with their scalar expansions.

Backports commit 6c7ab3015ac498181444deff55dcc8fd43ad468c from qemu
2020-04-30 06:34:37 -04:00
Richard Henderson 72d8af5282 tcg: Remove CONFIG_VECTOR16
The comment in tcg-runtime-gvec.c about CONFIG_VECTOR16 says that
tcg-op-gvec.c has eliminated size 8 vectors, and only passes on
multiples of 16. This may have been true of the first few operations,
but is not true of all operations.

In particular, multiply, shift by scalar, and compare of 8- and 16-bit
elements are not expanded inline if host vector operations are not
supported.

For an x86_64 host that does not support AVX, this means that we will
fall back to the helper, which will attempt to use SSE instructions,
which will SEGV on an invalid 8-byte aligned memory operation.

This patch simply removes the CONFIG_VECTOR16 code and configuration
without further simplification.

Buglink: https://bugs.launchpad.net/bugs/1863508

Backports commit 43d1ccd2a02fadf3c36b46a8c31125a28864d141 from qemu
2020-04-30 06:28:46 -04:00
Charles Ferguson 784d580f01 Ensure that PC is not fixed up when code tracing or timing. (#1179)
Under some circumstances, the PC is not fixed up properly when
returning from the execution of a block in cpu_tb_exec. This appears
to be caused by the resetting of the PC from the tb.

This change removes the additional fixup in the cases where there
is code tracing or timing active. Either of these cases would result
in the wrong PC being reported.

Closes unicorn-engine#1105.

Backports commit b59632fb645d456338472e3d757c065c0ed74ad5 from unicorn
2020-01-14 09:52:25 -05:00
w1tcher b1f5794ab4 Fix the error in the hook_code of the arm
Calling emu_stop and causing the pc value to be incorrect after the end of the run. (#1157)

Backports commit 83887b8193dfeca3e5e8da851b41f874bcd0514e from unicorn.
2020-01-14 09:29:37 -05:00
Chen Huitao 644ea0c88c fix a mem-leak (#1147)
* fix a mem-leak.

* check the uc and l1_map before using them.

* fix multi-level free bug.

* Add pointer check.

Backports commit 79d89e5d3b83c6ee5d523738bc488d1e44b06f6a from unicorn.
2020-01-14 09:24:44 -05:00
Azertinv a22641c4be Added an invalid instruction hook (#1132)
* first draft for an invalid instruction hook

* Fixed documentation on return value of invalid insn hook

Backports commit 07f94ad1fc62293cac330df9714d739be6354926 from unicorn
2020-01-14 09:15:54 -05:00
David Hildenbrand de513617c8 accel/tcg: allow to invalidate a write TLB entry immediately
Background: s390x implements Low-Address Protection (LAP). If LAP is
enabled, writing to effective addresses (before any translation)
0-511 and 4096-4607 triggers a protection exception.

So we have subpage protection on the first two pages of every address
space (where the lowcore - the CPU private data resides).

By immediately invalidating the write entry but allowing the caller to
continue, we force every write access onto these first two pages into
the slow path. we will get a tlb fault with the specific accessed
addresses and can then evaluate if protection applies or not.

We have to make sure to ignore the invalid bit if tlb_fill() succeeds.

Backports commit f52bfb12143e29d7c8bd827bdb751aee47a9694e from qemu
2020-01-14 07:14:10 -05:00
David Hildenbrand d9d91c1db6 tcg: Factor out probe_write() logic into probe_access()
Let's also allow to probe other access types.

Backports commit c25c283df0f08582df29f1d5d7be1516b851532d from qemu
2020-01-14 07:07:54 -05:00
David Hildenbrand 53c3c47efa tcg: Make probe_write() return a pointer to the host page
... similar to tlb_vaddr_to_host(); however, allow access to the host
page except when TLB_NOTDIRTY or TLB_MMIO is set.

Backports commit fef39ccd567032d3ad520ed80f3576068e6eb2e3 from qemu
2020-01-14 07:04:17 -05:00
David Hildenbrand 2bc3843fe3 tcg: Enforce single page access in probe_write()
Let's enforce the interface restriction.

Backports commit ca86cf328ce216bb304bbf09a43614613f945d86 from qemu
2020-01-14 07:02:15 -05:00
David Hildenbrand b732ad9eba tcg: Check for watchpoints in probe_write()
Let size > 0 indicate a promise to write to those bytes.
Check for write watchpoints in the probed range.

Backports commit 03a981893c99faba84bb373976796ad7dce0aecc from qemu
2020-01-14 07:01:05 -05:00
Richard Henderson 07f30382c0 cputlb: Handle watchpoints via TLB_WATCHPOINT
The raising of exceptions from check_watchpoint, buried inside
of the I/O subsystem, is fundamentally broken. We do not have
the helper return address with which we can unwind guest state.

Replace PHYS_SECTION_WATCH and io_mem_watch with TLB_WATCHPOINT.
Move the call to cpu_check_watchpoint into the cputlb helpers
where we do have the helper return address.

This allows watchpoints on RAM to bypass the full i/o access path.

Backports commit 50b107c5d617eaf93301cef20221312e7a986701 from qemu
2020-01-14 06:58:33 -05:00
Richard Henderson 6c4a3fd06f cputlb: Fold TLB_RECHECK into TLB_INVALID_MASK
We had two different mechanisms to force a recheck of the tlb.

Before TLB_RECHECK was introduced, we had a PAGE_WRITE_INV bit
that would immediate set TLB_INVALID_MASK, which automatically
means that a second check of the tlb entry fails.

We can use the same mechanism to handle small pages.
Conserve TLB_* bits by removing TLB_RECHECK.

Backports commit 30d7e098d5c38644359820317fcf72e3e129ec53 from qemu
2020-01-14 06:20:33 -05:00
Richard Henderson bb313206e5 cputlb: Remove double-alignment in store_helper
We have already aligned page2 to the start of the next page.
There is no reason to do that a second time.

Backports commit 5787585d0406cfd54dda0c71ea1a603347ce6e71 from qemu
2020-01-12 10:25:13 -05:00
Richard Henderson 6990b212e3 cputlb: Fix size operand for tlb_fill on unaligned store
We are currently passing the size of the full write to
the tlb_fill for the second page. Instead pass the real
size of the write to that page.

This argument is unused within all tlb_fill, except to be
logged via tracing, so in practice this makes no difference.

But in a moment we'll need the value of size2 for watchpoints,
and if we've computed the value we might as well use it.

Backports commit 8f7cd2ad4acd01242d00807e231097b3de9f0930 from qemu
2020-01-12 06:17:09 -05:00
Tony Nguyen a95927de1d cputlb: Byte swap memory transaction attribute
Notice new attribute, byte swap, and force the transaction through the
memory slow path.

Required by architectures that can invert endianness of memory
transaction, e.g. SPARC64 has the Invert Endian TTE bit.

Backports commit a26fc6f5152b47f1d7ed928f9c9d462d01ff1624 from qemu
2020-01-07 19:15:33 -05:00
Tony Nguyen 103d6f51c8 memory: Single byte swap along the I/O path
Now that MemOp has been pushed down into the memory API, and
callers are encoding endianness, we can collapse byte swaps
along the I/O path into the accelerator and target independent
adjust_endianness.

Collapsing byte swaps along the I/O path enables additional endian
inversion logic, e.g. SPARC64 Invert Endian TTE bit, with redundant
byte swaps cancelling out.

Backports commit 9bf825bf3df4ebae3af51566c8088e3f1249a910 from qemu
2020-01-07 19:12:04 -05:00
Tony Nguyen ad8957a4c3 cputlb: Replace size and endian operands for MemOp
Preparation for collapsing the two byte swaps adjust_endianness and
handle_bswap into the former.

Backports commit be5c4787e9a6eed12fd765d9e890f7cc6cd63220 from qemu
2020-01-07 19:03:51 -05:00
Tony Nguyen da98d0da4e memory: Access MemoryRegion with endianness
Preparation for collapsing the two byte swaps adjust_endianness and
handle_bswap into the former.

Call memory_region_dispatch_{read|write} with endianness encoded into
the "MemOp op" operand.

This patch does not change any behaviour as
memory_region_dispatch_{read|write} is yet to handle the endianness.

Once it does handle endianness, callers with byte swaps can collapse
them into adjust_endianness.

Backports commit d5d680cacc66ef7e3c02c81dc8f3a34eabce6dfe from qemu
2020-01-07 18:54:11 -05:00
Tony Nguyen 3b777a2332 cputlb: Access MemoryRegion with MemOp
The memory_region_dispatch_{read|write} operand "unsigned size" is
being converted into a "MemOp op".

Convert interfaces by using no-op size_memop.

After all interfaces are converted, size_memop will be implemented
and the memory_region_dispatch_{read|write} operand "unsigned size"
will be converted into a "MemOp op".

As size_memop is a no-op, this patch does not change any behaviour.

Backports commit 4cbb198eefef41bbca703605c78875fd4fec6ef6 from qemu
2020-01-07 18:26:29 -05:00
Tony Nguyen f75368cd0f
tcg: TCGMemOp is now accelerator independent MemOp
Preparation for collapsing the two byte swaps, adjust_endianness and
handle_bswap, along the I/O path.

Target dependant attributes are conditionalized upon NEED_CPU_H.

Backports commit 14776ab5a12972ea439c7fb2203a4c15a09094b4 from qemu
2019-11-28 03:01:12 -05:00
Emilio G. Cota f4be234ab8
atomic_template: fix indentation in GEN_ATOMIC_HELPER
Backports commit 358f6348df5ad785c7c18be659d4ff9a2174635f from qemu
2019-11-28 02:38:07 -05:00
Lioncash 802c626145
Revert "cputlb: Filter flushes on already clean tlbs"
This reverts commit 5ab9723787.
2019-06-30 19:21:20 -04:00
Richard Henderson a1396b12f6
tcg: Fix typos in helper_gvec_sar{8,32,64}v
The loop is written with scalars, not vectors.
Use the correct type when incrementing.

Fixes: 5ee5c14cacd

Backports commit 899f08ad1d1231dbbfa67298413f05ed2679fb02 from qemu
2019-06-13 16:09:16 -04:00
Alex Bennée 938f8465a0
cputlb: cast size_t to target_ulong before using for address masks
While size_t is defined to happily access the biggest host object this
isn't the case when generating masks for 64 bit guests on 32 bit
hosts. Otherwise we end up truncating the address when we fall back to
our unaligned helper.

Fixes: https://bugs.launchpad.net/qemu/+bug/1831545

Backports commit ab7a2009df66241a3742cbdfe8f9a1f66c6af21f from qemu
2019-06-13 16:07:01 -04:00
Alex Bennée 9aef73f5fb
cputlb: use uint64_t for interim values for unaligned load
When running on 32 bit TCG backends a wide unaligned load ends up
truncating data before returning to the guest. We specifically have
the return type as uint64_t to avoid any premature truncation so we
should use the same for the interim types.

Fixes: https://bugs.launchpad.net/qemu/+bug/1830872
Fixes: eed5664238e

Backports commit 8c79b288513587e960b6b7257a9d955d5592f209 from qemu
2019-06-13 16:06:22 -04:00
Richard Henderson d7ea41c3a3
cpu: Move icount_decr to CPUNegativeOffsetState
Amusingly, we had already ignored the comment to keep this value
at the end of CPUState. This restores the minimum negative offset
from TCG_AREG0 for code generation.

For the couple of uses within qom/cpu.c, without NEED_CPU_H, add
a pointer from the CPUState object to the IcountDecr object within
CPUNegativeOffsetState.

Backports commit 5e1401969b25f676fee6b1c564441759cf967a43 from qemu
2019-06-13 15:34:28 -04:00
Richard Henderson fbf91a6535
cpu: Replace ENV_GET_CPU with env_cpu
Now that we have both ArchCPU and CPUArchState, we can define
this generically instead of via macro in each target's cpu.h.

Backports commit 29a0af618ddd21f55df5753c3e16b0625f534b3c from qemu
2019-06-12 11:16:16 -04:00
Lioncash 5ab9723787
cputlb: Filter flushes on already clean tlbs
Especially for guests with large numbers of tlbs, like ARM or PPC,
we may well not use all of them in between flush operations.
Remember which tlbs have been used since the last flush, and
avoid any useless flushing.

Backports much of 3d1523ced6060cdfe9e768a814d064067ccabfe5 from qemu
along with a bunch of updating changes.
2019-06-10 20:42:15 -04:00
Richard Henderson ca58be9cb4
tcg: Add support for vector bitwise select
This operation performs d = (b & a) | (c & ~a), and is present
on a majority of host vector units. Include gvec expanders.

Backports commit 38dc12947ec9106237f9cdbd428792c985cd86ae from qemu
2019-05-24 18:15:10 -04:00
Richard Henderson 2a4a7b9391
tcg: Use tlb_fill probe from tlb_vaddr_to_host
Most of the existing users would continue around a loop which
would fault the tlb entry in via a normal load/store.

But for AArch64 SVE we have an existing emulation bug wherein we
would mark the first element of a no-fault vector load as faulted
(within the FFR, not via exception) just because we did not have
its address in the TLB. Now we can properly only mark it as faulted
if there really is no valid, readable translation, while still not
raising an exception. (Note that beyond the first element of the
vector, the hardware may report a fault for any reason whatsoever;
with at least one element loaded, forward progress is guaranteed.)

Backports commit 4811e9095c0491bc6f5450e5012c9c4796b9e59d from qemu
2019-05-16 18:27:03 -04:00
Richard Henderson dab0061a0d
tcg: Use CPUClass::tlb_fill in cputlb.c
We can now use the CPUClass hook instead of a named function.

Create a static tlb_fill function to avoid other changes within
cputlb.c. This also isolates the asserts within. Remove the
named tlb_fill function from all of the targets.

Backports commit c319dc13579a92937bffe02ad2c9f1a550e73973 from qemu
2019-05-16 17:35:37 -04:00