Commit graph

87 commits

Author SHA1 Message Date
Richard Henderson 6a78133659 target/arm: Remove sve_memopidx
None of the sve helpers use TCGMemOpIdx any longer, so we can
stop passing it.

Backports commit ba080b8682fc6bde7f2d9dedddb519d63cbe138f from qemu
2021-02-25 21:33:44 -05:00
Richard Henderson 1f306230d4 target/arm: Reuse sve_probe_page for gather loads
Backports commit 10a85e2c8ab6e004e7f3f1dcfea8cb0bf58fb9fb from qemu
2021-02-25 21:30:13 -05:00
Richard Henderson 585da952ec target/arm: Reuse sve_probe_page for scatter stores
Backports commit 88a660a48ef513ce9875b595e19b2a820b3f3fca from qemu
2021-02-25 21:27:14 -05:00
Richard Henderson 3eee880c2a target/arm: Reuse sve_probe_page for gather first-fault loads
This avoids the need for a separate set of helpers to implement
no-fault semantics, and will enable MTE in the future.

Backports commit 50de9b78cec06e6d16e92a114a505779359ca532 from qemu
2021-02-25 21:22:16 -05:00
Richard Henderson b1e31f3bf3 target/arm: Use SVEContLdSt for contiguous stores
Follow the model set up for contiguous loads. This handles
watchpoints correctly for contiguous stores, recognizing the
exception before any changes to memory.

Backports commit 0fa476c1bb37a70df7eeff1e5bfb4791feb37e0e from qemu
2021-02-25 21:15:14 -05:00
Richard Henderson 3591c2f548 target/arm: Update contiguous first-fault and no-fault loads
With sve_cont_ldst_pages, the differences between first-fault and no-fault
are minimal, so unify the routines. With cpu_probe_watchpoint, we are able
to make progress through pages with TLB_WATCHPOINT set when the watchpoint
does not actually fire.

Backports commit c647673ce4d72a8789703c62a7f3cbc732cb1ea8 from qemu
2021-02-25 21:06:14 -05:00
Richard Henderson 6c9304448e target/arm: Use SVEContLdSt for multi-register contiguous loads
Backports commit 5c9b8458a0b3008d24d84b67e1c9b6d5f39f4d66 from qemu
2021-02-25 20:50:22 -05:00
Richard Henderson 3979c8f73e target/arm: Handle watchpoints in sve_ld1_r
Handle all of the watchpoints for active elements all at once,
before we've modified the vector register. This removes the
TLB_WATCHPOINT bit from page[].flags, which means that we can
use the normal fast path via RAM.

Backports commit 4bcc3f0ff8e5ae2b17b5aab9aa613ff1b8025896 from qemu
2021-02-25 20:44:13 -05:00
Richard Henderson 0e5aa37c9a target/arm: Use SVEContLdSt in sve_ld1_r
First use of the new helper functions, so we can remove the
unused markup. No longer need a scratch for user-only, as
we completely probe the page set before reading; system mode
still requires a scratch for MMIO.

Backports commit b854fd06a868e0308bcfe05ad0a71210705814c7 from qemu
2021-02-25 20:41:53 -05:00
Richard Henderson d363c3d0ba target/arm: Adjust interface of sve_ld1_host_fn
The current interface includes a loop; change it to load a
single element. We will then be able to use the function
for ld{2,3,4} where individual vector elements are not adjacent.

Replace each call with the simplest possible loop over active
elements.

Backports commit cf4a49b71b1712142d7122025a8ca7ea5b59d73f from qemu
2021-02-25 20:34:18 -05:00
Richard Henderson 94b0876f15 target/arm: Add sve infrastructure for page lookup
For contiguous predicated memory operations, we want to
minimize the number of tlb lookups performed. We have
open-coded this for sve_ld1_r, but for correctness with
MTE we will need this for all of the memory operations.

Create a structure that holds the bounds of active elements,
and metadata for two pages. Add routines to find those
active elements, lookup the pages, and run watchpoints
for those pages.

Temporarily mark the functions unused to avoid Werror.

Backports commit b4cd95d2f4c7197b844f51b29871d888063ea3e7 from qemu
2021-02-25 20:28:23 -05:00
Richard Henderson f430a399d4 target/arm: Drop manual handling of set/clear_helper_retaddr
Since we converted back to cpu_*_data_ra, we do not need to
do this ourselves.

Backports commit f32e2ab65f3a0fc03d58936709e5a565c4b0db50 from qemu
2021-02-25 20:20:29 -05:00
Richard Henderson 2e03f74a53 target/arm: Use cpu_*_data_ra for sve_ldst_tlb_fn
Use the "normal" memory access functions, rather than the
softmmu internal helper functions directly.

Since fb901c9, cpu_mem_index is now a simple extract
from env->hflags and not a large computation.  Which means
that it's now more work to pass around this value than it
is to recompute it.

This only adjusts the primitives, and does not clean up
all of the uses within sve_helper.c.
2021-02-25 20:16:38 -05:00
Richard Henderson a417227674 softfloat: Replace flag with bool
We have had this on the to-do list for quite some time.

Backports commit c120391c0090d9c40425c92cdb00f38ea8588ff6 from qemu
2020-05-21 17:48:12 -04:00
Richard Henderson f93deb0786 target/arm: Use tcg_gen_gvec_5_ptr for sve FMLA/FCMLA
Now that we can pass 7 parameters, do not encode register
operands within simd_data.

Backports commit 08975da9f0bfcfa654628cae71201a351ba5449a from qemu
2020-05-11 17:17:17 -04:00
Richard Henderson 2a4a7b9391
tcg: Use tlb_fill probe from tlb_vaddr_to_host
Most of the existing users would continue around a loop which
would fault the tlb entry in via a normal load/store.

But for AArch64 SVE we have an existing emulation bug wherein we
would mark the first element of a no-fault vector load as faulted
(within the FFR, not via exception) just because we did not have
its address in the TLB. Now we can properly only mark it as faulted
if there really is no valid, readable translation, while still not
raising an exception. (Note that beyond the first element of the
vector, the hardware may report a fault for any reason whatsoever;
with at least one element loaded, forward progress is guaranteed.)

Backports commit 4811e9095c0491bc6f5450e5012c9c4796b9e59d from qemu
2019-05-16 18:27:03 -04:00
Lioncash 3521e72580
target/arm: Sychronize with qemu
Synchronizes with bits and pieces that were missed due to merging
incorrectly (sorry :<)
2019-04-18 04:49:11 -04:00
Lioncash 5f12065284
sve_helper: Use the QEMU_FLATTEN macro instead of the compiler attribute directly
Keeps the code compiler-independent.
2018-10-23 13:05:02 -04:00
Richard Henderson 66ffb372e7
target/arm: Pass TCGMemOpIdx to sve memory helpers
There is quite a lot of code required to compute cpu_mem_index,
or even put together the full TCGMemOpIdx. This can easily be
done at translation time.

Backports commit 500d04843ba953dc4560e44f04001efec38c14a6 from qemu
2018-10-08 14:15:15 -04:00
Richard Henderson 606e8cdb8c
target/arm: Rewrite vector gather first-fault loads
This implements the feature for softmmu, and moves the
main loop out of a macro and into a function.

Backports commit 116347ce20bb7b5cac17bf2b0e6f607530b50862 from qemu
2018-10-08 14:15:15 -04:00
Richard Henderson 1cd3c2a408
target/arm: Split contiguous stores for endianness
We can choose the endianness at translation time, rather than
re-computing it at execution time.

Backports commit 28d57f2dc59c287e1c40239509b0a325fd00e32f from qemu
2018-10-08 14:15:15 -04:00
Richard Henderson c9569b3fe0
target/arm: Split contiguous loads for endianness
We can choose the endianness at translation time, rather than
re-computing it at execution time.

Backports commit 7d0a57a2e1cea188b9023261a404d7a211117230 from qemu
2018-10-08 14:15:15 -04:00
Richard Henderson 2542ad17d0
target/arm: Rewrite vector gather stores
This fixes the endianness problem for softmmu, and moves
the main loop out of a macro and into an inlined function.

Backports commit 78cf1b886aa1b95c97fc5114641515c2892bb240 from qemu
2018-10-08 14:15:15 -04:00
Richard Henderson ff63807164
target/arm: Rewrite vector gather loads
This fixes the endianness problem for softmmu, and moves
the main loop out of a macro and into an inlined function.

Backports commit d4f75f25b43041e7a46d12352b3c70ae457d8cea from qemu
2018-10-08 14:15:15 -04:00
Richard Henderson 966ea163a3
target/arm: Rewrite helper_sve_st[1234]*_r
This fixes the endianness problem for softmmu, and moves the
main loop out of a macro and into an inlined function

Backports commit 9fd46c8362e0a45d04ccceae7051d06dd65c1d57 from qemu
2018-10-08 14:15:15 -04:00
Richard Henderson 4978d77039
target/arm: Rewrite helper_sve_ld[234]*_r
Use the same *_tlb primitives as we use for ld1.

For linux-user, this hoists the set of helper_retaddr. For softmmu,
hoists the computation of the current mmu_idx outside the loop,
fixes the endianness problem, and moves the main loop out of a
macro and into an inlined function.

Backports commit f27d4dc2af0de9b7b45c955882b8420905c6efe8 from qemu
2018-10-08 14:15:15 -04:00
Richard Henderson 5b88176e1d
target/arm: Rewrite helper_sve_ld1*_r using pages
Uses tlb_vaddr_to_host for correct operation with softmmu.
Optimize for accesses within a single page or pair of pages.

Backports commit 9123aeb6fcb14e0955ebe4e2a613802cfa0503ea from qemu
2018-10-08 14:15:15 -04:00
Richard Henderson 118495f4b1
target/arm: Use fp_status_fp16 for do_fmpa_zpzzz_h
This makes float16_muladd correctly use FZ16 not FZ.

Fixes: 6ceabaad110

Backports commit 52a339b11d1719a6589de40606859939875fda9a from qemu
2018-08-17 14:04:20 -04:00
Richard Henderson 1ca7c30fbb
target/arm: Fix typo in helper_sve_ld1hss_r
Backports commit 573ec0fe40b9a412085ac7dfb41975a0fc2b28dd from qemu
2018-08-17 13:47:38 -04:00
Richard Henderson e2e7bb0e21
target/arm: Fix typo in helper_sve_movz_d
Backports commit 054e7adf4e64e4acb3b033348ebf7cc871baa34f from qemu
2018-08-16 07:12:18 -04:00
Richard Henderson f26356b930
target/arm: Reorganize SVE WHILE
The pseudocode for this operation is an increment + compare loop,
so comparing <= the maximum integer produces an all-true predicate.

Rather than bound in both the inline code and the helper, pass the
helper the number of predicate bits to set instead of the number
of predicate elements to set.

Backports commit bbd0968c458d48e34a08b8694fa3309a9fe1c9e7 from qemu
2018-08-16 07:09:33 -04:00
Richard Henderson 46fd2c485a
target/arm: Fix sign of sve_cmpeq_ppzw/sve_cmpne_ppzw
The normal vector element is sign-extended before
comparing with the wide vector element.

Backports commit df4e001093988544d09887122ae824f18ba55c68 from qemu
2018-08-16 07:04:52 -04:00
Richard Henderson bd9975a8ab
target/arm: Fix LD1W and LDFF1W (scalar plus vector)
'I' was being double-incremented; correctly within the inner loop
and incorrectly within the outer loop.

Backports commit 628fc75f3a3bb115de3b445c1a18547c44613cfe from qemu
2018-07-17 12:33:00 -04:00
Richard Henderson f64e48dbee
target/arm: Fix SVE signed division vs x86 overflow exception
We already check for the same condition within the normal integer
sdiv and sdiv64 helpers. Use a slightly different formation that
does not require deducing the expression type.

Backports commit 7e8fafbfd0537937ba8fb366a90ea6548cc31576 from qemu
2018-07-03 05:06:06 -04:00
Richard Henderson 63431f0c21
target/arm: Implement SVE fp complex multiply add
Backports commit 05f48bab3080fb876fbad8d8f14e6ba545432d67 from qemu
2018-07-03 04:21:41 -04:00
Richard Henderson 79220741df
target/arm: Implement SVE floating-point complex add
Backports commit 76a9d9cdc481ed79f1c2ec16eeed185f13e6a8ae from qemu
2018-07-03 04:17:41 -04:00
Richard Henderson 00d3671412
target/arm: Implement SVE floating-point unary operations
Backports commit ec5b375bb5a0e35c0c21dc9dd1d82894269ce215 from qemu
2018-07-03 04:11:12 -04:00
Richard Henderson 2c2c940ef8
target/arm: Implement SVE floating-point round to integral value
Backports commit cda3c75322c6fae1cc5b367ee6d7acf2cbdbcf2b from qemu
2018-07-03 04:07:21 -04:00
Richard Henderson a9160a0e08
target/arm: Implement SVE floating-point convert to integer
Backports commit df4de1affc440d6f2cdaeea329b90c0b88ece5a1 from qemu
2018-07-03 04:02:58 -04:00
Richard Henderson 524630de27
target/arm: Implement SVE floating-point convert precision
Backports commit 46d33d1e3c9c5d56d57056db55010de52c173902 from qemu
2018-07-03 03:58:07 -04:00
Richard Henderson b14793a933
target/arm: Implement SVE floating-point trig multiply-add coefficient
Backports commit 67fcd9ad35d2b38630ee34e8ced8878d334c74fb from qemu
2018-07-03 03:52:36 -04:00
Richard Henderson f7e0c9e079
target/arm: Implement SVE FP Compare with Zero Group
Backports commit 4d2e2a03384a43c641e0cbca7ac79d7d0c50f666 from qemu
2018-07-03 03:49:52 -04:00
Richard Henderson f9f228efec
target/arm: Implement SVE FP Fast Reduction Group
Backports commit 23fbe79faa38cb4acc59f956a63feba3c2cc73ac from qemu
2018-07-03 03:41:00 -04:00
Richard Henderson c718ef4243
target/arm: Implement SVE floating-point arithmetic with immediate
Backports commit cc48affe83fff4b2886c064265d7103dee5e4a14 from qemu
2018-07-03 03:24:46 -04:00
Richard Henderson db1d39ab4a
target/arm: Implement SVE floating-point compare vectors
Backports commit abfdefd5bd444b629d16dcefc2b60ac8da37e87d from qemu
2018-07-03 03:14:24 -04:00
Richard Henderson 892a9a66eb
target/arm: Implement SVE first-fault gather loads
Backports commit ed67eb7fa2a63b6709ec94397d833bc3686f7833 from qemu
2018-07-03 03:06:07 -04:00
Richard Henderson 1782f20dde
target/arm: Implement SVE gather loads
Backports commit 673e9fa6c29e030f4ab6ceae5d0f50bd36fe0ee0 from qemu
2018-07-03 02:58:18 -04:00
Richard Henderson b78e283513
target/arm: Implement SVE scatter stores
Backports commit f6dbf62a7e3d00e9a1dcc7fe3e53b32c3ed93e24 from qemu
2018-07-03 02:40:17 -04:00
Richard Henderson c497dc0a83
target/arm: Implement SVE load and broadcast element
Backports commit 684598640dc3b28f86ccc28cc9af50ba257f4cc8 from qemu
2018-07-03 02:27:00 -04:00
Richard Henderson 2caa99929e
target/arm: Implement SVE Floating Point Accumulating Reduction Group
Backports commit 7f9ddf64d5fe5bfaa91ae0ec52217d86f4d86452 from qemu
2018-07-03 02:21:41 -04:00