Commit graph

1652 commits

Author SHA1 Message Date
Paul Lai 1e48962847
i386: Introduce SnowRidge CPU model
SnowRidge CPU supports Accelerator Infrastrcture Architecture (MOVDIRI,
MOVDIR64B), CLDEMOTE and SPLIT_LOCK_DISABLE.

MOVDIRI, MOVDIR64B, and CLDEMOTE are found via CPUID.
The availability of SPLIT_LOCK_DISABLE is check via msr access

References can be found in either:
https://software.intel.com/en-us/articles/intel-sdm
https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-and-future-features-programming-reference

Backports commit 0b18874bd216f3237740d5cbd64f39cf1e02addf from qemu
2019-08-08 18:26:09 -04:00
Like Xu 2c424e691c
target/i386: Add CPUID.1F generation support for multi-dies PCMachine
The CPUID.1F as Intel V2 Extended Topology Enumeration Leaf would be
exposed if guests want to emulate multiple software-visible die within
each package. Per Intel's SDM, the 0x1f is a superset of 0xb, thus they
can be generated by almost same code as 0xb except die_offset setting.

If the number of dies per package is greater than 1, the cpuid_min_level
would be adjusted to 0x1f regardless of whether the host supports CPUID.1F.
Likewise, the CPUID.1F wouldn't be exposed if env->nr_dies < 2.

Backports commit a94e1428991f741e2c6636e7c8df7f8d1905d983 from qemu
2019-08-08 18:24:46 -04:00
Like Xu d2410074d8
i386: Update new x86_apicid parsing rules with die_offset support
In new sockets/dies/cores/threads model, the apicid of logical cpu could
imply die level info of guest cpu topology thus x86_apicid_from_cpu_idx()
need to be refactored with #dies value, so does apicid_*_offset().

To keep semantic compatibility, the legacy pkg_offset which helps to
generate CPUIDs such as 0x3 for L3 cache should be mapping to die_offset.

Backports commit d65af288a84d8bf8c27e55d45545f52f016c08a7 from qemu
2019-08-08 18:22:03 -04:00
Lioncash a82e4efa24
i386/cpu: Consolidate die-id validity in smp context
The field die_id (default as 0) and has_die_id are introduced to X86CPU.
Following the legacy smp check rules, the die_id validity is added to
the same contexts as leagcy smp variables such as hmp_hotpluggable_cpus(),
machine_set_cpu_numa_node(), cpu_slot_to_string() and pc_cpu_pre_plug().

Backports relevant bits from 176d2cda0dee9f4f78f604ad72d6a111e8e38f3b
from qemu
2019-08-08 18:14:27 -04:00
Like Xu efd887b992
i386: Add die-level cpu topology to x86CPU on PCMachine
The die-level as the first PC-specific cpu topology is added to the leagcy
cpu topology model, which has one die per package implicitly and only the
numbers of sockets/cores/threads are configurable.

In the new model with die-level support, the total number of logical
processors (including offline) on board will be calculated as:

\#cpus = #sockets * #dies * #cores * #threads

and considering compatibility, the default value for #dies would be
initialized to one in x86_cpu_initfn() and pc_machine_initfn().

Backports commit c26ae610811e8d52f4fc73e3ae0a8bc4a24d6763 from qemu
2019-08-08 18:10:42 -04:00
Peter Maydell 1f4c3d6bcc
target/arm: Correct VMOV_imm_dp handling of short vectors
Coverity points out (CID 1402195) that the loop in trans_VMOV_imm_dp()
that iterates over the destination registers in a short-vector VMOV
accidentally throws away the returned updated register number
from vfp_advance_dreg(). Add the missing assignment. (We got this
correct in trans_VMOV_imm_sp().)

Backports commit 89a11ff756410aecb87d2c774df6e45dbf4105c1 from qemu
2019-08-08 18:08:55 -04:00
Peter Maydell 0d89bce217
target/arm: Execute Thumb instructions when their condbits are 0xf
Thumb instructions in an IT block are set up to be conditionally
executed depending on a set of condition bits encoded into the IT
bits of the CPSR/XPSR. The architecture specifies that if the
condition bits are 0b1111 this means "always execute" (like 0b1110),
not "never execute"; we were treating it as "never execute". (See
the ConditionHolds() pseudocode in both the A-profile and M-profile
Arm ARM.)

This is a bit of an obscure corner case, because the only legal
way to get to an 0b1111 set of condbits is to do an exception
return which sets the XPSR/CPSR up that way. An IT instruction
which encodes a condition sequence that would include an 0b1111 is
UNPREDICTABLE, and for v8A the CONSTRAINED UNPREDICTABLE choices
for such an IT insn are to NOP, UNDEF, or treat 0b1111 like 0b1110.
Add a comment noting that we take the latter option.

Backports commit 5529de1e5512c05276825fa8b922147663fd6eac from qemu
2019-08-08 18:07:57 -04:00
Peter Maydell 9d01d50db8
target/arm: Use _ra versions of cpu_stl_data() in v7M helpers
In the various helper functions for v7M/v8M instructions, use
the _ra versions of cpu_stl_data() and friends. Otherwise we
may get wrong behaviour or an assert() due to not being able
to locate the TB if there is an exception on the memory access
or if it performs an IO operation when in icount mode

Backports commit 2884fbb60412049ec92389039ae716b32057382e from qemu
2019-08-08 18:06:23 -04:00
Philippe Mathieu-Daudé bde186433d
target/arm/helper: Move M profile routines to m_helper.c
In preparation for supporting TCG disablement on ARM, we move most
of TCG related v7m/v8m helpers and APIs into their own file.

Note: It is easier to review this commit using the 'histogram'
diff algorithm:

$ git diff --diff-algorithm=histogram ...
or
$ git diff --histogram ...

Backports commit 7aab5a8c8bb525ea390b4ebc17ab82c0835cfdb6 from qemu
2019-08-08 18:04:08 -04:00
Philippe Mathieu-Daudé 199e2f8a7d
target/arm: Restrict semi-hosting to TCG
Semihosting hooks either SVC or HLT instructions, and inside KVM
both of those go to EL1, ie to the guest, and can't be trapped to
KVM.

Let check_for_semihosting() return False when not running on TCG.

backports commit 91f78c58da9ba78c8ed00f5d822b701765be8499 from qemu
2019-08-08 17:48:34 -04:00
Philippe Mathieu-Daudé 6295fd7156
target/arm: Move debug routines to debug_helper.c
These routines are TCG specific.

Backports commit 9dd5cca42448770a940fa2145f1ff18cdc7b01a9 from qemu
2019-08-08 17:46:56 -04:00
Joel Sing 14c6ed2cca
RISC-V: Clear load reservations on context switch and SC
This prevents a load reservation from being placed in one context/process,
then being used in another, resulting in an SC succeeding incorrectly and
breaking atomics.

Backports commit c13b169f1a3dd158d6c75727cdc388f95988db39 from qemu
2019-08-08 17:15:45 -04:00
Palmer Dabbelt 4a3d8417ca
RISC-V: Add support for the Zicsr extension
The various CSR instructions have been split out of the base ISA as part
of the ratification process. This patch adds a Zicsr argument, which
disables all the CSR instructions.

Backports commit 591bddea8d874e1500921de0353818e5586618f5 from qemu
2019-08-08 17:10:34 -04:00
Palmer Dabbelt 5b59f956b3
RISC-V: Add support for the Zifencei extension
fence.i has been split out of the base ISA as part of the ratification
process. This patch adds a Zifencei argument, which disables the
fence.i instruction.

Backports commit 50fba816cd226001bec3e495c39879deb2fa5432 from qemu
2019-08-08 17:09:09 -04:00
Alistair Francis e006204543
target/riscv: Set privledge spec 1.11.0 as default
Set the priv spec version 1.11.0 as the default and allow selecting it
via the command line.

Backports commit e3147506b02edcdd7c14ebb41a10fcc3027dcc5c from qemu
2019-08-08 17:05:40 -04:00
Alistair Francis 2ed6459e98
target/riscv: Add the mcountinhibit CSR
1.11 defines mcountinhibit, which has the same numeric CSR value as
mucounteren from 1.09.1 but has different semantics. This patch enables
the CSR for 1.11-based targets, which is trivial to implement because
the counters in QEMU never tick (legal according to the spec).

Backports commit 747a43e818dc36bd50ef98c2b11a7c31ceb810fa from qemu
2019-08-08 17:04:52 -04:00
Alistair Francis 4934db3de6
target/riscv: Add the privledge spec version 1.11.0
Add support for the ratified RISC-V privledge spec.

Backports commit 6729dbbd420696fcf69cf2c86bdfc66e072058ce from qemu
2019-08-08 17:03:33 -04:00
Alistair Francis 9241c2c842
target/riscv: Restructure deprecatd CPUs
Restructure the deprecated CPUs to make it clear in the code that these
are depreated. They are already marked as deprecated in
qemu-deprecated.texi. There are no functional changes.

Backports commit c1fb65e63cfca4506a14b084afd0eca2dc464fe8 from qemu
2019-08-08 17:01:22 -04:00
Hesham Almatary 39bfe52bc6
RISC-V: Fix a PMP check with the correct access size
The PMP check should be of the memory access size rather
than TARGET_PAGE_SIZE.

Backports commit db21e6f72721996ddf1948c35a8ee35238089da4 from qemu
2019-08-08 16:57:03 -04:00
Hesham Almatary e648455989
RISC-V: Fix a PMP bug where it succeeds even if PMP entry is off
The current implementation returns 1 (PMP check success) if the address is in
range even if the PMP entry is off. This is a bug.

For example, if there is a PMP check in S-Mode which is in range, but its PMP
entry is off, this will succeed, which it should not.

The patch fixes this bug by only checking the PMP permissions if the address is
in range and its corresponding PMP entry it not off. Otherwise, it will keep
the ret = -1 which will be checked and handled correctly at the end of the
function.

Backports commit f8162068f18f2f264a0355938784f54089234211 from qemu
2019-08-08 16:55:52 -04:00
Hesham Almatary e0592a9a58
RISC-V: Check PMP during Page Table Walks
The PMP should be checked when doing a page table walk, and report access
fault exception if the to-be-read PTE failed the PMP check.

Backports commit 1f447aec787bfbbd078afccae44fc4c92acb4fed from qemu
2019-08-08 16:54:45 -04:00
Hesham Almatary 614c2954b0
RISC-V: Check for the effective memory privilege mode during PMP checks
The current PMP check function checks for env->priv which is not the effective
memory privilege mode.

For example, mstatus.MPRV could be set while executing in M-Mode, and in that
case the privilege mode for the PMP check should be S-Mode rather than M-Mode
(in env->priv) if mstatus.MPP == PRV_S.

This patch passes the effective memory privilege mode to the PMP check.
Functions that call the PMP check should pass the correct memory privilege mode
after reading mstatus' MPRV/MPP or hstatus.SPRV (if Hypervisor mode exists).

Backports commit cc0fdb298517ce56c770803447f8b02a90271152 from qemu
2019-08-08 16:52:57 -04:00
Hesham Almatary e8edd4d109
RISC-V: Raise access fault exceptions on PMP violations
Section 3.6 in RISC-V v1.10 privilege specification states that PMP violations
report "access exceptions." The current PMP implementation has
a bug which wrongly reports "page exceptions" on PMP violations.

This patch fixes this bug by reporting the correct PMP access exceptions
trap values.

Backports commit 635b0b0ea39a13d1a3df932452e5728aebbb3f6e from qemu
2019-08-08 16:50:57 -04:00
Hesham Almatary f597727171
RISC-V: Only Check PMP if MMU translation succeeds
The current implementation unnecessarily checks for PMP even if MMU translation
failed. This may trigger a wrong PMP access exception instead of
a page exception.

For example, the very first instruction fetched after the first satp write in
S-Mode will trigger a PMP access fault instead of an instruction fetch page
fault.

This patch prioritises MMU exceptions over PMP exceptions and only checks for
PMP if MMU translation succeeds. This patch is required for future commits
that properly report PMP exception violations if PTW succeeds.

Backports commit e0f8fa72deba7ac7a7ae06ba25e6498aaad93ace from qemu
2019-08-08 16:49:06 -04:00
Michael Clark 6ab36ed89e
target/riscv: Implement riscv_cpu_unassigned_access
This patch adds support for the riscv_cpu_unassigned_access call
and will raise a load or store access fault.

Backports commit cbf5827693addaff4e4d2102afedbf078a204eb2 from qemu
2019-08-08 16:48:02 -04:00
Dayeol Lee 6528c78fd5
target/riscv: Fix PMP range boundary address bug
A wrong address is passed to `pmp_is_in_range` while checking if a
memory access is within a PMP range.
Since the ending address of the pmp range (i.e., pmp_state.addr[i].ea)
is set to the last address in the range (i.e., pmp base + pmp size - 1),
memory accesses containg the last address in the range will always fail.

For example, assume that a PMP range is 4KB from 0x87654000 such that
the last address within the range is 0x87654fff.
1-byte access to 0x87654fff should be considered to be fully inside the
PMP range.
However the access now fails and complains partial inclusion because
pmp_is_in_range(env, i, addr + size) returns 0 whereas
pmp_is_in_range(env, i, addr) returns 1.

Backports commit 49db9fa1fd7c252596b53cf80876e06f407d09ed from qemu
2019-08-08 16:42:24 -04:00
Aleksandar Markovic 1e148beb0e
target/mips: Correct helper for MSA FCLASS.<W|D> instructions
Correct helper for MSA FCLASS.<W|D> instructions.

Backports commit 698c5752c4e618dc17b4c78dfa566896c7bce5ef from qemu
2019-08-08 16:30:15 -04:00
Aleksandar Markovic bf194f980c
target/mips: Unroll loops for MSA float max/min instructions
Slight preformance improvement for MSA float max/min instructions.

Backports commit 807e6773a5eba62054843b13d96ff778b90aba09 from qemu
2019-08-08 16:28:49 -04:00
Aleksandar Markovic dd747162e5
target/mips: Correct comments in msa_helper.c
Fix some errors in comments for MSA helpers.

Backports commit 44da090ba02e60515aa0dc72b30ecc6f56aea771 from qemu
2019-08-08 16:24:47 -04:00
Aleksandar Markovic 87edd2a82a
target/mips: Correct comments in translate.c
Fix some checkpatch comment-related warnings.

Backports commit 7480515fcc1b0f7119953d89e93b8507127aeb62 from qemu
2019-08-08 16:23:02 -04:00
Philippe Mathieu-Daudé fa2a772c7b
target/arm: Declare some M-profile functions publicly
In the next commit we will split the M-profile functions from this
file. Some function will be called out of helper.c. Declare them in
the "internals.h" header.

Backports commit 787a7e76c2e93a48c47b324fea592c9910a70483 from qemu
2019-08-08 15:37:01 -04:00
Philippe Mathieu-Daudé 4eacf0ce98
target/arm: Restrict PSCI to TCG
Under KVM, the kernel gets the HVC call and handle the PSCI requests.

Backports commit 21fbea8c8af72817d8e21570fa4edbfae417341b from qemu
2019-08-08 15:32:19 -04:00
Philippe Mathieu-Daudé 5aaf004bb2
target/arm/vfp_helper: Restrict the SoftFloat use to TCG
This code is specific to the SoftFloat floating-point
implementation, which is only used by TCG.

Backports commit 4a15527c9feecfd2fa2807d5e698abbc19feb35f from qemu
2019-08-08 15:30:47 -04:00
Philippe Mathieu-Daudé 4e214e5d64
target/arm/vfp_helper: Extract vfp_set_fpscr_from_host()
The vfp_set_fpscr() helper contains code specific to the host
floating point implementation (here the SoftFloat library).
Extract this code to vfp_set_fpscr_from_host().

Backports commit 0c6ad94809b37a1f0f1f75d3cd0e4a24fb77e65c from qemu
2019-08-08 15:28:38 -04:00
Philippe Mathieu-Daudé 2b5d731f56
target/arm/vfp_helper: Extract vfp_set_fpscr_to_host()
The vfp_set_fpscr() helper contains code specific to the host
floating point implementation (here the SoftFloat library).
Extract this code to vfp_set_fpscr_to_host().

Backports commit e9d652824b05845f143ef4797d707fae47d4b3ed from qemu
2019-08-08 15:27:27 -04:00
Philippe Mathieu-Daudé 76bae63014
target/arm/vfp_helper: Move code around
To ease the review of the next commit,
move the vfp_exceptbits_to_host() function directly after
vfp_exceptbits_from_host(). Amusingly the diff shows we
are moving vfp_get_fpscr().

Backports commit 20e62dd8c831c9065ed4a8e64813c93ad61c50d7 from qemu.
2019-08-08 15:26:03 -04:00
Philippe Mathieu-Daudé 91e264823e
target/arm: Move TLB related routines to tlb_helper.c
These routines are TCG specific.
The arm_deliver_fault() function is only used within the new
helper. Make it static.

Backports commit e21b551cb652663f2f2405a64d63ef6b4a1042b7 from qemu
2019-08-08 15:24:26 -04:00
Philippe Mathieu-Daudé 1af5deaf52
target/arm: Declare get_phys_addr() function publicly
In the next commit we will split the TLB related routines of
this file, and this function will also be called in the new
file. Declare it in the "internals.h" header.

Backports commit ebae861fc6c385a7bcac72dde4716be06e6776f1 from qemu
2019-08-08 15:14:45 -04:00
Samuel Ortiz 9296465289
target/arm: Move the DC ZVA helper into op_helper
Those helpers are a software implementation of the ARM v8 memory zeroing
op code. They should be moved to the op helper file, which is going to
eventually be built only when TCG is enabled.

Backports commit 6cdca173ef81a9dbcee9e142f1a5a34ad9c44b75 from qemu
2019-08-08 15:10:46 -04:00
Philippe Mathieu-Daudé f77b60d7e9
target/arm: Fix coding style issues
Since we'll move this code around, fix its style first.

Backports commit 9798ac7162c8a720c5d28f4d1fc9e03c7ab4f015 from qemu
2019-08-08 15:05:57 -04:00
Philippe Mathieu-Daudé 4f71266524
target/arm: Fix multiline comment syntax
Since commit 8c06fbdf36b checkpatch.pl enforce a new multiline
comment syntax. Since we'll move this code around, fix its style
first.

Backports commit 9a223097e44d5320f5e0546710263f22d11f12fc from qemu
2019-08-08 15:05:37 -04:00
Philippe Mathieu-Daudé 0a5152caf8
target/arm/helper: Remove unused include
Backports commit 2c8ec397f8438eea5e52be4898dfcf12a1f88267 from qemu
2019-08-08 14:43:57 -04:00
Philippe Mathieu-Daudé 8a0bae93ff
target/arm: Add copyright boilerplate
Backports commit ed3baad15b0b0edc480373f9c1d805d6b8e7e78c from qemu
2019-08-08 14:43:32 -04:00
Philippe Mathieu-Daudé b028a37fdb
target/arm: Makefile cleanup (softmmu)
Group SOFTMMU objects together.
Since PSCI is TCG specific, keep it separate.

Backports commit b601e0cd78c2c57548a468c6fdb566d514c05b89 from qemu
2019-08-08 14:42:00 -04:00
Philippe Mathieu-Daudé 348b813c03
target/arm: Makefile cleanup (ARM)
Group ARM objects together, TCG related ones at the bottom.
This will help when restricting TCG-only objects.

Backports commit 07774d584267488c8c2f104ae5b552791961908a from qemu
2019-08-08 14:39:52 -04:00
Lioncash 76d33b34e1
target/arm: Fix bad patch merge in arm_tr_init_disas_context 2019-08-08 14:37:38 -04:00
Philippe Mathieu-Daudé 5c0cff59f2
target/arm: Makefile cleanup (Aarch64)
Group Aarch64 rules together, TCG related ones at the bottom.
This will help when restricting TCG-only objects.

Backports commit 87f4f183484dba7a460c59e99dac0dbb9f42ed87 from qemu
2019-08-08 14:31:24 -04:00
Lucien Murray-Pitts ac6ee425d3
m68k comments break patch submission due to being incorrectly formatted
Altering all comments in target/m68k to match Qemu coding styles so that future
patches wont fail due to style breaches.

Backports commit 808d77bc5f878a666035d478480b8ed229bd49fe from qemu
2019-08-08 14:26:45 -04:00
Aleksandar Markovic 977e53b921
target/mips: Fix emulation of ILVR.<B|H|W> on big endian host
Fix emulation of ILVR.<B|H|W> on big endian host by applying
mapping of data element indexes from one endian to another.

Backports commit 14f5d874bcd533054648bb7cc767c7169eaf2f1c from qemu
2019-06-30 20:03:23 -04:00
Aleksandar Markovic f7fc20c9ee
target/mips: Fix emulation of ILVL.<B|H|W> on big endian host
Fix emulation of ILVL.<B|H|W> on big endian host by applying
mapping of data element indexes from one endian to another.

Backports commit 8e74bceb00120b23f0931e4e4478d1b10e0970d4 from qemu
2019-06-30 20:02:12 -04:00
Aleksandar Markovic 929bd90968
target/mips: Fix emulation of ILVOD.<B|H|W> on big endian host
Fix emulation of ILVOD.<B|H|W> on big endian host by applying
mapping of data element indexes from one endian to another.

Backports commit b000169e4ed44a3925b6fd22fa0dd6e22bb02b81 from qemu
2019-06-30 19:54:35 -04:00
Aleksandar Markovic 17dd65b771
target/mips: Fix emulation of ILVEV.<B|H|W> on big endian host
Fix emulation of ILVEV.<B|H|W> on big endian host by applying
mapping of data element indexes from one endian to another.

Backports commit 98880cb5a669a35b5bc75432027f2b9fff566aea from qemu
2019-06-30 19:52:14 -04:00
Aleksandar Markovic e9dc22c280
target/mips: Fix if-else-switch-case arms checkpatch errors in translate.c
Remove if-else-switch-case-arms-related checkpatch errors.

Backports commit 1f8929d241c5461f3e98d52f54bcdadd35554448 from qemu
2019-06-30 19:42:16 -04:00
Aleksandar Markovic 1e52cb8fa1
target/mips: Fix some space checkpatch errors in translate.c
Remove some space-related checkpatch warning.

Backports commit 235785e8347558f36be21aa99efa1ba517ecc827 from qemu
2019-06-30 19:34:41 -04:00
Xiaoyao Li 576be63f06
target/i386: define a new MSR based feature word - FEAT_CORE_CAPABILITY
MSR IA32_CORE_CAPABILITY is a feature-enumerating MSR, which only
enumerates the feature split lock detection (via bit 5) by now.

The existence of MSR IA32_CORE_CAPABILITY is enumerated by CPUID.7_0:EDX[30].

The latest kernel patches about them can be found here:
https://lkml.org/lkml/2019/4/24/1909

Backports commit 597360c0d8ebda9ca6f239db724a25bddec62b2f from qemu
2019-06-25 18:59:52 -05:00
Peter Maydell fa19f96e8c
target/arm: Check for dp support for dp VFM, not sp
In commit 1120827fa182f0e7622 we accidentally put the
"UNDEF unless FPU has double-precision support" check in
the single-precision VFM function. Put it in the dp
function where it belongs.

Backports commit 34bea4edb9bbe8edf4b8606276482acdff5ca58b from qemu
2019-06-25 18:56:34 -05:00
Peter Maydell dc1f2247ec
target/arm: Only implement doubles if the FPU supports them
The architecture permits FPUs which have only single-precision
support, not double-precision; Cortex-M4 and Cortex-M33 are
both like that. Add the necessary checks on the MVFR0 FPDP
field so that we UNDEF any double-precision instructions on
CPUs like this.

Note that even if FPDP==0 the insns like VMOV-to/from-gpreg,
VLDM/VSTM, VLDR/VSTR which take double precision registers
still exist.

Backports commit 1120827fa182f0e76226df7ffe7a86598d1df54f from qemu
2019-06-25 18:55:25 -05:00
Peter Maydell cfac686c95
target/arm: Fix typos in trans function prototypes
In several places cut and paste errors meant we were using the wrong
type for the 'arg' struct in trans_ functions called by the
decodetree decoder, because we were using the _sp version of the
struct in the _dp function. These were harmless, because the two
structs were identical and so decodetree made them typedefs of the
same underlying structure (and we'd have had a compile error if they
were not harmless), but we should clean them up anyway.

Backports commit 83655223ac6143a563e981906ce13fd6f2cfbefd from qemu
2019-06-25 18:48:34 -05:00
Peter Maydell 318a1ddf39
target/arm: Remove unused cpu_F0s, cpu_F0d, cpu_F1s, cpu_F1d
Remove the now unused TCG globals cpu_F0s, cpu_F0d, cpu_F1s, cpu_F1d.

cpu_M0 is still used by the iwmmxt code, and cpu_V0 and
cpu_V1 are used by both iwmmxt and Neon.

Backports commit d9eea52c67c04c58ecceba6ffe5a93d1d02051fa from qemu
2019-06-25 18:45:53 -05:00
Peter Maydell 74168c20f2
target/arm: Stop using deprecated functions in NEON_2RM_VCVT_F32_F16
Remove some old constructns from NEON_2RM_VCVT_F16_F32 code:
* don't use CPU_F0s
* don't use tcg_gen_st_f32

Backports commit b66f6b9981004bbf120b8d17c20f92785179bdf2 from qemu
2019-06-25 18:43:40 -05:00
Peter Maydell 8ae25f6e4c
target/arm: stop using deprecated functions in NEON_2RM_VCVT_F16_F32
Remove some old constructs from NEON_2RM_VCVT_F16_F32 code:
* don't use cpu_F0s
* don't use tcg_gen_ld_f32

Backports commit 58f2682eee738e8890f9cfe858e0f4f68b00d45d from qemu
2019-06-25 18:39:43 -05:00
Peter Maydell d419fbc270
target/arm: Stop using cpu_F0s in Neon VCVT fixed-point ops
Stop using cpu_F0s in the Neon VCVT fixed-point operations.

Backports commit c253dd7832bc6b4e140a0da56410a9336cce05bc from qemu
2019-06-25 18:35:33 -05:00
Peter Maydell 46216ae382
target/arm: Stop using cpu_F0s for Neon f32/s32 VCVT
Stop using cpu_F0s for the Neon f32/s32 VCVT operations.
Since this is the last user of cpu_F0s in the Neon 2rm-op
loop, we can remove the handling code for it too.

Backports commit 60737ed5785b9c1c6f1c85575dfdd1e9eec91878 from qemu
2019-06-25 18:32:32 -05:00
Peter Maydell 2fbe9c1d1d
target/arm: Stop using cpu_F0s for NEON_2RM_VRECPE_F and NEON_2RM_VRSQRTE_F
Stop using cpu_F0s for NEON_2RM_VRECPE_F and NEON_2RM_VRSQRTE_F.

Backports commit 9a011fece7201f8e268c982df8c7836f3335bbe6 from qemu
2019-06-25 18:29:22 -05:00
Peter Maydell f82ea34369
target/arm: Stop using cpu_F0s for NEON_2RM_VCVT[ANPM][US]
Stop using cpu_F0s for the NEON_2RM_VCVT[ANPM][US] ops.

Backports commit 30bf0a018f6c706913c8c0ea57b386907f4229be from qemu
2019-06-25 18:28:03 -05:00
Peter Maydell 0d4535bf16
target/arm: Stop using cpu_F0s for NEON_2RM_VRINT*
Switch NEON_2RM_VRINT* away from using cpu_F0s.

Backports commit 3b52ad1fae804acdc2fdc41b418a65249beae430 from qemu
2019-06-25 18:26:24 -05:00
Peter Maydell a62cbc7ac5
target/arm: Stop using cpu_F0s for NEON_2RM_VNEG_F
Switch NEON_2RM_VABS_F away from using cpu_F0s.

Backports commit cedcc96fc7c8e520a190a010ac97dbb53e57d7d2 from qemu
2019-06-25 18:24:01 -05:00
Peter Maydell 63d7f92eba
target/arm: Stop using cpu_F0s for NEON_2RM_VABS_F
Where Neon instructions are floating point operations, we
mostly use the old VFP utility functions like gen_vfp_abs()
which work on the TCG globals cpu_F0s and cpu_F1s. The
Neon for-each-element loop conditionally loads the inputs
into either a plain old TCG temporary for most operations
or into cpu_F0s for float operations, and similarly stores
back either cpu_F0s or the temporary.

Switch NEON_2RM_VABS_F away from using cpu_F0s, and
update neon_2rm_is_float_op() accordingly.

Backports commit fd8a68cdcf81d70eebf866a132e9780d4108da9c from qemu
2019-06-25 18:22:05 -05:00
Peter Maydell ba0ddd3459
target/arm: Use vfp_expand_imm() for AArch32 VFP VMOV_imm
The AArch32 VMOV (immediate) instruction uses the same VFP encoded
immediate format we already handle in vfp_expand_imm(). Use that
function rather than hand-decoding it.

Backports commit 9bee50b498410ed6466018b26464d7384c7879e9 from qemu
2019-06-25 18:20:19 -05:00
Peter Maydell b2dc290454
target/arm: Move vfp_expand_imm() to translate.[ch]
We want to use vfp_expand_imm() in the AArch32 VFP decode;
move it from the a64-only header/source file to the
AArch32 one (which is always compiled even for AArch64).

Backports commit d6a092d479333b5f20a647a912a31b0102d37335 from qemu
2019-06-25 18:17:49 -05:00
Peter Maydell 021da28bfd
target/arm: Fix short-vector increment behaviour
For VFP short vectors, the VFP registers are divided into a
series of banks: for single-precision these are s0-s7, s8-s15,
s16-s23 and s24-s31; for double-precision they are d0-d3,
d4-d7, ... d28-d31. Some banks are "scalar" meaning that
use of a register within them triggers a pure-scalar or
mixed vector-scalar operation rather than a full vector
operation. The scalar banks are s0-s7, d0-d3 and d16-d19.
When using a bank as part of a vector operation, we
iterate through it, increasing the register number by
the specified stride each time, and wrapping around to
the beginning of the bank.

Unfortunately our calculation of the "increment" part of this
was incorrect:
vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask)
will only do the intended thing if bank_mask has exactly
one set high bit. For instance for doubles (bank_mask = 0xc),
if we start with vd = 6 and delta_d = 2 then vd is updated
to 12 rather than the intended 4.

This only causes problems in the unlikely case that the
starting register is not the first in its bank: if the
register number doesn't have to wrap around then the
expression happens to give the right answer.

Fix this bug by abstracting out the "check whether register
is in a scalar bank" and "advance register within bank"
operations to utility functions which use the right
bit masking operations

Backports commit 18cf951af9a27ae573a6fa17f9d0c103f7b7679b from qemu
2019-06-13 19:44:27 -04:00
Peter Maydell 1a0d31c05e
target/arm: Convert float-to-integer VCVT insns to decodetree
Convert the float-to-integer VCVT instructions to decodetree.
Since these are the last unconverted instructions, we can
delete the old decoder structure entirely now.

Backports commit 3111bfc2da6ba0c8396dc97ca479942d711c6146 from qemu
2019-06-13 19:40:02 -04:00
Peter Maydell f6c67559d4
target/arm: Convert VCVT fp/fixed-point conversion insns to decodetree
Convert the VCVT (between floating-point and fixed-point) instructions
to decodetree.

Backports commit e3d6f4290c788e850c64815f0b3e331600a4bcc0 from qemu
2019-06-13 19:35:51 -04:00
Peter Maydell c66d477359
target/arm: Convert VJCVT to decodetree
Convert the VJCVT instruction to decodetree.

Backports commit 92073e947487e2109f3dfebfeaa48d6323cbd981 from qemu
2019-06-13 19:31:35 -04:00
Peter Maydell 7be9e6f9b4
target/arm: Convert integer-to-float insns to decodetree
Convert the VCVT integer-to-float instructions to decodetree.

Backports commit 8fc9d8918cde342c71923e361b9f2193e36ed18b from qemu
2019-06-13 19:20:41 -04:00
Peter Maydell e0e4f99103
target/arm: Convert double-single precision conversion insns to decodetree
Convert the VCVT double/single precision conversion insns to decodetree.

Backports commit 6ed7e49c3693ed8411773c4880f42b2932beb12d from qemu
2019-06-13 19:18:01 -04:00
Peter Maydell ab9d0235ed
target/arm: Convert VFP round insns to decodetree
Convert the VFP round-to-integer instructions VRINTR, VRINTZ and
VRINTX to decodetree.

These instructions were only introduced as part of the "VFP misc"
additions in v8A, so we check this. The old decoder's implementation
was incorrectly providing them even for v7A CPUs.

Backports commit e25155f55dc4abb427a88dfe58bbbc550fe7d643 from qemu
2019-06-13 19:15:05 -04:00
Peter Maydell 9e842a0f2a
target/arm: Convert the VCVT-to-f16 insns to decodetree
Convert the VCVTT and VCVTB instructions which convert from
f32 and f64 to f16 to decodetree.

Since we're no longer constrained to the old decoder's style
using cpu_F0s and cpu_F0d we can perform a direct 16 bit
store of the right half of the input single-precision register
rather than doing a load/modify/store sequence on the full
32 bits.

Backports commit cdfd14e86ab0b1ca29a702d13a8e4af2e902a9bf from qemu
2019-06-13 19:03:59 -04:00
Peter Maydell 7d927b2d0e
target/arm: Convert the VCVT-from-f16 insns to decodetree
Convert the VCVTT, VCVTB instructions that deal with conversion
from half-precision floats to f32 or 64 to decodetree.

Since we're no longer constrained to the old decoder's style
using cpu_F0s and cpu_F0d we can perform a direct 16 bit
load of the right half of the input single-precision register
rather than loading the full 32 bits and then doing a
separate shift or sign-extension.

Backports commit b623d803dda805f07aadcbf098961fde27315c19 from qemu
2019-06-13 19:00:23 -04:00
Peter Maydell e6cc2616d2
target/arm: Convert VFP comparison insns to decodetree
Convert the VFP comparison instructions to decodetree.

Note that comparison instructions should not honour the VFP
short-vector length and stride information: they are scalar-only
operations. This applies to all the 2-operand instructions except
for VMOV, VABS, VNEG and VSQRT. (In the old decoder this is
implemented via the "if (op == 15 && rn > 3) { veclen = 0; }" check.)

Backports commit 386bba2368842fc74388a3c1651c6c0c0c70adbd from qemu
2019-06-13 18:55:53 -04:00
Peter Maydell a75a3e321f
target/arm: Convert VMOV (register) to decodetree
Backports commit 17552b979ebb9848a534c25ebed18a1072710058 from qemu
2019-06-13 18:49:49 -04:00
Peter Maydell ee30962891
target/arm: Convert VSQRT to decodetree
Convert the VSQRT instruction to decodetree.

Backports commit b8474540cbce4e2fa45010416375d1bcbe86dc15 from qemu
2019-06-13 18:47:32 -04:00
Peter Maydell 7aea3da6b7
target/arm: Convert VNEG to decodetree
Convert the VNEG instruction to decodetree.

Backports commit 1882651afdb0ca44f0631192fbe65a71c660d809 from qemu
2019-06-13 18:43:50 -04:00
Peter Maydell 1032d86ad3
target/arm: Convert VABS to decodetree
Convert the VFP VABS instruction to decodetree.

Unlike the 3-op versions, we don't pass fpst to the VFPGen2OpSPFn or
VFPGen2OpDPFn because none of the operations which use this format
and support short vectors will need it.

Backports commit 90287e22c987e9840704345ed33d237cbe759dd9 from qemu
2019-06-13 18:41:43 -04:00
Peter Maydell 7a16bc6876
target/arm: Convert VMOV (imm) to decodetree
Convert the VFP VMOV (immediate) instruction to decodetree.

Backports commit b518c753f0b94e14e01e97b4ec42c100dafc0cc2 from qemu
2019-06-13 18:37:58 -04:00
Peter Maydell 0ebb6b8b90
target/arm: Convert VFP fused multiply-add insns to decodetree
Convert the VFP fused multiply-add instructions (VFNMA, VFNMS,
VFMA, VFMS) to decodetree.

Note that in the old decode structure we were implementing
these to honour the VFP vector stride/length. These instructions
were introduced in VFPv4, and in the v7A architecture they
are UNPREDICTABLE if the vector stride or length are non-zero.
In v8A they must UNDEF if stride or length are non-zero, like
all VFP instructions; we choose to UNDEF always.

Backports commit d4893b01d23060845ee3855bc96626e16aad9ab5 from qemu
2019-06-13 18:24:36 -04:00
Peter Maydell 321bcc822b
target/arm: Convert VDIV to decodetree
Convert the VDIV instruction to decodetree.

Backports commit 519ee7ae31e050eb0ff9ad35c213f0bd7ab1c03e from qemu
2019-06-13 18:19:47 -04:00
Peter Maydell 76c74bc657
target/arm: Convert VSUB to decodetree
Convert the VSUB instruction to decodetree.

Backports commit 8fec9a119264b7936503abce3c106fad7e3ccb76 from qemu.
2019-06-13 18:18:00 -04:00
Peter Maydell f56f0342ad
target/arm: Convert VADD to decodetree
Convert the VADD instruction to decodetree.

Backports commit ce28b303716e7eca3f3765bf6776d722ebbe1122 from qemu
2019-06-13 18:15:52 -04:00
Peter Maydell 06584edf61
target/arm: Convert VNMUL to decodetree
Convert the VNMUL instruction to decodetree.

Backports commit 43c4be1236c105090d134540da1036073d157cd4 from qemu
2019-06-13 18:14:16 -04:00
Peter Maydell 2c5e102017
target/arm: Convert VMUL to decodetree
Convert the VMUL instruction to decodetree.

Backports commit 88c5188ced60e9f2b8cc3af3b9bc4a8031c8c996 from qemu
2019-06-13 18:12:03 -04:00
Peter Maydell b26b6a12a2
target/arm: Convert VFP VNMLA to decodetree
Convert the VFP VNMLA instruction to decodetree.

Backports commit 8a483533adc1bdc2decb8f456dbe930a2d245a8b from qemu
2019-06-13 18:09:57 -04:00
Peter Maydell 638b90de31
target/arm: Convert VFP VNMLS to decodetree
Convert the VFP VNMLS instruction to decodetree.

Backports commit c54a416cc6d60efbc79dd37aaf0c8918c05b5815 from qemu
2019-06-13 18:06:59 -04:00
Peter Maydell 67ad40ffa4
target/arm: Convert VFP VMLS to decodetree
Convert the VFP VMLS instruction to decodetree.

Backports commit e7258280d46af4ab6a0cc93ccfe8f6614defb4b7 from qemu
2019-06-13 18:02:37 -04:00
Peter Maydell edf81eb214
target/arm: Convert VFP VMLA to decodetree
Convert the VFP VMLA instruction to decodetree.

This is the first of the VFP 3-operand data processing instructions,
so we include in this patch the code which loops over the elements
for an old-style VFP vector operation. The existing code to do this
looping uses the deprecated cpu_F0s/F0d/F1s/F1d TCG globals; since
we are going to be converting instructions one at a time anyway
we can take the opportunity to make the new loop use TCG temporaries,
which means we can do that conversion one operation at a time
rather than needing to do it all in one go.

We include an UNDEF check which was missing in the old code:
short-vector operations (with stride or length non-zero) were
deprecated in v7A and must UNDEF in v8A, so if the MVFR0 FPShVec
field does not indicate that support for short vectors is present
we UNDEF the operations that would use them. (This is a change
of behaviour for Cortex-A7, Cortex-A15 and the v8 CPUs, which
previously were all incorrectly allowing short-vector operations.)

Note that the conversion fixes a bug in the old code for the
case of VFP short-vector "mixed scalar/vector operations". These
happen where the destination register is in a vector bank but
but the second operand is in a scalar bank. For example
vmla.f64 d10, d1, d16 with length 2 stride 2
is equivalent to the pair of scalar operations
vmla.f64 d10, d1, d16
vmla.f64 d8, d3, d16
where the destination and first input register cycle through
their vector but the second input is scalar (d16). In the
old decoder the gen_vfp_F1_mul() operation uses cpu_F1{s,d}
as a temporary output for the multiply, which trashes the
second input operand. For the fully-scalar case (where we
never do a second iteration) and the fully-vector case
(where the loop loads the new second input operand) this
doesn't matter, but for the mixed scalar/vector case we
will end up using the wrong value for later loop iterations.
In the new code we use TCG temporaries and so avoid the bug.
This bug is present for all the multiply-accumulate insns
that operate on short vectors: VMLA, VMLS, VNMLA, VNMLS.

Note 2: the expression used to calculate the next register
number in the vector bank is not in fact correct; we leave
this behaviour unchanged from the old decoder and will
fix this bug later in the series.

Backports commit 266bd25c485597c94209bfdb3891c1d0c573c164 from qemu
2019-06-13 17:59:16 -04:00
Peter Maydell 93fe4cbe9e
target/arm: Remove VLDR/VSTR/VLDM/VSTM use of cpu_F0s and cpu_F0d
Expand out the sequences in the new decoder VLDR/VSTR/VLDM/VSTM trans
functions which perform the memory accesses by going via the TCG
globals cpu_F0s and cpu_F0d, to use local TCG temps instead.

Backports commit 3993d0407dff7233e42f2251db971e126a0497e9 from qemu
2019-06-13 17:31:28 -04:00
Peter Maydell ff7042567e
target/arm: Convert the VFP load/store multiple insns to decodetree
Convert the VFP load/store multiple insns to decodetree.
This includes tightening up the UNDEF checking for pre-VFPv3
CPUs which only have D0-D15 : they now UNDEF for any access
to D16-D31, not merely when the smallest register in the
transfer list is in D16-D31.

This conversion does not try to share code between the single
precision and the double precision versions; this looks a bit
duplicative of code, but it leaves the door open for a future
refactoring which gets rid of the use of the "F0" registers
by inlining the various functions like gen_vfp_ld() and
gen_mov_F0_reg() which are hiding "if (dp) { ... } else { ... }"
conditionalisation.

Backports commit fa288de272c5c8a66d5eb683b123706a52bc7ad6 from qemu
2019-06-13 17:26:52 -04:00
Peter Maydell 6f0633ce80
target/arm: Convert VFP VLDR and VSTR to decodetree
Convert the VFP single load/store insns VLDR and VSTR to decodetree.

Backports commit 79b02a3b5231c5b8cd31e50cd549968dd0a05c49 from qemu
2019-06-13 17:22:48 -04:00
Peter Maydell fe98885ff2
target/arm: Convert VFP two-register transfer insns to decodetree
Convert the VFP two-register transfer instructions to decodetree
(in the v8 Arm ARM these are the "Advanced SIMD and floating-point
64-bit move" encoding group).

Again, we expand out the sequences involving gen_vfp_msr() and
gen_msr_vfp().

Backports commit 81f681106eabe21c55118a5a41999fb7387fb714 from qemu
2019-06-13 17:20:00 -04:00
Peter Maydell 3fb3403b82
target/arm: Convert single-precision register moves to decodetree
Convert the "single-precision" register moves to decodetree:
* VMSR
* VMRS
* VMOV between general purpose register and single precision

Note that the VMSR/VMRS conversions make our handling of
the "should this UNDEF?" checks consistent between the two
instructions:
* VMSR to MVFR0, MVFR1, MVFR2 now UNDEF from EL0
  (previously was a nop)
* VMSR to FPSID now UNDEFs from EL0 or if VFPv3 or better
  (previously was a nop)
* VMSR to FPINST and FPINST2 now UNDEF if VFPv3 or better
  (previously would write to the register, which had no
  guest-visible effect because we always UNDEF reads)

We also tighten up the decode: we were previously underdecoding
some SBZ or SBO bits.

The conversion of VMOV_single includes the expansion out of the
gen_mov_F0_vreg()/gen_vfp_mrs() and gen_mov_vreg_F0()/gen_vfp_msr()
sequences into the simpler direct load/store of the TCG temp via
neon_{load,store}_reg32(): we know in the new function that we're
always single-precision, we don't need to use the old-and-deprecated
cpu_F0* TCG globals, and we don't happen to have the declaration of
gen_vfp_msr() and gen_vfp_mrs() at the point in the file where the
new function is.

Backports commit a9ab50011aeda2dd012da99069e078379315ea18 from qemu
2019-06-13 17:16:38 -04:00