The pseudocode for this operation is an increment + compare loop,
so comparing <= the maximum integer produces an all-true predicate.
Rather than bound in both the inline code and the helper, pass the
helper the number of predicate bits to set instead of the number
of predicate elements to set.
Backports commit bbd0968c458d48e34a08b8694fa3309a9fe1c9e7 from qemu
The normal vector element is sign-extended before
comparing with the wide vector element.
Backports commit df4e001093988544d09887122ae824f18ba55c68 from qemu
Tailchaining is an optimization in handling of exception return
for M-profile cores: if we are about to pop the exception stack
for an exception return, but there is a pending exception which
is higher priority than the priority we are returning to, then
instead of unstacking and then immediately taking the exception
and stacking registers again, we can chain to the pending
exception without unstacking and stacking.
For v6M and v7M it is IMPDEF whether tailchaining happens for pending
exceptions; for v8M this is architecturally required. Implement it
in QEMU for all M-profile cores, since in practice v6M and v7M
hardware implementations generally do have it.
(We were already doing tailchaining for derived exceptions which
happened during exception return, like the validity checks and
stack access failures; these have always been required to be
tailchained for all versions of the architecture.)
Backports commit 5f62d3b9e67bfc3deb970e3c7fb7df7e57d46fc3 from qemu
On exception return for M-profile, we must restore the CONTROL.SPSEL
bit from the EXCRET value before we do any kind of tailchaining,
including for the derived exceptions on integrity check failures.
Otherwise we will give the guest an incorrect EXCRET.SPSEL value on
exception entry for the tailchained exception.
Backports commit 89b1fec193b81b6ad0bd2975f2fa179980cc722e from qemu
In do_v7m_exception_exit(), we use the exc_secure variable to track
whether the exception we're returning from is secure or non-secure.
Unfortunately the statement initializing this was accidentally
inside an "if (env->v7m.exception != ARMV7M_EXCP_NMI)" conditional,
which meant that we were using the wrong value for NMI handlers.
Move the initialization out to the right place.
Backports commit b8109608bc6f3337298d44ac4369bf0bc8c3a1e4 from qemu
One of the required effects of setting HCR_EL2.TGE is that when
SCR_EL3.NS is 1 then SCTLR_EL1.M must behave as if it is zero for
all purposes except direct reads. That is, it effectively disables
the MMU for the NS EL0/EL1 translation regime.
Backports commit 3d0e3080d8b7abcddc038d18e8401861c369c4c1 from qemu
The IMO, FMO and AMO bits in HCR_EL2 are defined to "behave as
1 for all purposes other than direct reads" if HCR_EL2.TGE
is set and HCR_EL2.E2H is 0, and to "behave as 0 for all
purposes other than direct reads" if HCR_EL2.TGE is set
and HRC_EL2.E2H is 1.
To avoid having to check E2H and TGE everywhere where we test IMO and
FMO, provide accessors arm_hcr_el2_imo(), arm_hcr_el2_fmo()and
arm_hcr_el2_amo(). We don't implement ARMv8.1-VHE yet, so the E2H
case will never be true, but we include the logic to save effort when
we eventually do get to that.
(Note that in several of these callsites the change doesn't
actually make a difference as either the callsite is handling
TGE specially anyway, or the CPU can't get into that situation
with TGE set; we change everywhere for consistency.)
Backports commit ac656b166b57332ee397e9781810c956f4f5fde5 from qemu
Whene we raise a synchronous exception, if HCR_EL2.TGE is set then
exceptions targeting NS EL1 must be redirected to EL2. Implement
this in raise_exception() -- all synchronous exceptions go through
this function.
(Asynchronous exceptions go via arm_cpu_exec_interrupt(), which
already honours HCR_EL2.TGE when it determines the target EL
in arm_phys_excp_target_el().)
Backports commit 7556edfb4d7bf0583c852c8cfc49ef494c41dd8a from qemu
Some debug registers can be trapped via MDCR_EL2 bits TDRA, TDOSA,
and TDA, which we implement in the functions access_tdra(),
access_tdosa() and access_tda(). If MDCR_EL2.TDE or HCR_EL2.TGE
are 1, the TDRA, TDOSA and TDA bits should behave as if they were 1.
Implement this by having the access functions check MDCR_EL2.TDE
and HCR_EL2.TGE.
Backports commit 30ac6339dca3fe0d05a611f12eedd5af20af585a from qemu
If the "trap general exceptions" bit HCR_EL2.TGE is set, we
must mask all virtual interrupts (as per DDI0487C.a D1.14.3).
Implement this in arm_excp_unmasked().
Backports commit 2ccf0fef632f3d54b2cc9ea08f1e6904ff1f8df4 from qemu
Forbid stack alignment change. (CCR)
Reserve FAULTMASK, BASEPRI registers.
Report any fault as a HardFault. Disable MemManage, BusFault and
UsageFault, so they always escalated to HardFault. (SHCSR)
Backports commit 22ab3460017cfcfb6b50f05838ad142e08becce5 from qemu
MSR_SMI_COUNT started being migrated in QEMU 2.12. Do not migrate it
on older machine types, or the subsection causes a load failure for
guests that use SMM.
Backports part of commit 990e0be2603511560168e1ad61f68294d951c39e from
qemu
Rename DCACHE to DATA_CACHE and ICACHE to INSTRUCTION_CACHE.
This avoids conflict with Linux asm/cachectl.h macros and fixes
build failure on mips hosts.
Backports commit 5f00335aecafc9ad56592d943619d3252f8941f1 from qemu
To correctly handle small (less than TARGET_PAGE_SIZE) MPU regions,
we must correctly handle the case where the address being looked
up hits in an MPU region that is not small but the address is
in the same page as a small region. For instance if MPU region
1 covers an entire page from 0x2000 to 0x2400 and MPU region
2 is small and covers only 0x2200 to 0x2280, then for an access
to 0x2000 we must not return a result covering the full page
even though we hit the page-sized region 1. Otherwise we will
then cache that result in the TLB and accesses that should
hit region 2 will incorrectly find the region 1 information.
Check for the case where we miss an MPU region but it is still
within the same page, and in that case narrow the size we will
pass to tlb_set_page_with_attrs() for whatever the final
outcome is of the MPU lookup.
Backports commit 9d2b5a58f85be2d8e129c4b53d6708ecf8796e54 from qemu
'I' was being double-incremented; correctly within the inner loop
and incorrectly within the outer loop.
Backports commit 628fc75f3a3bb115de3b445c1a18547c44613cfe from qemu
For M-profile exception returns, the mmu index to use for exception
return unstacking is supposed to be that of wherever we are returning to:
* if returning to handler mode, privileged
* if returning to thread mode, privileged or unprivileged depending on
CONTROL.nPRIV for the destination security state
We were passing the wrong thing as the 'priv' argument to
arm_v7m_mmu_idx_for_secstate_and_priv(). The effect was that guests
which programmed the MPU to behave differently for privileged and
unprivileged code could get spurious MemManage Unstack exceptions.
Backports commit 2b83714d4ea659899069a4b94aa2dfadc847a013 from qemu
Use MAKE_64BIT_MASK instead of open-coding. Remove an odd
vector size check that is unlikely to be more profitable
than 3 64-bit integer stores. Correct the iteration for WORD
to avoid writing too much data.
Fixes RISU tests of PTRUE for VL 256.
Backports commit 973558a3f869e591d2406dd8226ec0c4e32a3c3e from qemu
These instructions must perform the sve_access_check, but
since they are implemented as NOPs there is no generated
code to elide when the access check fails.
Backports commit 2f95a3b09aebdcb5c9152a7ac434a5d57441fe82 from qemu
This implements NPT suport for SVM by hooking into
x86_cpu_handle_mmu_fault where it reads the stage-1 page table. Whether
we need to perform this 2nd stage translation, and how, is decided
during vmrun and stored in hflags2, along with nested_cr3 and
nested_pg_mode.
As get_hphys performs a direct cpu_vmexit in case of NPT faults, we need
retaddr in that function. To avoid changing the signature of
cpu_handle_mmu_fault, this passes the value from tlb_fill to get_hphys
via the CPU state.
This was tested successfully via the Jailhouse hypervisor.
Backports commit fe441054bb3f0c75ff23335790342c0408e11c3a from qemu
There is no need to re-set these 3 features already
implied by the call to aarch64_a15_initfn.
Backports commit 0b33968e7f4cf998f678b2d1a5be3d6f3f3513d8 from qemu
There is no need to re-set these 9 features already
implied by the call to aarch64_a57_initfn.
Backports commit 156a7065365578deb3d63c2b5b69a4b5999a8fcc from qemu
Leave ARM_CP_SVE, removing ARM_CP_FPU; the sve_access_check
produced by the flag already includes fp_access_check. If
we also check ARM_CP_FPU the double fp_access_check asserts.
Backports commit 11d7870b1b4d038d7beb827f3afa72e284701351 from qemu
We already check for the same condition within the normal integer
sdiv and sdiv64 helpers. Use a slightly different formation that
does not require deducing the expression type.
Backports commit 7e8fafbfd0537937ba8fb366a90ea6548cc31576 from qemu
Since kernel commit a86bd139f2 (arm64: arch_timer: Enable CNTVCT_EL0
trap..), released in kernel version v4.12, user-space has been able
to read these system registers. As we can't use QEMUTimer's in
linux-user mode we just directly call cpu_get_clock().
Backports commit 26c4a83bd4707797868174332a540f7d61288d15 from qemu
We've already added the helpers with an SVE patch, all that remains
is to wire up the aa64 and aa32 translators. Enable the feature
within -cpu max for CONFIG_USER_ONLY.
Backports commit 26c470a7bb4233454137de1062341ad48947f252 from qemu
Enhance the existing helpers to support SVE, which takes the
index from each 128-bit segment. The change has no effect
for AdvSIMD, since there is only one such segment.
Backports commit 18fc24057815bf3d956cfab892a2bc2344bd1dcb from qemu
For aa64 advsimd, we had been passing the pre-indexed vector.
However, sve applies the index to each 128-bit segment, so we
need to pass in the index separately.
For aa32 advsimd, the fp32 operation always has index 0, but
we failed to interpret the fp16 index correctly.
Backports commit 2cc99919a81a62589a4a6b0f365eabfead1db1a7 from qemu
It calls cpu_loop_exit in system emulation mode (and should never be
called in user emulation mode).
Backports commit 50b3de6e5cd464dcc20e3a48f5a09e0299a184ac from qemu
We need to terminate the translation block after STGI so that pending
interrupts can be injected.
This fixes pending NMI injection for Jailhouse which uses "stgi; clgi"
to open a brief injection window.
Backports commit df2518aa587a0157bbfbc635fe47295629d9914a from qemu
Check for SVM interception prior to injecting an NMI. Tested via the
Jailhouse hypervisor.
Backports commit 02f7fd25a446a220905c2e5cb0fc3655d7f63b29 from qemu
The implementation of these two instructions was swapped.
At the same time, unify the setup of eflags for the insn group.
Backports commit 13672386a93fef64cfd33bd72fbf3d80f2c00e94 from qemu
Offset can be larger than 16 bit from nanoMIPS,
and immediate field can be larger than 16 bits as well.
Backports commit 72e1f16f18fe62504f8f25d7a3f6813b24b221be from qemu
Fix to raise a Reserved Instruction exception when given fs is not
available from CTC1.
Backports commit f48a2cb21824217a61ec7be797860a0702e5325c from qemu
Allow ARMv8M to handle small MPU and SAU region sizes, by making
get_phys_add_pmsav8() set the page size to the 1 if the MPU or
SAU region covers less than a TARGET_PAGE_SIZE.
We choose to use a size of 1 because it makes no difference to
the core code, and avoids having to track both the base and
limit for SAU and MPU and then convert into an artificially
restricted "page size" that the core code will then ignore.
Since the core TCG code can't handle execution from small
MPU regions, we strip the exec permission from them so that
any execution attempts will cause an MPU exception, rather
than allowing it to end up with a cpu_abort() in
get_page_addr_code().
(The previous code's intention was to make any small page be
treated as having no permissions, but unfortunately errors
in the implementation meant that it didn't behave that way.
It's possible that some binaries using small regions were
accidentally working with our old behaviour and won't now.)
We also retain an existing bug, where we ignored the possibility
that the SAU region might not cover the entire page, in the
case of executable regions. This is necessary because some
currently-working guest code images rely on being able to
execute from addresses which are covered by a page-sized
MPU region but a smaller SAU region. We can remove this
workaround if we ever support execution from small regions.
Backports commit 720424359917887c926a33d248131fbff84c9c28 from qemu
We want to handle small MPU region sizes for ARMv7M. To do this,
make get_phys_addr_pmsav7() set the page size to the region
size if it is less that TARGET_PAGE_SIZE, rather than working
only in TARGET_PAGE_SIZE chunks.
Since the core TCG code con't handle execution from small
MPU regions, we strip the exec permission from them so that
any execution attempts will cause an MPU exception, rather
than allowing it to end up with a cpu_abort() in
get_page_addr_code().
(The previous code's intention was to make any small page be
treated as having no permissions, but unfortunately errors
in the implementation meant that it didn't behave that way.
It's possible that some binaries using small regions were
accidentally working with our old behaviour and won't now.)
Backports commit e5e40999b5e03567ef654546e3d448431643f8f3 from qemu
Enable TOPOEXT feature on EPYC CPU. This is required to support
hyperthreading on VM guests. Also extend xlevel to 0x8000001E.
Disable topoext on PC_COMPAT_2_12 and keep xlevel 0x8000000a.
Backports commit e00516475c270dcb6705753da96063f95699abf2 from qemu
This is part of topoext support. To keep the compatibility, it is better
we support all the combination of nr_cores and nr_threads currently
supported. By allowing more nr_cores and nr_threads, we might end up with
more nodes than we can actually support with the real hardware. We need to
fix up the node id to make this work. We can achieve this by shifting the
socket_id bits left to address more nodes.
Backports commit 631be32155dbafa1fe886f2488127956c9120ba6 from qemu
AMD future CPUs expose a mechanism to tell the guest that the
Speculative Store Bypass Disable is not needed and that the
CPU is all good.
This is exposed via the CPUID 8000_0008.EBX[26] bit.
See 124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf
A copy of this document is available at
https://bugzilla.kernel.org/show_bug.cgi?id=199889
Backports commit 254790a909a2f153d689bfa7d8e8f0386cda870d from qemu
AMD future CPUs expose _two_ ways to utilize the Intel equivalant
of the Speculative Store Bypass Disable. The first is via
the virtualized VIRT_SPEC CTRL MSR (0xC001_011f) and the second
is via the SPEC_CTRL MSR (0x48). The document titled:
124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf
gives priority of SPEC CTRL MSR over the VIRT SPEC CTRL MSR.
A copy of this document is available at
https://bugzilla.kernel.org/show_bug.cgi?id=199889
Anyhow, this means that on future AMD CPUs there will be _two_ ways to
deal with SSBD.
Backports commit a764f3f7197f4d7ad8fe8424269933de912224cb from qemu
OSPKE is not a static feature flag: it changes dynamically at
runtime depending on CR4, and it was never configurable: KVM
never returned OSPKE on GET_SUPPORTED_CPUID, and on TCG enables
it automatically if CR4_PKE_MASK is set.
Remove OSPKE from the feature name array so users don't try to
configure it manually.
Backports commit 9ccb9784b57804f5c74434ad6ccb66650a015ffc from qemu
OSXAVE is not a static feature flag: it changes dynamically at
runtime depending on CR4, and it was never configurable: KVM
never returned OSXSAVE on GET_SUPPORTED_CPUID, and it is not
included in TCG_EXT_FEATURES.
Remove OSXSAVE from the feature name array so users don't try to
configure it manually.
Backports commit f1a23522b03a569f13aad49294bb4c4b1a9500c7 from qemu
Add support for cpuid leaf CPUID_8000_001E. Build the config that closely
match the underlying hardware. Please refer to the Processor Programming
Reference (PPR) for AMD Family 17h Model for more details.
Backports commit ed78467a214595a63af7800a073a03ffe37cd7db from qemu
Unlike ARMv7-M, ARMv6-M and ARMv8-M Baseline only supports naturally
aligned memory accesses for load/store instructions.
Backports commit 2aeba0d007d33efa12a6339bb140aa634e0d52eb from qemu
This feature is intended to distinguish ARMv8-M variants: Baseline and
Mainline. ARMv7-M compatibility requires the Main Extension. ARMv6-M
compatibility is provided by all ARMv8-M implementations.
Backports commit cc2ae7c9de14efd72c6205825eb7cd980ac09c11 from qemu
The arrays were made static, "if" was simplified because V7M and V8M
define V6 feature.
Backports commit 8297cb13e407db8a96cc7ed6b6a6c318a150759a from qemu
ARMv6-M supports 6 Thumb2 instructions. This patch checks for these
instructions and allows their execution.
Like Thumb2 cores, ARMv6-M always interprets BL instruction as 32-bit.
This patch is required for future Cortex-M0 support.
Backports commit 14120108f87b3f9e1beacdf0a6096e464e62bb65 from qemu
Rearrange the arithmetic so that we are agnostic about the total size
of the vector and the size of the element. This will allow us to index
up to the 32nd byte and with 16-byte elements.
Backports commit 66f2dbd783d0b6172043e3679171421b2d0bac11 from qemu
Add information for cpuid 0x8000001D leaf. Populate cache topology information
for different cache types (Data Cache, Instruction Cache, L2 and L3) supported
by 0x8000001D leaf. Please refer to the Processor Programming Reference (PPR)
for AMD Family 17h Model for more details.
Backports commit 8f4202fb1080f86958782b1fca0bf0279f67d136 from qemu
Always initialize CPUCaches structs with cache information, even
if legacy_cache=true. Use different CPUCaches struct for
CPUID[2], CPUID[4], and the AMD CPUID leaves.
This will simplify a lot the logic inside cpu_x86_cpuid()
Backports commit a9f27ea9adc8c695197bd08f2e938ef7b4183f07 from qemu
Rather than limit total TB size to PAGE-32 bytes, end the TB when
near the end of a page. This should provide proper semantics of
SIGSEGV when executing near the end of a page.
Backports commit 4c7a0f6f34869b3dfe7091d28ff27a8dfbdd8b70 from qemu
Removed ctx->insn_pc in favour of ctx->base.pc_next.
Yes, it is annoying, but didn't want to waste its 4 bytes.
Backports commit a575cbe01caecf22ab322a9baa5930a6d9e39ca6 from qemu
The name gen_lookup_tb is at odds with tcg_gen_lookup_and_goto_tb.
For these cases, we do indeed want to exit back to the main loop.
Similarly, DISAS_UPDATE performs no actual update, whereas DISAS_EXIT
does what it says.
Backports commit 4106f26e95c83b8759c3fe61a4d3a1fa740db0a9 from qemu
These are all indirect or out-of-page direct jumps.
We can indirectly chain to the next TB without going
back to the main loop.
Backports commit 8aaf7da9c3b1f282b5a123de3e87a2e6ca87f3b9 from qemu
We have exited the TB after using goto_tb; there is no
distinction from DISAS_NORETURN.
Backports commit 825340f5659647deb62743c3cb479ec8d78f1862 from qemu
The raise_exception helper does not return. Do not generate
any code following that.
Backports commit cb4add334a5a8db263c20c33c5365be3868f8967 from qemu
Do the cast to uintptr_t within the helper, so that the compiler
can type check the pointer argument. We can also do some more
sanity checking of the index argument.
Backports commit 07ea28b41830f946de3841b0ac61a3413679feb9 from qemu
Depending on the host abi, float16, aka uint16_t, values are
passed and returned either zero-extended in the host register
or with garbage at the top of the host register.
The tcg code generator has so far been assuming garbage, as that
matches the x86 abi, but this is incorrect for other host abis.
Further, target/arm has so far been assuming zero-extended results,
so that it may store the 16-bit value into a 32-bit slot with the
high 16-bits already clear.
Rectify both problems by mapping "f16" in the helper definition
to uint32_t instead of (a typedef for) uint16_t. This forces
the host compiler to assume garbage in the upper 16 bits on input
and to zero-extend the result on output.
Backports commit 6c2be133a7478e443c99757b833d0f265c48e0a6 from qemu
The FRECPX instructions should (like most other floating point operations)
honour the FPCR.FZ bit which specifies whether input denormals should
be flushed to zero (or FZ16 for the half-precision version).
We forgot to implement this, which doesn't affect the results (since
the calculation doesn't actually care about the mantissa bits) but did
mean we were failing to set the FPSR.IDC bit.
Backports commit 2cfbf36ec07f7cac1aabb3b86f1c95c8a55424ba from qemu
AMD Zen expose the Intel equivalant to Speculative Store Bypass Disable
via the 0x80000008_EBX[25] CPUID feature bit.
This needs to be exposed to guest OS to allow them to protect
against CVE-2018-3639.
Backports commit 403503b162ffc33fb64cfefdf7b880acf41772cd from qemu
"Some AMD processors only support a non-architectural means of enabling
speculative store bypass disable (SSBD). To allow a simplified view of
this to a guest, an architectural definition has been created through a new
CPUID bit, 0x80000008_EBX[25], and a new MSR, 0xc001011f. With this, a
hypervisor can virtualize the existence of this definition and provide an
architectural method for using SSBD to a guest.
Add the new CPUID feature, the new MSR and update the existing SSBD
support to use this MSR when present." (from x86/speculation: Add virtualized
speculative store bypass disable support in Linux).
Backports commit cfeea0c021db6234c154dbc723730e81553924ff from qemu
New microcode introduces the "Speculative Store Bypass Disable"
CPUID feature bit. This needs to be exposed to guest OS to allow
them to protect against CVE-2018-3639.
Backports commit d19d1f965904a533998739698020ff4ee8a103da from qemu
Excepting MOVPRFX, which isn't a reduction. Presumably it is
placed within the group because of its encoding.
Backports commit 047cec971d2791b206677b954227ea92ff7ee3db from qemu