Besides being more correct, arbitrarily long instruction allow the
generation of a translation block that spans three pages. This
confuses the generator and even allows ring 3 code to poison the
translation block cache and inject code into other processes that are
in guest ring 3.
This is an improved (and more invasive) fix for commit 30663fd ("tcg/i386:
Check the size of instruction being translated", 2017-03-24). In addition
to being more precise (and generating the right exception, which is #GP
rather than #UD), it distinguishes better between page faults and too long
instructions, as shown by this test case:
int main()
{
char *x = mmap(NULL, 8192, PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_ANON, -1, 0);
memset(x, 0x66, 4096);
x[4096] = 0x90;
x[4097] = 0xc3;
char *i = x + 4096 - 15;
mprotect(x + 4096, 4096, PROT_READ|PROT_WRITE);
((void(*)(void)) i) ();
}
... which produces a #GP without the mprotect, and a #PF with it.
Backports commit b066c5375737ad0d630196dab2a2b329515a1d00 from qemu
These take care of advancing s->pc, and will provide a unified point
where to check for the 15-byte instruction length limit.
Backports commit e3af7c788b73a6495eb9d94992ef11f6ad6f3c56 from qemu
The common situation of the SG instruction is that it is
executed from S&NSC memory by a CPU in NS state. That case
is handled by v7m_handle_execute_nsc(). However the instruction
also has defined behaviour in a couple of other cases:
* SG instruction in NS memory (behaves as a NOP)
* SG in S memory but CPU already secure (clears IT bits and
does nothing else)
* SG instruction in v8M without Security Extension (NOP)
These can be implemented in translate.c.
Backports commit 76eff04d166b8fe747adbe82de8b7e060e668ff9 from qemu
A few Thumb instructions are always unconditional even inside an
IT block (as opposed to being UNPREDICTABLE if used inside an
IT block): BKPT, the v8M SG instruction, and the A profile
HLT (debug halt) instruction.
This means we need to suppress the jump-over-instruction-on-condfail
code generation (though the IT state still advances as usual and
subsequent insns in the IT block may be conditional).
Backports commit dcf14dfb704519846f396a376339ebdb93eaf049 from qemu
Recent changes have left insn_crosses_page() more complicated
than it needed to be:
* it's only called from thumb_tr_translate_insn() so we know
for certain that we're looking at a Thumb insn
* the caller's check for dc->pc >= dc->next_page_start - 3
means that dc->pc can't possibly be 4 aligned, so there's
no need to check that (the check was partly there to ensure
that we didn't treat an ARM insn as Thumb, I think)
* we now have thumb_insn_is_16bit() which lets us do a precise
check of the length of the next insn, rather than opencoding
an inaccurate check
Simplify it down to just loading the first half of the insn
and calling thumb_insn_is_16bit() on it.
Backports commit 5b8d7289e9e92a0d7bcecb93cd189e245fef10cd from qemu
Refactor the Thumb decode to do the loads of the instruction words at
the top level rather than only loading the second half of a 32-bit
Thumb insn in the middle of the decode.
This is simple apart from the awkward case of Thumb1, where the
BL/BLX prefix and suffix instructions live in what in Thumb2 is the
32-bit insn space. To handle these we decode enough to identify
whether we're looking at a prefix/suffix that we handle as a 16 bit
insn, or a prefix that we're going to merge with the following suffix
to consider as a 32 bit insn. The translation of the 16 bit cases
then moves from disas_thumb2_insn() to disas_thumb_insn().
The refactoring has the benefit that we don't need to pass the
CPUARMState* down into the decoder code any more, but the major
reason for doing this is that some Thumb instructions must be always
unconditional regardless of the IT state bits, so we need to know the
whole insn before we emit the "skip this insn if the IT bits and cond
state tell us to" code. (The always unconditional insns are BKPT,
HLT and SG; the last of these is 32 bits.)
Backports commit 296e5a0a6c393553079a641c50521ae33ff89324 from qemu
The code which implements the Thumb1 split BL/BLX instructions
is guarded by a check on "not M or THUMB2". All we really need
to check here is "not THUMB2" (and we assume that elsewhere too,
eg in the ARCH(6T2) test that UNDEFs the Thumb2 insns).
This doesn't change behaviour because all M profile cores
have Thumb2 and so ARM_FEATURE_M implies ARM_FEATURE_THUMB2.
(v6M implements a very restricted subset of Thumb2, but we
can cross that bridge when we get to it with appropriate
feature bits.)
Backports commit 6b8acf256df09c8a8dd7dcaa79b06eaff4ad63f7 from qemu
Secure function return happens when a non-secure function has been
called using BLXNS and so has a particular magic LR value (either
0xfefffffe or 0xfeffffff). The function return via BX behaves
specially when the new PC value is this magic value, in the same
way that exception returns are handled.
Adjust our BX excret guards so that they recognize the function
return magic number as well, and perform the function-return
unstacking in do_v7m_exception_exit().
Backports commit d02a8698d7ae2bfed3b11fe5b064cb0aa406863b from qemu
Implement the SG instruction, which we emulate 'by hand' in the
exception handling code path.
Backports commit 333e10c51ef5876ced26f77b61b69ce0f83161a9 from qemu
Add the M profile secure MMU index values to the switch in
get_a32_user_mem_index() so that LDRT/STRT work correctly
rather than asserting at translate time.
Backports commit b9f587d62cebed427206539750ebf59bde4df422 from qemu
It is unlikely that we will ever want to call this helper passing
an argument other than the current PC. So just remove the argument,
and use the pc we already get from cpu_get_tb_cpu_state.
This change paves the way to having a common "tb_lookup" function.
Backports commit 7f11636dbee89b0e4d03e9e2b96e14649a7db778 from qemu
It looks like there was a transcription error when writing this code
initially. The code previously only decoded src or dst of rax. This
resolves
https://bugs.launchpad.net/qemu/+bug/1719984.
Backports commit e0dd5fd41a1a38766009f442967fab700d2d0550 from qemu
For the SG instruction and secure function return we are going
to want to do memory accesses using the MMU index of the CPU
in secure state, even though the CPU is currently in non-secure
state. Write arm_v7m_mmu_idx_for_secstate() to do this job,
and use it in cpu_mmu_index().
Backports commit b81ac0eb6315e602b18439961e0538538e4aed4f from qemu
In cpu_mmu_index() we try to do this:
if (env->v7m.secure) {
mmu_idx += ARMMMUIdx_MSUser;
}
but it will give the wrong answer, because ARMMMUIdx_MSUser
includes the 0x40 ARM_MMU_IDX_M field, and so does the
mmu_idx we're adding to, and we'll end up with 0x8n rather
than 0x4n. This error is then nullified by the call to
arm_to_core_mmu_idx() which masks out the high part, but
we're about to factor out the code that calculates the
ARMMMUIdx values so it can be used without passing it through
arm_to_core_mmu_idx(), so fix this bug first.
Backports commit fe768788d29597ee56fc11ba2279d502c2617457 from qemu
Implement the security attribute lookups for memory accesses
in the get_phys_addr() functions, causing these to generate
various kinds of SecureFault for bad accesses.
The major subtlety in this code relates to handling of the
case when the security attributes the SAU assigns to the
address don't match the current security state of the CPU.
In the ARM ARM pseudocode for validating instruction
accesses, the security attributes of the address determine
whether the Secure or NonSecure MPU state is used. At face
value, handling this would require us to encode the relevant
bits of state into mmu_idx for both S and NS at once, which
would result in our needing 16 mmu indexes. Fortunately we
don't actually need to do this because a mismatch between
address attributes and CPU state means either:
* some kind of fault (usually a SecureFault, but in theory
perhaps a UserFault for unaligned access to Device memory)
* execution of the SG instruction in NS state from a
Secure & NonSecure code region
The purpose of SG is simply to flip the CPU into Secure
state, so we can handle it by emulating execution of that
instruction directly in arm_v7m_cpu_do_interrupt(), which
means we can treat all the mismatch cases as "throw an
exception" and we don't need to encode the state of the
other MPU bank into our mmu_idx values.
This commit doesn't include the actual emulation of SG;
it also doesn't include implementation of the IDAU, which
is a per-board way to specify hard-coded memory attributes
for addresses, which override the CPU-internal SAU if they
specify a more secure setting than the SAU is programmed to.
Backports commit 35337cc391245f251bfb9134f181c33e6375d6c1 from qemu
Implement the register interface for the SAU: SAU_CTRL,
SAU_TYPE, SAU_RNR, SAU_RBAR and SAU_RLAR. None of the
actual behaviour is implemented here; registers just
read back as written.
When the CPU definition for Cortex-M33 is eventually
added, its initfn will set cpu->sau_sregion, in the same
way that we currently set cpu->pmsav7_dregion for the
M3 and M4.
Number of SAU regions is typically a configurable
CPU parameter, but this patch doesn't provide a
QEMU CPU property for it. We can easily add one when
we have a board that requires it.
Backports commit 9901c576f6c02d43206e5faaf6e362ab7ea83246 from qemu
Add support for v8M and in particular the security extension
to the exception entry code. This requires changes to:
* calculation of the exception-return magic LR value
* push the callee-saves registers in certain cases
* clear registers when taking non-secure exceptions to avoid
leaking information from the interrupted secure code
* switch to the correct security state on entry
* use the vector table for the security state we're targeting
Backports commit d3392718e1fcf0859fb7c0774a8e946bacb8419c from qemu
For v8M, exceptions from Secure to Non-Secure state will save
callee-saved registers to the exception frame as well as the
caller-saved registers. Add support for unstacking these
registers in exception exit when necessary.
Backports commit 907bedb3f3ce134c149599bd9cb61856d811b8ca from qemu
In v8M, more bits are defined in the exception-return magic
values; update the code that checks these so we accept
the v8M values when the CPU permits them.
Backports commit bfb2eb52788b9605ef2fc9bc72683d4299117fde from qemu
Add the new M profile Secure Fault Status Register
and Secure Fault Address Register.
Backports commit bed079da04dd9e0e249b9bc22bca8dce58b67f40 from qemu
In the v8M architecture, return from an exception to a PC which
has bit 0 set is not UNPREDICTABLE; it is defined that bit 0
is discarded [R_HRJH]. Restrict our complaint about this to v7M.
Backports commit 4e4259d3c574a8e89c3af27bcb84bc19a442efb1 from qemu
Attempting to do an exception return with an exception frame that
is not 8-aligned is UNPREDICTABLE in v8M; warn about this.
(It is not UNPREDICTABLE in v7M, and our implementation can
handle the merely-4-aligned case fine, so we don't need to
do anything except warn.)
Backports commit cb484f9a6e790205e69d9a444c3e353a3a1cfd84 from qemu
ARM v8M specifies that the INVPC usage fault for mismatched
xPSR exception field and handler mode bit should be checked
before updating the PSR and SP, so that the fault is taken
with the existing stack frame rather than by pushing a new one.
Perform this check in the right place for v8M.
Since v7M specifies in its pseudocode that this usage fault
check should happen later, we have to retain the original
code for that check rather than being able to merge the two.
(The distinction is architecturally visible but only in
very obscure corner cases like attempting an invalid exception
return with an exception frame in read only memory.)
Backports commit 224e0c300a0098fb577a03bd29d774d0769f632a from qemu
On exception return for v8M, the SPSEL bit in the EXC_RETURN magic
value should be restored to the SPSEL bit in the CONTROL register
banked specified by the EXC_RETURN.ES bit.
Add write_v7m_control_spsel_for_secstate() which behaves like
write_v7m_control_spsel() but allows the caller to specify which
CONTROL bank to use, reimplement write_v7m_control_spsel() in
terms of it, and use it in exception return.
Backports commit 3f0cddeee1f266d43c956581f3050058360a810d from qemu
Now that we can handle the CONTROL.SPSEL bit not necessarily being
in sync with the current stack pointer, we can restore the correct
security state on exception return. This happens before we start
to read registers off the stack frame, but after we have taken
possible usage faults for bad exception return magic values and
updated CONTROL.SPSEL.
Backports commit 3919e60b6efd9a86a0e6ba637aa584222855ac3a from qemu
In the v7M architecture, there is an invariant that if the CPU is
in Handler mode then the CONTROL.SPSEL bit cannot be nonzero.
This in turn means that the current stack pointer is always
indicated by CONTROL.SPSEL, even though Handler mode always uses
the Main stack pointer.
In v8M, this invariant is removed, and CONTROL.SPSEL may now
be nonzero in Handler mode (though Handler mode still always
uses the Main stack pointer). In preparation for this change,
change how we handle this bit: rename switch_v7m_sp() to
the now more accurate write_v7m_control_spsel(), and make it
check both the handler mode state and the SPSEL bit.
Note that this implicitly changes the point at which we switch
active SP on exception exit from before we pop the exception
frame to after it.
Backports commit de2db7ec894f11931932ca78cd14a8d2b1389d5b from qemu
Currently our M profile exception return code switches to the
target stack pointer relatively early in the process, before
it tries to pop the exception frame off the stack. This is
awkward for v8M for two reasons:
* in v8M the process vs main stack pointer is not selected
purely by the value of CONTROL.SPSEL, so updating SPSEL
and relying on that to switch to the right stack pointer
won't work
* the stack we should be reading the stack frame from and
the stack we will eventually switch to might not be the
same if the guest is doing strange things
Change our exception return code to use a 'frame pointer'
to read the exception frame rather than assuming that we
can switch the live stack pointer this early.
Backports commit 5b5223997c04b769bb362767cecb5f7ec382c5f0 from qemu
This properly forwards SMC events to EL2 when PSCI is provided by QEMU
itself and, thus, ARM_FEATURE_EL3 is off.
Found and tested with the Jailhouse hypervisor. Solution based on
suggestions by Peter Maydell.
Backports commit 77077a83006c3c9bdca496727f1735a3c5c5355d from qemu
In the A64 decoder, we have a lot of references to section numbers
from version A.a of the v8A ARM ARM (DDI0487). This version of the
document is now long obsolete (we are currently on revision B.a),
and various intervening versions renumbered all the sections.
The most recent B.a version of the document doesn't assign
section numbers at all to the individual instruction classes
in the way that the various A.x versions did. The simplest thing
to do is just to delete all the out of date C.x.x references.
Backports commit 4ce31af4aeb8471f6a913de7c59d3bde1fc4f03d from qemu
Now that we have a banked FAULTMASK register and banked exceptions,
we can implement the correct check in cpu_mmu_index() for whether
the MPU_CTRL.HFNMIENA bit's effect should apply. This bit causes
handlers which have requested a negative execution priority to run
with the MPU disabled. In v8M the test has to check this for the
current security state and so takes account of banking.
Backports relevant part of commit 5d4791991d4de12e83d44738417c9e964167b6e8 from qemu
In v8M the MSR and MRS instructions have extra register value
encodings to allow secure code to access the non-secure banked
version of various special registers.
(We don't implement the MSPLIM_NS or PSPLIM_NS aliases, because
we don't currently implement the stack limit registers at all.)
Backports commit 50f11062d4c896408731d6a286bcd116d1e08465 from qemu
Although none of the existing macro call-sites were broken,
it's always better to write macros that properly parenthesize
arguments that can be complex expressions, so that the intended
order of operations is not broken.
Backports commit 2a2be359c4335607c7f746cf27c412c08ab89aff from qemu
now cpu_mips_init() reimplements subset of cpu_generic_init()
tasks, so just drop it and use cpu_generic_init() directly.
Backports commit c4c8146cfd0fc3f95418fbc82a2eded594675022 from qemu
Register separate QOM types for each mips cpu model,
so it would be possible to reuse generic CPU creation
routines.
Backports commit 41da212c9ce9482fcfd490170c2611470254f8dc from qemu
This changes the order between cpu_mips_realize_env() and
cpu_exec_initfn(), but cpu_exec_initfn() don't have anything that
depends on cpu_mips_realize_env() being called first.
Backports commit df4dc10284e1d871db8adb512816a561473ffe3e from qemu
no logical change, only code movement (and fix a comment typo).
Backports commit 26aa3d9aecbb6fe9bce808a1d127191bdf3cc3d2 from qemu
Also backports commit 5502b66fc7d0bebd08b9b7017cb7e8b5261c3a2d
Starting with Windows Server 2012 and Windows 8, if
CPUID.40000005.EAX contains a value of -1, Windows assumes specific
limit to the number of VPs. In this case, Windows Server 2012
guest VMs may use more than 64 VPs, up to the maximum supported
number of processors applicable to the specific Windows
version being used.
https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs
For compatibility, Let's introduce a new property for X86CPU,
named "x-hv-max-vps" as Eduardo's suggestion, and set it
to 0x40 before machine 2.10.
(The "x-" prefix indicates that the property is not supposed to
be a stable user interface.)
Backports relevant parts of commit 6c69dfb67e84747cf071958594d939e845dfcc0c from qemu
The SSE4.1 phminposuw instruction finds the minimum 16-bit element in
the source vector, putting the value of that element in the low 16
bits of the destination vector, the index of that element in the next
three bits and zeroing the rest of the destination. The helper for
this operation fills the destination from high to low, meaning that
when the source and destination are the same register, the minimum
source element can be overwritten before it is copied to the
destination. This patch fixes it to fill the destination from low to
high instead, so the minimum source element is always copied first.
This fixes one gcc test failure in my GCC 6-based testing (and so
concludes the present sequence of patches, as I don't have any further
gcc test failures left in that testing that I attribute to QEMU bugs).
Backports commit aa406feadfc5b095ca147ec56d6187c64be015a7 from qemu
One of the cases of the SSE4.2 pcmpestri / pcmpestrm / pcmpistri /
pcmpistrm instructions does a substring search. The implementation of
this case in the pcmpxstrx helper is incorrect. The operation in this
case is a search for a string (argument d to the helper) in another
string (argument s to the helper); if a copy of d at a particular
position would run off the end of s, the resulting output bit should
be 0 whether or not the strings match in the region where they
overlap, but the QEMU implementation was wrongly comparing only up to
the point where s ends and counting it as a match if an initial
segment of d matched a terminal segment of s. Here, "run off the end
of s" means that some byte of d would overlap some byte outside of s;
thus, if d has zero length, it is considered to match everywhere,
including after the end of s. This patch fixes the implementation to
correspond with the proper instruction semantics. This fixes four gcc
test failures in my GCC 6-based testing.
Backports commit ae35eea7e4a9f21dd147406dfbcd0c4c6aaf2a60 from qemu
The SSE4.1 packusdw instruction combines source and destination
vectors of signed 32-bit integers into a single vector of unsigned
16-bit integers, with unsigned saturation. When the source and
destination are the same register, this means each 32-bit element of
that register is used twice as an input, to produce two of the 16-bit
output elements, and so if the operation is carried out
element-by-element in-place, no matter what the order in which it is
applied to the elements, the first element's operation will overwrite
some future input. The helper for packssdw avoids this issue by
computing the result in a local temporary and copying it to the
destination at the end; this patch fixes the packusdw helper to do
likewise. This fixes three gcc test failures in my GCC 6-based
testing.
Backports commit 80e19606215d4df370dfe8fe21c558a129f00f0b from qemu
It turns out that my recent fix to set rip_offset when emulating some
SSE4.1 instructions needs generalizing to cover a wider class of
instructions. Specifically, every instruction in the sse_op_table7
table, coming from various instruction set extensions, has an 8-bit
immediate operand that comes after any memory operand, and so needs
rip_offset set for correctness if there is a memory operand that is
rip-relative, and my patch only set it for a subset of those
instructions. This patch moves the rip_offset setting to cover the
wider class of instructions, so fixing 9 further gcc testsuite
failures in my GCC 6-based testing. (I do not know whether there
might be still further classes of instructions missing this setting.)
Backports commit c6a8242915328cda0df0fbc0803da3448137e614 from qemu
The SSE4.1 pmovsx* and pmovzx* instructions take packed 1-byte, 2-byte
or 4-byte inputs and sign-extend or zero-extend them to a wider vector
output. The associated helpers for these instructions do the
extension on each element in turn, starting with the lowest. If the
input and output are the same register, this means that all the input
elements after the first have been overwritten before they are read.
This patch makes the helpers extend starting with the highest element,
not the lowest, to avoid such overwriting. This fixes many GCC test
failures (161 in the gcc testsuite in my GCC 6-based testing) when
testing with a default CPU setting enabling those instructions.
Backports commit c6a56c8e990b213a1638af2d34352771d5fa4d9c from qemu
Instead of copying addr to a local temp, reuse the value (which we
have just compared as equal) already saved in cpu_exclusive_addr.
Backports commit 37e29a64254bf82a1901784fcca17c25f8164c2f from qemu
Previously when single stepping through ERET instruction via GDB
would result in debugger entering the "next" PC after ERET instruction.
When debugging in kernel mode, this will also cause unintended behavior,
because debugger will try to access memory from EL0 point of view.
Backports commit dddbba9943ef6a81c8702e4a50cb0a8b1a4201fe from qemu
In the v7M and v8M ARM ARM, the magic exception return values are
referred to as EXC_RETURN values, and in QEMU we use V7M_EXCRET_*
constants to define bits within them. Rename the 'type' variable
which holds the exception return value in do_v7m_exception_exit()
to excret, making it clearer that it does hold an EXC_RETURN value.
Backports commit 351e527a613147aa2a2e6910f92923deef27ee48 from qemu
The exception-return magic values get some new bits in v8M, which
makes some bit definitions for them worthwhile.
We don't use the bit definitions for the switch on the low bits
which checks the return type for v7M, because this is defined
in the v7M ARM ARM as a set of valid values rather than via
per-bit checks.
Backports commit 4d1e7a4745c050f7ccac49a1c01437526b5130b5 from qemu
In do_v7m_exception_exit(), there's no need to force the high 4
bits of 'type' to 1 when calling v7m_exception_taken(), because
we know that they're always 1 or we could not have got to this
"handle return to magic exception return address" code. Remove
the unnecessary ORs.
Backports commit 7115cdf5782922611bcc44c89eec5990db7f6466 from qemu