Commit graph

91 commits

Author SHA1 Message Date
Emilio G. Cota ec14a00925
target-arm: emulate LL/SC using cmpxchg helpers
Emulating LL/SC with cmpxchg is not correct, since it can
suffer from the ABA problem. Portable parallel code, however,
is written assuming only cmpxchg--and not LL/SC--is available.
This means that in practice emulating LL/SC with cmpxchg is
a viable alternative.

The appended emulates LL/SC pairs in ARM with cmpxchg helpers.
This works in both user and system mode. In usermode, it avoids
pausing all other CPUs to perform the LL/SC pair. The subsequent
performance and scalability improvement is significant, as the
plots below show. They plot the throughput of atomic_add-bench
compiled for ARM and executed on a 64-core x86 machine.

Hi-res plots: http://imgur.com/a/aNQpB

atomic_add-bench: 1000000 ops/thread, [0,1] range

9 ++---------+----------+----------+----------+----------+----------+---++
+cmpxchg +-E--+ + + + + + |
8 +Emaster +-H--+ ++
| | |
7 ++E ++
| | |
6 ++++ ++
| | |
5 ++ | ++
4 ++ | ++
| | |
3 ++ | ++
| | |
2 ++ | ++
|H++E+--- +++ ---+E+------+E+------+E|
1 +++ +E+-----+E+------+E+------+E+------+E+-- +++ +++ ++
++H+ + +++ + +++ ++++ + + + |
0 ++--H----H-+-----H----+----------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads

atomic_add-bench: 1000000 ops/thread, [0,2] range

16 ++---------+----------+---------+----------+----------+----------+---++
+cmpxchg +-E--+ + + + + + |
14 ++master +-H--+ ++
| | |
12 ++| ++
| E |
10 ++| ++
| | |
8 ++++ ++
|E+| |
| | |
6 ++ | ++
| | |
4 ++ | ++
| +E+--- +++ +++ +++ ---+E+------+E|
2 +H+ +E+------E-------+E+-----+E+------+E+------+E+-- +++
+ | + +++ + ++++ + + + |
0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads

atomic_add-bench: 1000000 ops/thread, [0,128] range

70 ++---------+----------+---------+----------+----------+----------+---++
+cmpxchg +-E--+ + + + ++++ + |
60 ++master +-H--+ ----E------+E+-------++
| -+E+--- +++ +++ +E|
| +++ ---- +++ ++|
50 ++ +++ ---+E+- ++
| -E--- |
40 ++ ---+++ ++
| +++--- |
| -+E+ |
30 ++ +++---- ++
| +E+ |
20 ++ +++-- ++
| +E+ |
|+E+ |
10 +E+ ++
+ + + + + + + |
0 +HH-H----H-+-----H----+---------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads

atomic_add-bench: 1000000 ops/thread, [0,1024] range

120 ++---------+---------+----------+---------+----------+----------+---++
+cmpxchg +-E--+ + + + + + |
| master +-H--+ ++|
100 ++ ----E+
| +++ ---+E+--- ++|
| --E--- +++ |
80 ++ ---- +++ ++
| ---+E+- |
60 ++ -+E+-- ++
| +++ ---- +++ |
| -+E+- |
40 ++ +++---- ++
| +++ ---+E+ |
| -+E+--- |
20 ++ +E+ ++
|+E+++ |
+E+ + + + + + + |
0 +HH-H---H--+-----H---+----------+---------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads

Backports commit 354161b37c6465a32073eac5f16fa35939af2bb4 from qemu
2018-02-28 00:07:44 -05:00
Richard Henderson fd9933fbd5
target-arm: Rearrange aa32 load and store functions
Stop specializing on TARGET_LONG_BITS == 32; unconditionally allocate
a temp and expand with tcg_gen_extu_i32_tl. Split out gen_aa32_addr,
gen_aa32_frob64, gen_aa32_ld_i32 and gen_aa32_st_i32 as separate interfaces.

Backports commit 7f5616f53896a4e08ad37de3ac50d3a4cc8eff7a from qemu
2018-02-27 23:59:16 -05:00
Peter Maydell 1a850bcb19
target-arm: Implement new HLT trap for semihosting
Version 2.0 of the semihosting specification introduces new trap
instructions for AArch32: HLT 0xF000 for A32 and HLT 0x3C for T32.
Implement these (in the same way we implement the existing HLT
semihosting trap for A64).

The old traps via SVC and BKPT are unaffected.

Backports commit 19a6e31c9d2701ef648b70ddcfc3bf64cec8c37e from qemu
2018-02-26 15:28:45 -05:00
Peter Maydell f2dcb81b27
Fix masking of PC lower bits when doing exception returns
In commit 9b6a3ea7a699594 store_reg() was changed to mask
both bits 0 and 1 of the new PC value when in ARM mode.
Unfortunately this broke the exception return code paths
when doing a return from ARM mode to Thumb mode: in some
of these we write a new CPSR including new Thumb mode
bit via gen_helper_cpsr_write_eret(), and then use store_reg()
to write the new PC. In this case if the new CPSR specified
Thumb mode then masking bit 1 of the PC is incorrect
(these code paths correspond to the v8 ARM ARM pseudocode
function AArch32.ExceptionReturn(), which always aligns the
new PC appropriately for the new instruction set state).

Instead of using store_reg() in exception-return code paths,
call a new store_pc_exc_ret() which stores the raw new PC
value to env->regs[15], and then mask it appropriately in
the subsequent helper_cpsr_write_eret() where the new
env->thumb state is available.

This fixes a bug introduced by 9b6a3ea7a699594 which caused
crashes/hangs or otherwise bad behaviour for Linux when
userspace was using Thumb.

Backports commit fb0e8e79a9d77ee240dbca036fa8698ce654e5d1 from qemu
2018-02-26 08:09:28 -05:00
Peter Maydell f48d1fe391
target-arm: Correctly handle 'sub pc, pc, 1' for ARMv6
In the ARM v6 architecture, 'sub pc, pc, 1' is not an interworking
branch, so the computed new value is written to r15 as a normal
value. The architecture says that in this case, bits [1:0] of
the value written must be ignored if we are in ARM mode (or
bit [0] ignored if in Thumb mode); this is a change from the
ARMv4/v5 specification that behaviour is UNPREDICTABLE.
Use the correct mask on the PC value when doing a non-interworking
store to PC.

A popular library used on RaspberryPi uses this instruction
as part of a trick to determine whether it is running on
ARMv6 or ARMv7, and we were mishandling the sequence.

Fixes bug: https://bugs.launchpad.net/bugs/1625295

Backports commit 9b6a3ea7a699594162ed3d11e4e04b98568dc5c0 from qemu
2018-02-26 05:02:32 -05:00
Pranith Kumar 7849f8d72a
target-arm: Generate fences in ARMv7 frontend
Backports commit 61e4c432ab26526bab0f3ef746c1861415b6da29 from qemu
2018-02-26 03:22:53 -05:00
Richard Henderson 1547048a22
tcg: Reorg TCGOp chaining
Instead of using -1 as end of chain, use 0, and link through the 0
entry as a fully circular double-linked list.

Backports commit dcb8e75870e2de199db853697f8839cb603beefe from qemu
2018-02-25 21:44:50 -05:00
Lluís Vilanova 2297527755
exec: [tcg] Track which vCPU is performing translation and execution
Information is tracked inside the TCGContext structure, and later used
by tracing events with the 'tcg' and 'vcpu' properties.

The 'cpu' field is used to check tracing of translation-time
events ("*_trans"). The 'tcg_env' field is used to pass it to
execution-time events ("*_exec").

Backports commit 7c2550432abe62f53e6df878ceba6ceaf71f0e7e from qemu
2018-02-24 19:21:39 -05:00
Peter Maydell 9bdf310d49
target-arm: Don't permit ARMv8-only Neon insns on ARMv7
The Neon instructions VCVTA, VCVTM, VCVTN, VCVTP, VRINTA, VRINTM,
VRINTN, VRINTP, VRINTX, and VRINTZ were only introduced with ARMv8,
so they need a guard to make them UNDEF if the CPU only supports ARMv7.
(We got this right for all the other new-in-v8 insns, but forgot
it for these Neon 2-reg-misc ops.)

Backports commit fe8fcf3d642b4de1369841bf6acac13e0ec8770d from qemu
2018-02-24 18:20:00 -05:00
Edgar E. Iglesias 8aee797956
target-arm: A64: Create Instruction Syndromes for Data Aborts
Add support for generating the ISS (Instruction Specific Syndrome) for
Data Abort exceptions taken from AArch64.
These syndromes are used by hypervisors for example to trap and emulate
memory accesses.

We save the decoded data out-of-band with the TBs at translation time.
When exceptions hit, the extra data attached to the TB is used to
recreate the state needed to encode instruction syndromes.
This avoids the need to emit moves with every load/store.

Based on a suggestion from Peter Maydell.

Backports commit aaa1f954d4cab243e3d5337a72bc6d104e1c4808 from qemu
2018-02-24 16:46:44 -05:00
Paolo Bonzini 9485b7c2e1
cpu: move exec-all.h inclusion out of cpu.h
exec-all.h contains TCG-specific definitions. It is not needed outside
TCG-specific files such as translate.c, exec.c or *helper.c.

One generic function had snuck into include/exec/exec-all.h; move it to
include/qom/cpu.h.

Backports commit 63c915526d6a54a95919ebece83fa9ca631b2508 from qemu
2018-02-24 02:39:08 -05:00
Sergey Fedorov ffdc9d6323
tcg: Allow goto_tb to any target PC in user mode
In user mode, there's only a static address translation, TBs are always
invalidated properly and direct jumps are reset when mapping change.
Thus the destination address is always valid for direct jumps and
there's no need to restrict it to the pages the TB resides in.

Backports commit 90aa39a1cc4837360889f0e033ca25cc82100308 from qemu
2018-02-23 23:12:14 -05:00
Sergey Fedorov 73c59faad5
tcg: Clean up direct block chaining safety checks
We don't take care of direct jumps when address mapping changes. Thus we
must be sure to generate direct jumps so that they always keep valid
even if address mapping changes. Luckily, we can only allow to execute a
TB if it was generated from the pages which match with current mapping.

Document tcg_gen_goto_tb() declaration and note the reason for
destination PC limitations.

Some targets with variable length instructions allow TB to straddle a
page boundary. However, we make sure that both of TB pages match the
current address mapping when looking up TBs. So it is safe to do direct
jumps into the both pages. Correct the checks for some of those targets.

Given that, we can safely patch a TB which spans two pages. Remove the
unnecessary check in cpu_exec() and allow such TBs to be patched.

Backports commit 5b053a4a28278bca606eeff7d1c0730df1b047e9 from qemu
2018-02-23 22:26:00 -05:00
Peter Maydell 8309945dcc
target-arm: Implement MRS (banked) and MSR (banked) instructions
Starting with the ARMv7 Virtualization Extensions, the A32 and T32
instruction sets provide instructions "MSR (banked)" and "MRS
(banked)" which can be used to access registers for a mode other
than the current one:
* R<m>_<mode>
* ELR_hyp
* SPSR_<mode>

Implement the missing instructions.

Backports commit 8bfd0550be821cf27d71444e2af350de3c3d2ee3 from qemu
2018-02-21 21:50:42 -05:00
Ralf-Philipp Weinmann 893b9f7f96
target-arm: Only trap SRS from S-EL1 if specified mode is MON
Commit cbc0326b6fb9 caused SRS instructions executed from Secure
EL1 to trap to EL3 even if the specified mode was not monitor mode.

According to the ARMv8 Architecture reference manual [F6.1.203], ALL
of the following conditions need to be met for SRS to trap to EL3:
* It is executed at Secure PL1.
* The specified mode is monitor mode.
* EL3 is using AArch64.

Correct the condition governing the trap to EL3 to check the
specified mode.

Backports commit ba63cf47a93041137a94e86b7d0cd87fc896949b from qemu
2018-02-21 02:49:28 -05:00
Paolo Bonzini 7f23f7004d
target-arm: implement BE32 mode in system emulation
System emulation only has a little-endian target; BE32 mode
is implemented by adjusting the low bits of the address
for every byte and halfword load and store. 64-bit accesses
flip the low and high words.

Backports commit e334bd3190f6c4ca12f1d40d316dc471c70009ab from qemu
2018-02-21 02:47:22 -05:00
Paolo Bonzini aa5be4d6ca
target-arm: implement setend
Since this is not a high-performance path, just use a helper to
flip the E bit and force a lookup in the hash table since the
flags have changed.

Backports commit 9886ecdf31165de2d4b8bccc1a220bd6ac8bc192 from qemu
2018-02-21 02:39:13 -05:00
Peter Crosthwaite 902170741a
target-arm: introduce tbflag for endianness
Introduce a tbflags for endianness, set based upon the CPUs current
endianness. This in turn propagates through to the disas endianness
flag.

Backports commit 91cca2cda9823b1e7a049cb308a05104b5076cba from qemu
2018-02-21 02:35:34 -05:00
Paolo Bonzini 9ab3d105fd
target-arm: introduce disas flag for endianness
Introduce a disas flag for setting the CPU data endianness. This allows
control of the endianness from the CPU state rather than hard-coding it
to TARGET_WORDS_BIGENDIAN.

Backports commit dacf0a2ff7d39ab12bd90f2f5496a3889facd54a from qemu
2018-02-21 02:20:50 -05:00
Paolo Bonzini ec15ee10d0
target-arm: implement SCTLR.B, drop bswap_code
bswap_code is a CPU property of sorts ("is the iside endianness the
opposite way round to TARGET_WORDS_BIGENDIAN?") but it is not the
actual CPU state involved here which is SCTLR.B (set for BE32
binaries, clear for BE8).

Replace bswap_code with SCTLR.B, and pass that to arm_ld*_code.
The next patches will make data fetches honor both SCTLR.B and
CPSR.E appropriately.

Backports commit f9fd40ebe4f55e0048e002925b8d65e66d56e7a7 from qemu
2018-02-21 02:08:05 -05:00
Peter Maydell 6ae2357be6
target-arm: Give CPSR setting on 32-bit exception return its own helper
The rules for setting the CPSR on a 32-bit exception return are
subtly different from those for setting the CPSR via an instruction
like MSR or CPS. (In particular, in Hyp mode changing the mode bits
is not valid via MSR or CPS.) Split the exception-return case into
its own helper for setting CPSR, so we can eventually handle them
differently in the helper function.

Backports commit 235ea1f5c89abf30e452539b973b0dbe43d3fe2b from qemu
2018-02-20 22:08:35 -05:00
Peter Maydell 57a9474cc7
target-arm: UNDEF in the UNPREDICTABLE SRS-from-System case
Make get_r13_banked() raise an exception at runtime for the
corner case of SRS from System mode, so that we can UNDEF it;
this brings us in to line with the ARM ARM's set of permitted
CONSTRAINED UNPREDICTABLE choices.

Backports commit f01377f591fe15c652f947646c4a69a7d4a71ad9 from qemu
2018-02-20 15:12:25 -05:00
Peter Maydell a6aac0dbb4
target-arm: Clean up trap/undef handling of SRS
The SRS instruction is:
* UNDEFINED in Hyp mode
* UNPREDICTABLE in User or System mode
* UNPREDICTABLE if the specified mode isn't accessible
* trapped to EL3 if EL3 is AArch64 and we are at Secure EL1

Clean up the code to handle all these cases cleanly, including
picking UNDEF as our choice of UNPREDICTABLE behaviour rather
blindly trusting the mode field passed in the instruction.
As part of this, move the check for IS_USER into gen_srs()
itself rather than having it done by the caller.

The exception is that we don't UNDEF for calls from System
mode, which need a runtime check. This will be dealt with in
the following commits.

Backports commit cbc0326b6fb905f80b7cef85b24571f7ebb62077 from qemu
2018-02-20 15:02:45 -05:00
Peter Maydell 3d5b54cf4b
target-arm: Fix IL bit reported for Thumb VFP and Neon traps
All Thumb Neon and VFP instructions are 32 bits, so the IL
bit in the syndrome register should be set. Pass false to the
syn_* function's is_16bit argument rather than s->thumb
so we report the correct IL bit.

Backports commit 7d197d2db5e99e4c8b20f6771ddc7303acaa1c89 from qemu
2018-02-20 11:39:39 -05:00
Peter Maydell 5b8ad0e2fc
target-arm: Fix IL bit reported for Thumb coprocessor traps
All Thumb coprocessor instructions are 32 bits, so the IL
bit in the syndrome register should be set. Pass false to the
syn_* function's is_16bit argument rather than s->thumb
so we report the correct IL bit.

Backports commit 4df322593037d2700f72dfdfb967300b7ad2e696 from qemu
2018-02-20 11:38:27 -05:00
Peter Maydell 6dbc781ce3
target-arm: Add isread parameter to CPAccessFns
System registers might have access requirements which need to
be described via a CPAccessFn and which differ for reads and
writes. For this to be possible we need to pass the access
function a parameter to tell it whether the access being checked
is a read or a write.

Backports commit 3f208fd76bcc91a8506681bb8472f2398fe6f487 from qemu
2018-02-20 11:24:17 -05:00
Richard Henderson c507f16702
tcg: Remove lingering references to gen_opc_buf
Three in comments and one in code in the stub tcg_liveness_analysis.

Backports commit 201577059331b8b3aef221ee2ed594deb99d6631 from qemu
2018-02-19 01:42:55 -05:00
Peter Maydell cd5c4037ac
target-arm: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Backports commit 74c21bd07491739c6e56bcb1f962e4df730e77f3 from qemu
2018-02-17 21:09:32 -05:00
Sergey Fedorov 587c9f0570
target-arm: Fix and improve AA32 singlestep translation completion code 2018-02-17 19:34:52 -05:00
Andrew Baumann e1701b069f
target-arm: raise exception on misaligned LDREX operands
Qemu does not generally perform alignment checks. However, the ARM ARM
requires implementation of alignment exceptions for a number of cases
including LDREX, and Windows-on-ARM relies on this.

This change adds plumbing to enable alignment checks on loads using
MO_ALIGN, a do_unaligned_access hook to raise the exception (data
abort), and uses the new aligned loads in LDREX (for all but
single-byte loads).

Backports commit 30901475b91ef1f46304404ab4bfe89097f61b96 from qemu
2018-02-17 19:29:29 -05:00
Sergey Fedorov 5df03a909c
target-arm: Update condexec before arch BP check in AA32 translation
Architectural breakpoint check could raise an exceptions, thus condexec
bits should be updated before calling gen_helper_check_breakpoints().

Backports commit ce8a1b5449cd8c4c2831abb581d3208c3a3745a0 from qemu
2018-02-17 18:35:50 -05:00
Sergey Fedorov b42bfc59f1
target-arm: Update condexec before CP access check in AA32 translation
Coprocessor access instructions are allowed inside IT block.
gen_helper_access_check_cp_reg() can raise an exceptions thus condexec
bits should be updated before.

Backports commit 43bfa4a100687af8d293fef0a197839b51400fca from qemu
2018-02-17 18:33:58 -05:00
Sergey Fedorov 23ece1622c
target-arm: Clean up DISAS_UPDATE usage in AArch32 translation code
AArch32 translation code does not distinguish between DISAS_UPDATE and
DISAS_JUMP. Thus, we cannot use any of them without first updating PC in
CPU state. Furthermore, it is too complicated to update PC in CPU state
before PC gets updated in disas context. So it is hardly possible to
correctly end TB early if is is not likely to be executed before calling
disas_*_insn(), e.g. just after calling breakpoint check helper.

Modify DISAS_UPDATE and DISAS_JUMP usage in AArch32 translation and
apply to them the same semantic as AArch64 translation does:
- DISAS_UPDATE: update PC in CPU state when finishing translation
- DISAS_JUMP: preserve current PC value in CPU state when finishing
translation

This patch fixes a bug in AArch32 breakpoint handling: when
check_breakpoints helper does not generate an exception, ending the TB
early with DISAS_UPDATE couldn't update PC in CPU state and execution
hangs.

Backports commit 577bf808958d06497928c639efaa473bf8c5e099 from qemu
2018-02-17 17:43:21 -05:00
Peter Maydell 0ed5787f89
target-arm: Report S/NS status in the CPU debug logs
If this CPU supports EL3, enhance the printing of the current
CPU mode in debug logging to distinguish S from NS modes as
appropriate.

Backports commit 06e5cf7acd1f94ab7c1cd6945974a1f039672940 from qemu
2018-02-17 15:24:14 -05:00
Richard Henderson c01a6dab0a
target-*: Advance pc after recognizing a breakpoint
Some targets already had this within their logic, but make sure
it's present for all targets.

Backports commit 522a0d4e3c0d397ffb45ec400d8cbd426dad9d17 from qemu
2018-02-17 15:24:11 -05:00
Richard Henderson 3ec0adcc07
target-*: Introduce and use cpu_breakpoint_test
Reduce the boilerplate required for each target. At the same time,
move the test for breakpoint after calling tcg_gen_insn_start.

Note that arm and aarch64 do not use cpu_breakpoint_test, but still
move the inline test down after tcg_gen_insn_start.

Backports commit b933066ae03d924a92b2616b4a24e7d91cd5b841 from qemu
2018-02-17 15:24:10 -05:00
Peter Maydell 93386e2dd4
target-arm/translate.c: Handle non-executable page-straddling Thumb insns
When the memory we're trying to translate code from is not executable we have
to turn this into a guest fault. In order to report the correct PC for this
fault, and to make sure it is not reported until after any other possible
faults for instructions earlier in execution, we must terminate TBs at
the end of a page, in case the next instruction is in a non-executable page.
This is simple for T16, A32 and A64 instructions, which are always aligned
to their size. However T32 instructions may be 32-bits but only 16-aligned,
so they can straddle a page boundary.

Correct the condition that checks whether the next instruction will touch
the following page, to ensure that if we're 2 bytes before the boundary
and this insn is T32 then we end the TB.

Backports commit 541ebcd401ee47f3c1a3ce503ef5466b75e9d20a from qemu
2018-02-17 15:24:07 -05:00
Sergey Fedorov e4e0c75f0f
target-arm: Fix CPU breakpoint handling
A QEMU breakpoint match is not definitely an architectural breakpoint
match. If an exception is generated unconditionally during translation,
it is hardly possible to ignore it in the debug exception handler.

Generate a call to a helper to check CPU breakpoints and raise an
exception only if any breakpoint matches architecturally.

Backports commit 5d98bf8f38c17a348ab6e8af196088cd4953acd0 from qemu
2018-02-17 15:24:02 -05:00
Sergey Sorokin 04992f0fb3
target-arm: Break the TB after ISB to execute self-modified code correctly
If any store instruction writes the code inside the same TB
after this store insn, the execution of the TB must be stopped
to execute new code correctly.
As described in ARMv8 manual D3.4.6 self-modifying code must do an
IC invalidation to be valid, and an ISB after it. So it's enough to end
the TB after ISB instruction on the code translation.
Also this TB break is necessary to take any pending interrupts immediately
after an ISB (as required by ARMv8 ARM D1.14.4).

Backports commit 6df99dec9e81838423d723996e96236693fa31fe from qemu
2018-02-17 15:24:01 -05:00
Richard Henderson a5ac288135
tcg: Remove gen_intermediate_code_pc
It is no longer used, so tidy up everything reached by it.
This includes the gen_opc_* arrays, the search_pc parameter
and the inline gen_intermediate_code_internal functions.

Backports commit 4e5e1215156662b2b153255c49d4640d82c5568b from qemu
2018-02-17 15:23:59 -05:00
Richard Henderson 1cbd175736
tcg: Pass data argument to restore_state_to_opc
The gen_opc_* arrays are already redundant with the data stored in
the insn_start arguments. Transition restore_state_to_opc to use
data from the latter.

Backports commit bad729e272387de7dbfa3ec4319036552fc6c107 from qemu
2018-02-17 15:23:58 -05:00
Lioncash b115c5509d
tcg: Add TCG_MAX_INSNS
Adjust all translators to respect it.

Backports commit 190ce7fbc79fd0883a6170d7f30da59d366e6830 from qemu
2018-02-17 15:23:58 -05:00
Sergey Sorokin a883d349fe
target-arm: Fix default_exception_el() function for the case when EL3 is not supported
If EL3 is not supported in current configuration,
we should not try to get EL3 bitness.

Backports commit cef9ee706792b1e205fe472b67053a0e82cd058e from qemu
2018-02-17 15:23:36 -05:00
Peter Maydell 5c7389680e
target-arm: Implement YIELD insn to yield in ARM and Thumb translators
Implement the YIELD instruction in the ARM and Thumb translators to
actually yield control back to the top level loop rather than being
a simple no-op. (We already do this for A64.)

Backports commit c87e5a61c2b3024116f52f7e68273f864ff7ab82 from qemu
2018-02-17 15:23:14 -05:00
Peter Maydell 13db196792
target-arm: Correct "preferred return address" for cpreg access exceptions
The architecture defines that when taking an exception trying to
access a coprocessor register, the "preferred return address" for
the exception is the address of the instruction that caused the
exception. Correct an off-by-4 error which meant we were returning
the address after the instruction for traps which happened because
of a failure of a runtime access-check function on an AArch32
register. (Traps caused by translate-time checkable permissions
failures had the correct address, as did traps on AArch64 registers.)

This fixes https://bugs.launchpad.net/qemu/+bug/1463338

Backports commit 3977ee5d7a9f2e3664dd8b233f3224694e23b62b from qemu
2018-02-17 15:22:42 -05:00
Peter Maydell 6d7370457f
target-arm: Don't halt on WFI unless we don't have any work
Just NOP the WFI instruction if we have work to do.
This doesn't make much difference currently (though it does avoid
jumping out to the top level loop and immediately restarting),
but the distinction between "halt" and "don't halt" will become
more important when the decision to halt requires us to trap
to a higher exception level instead.

Backport commit 84549b6dcf9147559ec08b066de673587be6b763 from qemu
2018-02-12 23:10:45 -05:00
Greg Bellows 3c87e50745
target-arm: Extend FP checks to use an EL
Extend the ARM disassemble context to take a target exception EL instead of a
boolean enable. This change reverses the polarity of the check making a value
of 0 indicate floating point enabled (no exception).

Backports commit 9dbbc748d671c70599101836cd1c2719d92f3017 from qemu
2018-02-12 23:04:19 -05:00
Greg Bellows edd8066082
target-arm: Add exception target el infrastructure
Add a CPU state exception target EL field that will be used for communicating
the EL to which an exception should be routed.

Add a disassembly context field for tracking the EL3 architecture needed for
determining the target exception EL.

Add a target EL argument to the generic exception helper for callers to specify
the EL to which the exception should be routed. Extended the helper to set
the newly added CPU state exception target el.

Added a function for setting the target exception EL and updated calls to helpers
to call it.

Backports commit 737103619869600668cc7e8700e4f6eab3943896 from qemu
2018-02-12 22:17:02 -05:00
Peter Maydell db9901f2ad
target-arm: Avoid buffer overrun on UNPREDICTABLE ldrd/strd
A LDRD or STRD where rd is not an even number is UNPREDICTABLE.
We were letting this fall through, which is OK unless rd is 15,
in which case we would attempt to do a load_reg or store_reg
to a nonexistent r16 for the second half of the double-word.
Catch the odd-numbered-rd cases and UNDEF them instead.

To do this we rearrange the structure of the code a little
so we can put the UNDEF catches at the top before we've
allocated TCG temporaries.

Backports commit a4bb522ee51087af61998f290d12ba2e14c7910e from qemu
2018-02-12 17:17:23 -05:00
Peter Maydell 3497be0faa
target-arm: Fix handling of STM (user) with r15 in register list
The A32 encoding of LDM distinguishes LDM (user) from LDM (exception
return) based on whether r15 is in the register list. However for
STM (user) there is no equivalent distinction. We were incorrectly
treating "r15 in list" as indicating exception return for both LDM
and STM, with the result that an STM (user) involving r15 went into
an infinite loop. Fix this; note that the value stored for r15
in this case is the current PC regardless of our current mode.

Backports commit da3e53ddcb0ca924da97ca5a35605fc554aa3e05 from qemu
2018-02-12 16:16:05 -05:00