Move cpu_sync_bndcs_hflags() function from mpx_helper.c
to helper.c because mpx_helper.c need be disabled when
tcg is disabled.
Backports commit ab0a19d4f08d924e052eb369420d264240872f8a from qemu
Add CONFIG_TCG around TLB-related functions and structure declarations.
Some of these functions are defined in ./accel/tcg/cputlb.c, which will
not be linked in if TCG is disabled, and have no stubs; therefore, their
callers will also be compiled out for --disable-tcg.
Backports commit b11ec7f2e44b285a3967d629b55d1a6970b06787 from qemu
This lets you build without TCG (hardware accelerationor qtest only). When
this flag is passed to configure, it will automatically filter out the target
list to only those that support KVM or Xen or HAX.
Backports commit b3f6ea7e55e8228d6f84d5cee7cb11cae917ba95 from qemu
translate-all.c will be disabled if tcg is disabled in the build,
so page_size_init() function and related variables will be moved
to exec.c file.
Backports commit a0be0c585f5dcc4d50a37f6a20d3d625c5ef3a2c from qemu
Commit 1f5c00cfdb8114c ("qom/cpu: move tlb_flush to cpu_common_reset")
moved the call to tlb_flush() from the target-specific reset handlers
into the common code qom/cpu.c file, and protected the call with
"#ifdef CONFIG_SOFTMMU" to avoid that it is called for linux-user
only targets. But since qom/cpu.c is common code, CONFIG_SOFTMMU is
*never* defined here, so the tlb_flush() was simply never executed
anymore. Fix it by introducing a wrapper for tlb_flush() in a file
that is re-compiled for each target, i.e. in translate-all.c.
Backports commit 2cd53943115be5118b5b2d4b80ee0a39c94c4f73 from qemu
Move the handling of conforming code segments before the handling
of stack switch.
Because dpl == cpl after the new "if", it's now unnecessary to check
the C bit when testing dpl < cpl. Furthermore, dpl > cpl is checked
slightly above the modified code, so the final "else" is unreachable
and we can remove it.
Backports commit 1110bfe6f5600017258fa6578f9c17ec25b32277 from qemu
In do_interrupt64(), when interrupt stack table(ist) is enabled
and the the target code segment is conforming(e2 & DESC_C_MASK), the
old implementation always set new CPL to 0, and SS.RPL to 0.
This is incorrect for when CPL3 code access a CPL0 conforming code
segment, the CPL should remain unchanged. Otherwise higher privileged
code can be compromised.
The patch fix this for always set dpl = cpl when the target code segment
is conforming, and modify the last parameter `flags`, which contains
correct new CPL, in cpu_x86_load_seg_cache().
Backports commit e95e9b88ba5f4a6c17f4d0c3a3a6bf3f648bb328 from qemu
Some code paths can lead to atomic accesses racing with memset()
on cpu->tb_jmp_cache, which can result in torn reads/writes
and is undefined behaviour in C11.
These torn accesses are unlikely to show up as bugs, but from code
inspection they seem possible. For example, tb_phys_invalidate does:
/* remove the TB from the hash list */
h = tb_jmp_cache_hash_func(tb->pc);
CPU_FOREACH(cpu) {
if (atomic_read(&cpu->tb_jmp_cache[h]) == tb) {
atomic_set(&cpu->tb_jmp_cache[h], NULL);
}
}
Here atomic_set might race with a concurrent memset (such as the
ones scheduled via "unsafe" async work, e.g. tlb_flush_page) and
therefore we might end up with a torn pointer (or who knows what,
because we are under undefined behaviour).
This patch converts parallel accesses to cpu->tb_jmp_cache to use
atomic primitives, thereby bringing these accesses back to defined
behaviour. The price to pay is to potentially execute more instructions
when clearing cpu->tb_jmp_cache, but given how infrequently they happen
and the small size of the cache, the performance impact I have measured
is within noise range when booting debian-arm.
Note that under "safe async" work (e.g. do_tb_flush) we could use memset
because no other vcpus are running. However I'm keeping these accesses
atomic as well to keep things simple and to avoid confusing analysis
tools such as ThreadSanitizer.
Backports commit f3ced3c59287dabc253f83f0c70aa4934470c15e from qemu
We are relying on cpu_env being defined as a global, yet most
targets (i.e. all but arm/a64) have it defined as a local variable.
Luckily all of them use the same "cpu_env" name, but really
compilation shouldn't break if the name of that local variable
changed.
Fix it by using tcg_ctx.tcg_env, which all targets set in their
translate_init function. This change also helps paving the way
for the upcoming "translation loop common to all targets" work.
Backports commit 53f6672bcf57d82b794a2cc3a3469be7d35c8653 from qemu
Add fsabs, fdabs, fsneg, fdneg, fsmove and fdmove.
The value is converted using the new floatx80_round() function.
Backports commit 77bdb2292492fafc4bc0fbb4d8c44fdd0ef1fa8e from qemu
Add a function to round a floatx80 to the defined precision
(floatx80_rounding_precision)
Backports commit 0f72129281765ed64d26353284059f2bdcde7a23 from qemu
fmovecr moves a floating point constant from the
FPU ROM to a floating point register.
Backports commit 9d403660d91229922c2786e81c23cc9dd8e644f1 from qemu
This may be used for deprecated object properties that are kept for
backwards compatibility.
Backports commit a733371214b68881d84725a3c71f60e2faf3b8e2 from qemu
This replaces env1 and page_index variables by env and index
so we can use VICTIM_TLB_HIT macro later.
Backports commit 3416343255cbe01fbe12e5e36cd4bb5042425b27 from qemu
Coldfire uses float64, but 680x0 use floatx80.
This patch introduces the use of floatx80 internally
and enables 680x0 80bits FPU.
Backports commit f83311e4764f1f25a8abdec2b32c64483be1759b from qemu
Switch to use QNum/uint where appropriate to remove i64 limitation.
The input visitor will cast i64 input to u64 for compatibility
reasons (existing json QMP client already use negative i64 for large
u64, and expect an implicit cast in qemu).
Note: before the patch, uint64_t values above INT64_MAX are sent over
json QMP as negative values, e.g. UINT64_MAX is sent as -1. After the
patch, they are sent unmodified. Clearly a bug fix, but we have to
consider compatibility issues anyway. libvirt should cope fine,
because its parsing of unsigned integers accepts negative values
modulo 2^64. There's hope that other clients will, too.
Backports commit 5923f85fb82df7c8c60a89458a5ae856045e5ab1 from qemu
In order to store integer values between INT64_MAX and UINT64_MAX, add
a uint64_t internal representation.
Backports commit 61a8f418b26a2d974e38e4ae55020aca8d402d88 from qemu
Before the previous commit, parameter promote_int = true made
visit_start_alternate() with an input visitor avoid QTYPE_QINT
variants and create QTYPE_QFLOAT variants instead. This was used
where QTYPE_QINT variants were invalid.
The previous commit fused QTYPE_QINT with QTYPE_QFLOAT, rendering
promote_int useless and unused.
Backports commit 60390d2dc85ffade8981ca41e02335cb07353a6d from qemu
We would like to use a same QObject type to represent numbers, whether
they are int, uint, or floats. Getters will allow some compatibility
between the various types if the number fits other representations.
Add a few more tests while at it.
Backports commit 01b2ffcedd94ad7b42bc870e4c6936c87ad03429 from qemu
QAPI_CLONE() returns a newly allocated QAPI object. Inconvenient when
we want to clone into an existing object. QAPI_CLONE_MEMBERS() does
exactly that.
Backports commit 4626a19c86c30d96cedbac2bd44ef8103303cb37 from qemu
Rather than making lots of callers wrap a scalar in a QInt, QString,
or QBool, provide helper macros that do the wrapping automatically.
Update the Coccinelle script to make mass conversions easy, although
the conversion itself will be done as a separate patches to ease
review and backport efforts.
Backports commit a92c21591b5bb9543996538f14854ca6b528318b from qemu
Visiting a list when input is the empty string should result in an
empty list, not an error. Noticed when commit 3d089ce belatedly added
tests, but simply accepted as weird then. It's actually a regression:
broken in commit 74f24cb, v2.7.0. Fix it, and throw in another test
case for empty string.
Backports commit d2788227c6185c72d88ef3127e9fed41686f8e39 from qemu
We can call tb_htable_lookup even when the tb_jmp_cache is completely
empty. Therefore, un-nest most of the code dependent on tb != NULL
from the read from the cache.
This improves the hit rate of lookup_tb_ptr; for instance, when booting
and immediately shutting down debian-arm, the hit rate improves from
93.2% to 99.4%.
Backports commit b97a879de980e99452063851597edb98e7e8039c from qemu
The new placement of the TB means that we can use one insn
to load the goto_tb destination directly from the TB.
Backports commit 308714e6bc945389c64faf1b9213e2c0d3f03391 from qemu
Since we're no longer using a direct branch, we have no
limit on the branch distance.
Backports commit acb0b292b6d0f49972dc98f742e79ed53973e438 from qemu
The new placement of the TB means that we can use one insn
to load the return value for exit_tb returning the TB pointer.
Backports commit cc74d332ff9a78684374847375ef63fc4bd10436 from qemu
We are partially initializing tb in tb_alloc. Instead, fully
initialize it in tb_gen_code, which is tb_alloc's only caller.
This saves an unnecessary write to tb->cflags.
Backports commit 2b48e10f888059a98043b4816769fa2a326a1d2c from qemu
Allocating an arbitrarily-sized array of tbs results in either
(a) a lot of memory wasted or (b) unnecessary flushes of the code
cache when we run out of TB structs in the array.
An obvious solution would be to just malloc a TB struct when needed,
and keep the TB array as an array of pointers (recall that tb_find_pc()
needs the TB array to run in O(log n)).
Perhaps a better solution, which is implemented in this patch, is to
allocate TB's right before the translated code they describe. This
results in some memory waste due to padding to have code and TBs in
separate cache lines--for instance, I measured 4.7% of padding in the
used portion of code_gen_buffer when booting aarch64 Linux on a
host with 64-byte cache lines. However, it can allow for optimizations
in some host architectures, since TCG backends could safely assume that
the TB and the corresponding translated code are very close to each
other in memory. See this message by rth for a detailed explanation:
https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05172.html
Subject: Re: GSoC 2017 Proposal: TCG performance enhancements
Backports commit 6e3b2bfd6af488a896f7936e99ef160f8f37e6f2 from qemu
Add helpers to gather cache info from the host at init-time.
For now, only export the host's I/D cache line sizes, which we
will use to improve cache locality to avoid false sharing.
Backports commit b255b2c8a5484742606e8760870ba3e14d0c9605 from qemu
V flag for subtraction is:
v = (res ^ src1) & (src1 ^ src2)
(see COMPUTE_CCR() in target/m68k/helper.c)
But gen_flush_flags() uses:
v = (res ^ src2) & (src1 ^ src2)
The problem has been found with the following program:
.global _start
_start:
move.l #-2147483648,%d0
subq.l #1,%d0
jvc 1f
move.l #1,%d1
move.l #1,%d0
trap #0
1:
move.l #0,%d1
move.l #1,%d0
trap #0
It works fine (exit(1)) on real hardware, and with "-singlestep".
"-singlestep" uses gen_helper_flush_flags(), whereas
without "-singlestep", V flag is computed directly in
gen_flush_flags().
This patch updates gen_flush_flags() to have the same result
as with gen_helper_flush_flags().
Backports commit 043b936ef6fe53396b3c6b8f5562ea3e238a071d from qemu
Running Windows with icount causes a crash in instruction of write cr.
This patch fixes it.
Reading and writing cr cause an icount read because there are called
cpu_get_apic_tpr and cpu_set_apic_tpr functions. So, there is need
gen_io_start()/gen_io_end() calls.
Backports commit 5b003a40bb1ab14d0398e91f03393d3c6b9577cd from qemu
This speeds up SMM switches. Later on it may remove the need to take
the BQL, and it may also allow to reuse code between TCG and KVM.
Backports commit f8c45c6550b9ff1e1f0b92709ff3213a79870879 from qemu
It really only plays with the dispatchers, so the parameter list does
not need that complexity. This helps for readability at least.
Backports commit 003a0cf2cd1828a1141a874428571267b117f765 from qemu
In theory this would re-enable usage of QEMU on an armv4 host.
Whether this is worthwhile is debatable -- we've been unconditionally
issuing the armv5t BX instruction in the prologue since 2011 without
complaint. Possibly we should simply require an armv6 host.
Backports commit 702a947484eb3e615183dafc93de590ab0679f60 from qemu
Instead of unconditionally exiting to the exec loop, use the
gen_jr helper to jump to the target if it is valid.
Perf impact: see next commit's log.
Backports commit fe62089563ffc6a42f16ff28a6b6be34d2697766 from qemu
Instead of unconditionally exiting to the exec loop, use the
lookup_and_goto_ptr helper to jump to the target if it is valid.
Perf impact: see next commit's log.
Backports commit 7ad55b4ffd982c80f26f7f3658138d94cdc678e8 from qemu
Instead of exporting goto_ptr directly to TCG frontends, export
tcg_gen_lookup_and_goto_ptr(), which calls goto_ptr with the pointer
returned by the lookup_tb_ptr() helper. This is the only use case
we have for goto_ptr and lookup_tb_ptr, so having this function is
very convenient. Furthermore, it trivially allows us to avoid calling
the lookup helper if goto_ptr is not implemented by the backend.
Backports commit cedbcb01529cb6cf9a2289cdbebbc63f6149fc18 from qemu
We need to coordinate with the TCG_OVERSIZED_GUEST test in cputlb.c,
and allow 64-bit atomics even though sizeof(void *) == 4.
Backports commit 374aae653499f4d405caf32b7fff0c8639113fe4 from qemu
The cp15, CRn=15, opc1=0, CRm=5, opc2=0 instruction invalidates all the
data cache on the cortex-r5. Implementing it as a NOP.
Backports commit 95e9a242e2a393c7d4e5cc04340e39c3a9420f03 from qemu
M profile doesn't implement ARM, and the architecturally required
behaviour for attempts to execute with the Thumb bit clear is to
generate a UsageFault with the CFSR INVSTATE bit set. We were
incorrectly implementing this as generating an UNDEFINSTR UsageFault;
fix this.
Backports commit e13886e3a790b52f0b2e93cb5e84fdc2ada5471a from qemu
Implement the exception return consistency checks
described in the v7M pseudocode ExceptionReturn().
Inspired by a patch from Michael Davidsaver's series, but
this is a reimplementation from scratch based on the
ARM ARM pseudocode.
Backports commit aa488fe3bb5460c6675800ccd80f6dccbbd70159 from qemu
Extract the code from the tail end of arm_v7m_do_interrupt() which
enters the exception handler into a pair of utility functions
v7m_exception_taken() and v7m_push_stack(), which correspond roughly
to the pseudocode PushStack() and ExceptionTaken().
This also requires us to move the arm_v7m_load_vector() utility
routine up so we can call it.
Handling illegal exception returns has some cases where we want to
take a UsageFault either on an existing stack frame or with a new
stack frame but with a specific LR value, so we want to be able to
call these without having to go via arm_v7m_cpu_do_interrupt().
Backports commit 39ae2474e337247e5930e8be783b689adc9f6215 from qemu
All the places in armv7m_cpu_do_interrupt() which pend an
exception in the NVIC are doing so for synchronous
exceptions. We know that we will always take some
exception in this case, so we can just acknowledge it
immediately, rather than returning and then immediately
being called again because the NVIC has raised its outbound
IRQ line.
Backports commit a25dc805e2e63a55029e787a52335e12dabf07dc from qemu
The M profile condition for when we can take a pending exception or
interrupt is not the same as that for A/R profile. The code
originally copied from the A/R profile version of the
cpu_exec_interrupt function only worked by chance for the
very simple case of exceptions being masked by PRIMASK.
Replace it with a call to a function in the NVIC code that
correctly compares the priority of the pending exception
against the current execution priority of the CPU.
Backports commit 7ecdaa4a9635f1ded0dfa9218c25273b6d4dcd44 from qemu
Having armv7m_nvic_acknowledge_irq() return the new value of
env->v7m.exception and its one caller assign the return value
back to env->v7m.exception is pointless. Just make the return
type void instead.
Backports commit a5d8235545e98c1ce02560d5f4f57552d937efe9 from qemu
Implement HFNMIENA support for the M profile MPU. This bit controls
whether the MPU is treated as enabled when executing at execution
priorities of less than zero (in NMI, HardFault or with the FAULTMASK
bit set).
Doing this requires us to use a different MMU index for "running
at execution priority < 0", because we will have different
access permissions for that case versus the normal case.
Backports commit 3bef7012560a7f0ea27b265105de5090ba117514 from qemu
The M series MPU is almost the same as the already implemented R
profile MPU (v7 PMSA). So all we need to implement here is the MPU
register interface in the system register space.
This implementation has the same restriction as the R profile MPU
that it doesn't permit regions to be sized down smaller than 1K.
We also do not yet implement support for MPU_CTRL.HFNMIENA; this
bit should if zero disable use of the MPU when running HardFault,
NMI or with FAULTMASK set to 1 (ie at an execution priority of
less than zero) -- if the MPU is enabled we don't treat these
cases any differently.
Backports commit 29c483a506070e8f554c77d22686f405e30b9114 from qemu
General logic is that operations stopped by the MPU are MemManage,
and those which go through the MPU and are caught by the unassigned
handle are BusFault. Distinguish these by looking at the
exception.fsr values, and set the CFSR bits and (if appropriate)
fill in the BFAR or MMFAR with the exception address.
Backports commit 5dd0641d234e355597be62e5279d8a519c831625 from qemu
All M profile CPUs are PMSA, so set the feature bit.
(We haven't actually implemented the M profile MPU register
interface yet, but setting this feature bit gives us closer
to correct behaviour for the MPU-disabled case.)
Backports commit 790a11503cfb5e1dcd031ea2212bbebae4ca3cec from qemu
Add support for the M profile default memory map which is used
if the MPU is not present or disabled.
The main differences in behaviour from implementing this
correctly are that we set the PAGE_EXEC attribute on
the right regions of memory, such that device regions
are not executable.
Backports commit 3a00d560bcfca7ad04327062c1986a016c104b1f from qemu
Improve the "-d mmu" tracing for the PMSAv7 MPU translation
process as an aid in debugging guest MPU configurations:
* fix a missing newline for a guest-error log
* report the region number with guest-error or unimp
logs of bad region register values
* add a log message for the overall result of the lookup
* print "0x" prefix for hex values
Backports commit c9f9f1246d630960bce45881e9c0d27b55be71e2 from qemu
Now that we enforce both:
* pmsav7_dregion == 0 implies has_mpu == false
* PMSA with has_mpu == false means SCTLR.M cannot be set
we can remove a check on pmsav7_dregion from get_phys_addr_pmsav7(),
because we can only reach this code path if the MPU is enabled
(and so region_translation_disabled() returned false).
Backports commit e9235c6983b261e04e897e8ff900b2b7a391e644 from qemu
If the CPU is a PMSA config with no MPU implemented, then the
SCTLR.M bit should be RAZ/WI, so that the guest can never
turn on the non-existent MPU.
Backports commit 06312febfb2d35367006ef23608ddd6a131214d4 from qemu
Fix the handling of QOM properties for PMSA CPUs with no MPU:
Allow no-MPU to be specified by either:
* has-mpu = false
* pmsav7_dregion = 0
and make setting one imply the other. Don't clear the PMSA
feature bit in this situation.
Backports commit f50cd31413d8bc9d1eef8edd1f878324543bf65d from qemu
ARM CPUs come in two flavours:
* proper MMU ("VMSA")
* only an MPU ("PMSA")
For PMSA, the MPU may be implemented, or not (in which case there
is default "always acts the same" behaviour, but it isn't guest
programmable).
QEMU is a bit confused about how we indicate this: we have an
ARM_FEATURE_MPU, but it's not clear whether this indicates
"PMSA, not VMSA" or "PMSA and MPU present" , and sometimes we
use it for one purpose and sometimes the other.
Currently trying to implement a PMSA-without-MPU core won't
work correctly because we turn off the ARM_FEATURE_MPU bit
and then a lot of things which should still exist get
turned off too.
As the first step in cleaning this up, rename the feature
bit to ARM_FEATURE_PMSA, which indicates a PMSA CPU (with
or without MPU).
Backports commit 452a095526a0537f16c271516a2200877a272ea8 from qemu
Make M profile use completely separate ARMMMUIdx values from
those that A profile CPUs use. This is a prelude to adding
support for the MPU and for v8M, which together will require
6 MMU indexes which don't map cleanly onto the A profile
uses:
non secure User
non secure Privileged
non secure Privileged, execution priority < 0
secure User
secure Privileged
secure Privileged, execution priority < 0
Backports commit e7b921c2d9efc249f99b9feb0e7dca82c96aa5c4 from qemu
The v7M exception architecture requires that if a synchronous
exception cannot be taken immediately (because it is disabled
or at too low a priority) then it should be escalated to
HardFault (and the HardFault exception is then taken).
Implement this escalation logic.
Backports commit a73c98e159d18155445d29b6044be6ad49fd802f from qemu
The M profile CPU's MPU has an awkward corner case which we
would like to implement with a different MMU index.
We can avoid having to bump the number of MMU modes ARM
uses, because some of our existing MMU indexes are only
used by non-M-profile CPUs, so we can borrow one.
To avoid that getting too confusing, clean up the code
to try to keep the two meanings of the index separate.
Instead of ARMMMUIdx enum values being identical to core QEMU
MMU index values, they are now the core index values with some
high bits set. Any particular CPU always uses the same high
bits (so eventually A profile cores and M profile cores will
use different bits). New functions arm_to_core_mmu_idx()
and core_to_arm_mmu_idx() convert between the two.
In general core index values are stored in 'int' types, and
ARM values are stored in ARMMMUIdx types.
Backports commit 8bd5c82030b2cb09d3eef6b444f1620911cc9fc5 from qemu
The PMUv3 driver of linux kernel (in arch/arm64/kernel/perf_event.c)
relies on the PMUVER field of id_aa64dfr0_el1 to decide if PMU support
is present or not. This patch clears the PMUVER field under TCG mode
when vPMU=off. Without it, PMUv3 will init insider guest VMs even
with vPMU=off. This patch also removes a redundant line inside the
if-statement.
Backports commit 2b3ffa929249b15a75d8bde3e8e57a744f52aff0 from qemu
When identifying the DFSR format for an alignment fault, use
the mmu index that we are passed, rather than calling cpu_mmu_index()
to get the mmu index for the current CPU state. This doesn't actually
make any difference since the only cases where the current MMU index
differs from the index used for the load are the "unprivileged
load/store" instructions, and in that case the mmu index may
differ but the translation regime is the same (apart from the
"use from Hyp mode" case which is UNPREDICTABLE).
However it's the more logical thing to do.
Backports commit e517d95b63427fae9f03958dbc005c36b4ebf2cf from qemu