Ensure direct jump patching in s390 is atomic by:
* naturally aligning a location of direct jump address;
* using atomic_read()/atomic_set() for code patching.
Backports commit ed3d51ecd7fe248d3959e469d53890ac9ffe0cd2 from qemu
Ensure direct jump patching in i386 is atomic by:
* naturally aligning a location of direct jump address;
* using atomic_read()/atomic_set() for code patching.
Backports commit 0d07abf05e98903c7faf204a9a90f7d45b7554dc from qemu
We are inconsistent with the type of tb->flags: usage varies loosely
between int and uint64_t. Settle to uint32_t everywhere, which is
superior to both: at least one target (aarch64) uses the most significant
bit in the u32, and uint64_t is wasteful.
Compile-tested for all targets.
Backports commit 89fee74a0f066dfd73830a7b5fa137e87888c870 from qemu
Use tcg_set_insn_param() instead of directly accessing internal
tcg data structures to update an insn param.
Backports commit 25caa94c4a26daaab1e65c6d887e2972aeb5749e from qemu
Move the architecture agnostic function prototypes for exec.c out of
cputlb.h to exec-all.h. This allows hiding of the arch specific
cputlb.h from exec.c which should be getting close to having no
architecture specifics. Prepares support for multi-arch, which will have
a minimal cpu.h that services exec.c but not cputlb.h.
Backports commit dfccc7602374c9fd3b083208b552d62daa244811 from qemu
To prepare for multi-arch, cputlb.c should only have awareness of one
single architecture. This means it should not have access to the full
CPU lists which may be heterogeneous. Instead, push the CPU_LOOP() up
to the one and only caller in exec.c.
Backports commit 9a13565d52bfd321934fb44ee004bbaf5f5913a8 from qemu
The last two arguments to these functions are the last and first bit to
check relative to the base. The code was using incorrectly the first
bit and the number of bits. Fix this in cpu_physical_memory_get_dirty
and cpu_physical_memory_all_dirty. This requires a few changes in the
iteration; change the code in cpu_physical_memory_set_dirty_range to
match.
Backports commit 88c73d16ad1b6c22a2ab082064d0d521f756296a from qemu
Although accesses to ram_list.dirty_memory[] use atomics so multiple
threads can safely dirty the bitmap, the data structure is not fully
thread-safe yet.
This patch handles the RAM hotplug case where ram_list.dirty_memory[] is
grown. ram_list.dirty_memory[] is change from a regular bitmap to an
RCU array of pointers to fixed-size bitmap blocks. Threads can continue
accessing bitmap blocks while the array is being extended. See the
comments in the code for an in-depth explanation of struct
DirtyMemoryBlocks.
I have tested that live migration with virtio-blk dataplane works.
Backports commit 5b82b703b69acc67b78b98a5efc897a3912719eb from qemu
qemu-log: dfilter-ise exec, out_asm, op and opt_op
This ensures the code generation debug code will honour -dfilter if set.
For the "exec" tracing I've added a new inline macro for efficiency's
sake.
Backports commit d977e1c2dbc9e63454b2000f91954d02543bf43b from qemu
Improve the TB execution logging so that it is easier to identify
what is happening from trace logs:
* move the "Trace" logging of executed TBs into cpu_tb_exec()
so that it is emitted if and only if we actually execute a TB,
and for consistency for the CPU state logging
* log when we link two TBs together via tb_add_jump()
* log when cpu_tb_exec() returns early from a chain of TBs
The new style logging looks like this:
Trace 0x7fb7cc822ca0 [ffffffc0000dce00]
Linking TBs 0x7fb7cc822ca0 [ffffffc0000dce00] index 0 -> 0x7fb7cc823110 [ffffffc0000dce10]
Trace 0x7fb7cc823110 [ffffffc0000dce10]
Trace 0x7fb7cc823420 [ffffffc000302688]
Trace 0x7fb7cc8234a0 [ffffffc000302698]
Trace 0x7fb7cc823520 [ffffffc0003026a4]
Trace 0x7fb7cc823560 [ffffffc0000dce44]
Linking TBs 0x7fb7cc823560 [ffffffc0000dce44] index 1 -> 0x7fb7cc8235d0 [ffffffc0000dce70]
Trace 0x7fb7cc8235d0 [ffffffc0000dce70]
Stopped execution of TB chain before 0x7fb7cc8235d0 [ffffffc0000dce70]
Trace 0x7fb7cc8235d0 [ffffffc0000dce70]
Trace 0x7fb7cc822fd0 [ffffffc0000dd52c]
Backports commit 1a830635229e14c403600167823ea6b3b79d3097 from qemu
Just specifying ops = NULL in some cases can be more convenient than having
two functions.
Backports commit 6d6d2abf2c2e52c0f404d0a31a963e945b0cc7ad from qemu
All references to mr->ram_addr are replaced by
memory_region_get_ram_addr(mr) (except for a few assertions that are
replaced with mr->ram_block).
Backports commit 8e41fb63c5bf29ecabe0cee1239bf6230f19978a from qemu
Previously we return RAMBlock.offset; now return the pointer to the
whole structure.
ram_block_add returns void now, error is completely passed with errp.
Backports commit 528f46af6ecd1e300db18684969104d4067b867b from qemu
these two functions consume too much cpu overhead to
find the RAMBlock by ram address.
After this patch, we can pass the RAMBlock pointer
to them so that they don't need to find the RAMBlock
anymore most of the time. We can get better performance
in address translation processing.
Backports commit 3655cb9c7375a595a8051ec677c515b24d5c1fe6 from qemu
Each RAM memory region has a unique corresponding RAMBlock.
In the current realization, the memory region only stored
the ram_addr which means the offset of RAM address space,
We need to qurey the global ram.list to find the ram block
by ram_addr if we want to get the ram block, which is very
expensive.
Now, we store the RAMBlock pointer into memory region
structure. So, if we know the mr, we can easily get the
RAMBlock.
Backports commit 58eaa2174e99d9a05172d03fd2799ab8fd9e6f60 from qemu
This condition is true in the common case, so we can cut out the body of
the function. In addition, this makes it easier for the compiler to do
at least partial inlining, even if it decides that fully inlining the
function is unreasonable.
Backports commit 8bafcb21643a39a5b29109f8bd5ee5a6f0f6850b from qemu
check the return value of the function it calls and error if it's non-0
Fixup qemu_rdma_init_one_block that is the only current caller,
and rdma_add_block the only function it calls using it.
Pass the name of the ramblock to the function; helps in debugging.
Backports commit e3807054e20fb3b94d18cb751c437ee2f43b6fac from qemu
This will either create a new AS or return a pointer to an
already existing equivalent one, if we have already created
an AS for the specified root memory region.
The motivation is to reuse address spaces as much as possible.
It's going to be quite common that bus masters out in device land
have pointers to the same memory region for their mastering yet
each will need to create its own address space. Let the memory
API implement sharing for them.
Aside from the perf optimisations, this should reduce the amount
of redundant output on info mtree as well.
Thee returned value will be malloced, but the malloc will be
automatically freed when the AS runs out of refs.
Backports commit f0c02d15b57da6f5463e3768aa0cfeedccf4b8f4 from qemu
Add a function to return the AddressSpace for a CPU based on
its numerical index. (Callers outside exec.c don't have access
to the CPUAddressSpace struct so can't just fish it out of the
CPUState struct directly.)
Backports commit 651a5bc03705102de519ebf079a40ecc1da991db from qemu
Pass the MemTxAttrs for the memory access to iotlb_to_region(); this
allows it to determine the correct AddressSpace to use for the lookup.
Backports commit a54c87b68a0410d0cf6f8b84e42074a5cf463732 from qemu
When looking up the MemoryRegionSection for the new TLB entry in
tlb_set_page_with_attrs(), use cpu_asidx_from_attrs() to determine
the correct address space index for the lookup, and pass it into
address_space_translate_for_iotlb().
Backports commit d7898cda81b6efa6b2d7a749882695cdcf280eaa from qemu
Allow multiple calls to cpu_address_space_init(); each
call adds an entry to the cpu->ases array at the specified
index. It is up to the target-specific CPU code to actually use
these extra address spaces.
Since this multiple AddressSpace support won't work with
KVM, add an assertion to avoid confusing failures.
Backports commit 12ebc9a76dd7702aef0a3618717a826c19c34ef4 from qemu
Rather than setting cpu->as unconditionally in cpu_exec_init
(and then having target-i386 override this later), don't set
it until the first call to cpu_address_space_init.
This requires us to initialise the address space for
both TCG and KVM (KVM doesn't need the AS listener but
it does require cpu->as to be set).
For target CPUs which don't set up any address spaces (currently
everything except i386), add the default address_space_memory
in qemu_init_vcpu().
Backports commit 56943e8cc14b7eeeab67d1942fa5d8bcafe3e53f from qemu
memcpy can take a large amount of time for small reads and writes.
Handle the common case of reading s/g descriptors from memory (there
is no corresponding "write" case that is as common, because writes
often use address_space_st* functions) by inlining the relevant
parts of address_space_read into the caller.
Backports commit 3cc8f884996584630734a90c9b3c535af81e3c92 from qemu
We want to inline the case where there is only one iteration, because
then the compiler can also inline the memcpy. As a start, extract
everything after the first address_space_translate call.
Backports commit a203ac702e0720135fac8b1f2061d119814c1798 from qemu
For the common case of DMA into non-hotplugged RAM, it is unnecessary
but expensive to do object_ref/unref. Add back an owner field to
MemoryRegion, so that these memory regions can skip the reference
counting.
Backports commit 612263cf33062f7441a5d0e3b37c65991fdc3210 from qemu
Order fields so that all fields accessed during a RAM read/write fit in
the same cache line.
Backports commit a676854f3447019c7c4b005ab6aece905fccfddd from qemu
Replace qemu_ram_free_from_ptr() with qemu_ram_free().
The only difference between qemu_ram_free_from_ptr() and
qemu_ram_free() is that g_free_rcu() is used instead of
call_rcu(reclaim_ramblock). We can safely replace it because:
* RAM blocks allocated by qemu_ram_alloc_from_ptr() always have
RAM_PREALLOC set;
* reclaim_ramblock(block) will do nothing except g_free(block)
if RAM_PREALLOC is set at block->flags.
Backports commit a29ac16632aec6065c72985b9f7eeb1ca6fbef4a from qemu
Add a function to find a RAMBlock by name; use it in two
of the places that already open code that loop; we've
got another use later in postcopy.
Backports commit e3dd74934f2d2c8c67083995928ff68e8c1d0030 from qemu
Postcopy sends RAMBlock names and offsets over the wire (since it can't
rely on the order of ramaddr being the same), and it starts out with
HVA fault addresses from the kernel.
qemu_ram_block_from_host translates a HVA into a RAMBlock, an offset
in the RAMBlock and the global ram_addr_t value.
Rewrite qemu_ram_addr_from_host to use qemu_ram_block_from_host.
Provide qemu_ram_get_idstr since its the actual name text sent on the
wire.
Backports commit 422148d3e56c3c9a07c0cf36c1e0a0b76f09c357 from qemu
This makes ROM blocks resizeable. This infrastructure is required for other
functionality we have queued.
Backports commit aaf03019175949eda5087329448b8a0033b89479 from qemu
At present, the "average" guestimate of TB size is way too small, leading
to many unused entries in the pre-allocated TB array. For a guest with 1GB
ram, we're currently allocating 256MB for the array.
Survey arm, alpha, aarch64, ppc, sparc, i686, x86_64 guests running on
x86_64 and ppc64 hosts and select a new average. The size of the array
drops to 81MB with no more flushing than before.
Backports commit 126d89e8cdfa3be15d51f76906eaccbcd0023f98 from qemu
We currently pre-compute an worst case code size for any TB, which
works out to be 122kB. Since the average TB size is near 1kB, this
wastes quite a lot of storage.
Instead, check for overflow in between generating code for each opcode.
The overhead of the check isn't measurable and wastage is minimized.
Backports commit b125f9dc7bd68cd4c57189db4da83b0620b28a72 from qemu
It is no longer used, so tidy up everything reached by it.
This includes the gen_opc_* arrays, the search_pc parameter
and the inline gen_intermediate_code_internal functions.
Backports commit 4e5e1215156662b2b153255c49d4640d82c5568b from qemu
In this case, QEMU might longjmp out of cpu-exec.c and miss the final
cleanup in cpu_exec_nocache. Do this manually through a new compile
flag.
Backports commit d8a499f17ee5f05407874f29f69f0e3e3198a853 from qemu
The gen_opc_* arrays are already redundant with the data stored in
the insn_start arguments. Transition restore_state_to_opc to use
data from the latter.
Backports commit bad729e272387de7dbfa3ec4319036552fc6c107 from qemu
Change tlb_set_dirty() to accept a CPU instead of an env pointer. This
allows for removal of another CPUArchState usage from prototypes that
need to be QOMified.
Backports commit bcae01e468d961ad9afaf4148329147e4be209ab from qemu
There is some iffy lock hierarchy going on in translate-all.c. To
fix it, we need to take the mmap_lock in cpu-exec.c. Make the
functions globally available.
Backports commit 8fd19e6cfd5b6cdf028c6ac2ff4157ed831ea3a6 from qemu