Commit graph

174 commits

Author SHA1 Message Date
Sergey Fedorov ba9a237586
tcg: Rework tb_invalidated_flag
'tb_invalidated_flag' was meant to catch two events:
* some TB has been invalidated by tb_phys_invalidate();
* the whole translation buffer has been flushed by tb_flush().

Then it was checked:
* in cpu_exec() to ensure that the last executed TB can be safely
linked to directly call the next one;
* in cpu_exec_nocache() to decide if the original TB should be provided
for further possible invalidation along with the temporarily
generated TB.

It is always safe to patch an invalidated TB since it is not going to be
used anyway. It is also safe to call tb_phys_invalidate() for an already
invalidated TB. Thus, setting this flag in tb_phys_invalidate() is
simply unnecessary. Moreover, it can prevent from pretty proper linking
of TBs, if any arbitrary TB has been invalidated. So just don't touch it
in tb_phys_invalidate().

If this flag is only used to catch whether tb_flush() has been called
then rename it to 'tb_flushed'. Declare it as 'bool' and stick to using
only 'true' and 'false' to set its value. Also, instead of setting it in
tb_gen_code(), just after tb_flush() has been called, do it right inside
of tb_flush().

In cpu_exec(), this flag is used to track if tb_flush() has been called
and have made 'next_tb' (a reference to the last executed TB) invalid
for linking it to directly call the next TB. tb_flush() can be called
during the CPU execution loop from tb_gen_code(), during TB execution or
by another thread while 'tb_lock' is released. Catch for translation
buffer flush reliably by resetting this flag once before first TB lookup
and each time we find it set before trying to add a direct jump. Don't
touch in in tb_find_physical().

Each vCPU has its own execution loop in multithreaded mode and thus
should have its own copy of the flag to be able to reset it with its own
'next_tb' and don't affect any other vCPU execution thread. So make this
flag per-vCPU and move it to CPUState.

In cpu_exec_nocache(), we only need to check if tb_flush() has been
called from tb_gen_code() called by cpu_exec_nocache() itself. To do
this reliably, preserve the old value of the flag, reset it before
calling tb_gen_code(), check afterwards, and combine the saved value
back to the flag.

This patch is based on the patch "tcg: move tb_invalidated_flag to
CPUState" from Paolo Bonzini <pbonzini@redhat.com>.

Backports commit 6f789be56d3f38e9214dafcfab3bf9be7191f370 from qemu
2018-02-23 23:34:51 -05:00
Sergey Fedorov d60af028c5
tcg: Clarify thread safety check in tb_add_jump()
The check is to make sure that another thread hasn't already done the
same while we were outside of tb_lock. Mention this in a comment.

Backports commit 9962c478b153a18fe88a6509fe58cd178aff8abc from qemu
2018-02-23 21:32:47 -05:00
Sergey Fedorov fbc0a1105f
tcg: Use uintptr_t type for jmp_list_{next|first} fields of TB
These fields do not contain pure pointers to a TranslationBlock
structure. So uintptr_t is the most appropriate type for them.
Also put some asserts to assure that the two least significant bits of
the pointer are always zero before assigning it to jmp_list_first.

Backports commit c37e6d7e3589ecb96914faa21025ad7ba6654aea from qemu
2018-02-23 21:28:19 -05:00
Sergey Fedorov e60c24cecf
tcg: Clean up direct block chaining data fields
Briefly describe in a comment how direct block chaining is done. It
should help in understanding of the following data fields.

Rename some fields in TranslationBlock and TCGContext structures to
better reflect their purpose (dropping excessive 'tb_' prefix in
TranslationBlock but keeping it in TCGContext):
tb_next_offset => jmp_reset_offset
tb_jmp_offset => jmp_insn_offset
tb_next => jmp_target_addr
jmp_next => jmp_list_next
jmp_first => jmp_list_first

Avoid using a magic constant as an invalid offset which is used to
indicate that there's no n-th jump generated.

Backports commit f309101c26b59641fc1aa8fb2a98a5441cdaea03 from qemu
2018-02-23 21:28:19 -05:00
Sergey Fedorov c5b234ed1f
tcg: Note requirement on atomic direct jump patching
Backports commit 10b4f4855537dd421e193a7d0416513116370558 from qemu
2018-02-23 21:28:18 -05:00
Sergey Fedorov 52e2972300
tcg/arm: Make direct jump patching thread-safe
Ensure direct jump patching in ARM is atomic by using
atomic_read()/atomic_set() for code patching.

Backports commit 7d14e0e2d661479985197203589c38840e1066df from qemu
2018-02-23 21:28:18 -05:00
Sergey Fedorov 57359fbe6c
tcg/s390: Make direct jump patching thread-safe
Ensure direct jump patching in s390 is atomic by:
* naturally aligning a location of direct jump address;
* using atomic_read()/atomic_set() for code patching.

Backports commit ed3d51ecd7fe248d3959e469d53890ac9ffe0cd2 from qemu
2018-02-23 21:28:18 -05:00
Sergey Fedorov 5eb2d6618f
tcg/i386: Make direct jump patching thread-safe
Ensure direct jump patching in i386 is atomic by:
* naturally aligning a location of direct jump address;
* using atomic_read()/atomic_set() for code patching.

Backports commit 0d07abf05e98903c7faf204a9a90f7d45b7554dc from qemu
2018-02-23 21:28:17 -05:00
Emilio G. Cota 170f6e0b3b
tb: consistently use uint32_t for tb->flags
We are inconsistent with the type of tb->flags: usage varies loosely
between int and uint64_t. Settle to uint32_t everywhere, which is
superior to both: at least one target (aarch64) uses the most significant
bit in the u32, and uint64_t is wasteful.

Compile-tested for all targets.

Backports commit 89fee74a0f066dfd73830a7b5fa137e87888c870 from qemu
2018-02-23 21:28:11 -05:00
Edgar E. Iglesias bfc74c4da2
gen-icount: Use tcg_set_insn_param
Use tcg_set_insn_param() instead of directly accessing internal
tcg data structures to update an insn param.

Backports commit 25caa94c4a26daaab1e65c6d887e2972aeb5749e from qemu
2018-02-23 20:01:17 -05:00
Lioncash 87130fc884
exec-all: Remove externs
These are unused
2018-02-23 12:43:03 -05:00
Peter Crosthwaite 576f1752a6
include/exec: Move cputlb exec.c defs out
Move the architecture agnostic function prototypes for exec.c out of
cputlb.h to exec-all.h. This allows hiding of the arch specific
cputlb.h from exec.c which should be getting close to having no
architecture specifics. Prepares support for multi-arch, which will have
a minimal cpu.h that services exec.c but not cputlb.h.

Backports commit dfccc7602374c9fd3b083208b552d62daa244811 from qemu
2018-02-23 10:52:25 -05:00
Peter Crosthwaite 97c9423ee8
cputlb: move CPU_LOOP() for tlb_reset() to exec.c
To prepare for multi-arch, cputlb.c should only have awareness of one
single architecture. This means it should not have access to the full
CPU lists which may be heterogeneous. Instead, push the CPU_LOOP() up
to the one and only caller in exec.c.

Backports commit 9a13565d52bfd321934fb44ee004bbaf5f5913a8 from qemu
2018-02-23 10:46:31 -05:00
Paolo Bonzini 9479199c6b
memory: fix usage of find_next_bit and find_next_zero_bit
The last two arguments to these functions are the last and first bit to
check relative to the base. The code was using incorrectly the first
bit and the number of bits. Fix this in cpu_physical_memory_get_dirty
and cpu_physical_memory_all_dirty. This requires a few changes in the
iteration; change the code in cpu_physical_memory_set_dirty_range to
match.

Backports commit 88c73d16ad1b6c22a2ab082064d0d521f756296a from qemu
2018-02-22 19:51:43 -05:00
Stefan Hajnoczi e79e0881cd
memory: RCU ram_list.dirty_memory[] for safe RAM hotplug
Although accesses to ram_list.dirty_memory[] use atomics so multiple
threads can safely dirty the bitmap, the data structure is not fully
thread-safe yet.

This patch handles the RAM hotplug case where ram_list.dirty_memory[] is
grown.  ram_list.dirty_memory[] is change from a regular bitmap to an
RCU array of pointers to fixed-size bitmap blocks.  Threads can continue
accessing bitmap blocks while the array is being extended.  See the
comments in the code for an in-depth explanation of struct
DirtyMemoryBlocks.

I have tested that live migration with virtio-blk dataplane works.

Backports commit 5b82b703b69acc67b78b98a5efc897a3912719eb from qemu
2018-02-22 15:38:03 -05:00
Alex Bennée 3da7d9d9ae
qemu-log: dfilter-ise exec, out_asm, op and opt_op
qemu-log: dfilter-ise exec, out_asm, op and opt_op

This ensures the code generation debug code will honour -dfilter if set.
For the "exec" tracing I've added a new inline macro for efficiency's
sake.

Backports commit d977e1c2dbc9e63454b2000f91954d02543bf43b from qemu
2018-02-22 10:06:19 -05:00
Peter Maydell 3f5e36e15f
qemu-log: Improve the exec TB execution logging
Improve the TB execution logging so that it is easier to identify
what is happening from trace logs:
* move the "Trace" logging of executed TBs into cpu_tb_exec()
so that it is emitted if and only if we actually execute a TB,
and for consistency for the CPU state logging
* log when we link two TBs together via tb_add_jump()
* log when cpu_tb_exec() returns early from a chain of TBs

The new style logging looks like this:

Trace 0x7fb7cc822ca0 [ffffffc0000dce00]
Linking TBs 0x7fb7cc822ca0 [ffffffc0000dce00] index 0 -> 0x7fb7cc823110 [ffffffc0000dce10]
Trace 0x7fb7cc823110 [ffffffc0000dce10]
Trace 0x7fb7cc823420 [ffffffc000302688]
Trace 0x7fb7cc8234a0 [ffffffc000302698]
Trace 0x7fb7cc823520 [ffffffc0003026a4]
Trace 0x7fb7cc823560 [ffffffc0000dce44]
Linking TBs 0x7fb7cc823560 [ffffffc0000dce44] index 1 -> 0x7fb7cc8235d0 [ffffffc0000dce70]
Trace 0x7fb7cc8235d0 [ffffffc0000dce70]
Stopped execution of TB chain before 0x7fb7cc8235d0 [ffffffc0000dce70]
Trace 0x7fb7cc8235d0 [ffffffc0000dce70]
Trace 0x7fb7cc822fd0 [ffffffc0000dd52c]

Backports commit 1a830635229e14c403600167823ea6b3b79d3097 from qemu
2018-02-22 09:40:11 -05:00
Pavel Fedin 0201c71145
Merge memory_region_init_reservation() into memory_region_init_io()
Just specifying ops = NULL in some cases can be more convenient than having
two functions.

Backports commit 6d6d2abf2c2e52c0f404d0a31a963e945b0cc7ad from qemu
2018-02-21 11:23:00 -05:00
Fam Zheng fa7d3e6cdb
memory: Drop MemoryRegion.ram_addr
All references to mr->ram_addr are replaced by
memory_region_get_ram_addr(mr) (except for a few assertions that are
replaced with mr->ram_block).

Backports commit 8e41fb63c5bf29ecabe0cee1239bf6230f19978a from qemu
2018-02-21 08:53:08 -05:00
Fam Zheng 2c1a72635d
memory: Implement memory_region_get_ram_addr with mr->ram_block
Backports commit 7ebb2745acbb8d910eab07dc5f0aa01a4457703c from qemu
2018-02-21 08:53:08 -05:00
Gonglei aa80edbef0
exec: Return RAMBlock pointer from allocating functions
Previously we return RAMBlock.offset; now return the pointer to the
whole structure.

ram_block_add returns void now, error is completely passed with errp.

Backports commit 528f46af6ecd1e300db18684969104d4067b867b from qemu
2018-02-21 08:52:57 -05:00
Gonglei 26951bf754
memory: Remove unreachable return statement
Backports commit d61524486c6e503e502241a2ea834f930f98a6a1 from qemu
2018-02-20 20:54:24 -05:00
Gonglei d25285bc78
memory: optimize qemu_get_ram_ptr and qemu_ram_ptr_length
these two functions consume too much cpu overhead to
find the RAMBlock by ram address.

After this patch, we can pass the RAMBlock pointer
to them so that they don't need to find the RAMBlock
anymore most of the time. We can get better performance
in address translation processing.

Backports commit 3655cb9c7375a595a8051ec677c515b24d5c1fe6 from qemu
2018-02-20 20:53:31 -05:00
Gonglei 39e4d63e68
exec: store RAMBlock pointer into memory region
Each RAM memory region has a unique corresponding RAMBlock.
In the current realization, the memory region only stored
the ram_addr which means the offset of RAM address space,
We need to qurey the global ram.list to find the ram block
by ram_addr if we want to get the ram block, which is very
expensive.

Now, we store the RAMBlock pointer into memory region
structure. So, if we know the mr, we can easily get the
RAMBlock.

Backports commit 58eaa2174e99d9a05172d03fd2799ab8fd9e6f60 from qemu
2018-02-20 20:43:32 -05:00
Lioncash c658126845
include: Move RAMList to ramlist.h
Moves the struct back into qemu's headers
2018-02-20 08:47:51 -05:00
Lioncash cdd4003ce9
Move RAMBlock to ram_addr.h
Moves it back into qemu's includes.
2018-02-20 08:35:44 -05:00
Paolo Bonzini cbc56b3ceb
memory: add early bail out from cpu_physical_memory_set_dirty_range
This condition is true in the common case, so we can cut out the body of
the function. In addition, this makes it easier for the compiler to do
at least partial inlining, even if it decides that fully inlining the
function is unreasonable.

Backports commit 8bafcb21643a39a5b29109f8bd5ee5a6f0f6850b from qemu
2018-02-20 08:32:10 -05:00
Lioncash a268815478
include: Add stubbed xen function
Will allow us to not comment out code all the time for xen checks (ideally)
2018-02-20 08:29:58 -05:00
Lioncash 6d5f465449
uc: Handle freeing of multiple address spaces 2018-02-18 21:36:50 -05:00
Dr. David Alan Gilbert 75701d03ee
qemu_ram_foreach_block: pass up error value, and down the ramblock name
check the return value of the function it calls and error if it's non-0
Fixup qemu_rdma_init_one_block that is the only current caller,
  and rdma_add_block the only function it calls using it.

Pass the name of the ramblock to the function; helps in debugging.

Backports commit e3807054e20fb3b94d18cb751c437ee2f43b6fac from qemu
2018-02-18 19:17:18 -05:00
Peter Crosthwaite b82e711a65
memory: Add address_space_init_shareable()
This will either create a new AS or return a pointer to an
already existing equivalent one, if we have already created
an AS for the specified root memory region.

The motivation is to reuse address spaces as much as possible.
It's going to be quite common that bus masters out in device land
have pointers to the same memory region for their mastering yet
each will need to create its own address space. Let the memory
API implement sharing for them.

Aside from the perf optimisations, this should reduce the amount
of redundant output on info mtree as well.

Thee returned value will be malloced, but the malloc will be
automatically freed when the AS runs out of refs.

Backports commit f0c02d15b57da6f5463e3768aa0cfeedccf4b8f4 from qemu
2018-02-18 00:18:21 -05:00
Peter Maydell 1dfba71bef
exec.c: Add cpu_get_address_space()
Add a function to return the AddressSpace for a CPU based on
its numerical index. (Callers outside exec.c don't have access
to the CPUAddressSpace struct so can't just fish it out of the
CPUState struct directly.)

Backports commit 651a5bc03705102de519ebf079a40ecc1da991db from qemu
2018-02-17 23:22:23 -05:00
Peter Maydell 2fe995a0da
exec.c: Pass MemTxAttrs to iotlb_to_region so it uses the right AS
Pass the MemTxAttrs for the memory access to iotlb_to_region(); this
allows it to determine the correct AddressSpace to use for the lookup.

Backports commit a54c87b68a0410d0cf6f8b84e42074a5cf463732 from qemu
2018-02-17 23:19:00 -05:00
Peter Maydell 8edd6ffdfd
cputlb.c: Use correct address space when looking up MemoryRegionSection
When looking up the MemoryRegionSection for the new TLB entry in
tlb_set_page_with_attrs(), use cpu_asidx_from_attrs() to determine
the correct address space index for the lookup, and pass it into
address_space_translate_for_iotlb().

Backports commit d7898cda81b6efa6b2d7a749882695cdcf280eaa from qemu
2018-02-17 23:15:22 -05:00
Peter Maydell 90c7c1bdb5
exec-all.h: Document tlb_set_page_with_attrs, tlb_set_page
Add documentation comments for tlb_set_page_with_attrs()
and tlb_set_page().

Backports commit 1787cc8ee55143b6071c87e59f08d56e7c22c1eb from qemu
2018-02-17 22:37:58 -05:00
Peter Maydell 51369b67cd
exec.c: Allow target CPUs to define multiple AddressSpaces
Allow multiple calls to cpu_address_space_init(); each
call adds an entry to the cpu->ases array at the specified
index. It is up to the target-specific CPU code to actually use
these extra address spaces.

Since this multiple AddressSpace support won't work with
KVM, add an assertion to avoid confusing failures.

Backports commit 12ebc9a76dd7702aef0a3618717a826c19c34ef4 from qemu
2018-02-17 22:35:13 -05:00
Peter Maydell f1b237236c
exec.c: Don't set cpu->as until cpu_address_space_init
Rather than setting cpu->as unconditionally in cpu_exec_init
(and then having target-i386 override this later), don't set
it until the first call to cpu_address_space_init.

This requires us to initialise the address space for
both TCG and KVM (KVM doesn't need the AS listener but
it does require cpu->as to be set).

For target CPUs which don't set up any address spaces (currently
everything except i386), add the default address_space_memory
in qemu_init_vcpu().

Backports commit 56943e8cc14b7eeeab67d1942fa5d8bcafe3e53f from qemu
2018-02-17 22:24:36 -05:00
Paolo Bonzini 1650af8c8b
memory: try to inline constant-length reads
memcpy can take a large amount of time for small reads and writes.
Handle the common case of reading s/g descriptors from memory (there
is no corresponding "write" case that is as common, because writes
often use address_space_st* functions) by inlining the relevant
parts of address_space_read into the caller.

Backports commit 3cc8f884996584630734a90c9b3c535af81e3c92 from qemu
2018-02-17 20:44:39 -05:00
Paolo Bonzini 712c300639
memory: inline a few small accessors
These are used in the address_space_* fast paths.

Backports commit 1619d1fe737d2af068aefe134386a69b76164794 from qemu
2018-02-17 20:35:28 -05:00
Paolo Bonzini 9a78c61145
memory: extract first iteration of address_space_read and address_space_write
We want to inline the case where there is only one iteration, because
then the compiler can also inline the memcpy. As a start, extract
everything after the first address_space_translate call.

Backports commit a203ac702e0720135fac8b1f2061d119814c1798 from qemu
2018-02-17 20:31:21 -05:00
Paolo Bonzini 077ffc3bd5
memory: avoid unnecessary object_ref/unref
For the common case of DMA into non-hotplugged RAM, it is unnecessary
but expensive to do object_ref/unref. Add back an owner field to
MemoryRegion, so that these memory regions can skip the reference
counting.

Backports commit 612263cf33062f7441a5d0e3b37c65991fdc3210 from qemu
2018-02-17 20:10:25 -05:00
Paolo Bonzini e6b25279f8
memory: reorder MemoryRegion fields
Order fields so that all fields accessed during a RAM read/write fit in
the same cache line.

Backports commit a676854f3447019c7c4b005ab6aece905fccfddd from qemu
2018-02-17 19:48:52 -05:00
Eduardo Habkost 26791ea61b
exec: Eliminate qemu_ram_free_from_ptr()
Replace qemu_ram_free_from_ptr() with qemu_ram_free().

The only difference between qemu_ram_free_from_ptr() and
qemu_ram_free() is that g_free_rcu() is used instead of
call_rcu(reclaim_ramblock). We can safely replace it because:

* RAM blocks allocated by qemu_ram_alloc_from_ptr() always have
RAM_PREALLOC set;
* reclaim_ramblock(block) will do nothing except g_free(block)
if RAM_PREALLOC is set at block->flags.

Backports commit a29ac16632aec6065c72985b9f7eeb1ca6fbef4a from qemu
2018-02-17 19:37:45 -05:00
Dr. David Alan Gilbert 60975685ce
qemu_ram_block_by_name
Add a function to find a RAMBlock by name; use it in two
of the places that already open code that loop; we've
got another use later in postcopy.

Backports commit e3dd74934f2d2c8c67083995928ff68e8c1d0030 from qemu
2018-02-17 18:01:16 -05:00
Dr. David Alan Gilbert cc088f84b5
qemu_ram_block_from_host
Postcopy sends RAMBlock names and offsets over the wire (since it can't
rely on the order of ramaddr being the same), and it starts out with
HVA fault addresses from the kernel.

qemu_ram_block_from_host translates a HVA into a RAMBlock, an offset
in the RAMBlock and the global ram_addr_t value.

Rewrite qemu_ram_addr_from_host to use qemu_ram_block_from_host.

Provide qemu_ram_get_idstr since its the actual name text sent on the
wire.

Backports commit 422148d3e56c3c9a07c0cf36c1e0a0b76f09c357 from qemu
2018-02-17 17:54:03 -05:00
Peter Maydell e1a4e4208f
pc: resizeable ROM blocks
This makes ROM blocks resizeable. This infrastructure is required for other
functionality we have queued.

Backports commit aaf03019175949eda5087329448b8a0033b89479 from qemu
2018-02-17 17:18:38 -05:00
Michael S. Tsirkin dce38dd8eb
memory: add memory_region_set_size
Add API to change MR size.
Will be used internally for RAM resize.

Backports commit e7af4c67300b3f9382e96f7a6741a5992116b2d2 from qemu
2018-02-17 16:02:26 -05:00
Richard Henderson a276496ebc
tcg: Adjust CODE_GEN_AVG_BLOCK_SIZE
At present, the "average" guestimate of TB size is way too small, leading
to many unused entries in the pre-allocated TB array. For a guest with 1GB
ram, we're currently allocating 256MB for the array.

Survey arm, alpha, aarch64, ppc, sparc, i686, x86_64 guests running on
x86_64 and ppc64 hosts and select a new average. The size of the array
drops to 81MB with no more flushing than before.

Backports commit 126d89e8cdfa3be15d51f76906eaccbcd0023f98 from qemu
2018-02-17 15:24:01 -05:00
Richard Henderson bdf667fd4e
tcg: Check for overflow via highwater mark
We currently pre-compute an worst case code size for any TB, which
works out to be 122kB. Since the average TB size is near 1kB, this
wastes quite a lot of storage.

Instead, check for overflow in between generating code for each opcode.
The overhead of the check isn't measurable and wastage is minimized.

Backports commit b125f9dc7bd68cd4c57189db4da83b0620b28a72 from qemu
2018-02-17 15:24:00 -05:00
Richard Henderson a5ac288135
tcg: Remove gen_intermediate_code_pc
It is no longer used, so tidy up everything reached by it.
This includes the gen_opc_* arrays, the search_pc parameter
and the inline gen_intermediate_code_internal functions.

Backports commit 4e5e1215156662b2b153255c49d4640d82c5568b from qemu
2018-02-17 15:23:59 -05:00
Richard Henderson 66de6cc37c
tcg: Save insn data and use it in cpu_restore_state_from_tb
We can now restore state without retranslation.

Backports commit fca8a500d519a56abeaedf8073167a61d3c6b9c4 from qemu
2018-02-17 15:23:59 -05:00
Paolo Bonzini cab4c979f0
cpu-exec: add a new CF_USE_ICOUNT cflag
Backports commit 0266359e57987d6be53fbcb885f2dd39c1dae940 from qemu
2018-02-17 15:23:58 -05:00
Pavel Dovgalyuk ac46898b3c
cpu-exec: invalidate nocache translation if they are interrupted
In this case, QEMU might longjmp out of cpu-exec.c and miss the final
cleanup in cpu_exec_nocache.  Do this manually through a new compile
flag.

Backports commit d8a499f17ee5f05407874f29f69f0e3e3198a853 from qemu
2018-02-17 15:23:58 -05:00
Richard Henderson 1cbd175736
tcg: Pass data argument to restore_state_to_opc
The gen_opc_* arrays are already redundant with the data stored in
the insn_start arguments. Transition restore_state_to_opc to use
data from the latter.

Backports commit bad729e272387de7dbfa3ec4319036552fc6c107 from qemu
2018-02-17 15:23:58 -05:00
Peter Crosthwaite afb48e9fc5
cputlb: Change tlb_set_dirty() arg to cpu
Change tlb_set_dirty() to accept a CPU instead of an env pointer. This
allows for removal of another CPUArchState usage from prototypes that
need to be QOMified.

Backports commit bcae01e468d961ad9afaf4148329147e4be209ab from qemu
2018-02-17 15:23:52 -05:00
Paolo Bonzini 195a86283f
exec: make mmap_lock/mmap_unlock globally available
There is some iffy lock hierarchy going on in translate-all.c. To
fix it, we need to take the mmap_lock in cpu-exec.c. Make the
functions globally available.

Backports commit 8fd19e6cfd5b6cdf028c6ac2ff4157ed831ea3a6 from qemu
2018-02-17 15:23:49 -05:00
Pavel Dovgalyuk 4a05c9ee28
cpu-exec: introduce loop exit with restore function
This patch introduces loop exit function, which also
restores guest CPU state according to the value of host
program counter.

Backports commit 1c3c8af1fb40a481c07749e0448644d9b7700415 from qemu
2018-02-17 15:23:38 -05:00
Pavel Dovgalyuk 28f154129b
softmmu: remove now unused functions
Now that the cpu_ld/st_* function directly call helper_ret_ld/st, we can
drop the old helper_ld/st functions.

Backports commit b8611499b940b1b4db67aa985e3a844437bcbf00 from qemu
2018-02-17 15:23:38 -05:00
Pavel Dovgalyuk 6cdaaf9b1b
softmmu: add helper function to pass through retaddr
This patch introduces several helpers to pass return address
which points to the TB. Correct return address allows correct
restoring of the guest PC and icount. These functions should be used when
helpers embedded into TB invoke memory operations.

Backports commit 282dffc8a4bfe8724548cabb8a26698bde0a6e18 from qemu
2018-02-17 15:23:38 -05:00
Benjamin Herrenschmidt 1722be3e73
tlb: Add ifetch argument to cpu_mmu_index()
This is set to true when the index is for an instruction fetch
translation.

The core get_page_addr_code() sets it, as do the SOFTMMU_CODE_ACCESS
acessors.

All targets ignore it for now, and all other callers pass "false".

This will allow targets who wish to split the mmu index between
instruction and data accesses to do so. A subsequent patch will
do just that for PowerPC.

Backports commit 97ed5ccdee95f0b98bedc601ff979e368583472c from qemu
2018-02-17 15:23:37 -05:00
Lioncash f81894dddb
exec: Add semihosting stubs 2018-02-17 15:23:33 -05:00
Peter Maydell 6e94bda144
cputlb: Add functions for flushing TLB for a single MMU index
Guest CPU TLB maintenance operations may be sufficiently
specialized to only need to flush TLB entries corresponding
to a particular MMU index. Implement cputlb functions for
this, to avoid the inefficiency of flushing TLB entries
which we don't need to.

Backports commit d7a74a9d4a68e27b3a8ceda17bb95cb0a23d8e4d from qemu
2018-02-17 15:23:31 -05:00
Peter Crosthwaite 590c3dbb76
cpu_defs: Simplify CPUTLB padding logic
There was a complicated subtractive arithmetic for determining the
padding on the CPUTLBEntry structure. Simplify this with a union.

Backports commit b4a4b8d0e0767c85946fd8fc404643bf5766351a from qemu
2018-02-17 15:23:27 -05:00
Peter Crosthwaite 9e23308b66
cpu: Change cpu_exec_init() arg to cpu, not env
The callers (most of them in target-foo/cpu.c) to this function all
have the cpu pointer handy. Just pass it to avoid an ENV_GET_CPU() from
core code (in exec.c).

Backports commit 4bad9e392e788a218967167a38ce2ae7a32a6231 from qemu
2018-02-17 15:23:18 -05:00
Peter Crosthwaite 8200453545
translate-all: Change tb_flush() env argument to cpu
All of the core-code usages of this API have the cpu pointer handy so
pass it in. There are only 3 architecture specific usages (2 of which
are commented out) which can just use ENV_GET_CPU() locally to get the
cpu pointer. The reduces core code usage of the CPU env, which brings
us closer to common-obj'ing these core files.

Backports commit bbd77c180d7ff1b04a7661bb878939b2e1d23798 from qemu
2018-02-17 15:23:18 -05:00
Peter Crosthwaite 13b919f5c8
cpu-all: complete real host page size API
Currently the "host" page size alignment API is really aligning to both
host and target page sizes. There is the qemu_real_page_size which can
be used for the actual host page size but it's missing a mask and ALIGN
macro as provided for qemu_page_size. Complete the API. This allows
system level code that cares about the host page size to use a
consistent alignment interface without having to un-needingly align to
the target page size. This also reduces system level code dependency
on the cpu specific TARGET_PAGE_SIZE.

Backports commit 4e51361d79289aee2985dfed472f8d87bd53a8df from qemu
2018-02-17 15:23:16 -05:00
Peter Maydell 2f3f2ae092
Stop including qemu-common.h in memory.h
Including qemu-common.h from other header files is generally a bad
idea, because it means it's very easy to end up with a circular
dependency. For instance, if we wanted to include memory.h from
qom/cpu.h we'd end up with this loop:
memory.h -> qemu-common.h -> cpu.h -> cpu-qom.h -> qom/cpu.h -> memory.h

Remove the include from memory.h. This requires us to fix up a few
other files which were inadvertently getting declarations indirectly
through memory.h.

The biggest change is splitting the fprintf_function typedef out
into its own header so other headers can get at it without having
to include qemu-common.h.

Backports commit fba0a593b2809ecdda68650952cf3d3332ac1990 from qemu
2018-02-17 15:23:16 -05:00
Jan Kiszka b93c24ba31
memory: Add global-locking property to memory regions
This introduces the memory region property "global_locking". It is true
by default. By setting it to false, a device model can request BQL-free
dispatching of region accesses to its r/w handlers. The actual BQL
break-up will be provided in a separate patch.

Backports commit 196ea13104f802c508e57180b2a0d2b3418989a3 from qemu
2018-02-17 15:23:16 -05:00
Peter Crosthwaite 82a22d8f3a
cpu-defs: Move out TB_JMP defines
These are not Architecture specific in any way so move them out of
cpu-defs.h. tb-hash.h is an appropriate place as a leading user and
their strong relationship to TB hashing and caching.

Backports commit 41da4bd6420afd1209c408974920f63ff9c658e1 from qemu
2018-02-17 15:23:15 -05:00
Peter Crosthwaite 09d23c6604
include/exec: Move tb hash functions out
This is one of very few things in exec-all with a genuine CPU
architecture dependency. Move these hashing helpers to a new
header to trim exec-all.h down to a near architecture-agnostic
header.

The defs are only used by cpu-exec and translate-all which are both
arch-obj's so the new tb-hash.h has no core code usage.

Backports commit e1b89321bafea9fb33d87852fc91fee579d17dfe from qemu
2018-02-17 15:23:15 -05:00
Peter Crosthwaite 860e4184df
include/exec: Move standard exceptions to cpu-all.h
These exception indicies are generic and don't have any reliance on the
per-arch cpu.h defs. Move them to cpu-all.h so they can be used by core
code that does not have access to cpu-defs.h.

Backports commit 9e0dc48c9f05505b53cb28f860456a0648e56ddf from qemu
2018-02-17 15:23:15 -05:00
Peter Crosthwaite a591219ad6
cpu-defs: Move CPU_TEMP_BUF_NLONGS to tcg
The usages of this define are pure TCG and there is no architecture
specific variation of the value. Localise it to the TCG engine to
remove another architecture agnostic piece from cpu-defs.h.

This follows on from a28177820a868eafda8fab007561cc19f41941f4 where
temp_buf was moved out of the CPU_COMMON obsoleting the need for
the super early definition.

Backports commit 6e0b07306d1793e8402dd218d2e38a7377b5fc27 from qemu
2018-02-17 15:23:15 -05:00
Aurelien Jarno 93df793d4d
softmmu: provide tlb_vaddr_to_host function for user mode
To avoid to many #ifdef in target code, provide a tlb_vaddr_to_host for
both user and softmmu modes. In the first case the function always
succeed and just call the g2h function.

Backports commit 2e83c496261c799b0fe6b8e18ac80cdc0a5c97ce from qemu
2018-02-17 15:22:43 -05:00
Paolo Bonzini dc80b0893f
target-i386: introduce cpu_get_mem_attrs
Backports commit f794aa4a2fd772a3ec413c4e478cc23857cfee98 from qemu
2018-02-13 11:33:39 -05:00
Stefan Hajnoczi fc7b95d06a
memory: replace cpu_physical_memory_reset_dirty() with test-and-clear
The cpu_physical_memory_reset_dirty() function is sometimes used
together with cpu_physical_memory_get_dirty(). This is not atomic since
two separate accesses to the dirty memory bitmap are made.

Turn cpu_physical_memory_reset_dirty() and
cpu_physical_memory_clear_dirty_range_type() into the atomic
cpu_physical_memory_test_and_clear_dirty().

Backports commit 03eebc9e3246b9b3f5925aa41f7dfd7c1e467875 from qemu
2018-02-13 11:25:45 -05:00
Stefan Hajnoczi 18ccd4b5be
memory: use atomic ops for setting dirty memory bits
Use set_bit_atomic() and bitmap_set_atomic() so that multiple threads
can dirty memory without race conditions.

Backports commit d114875b9a1c21162f69a12d72f69a22e7bab376 from qemu
2018-02-13 11:07:48 -05:00
Paolo Bonzini 6d509f7333
exec: only check relevant bitmaps for cleanliness
Most of the time, not all bitmaps have to be marked as dirty;
do not do anything if the interesting ones are already dirty.
Previously, any clean bitmap would have cause all the bitmaps to be
marked dirty.

In fact, unless running TCG most of the time bitmap operations need
not be done at all, because memory_region_is_logging returns zero.
In this case, skip the call to cpu_physical_memory_range_includes_clean
altogether as well.

With this patch, cpu_physical_memory_set_dirty_range is called
unconditionally, so there need not be anymore a separate call to
xen_modified_memory.

Backports commit e87f7778b64d4a6a78e16c288c7fdc6c15317d5f from qemu
2018-02-13 11:03:26 -05:00
Paolo Bonzini 6bbfcf65e8
memory: do not touch code dirty bitmap unless TCG is enabled
cpu_physical_memory_set_dirty_lebitmap unconditionally syncs the
DIRTY_MEMORY_CODE bitmap. This however is unused unless TCG is
enabled.

Backports commit 9460dee4b2258e3990906fb34099481c8334c267 from qemu
2018-02-13 10:48:14 -05:00
Paolo Bonzini 1b1f82cef7
exec: invert return value of cpu_physical_memory_get_clean, rename
While it is obvious that cpu_physical_memory_get_dirty returns true even if
a single page is dirty, the same is not true for cpu_physical_memory_get_clean;
one would expect that it returns true only if all the pages are clean, but
it actually looks for even one clean page. (By contrast, the caller of that
function, cpu_physical_memory_range_includes_clean, has a good name).

To clarify, rename the function to cpu_physical_memory_all_dirty and return
true if _all_ the pages are dirty. This is the opposite of the previous
meaning, because "all are 1" is the same as "not (any is 0)", so we have to
modify cpu_physical_memory_range_includes_clean as well

Backports commit 72b47e79cef36ed6ffc718f10e21001d7ec2a66f from qemu
2018-02-13 09:54:12 -05:00
Paolo Bonzini f578c89e8b
cputlb: remove useless arguments to tlb_unprotect_code_phys, rename
These days modification of the TLB is done in notdirty_mem_write,
so the virtual address and env pointer as unnecessary.

The new name of the function, tlb_unprotect_code, is consistent with
tlb_protect_code.

Backports commit 9564f52da7eb061326956ed9a468935e3352512d from qemu
2018-02-13 09:07:41 -05:00
Lioncash 72c8e4d264
exec: move functions to translate-all.h
Remove them from the sundry exec-all.h header, since they are only used by
the TCG runtime in exec.c and user-exec.c.

Backports commit 1652b974766401743879d78f796f44b8929b0787 from qemu
2018-02-13 09:01:45 -05:00
Paolo Bonzini c82ea2b20b
memory: track DIRTY_MEMORY_CODE in mr->dirty_log_mask
DIRTY_MEMORY_CODE is only needed for TCG. By adding it directly to
mr->dirty_log_mask, we avoid testing for TCG everywhere a region is
checked for the enabled/disabled state of dirty logging.

Backports commit 677e7805cf95f3b2bca8baf0888d1ebed7f0c606 from qemu
2018-02-13 08:55:42 -05:00
Paolo Bonzini e3d1cef8fb
memory: prepare for multiple bits in the dirty log mask
When the dirty log mask will also cover other bits than DIRTY_MEMORY_VGA,
some listeners may be interested in the overall zero/non-zero value of
the dirty log mask; others may be interested in the value of single bits.

For this reason, always call log_start/log_stop if bits have respectively
appeared or disappeared, and pass the old and new values of the dirty log
mask so that listeners can distinguish the kinds of change.

For example, KVM checks if dirty logging used to be completely disabled
(in log_start) or is now completely disabled (in log_stop). On the
other hand, Xen has to check manually if DIRTY_MEMORY_VGA changed,
since that is the only bit it cares about.

Backports commit b2dfd71c4843a762f2befe702adb249cf55baf66 from qemu
2018-02-13 08:52:23 -05:00
Paolo Bonzini 1551573acc
memory: differentiate memory_region_is_logging and memory_region_get_dirty_log_mask
For now memory regions only track DIRTY_MEMORY_VGA individually, but
this will change soon. To support this, split memory_region_is_logging
in two functions: one that returns a given bit from dirty_log_mask,
and one that returns the entire mask. memory_region_is_logging gets an
extra parameter so that the compiler flags misuse.

While VGA-specific users (including the Xen listener!) will want to keep
checking that bit, KVM and vhost check for "any bit except migration"
(because migration is handled via the global start/stop listener
callbacks).

Backports commit 2d1a35bef0ed96b3f23535e459c552414ccdbafd from qemu
2018-02-13 08:41:44 -05:00
Paolo Bonzini 96e7e32972
softmmu: support up to 12 MMU modes
At 8k per TLB (for 64-bit host or target), 8 or more modes
make the TLBs bigger than 64k, and some RISC TCG backends do
not like that. On the affected hosts, cut the TLB size in
half---there is still a measurable speedup on PPC with the
next patch.

Backports commit 1de29aef17a7d70dbc04a7fe51e18942e3ebe313 from qemu
2018-02-13 08:34:52 -05:00
Peter Maydell e1a7c13fb4
target-arm: Add user-mode transaction attribute
Add a transaction attribute indicating that a memory access is being
done from user-mode (unprivileged). This corresponds to an equivalent
signal in ARM AMBA buses.

Backports commit 0995bf8cd91b81ec9c1078e37b808794080dc5c0 from qemu
2018-02-12 20:41:58 -05:00
Peter Maydell 6c8b7e0fed
target-arm: Honour NS bits in page tables
Honour the NS bit in ARM page tables:
* when adding entries to the TLB, include the Secure/NonSecure
transaction attribute
* set the NS bit in the PAR when doing ATS operations

Note that we don't yet correctly use the NSTable bit to
cause the page table walk itself to use the right attributes.

Backports commit 8bf5b6a9c1911d2c8473385fc0cebfaaeef42dbc from qem
2018-02-12 20:36:35 -05:00
Peter Maydell df0fac6b6a
exec.c: Add new address_space_ld*/st* functions
Add new address_space_ld*/st* functions which allow transaction
attributes and error reporting for basic load and stores. These
are named to be in line with the address_space_read/write/rw
buffer operations.

The existing ld/st*_phys functions are now wrappers around
the new functions.

Backports commit 500131154d677930fce35ec3a6f0b5a26bcd2973 from qemu
2018-02-12 19:22:47 -05:00
Peter Maydell b94c89e559
exec.c: Make address_space_rw take transaction attributes
Make address_space_rw take transaction attributes, rather
than always using the 'unspecified' attributes.

Backports commit 5c9eb0286c819c1836220a32f2e1a7b5004ac79a from qemu
2018-02-12 19:04:09 -05:00
Peter Maydell 933e3bd8d1
Add MemTxAttrs to the IOTLB
Add a MemTxAttrs field to the IOTLB, and allow target-specific
code to set it via a new tlb_set_page_with_attrs() function;
pass the attributes through to the device when making IO accesses.

Backports commit fadc1cbe85c6b032d5842ec0d19d209f50fcb375 from qemu
2018-02-12 18:38:38 -05:00
Peter Maydell 2aecce835b
Make CPU iotlb a structure rather than a plain hwaddr
Make the CPU iotlb a structure rather than a plain hwaddr;
this will allow us to add transaction attributes to it.

Backports commit e469b22ffda40188954fafaf6e3308f58d50f8f8 from qemu
2018-02-12 18:34:05 -05:00
Peter Maydell 825e74410f
memory: Replace io_mem_read/write with memory_region_dispatch_read/write
Rather than retaining io_mem_read/write as simple wrappers around
the memory_region_dispatch_read/write functions, make the latter
public and change all the callers to use them, since we need to
touch all the callsites anyway to add MemTxAttrs and MemTxResult
support. Delete io_mem_read and io_mem_write entirely.

(All the callers currently pass MEMTXATTRS_UNSPECIFIED
and convert the return value back to bool or ignore it.)

Backports commit 3b6434953934e6d4a776ed426d8c6d6badee176f from qemu
2018-02-12 17:26:52 -05:00
Peter Maydell b2962f4613
memory: Define API for MemoryRegionOps to take attrs and return status
Define an API so that devices can register MemoryRegionOps whose read
and write callback functions are passed an arbitrary pointer to some
transaction attributes and can return a success-or-failure status code.
This will allow us to model devices which:
* behave differently for ARM Secure/NonSecure memory accesses
* behave differently for privileged/unprivileged accesses
* may return a transaction failure (causing a guest exception)
for erroneous accesses

This patch defines the new API and plumbs the attributes parameter through
to the memory.c public level functions io_mem_read() and io_mem_write(),
where it is currently dummied out.

The success/failure response indication is also propagated out to
io_mem_read() and io_mem_write(), which retain the old-style
boolean true-for-error return.

Backports commit cc05c43ad942165ecc6ffd39e41991bee43af044 from qemu
2018-02-12 17:17:27 -05:00
Paolo Bonzini a46accd252
exec: make iotlb RCU-friendly
After the previous patch, TLBs will be flushed on every change to
the memory mapping. This patch augments that with synchronization
of the MemoryRegionSections referred to in the iotlb array.

With this change, it is guaranteed that iotlb_to_region will access
the correct memory map, even once the TLB will be accessed outside
the BQL.

Backports commit 9d82b5a792236db31a75b9db5c93af69ac07c7c5 from qemu
2018-02-12 15:20:39 -05:00
Paolo Bonzini 3fbda890df
exec: introduce cpu_reload_memory_map
This for now is a simple TLB flush. This can change later for two
reasons:

1) an AddressSpaceDispatch will be cached in the CPUState object

2) it will not be possible to do tlb_flush once the TCG-generated code
runs outside the BQL.

Backports commit 76e5c76f2e2e0d20bab2cd5c7a87452f711654fb from qemu
2018-02-12 15:09:49 -05:00
Peter Maydell 9d02c52b8a
cpu_ldst.h: Allow NB_MMU_MODES to be 7
Support guest CPUs which need 7 MMU index values.
Add a comment about what would be required to raise the limit
further (trivial for 8, TCG backend rework for 9 or more).

Backports commit 8f3ae2ae2d02727f6d56610c09d7535e43650dd4 from qemu
2018-02-12 11:21:19 -05:00
Richard Henderson 232632e76c
tcg: Change translator-side labels to a pointer
This is improved type checking for the translators -- it's no longer
possible to accidentally swap arguments to the branch functions.

Note that the code generating backends still manipulate labels as int.

With notable exceptions, the scope of the change is just a few lines
for each target, so it's not worth building extra machinery to do this
change in per-target increments.

Backports commit 42a268c241183877192c376d03bd9b6d527407c7 from qemu
2018-02-09 14:17:56 -05:00
Lioncash 0273e6ae18
tcg: Put opcodes in a linked list
The previous setup required ops and args to be completely sequential,
and was error prone when it came to both iteration and optimization.
2018-02-09 12:54:05 -05:00
Richard Henderson 78378289e3
tcg: Move emit of INDEX_op_end into gen_tb_end
Backports commit 0a7df5da986bd7ee0789f2d7b8611f2e8eee5046 from qemu
2018-02-09 08:51:01 -05:00
Nguyen Anh Quynh 6ea39f7d5a merge msvc with master 2017-02-24 10:39:36 +08:00