This patch converts the old "is_write" bool into IOMMUAccessFlags. The
difference is that "is_write" can only express either read/write, but
sometimes what we really want is "none" here (neither read nor write).
Replay is an good example - during replay, we should not check any RW
permission bits since thats not an actual IO at all.
Backports commit bf55b7afce53718ef96f4e6616da62c0ccac37dd from qemu
MemoryRegionCache did not know about virtio support for IOMMUs (because the
two features were developed at the same time). Revert MemoryRegionCache
to "normal" address_space_* operations for 2.9, as it is simpler than
undoing the virtio patches.
Backports commit 90c4fe5fc517a045e7a7cf2f23472e114042ca29 from qemu
The 'name' parameter to memory_region_init_* had been marked as debug
only, however vmstate_region_ram uses it as a parameter to
qemu_ram_set_idstr to set RAMBlock names and these form part of the
migration stream.
Backports commit e8f5fe2de125a0bfbefbaa6a69af81f4817cb7a0 from qemu
This patch introduces a helper to query the iotlb entry for a
possible iova. This will be used by later device IOTLB API to enable
the capability for a dataplane (e.g vhost) to query the IOTLB.
Backports commit 052c8fa9983f553fdfa0d61034774070dd639c2b from qemu
Device models often have to perform multiple access to a single
memory region that is known in advance, but would to use "DMA-style"
functions instead of address_space_map/unmap. This can happen
for example when the data has to undergo endianness conversion.
Introduce a new data structure to cache the result of
address_space_translate without forcing usage of a host address
like address_space_map does.
Backports commit 1f4e496e1fc2eb6c8bf377a0f9695930c380bfd3 from qemu
Templatize the address_space_* and *_phys functions, so that we can add
similar functions in the next patch that work with a lightweight,
cache-like version of address_space_map/unmap.
Backports commit 0ce265ffef87f19f4dd1ff0663e09a63d66ae408 from qemu
This speeds up MEMORY_LISTENER_CALL noticeably. Right now,
with many PCI devices you have N regions added to M AddressSpaces
(M = # PCI devices with bus-master enabled) and each call looks
up the whole listener list, with at least M listeners in it.
Because most of the regions in N are BARs, which are also roughly
proportional to M, the whole thing is O(M^3). This changes it
to O(M^2), which is the best we can do without rewriting the
whole thing.
Backports commit 9a54635dcb51a3fcf7507af630168f514a8cd4e7 from qemu
With a vfio assigned device we lay down a base MemoryRegion registered
as an IO region, giving us read & write accessors. If the region
supports mmap, we lay down a higher priority sub-region MemoryRegion
on top of the base layer initialized as a RAM device pointer to the
mmap. Finally, if we have any quirks for the device (ie. address
ranges that need additional virtualization support), we put another IO
sub-region on top of the mmap MemoryRegion. When this is flattened,
we now potentially have sub-page mmap MemoryRegions exposed which
cannot be directly mapped through KVM.
This is as expected, but a subtle detail of this is that we end up
with two different access mechanisms through QEMU. If we disable the
mmap MemoryRegion, we make use of the IO MemoryRegion and service
accesses using pread and pwrite to the vfio device file descriptor.
If the mmap MemoryRegion is enabled and results in one of these
sub-page gaps, QEMU handles the access as RAM, using memcpy to the
mmap. Using either pread/pwrite or the mmap directly should be
correct, but using memcpy causes us problems. I expect that not only
does memcpy not necessarily honor the original width and alignment in
performing a copy, but it potentially also uses processor instructions
not intended for MMIO spaces. It turns out that this has been a
problem for Realtek NIC assignment, which has such a quirk that
creates a sub-page mmap MemoryRegion access.
To resolve this, we disable memory_access_is_direct() for ram_device
regions since QEMU assumes that it can use memcpy for those regions.
Instead we access through MemoryRegionOps, which replaces the memcpy
with simple de-references of standard sizes to the host memory.
With this patch we attempt to provide unrestricted access to the RAM
device, allowing byte through qword access as well as unaligned
access. The assumption here is that accesses initiated by the VM are
driven by a device specific driver, which knows the device
capabilities. If unaligned accesses are not supported by the device,
we don't want them to work in a VM by performing multiple aligned
accesses to compose the unaligned access. A down-side of this
philosophy is that the xp command from the monitor attempts to use
the largest available access weidth, unaware of the underlying
device. Using memcpy had this same restriction, but at least now an
operator can dump individual registers, even if blocks of device
memory may result in access widths beyond the capabilities of a
given device (RTL NICs only support up to dword).
Backports commit 1b16ded6a512809f99c133a97f19026fe612b2de from qemu
Setting skip_dump on a MemoryRegion allows us to modify one specific
code path, but the restriction we're trying to address encompasses
more than that. If we have a RAM MemoryRegion backed by a physical
device, it not only restricts our ability to dump that region, but
also affects how we should manipulate it. Here we recognize that
MemoryRegions do not change to sometimes allow dumps and other times
not, so we replace setting the skip_dump flag with a new initializer
so that we know exactly the type of region to which we're applying
this behavior.
Backports commit ca83f87a66d19fdaabf23d4f5ebb49396fe232c1 from qemu
It doesn't make sense to pass a NULL ops argument to
memory_region_init_rom_device(), because the effect will
be that if the guest tries to write to the memory region
then QEMU will segfault. Catch the bug earlier by sanity
checking the arguments to this function, and remove the
misleading documentation that suggests that passing NULL
might be sensible.
Backports commit 39e0b03dec518254fabd2acff29548d3f1d2b754 from qemu
Provide a new helper function memory_region_init_rom() for memory
regions which are read-only (and unlike those created by
memory_region_init_rom_device() don't have special behaviour
for writes). This has the same behaviour as calling
memory_region_init_ram() and then memory_region_set_readonly()
(which is what we do today in boards with pure ROMs) but is a
more easily discoverable API for the purpose.
Backports commit a1777f7f6462c66e1ee6e98f0d5c431bfe988aa5 from qemu
The IOMMU driver may change behavior depending on whether a notifier
client is present. In the case of POWER, this represents a change in
the visibility of the IOTLB, for other drivers such as intel-iommu and
future AMD-Vi emulation, notifier support is not yet enabled and this
provides the opportunity to flag that incompatibility.
Backports commit d22d8956b185c002b50a4d0883aff61f857347ef from qemu
Every IOMMU has some granularity which MemoryRegionIOMMUOps::translate
uses when translating, however this information is not available outside
the translate context for various checks.
This adds a get_min_page_size callback to MemoryRegionIOMMUOps and
a wrapper for it so IOMMU users (such as VFIO) can know the minimum
actual page size supported by an IOMMU.
As IOMMU MR represents a guest IOMMU, this uses TARGET_PAGE_SIZE
as fallback.
This removes vfio_container_granularity() and uses new helper in
memory_region_iommu_replay() when replaying IOMMU mappings on added
IOMMU memory region.
Backports the relevant parts of commit f682e9c244af7166225f4a50cc18ff296bb9d43e from qemu
Let users of qemu_get_ram_ptr and qemu_ram_ptr_length pass in an
address that is relative to the MemoryRegion. This basically means
what address_space_translate returns.
Because the semantics of the second parameter change, rename the
function to qemu_map_ram_ptr.
Backports commit 0878d0e11ba8013dd759c6921cbf05ba6a41bd71 from qemu
Move the old qemu_ram_addr_from_host to memory_region_from_host and
make it return an offset within the region. For qemu_ram_addr_from_host
return the ram_addr_t directly, similar to what it was before
commit 1b5ec23 ("memory: return MemoryRegion from qemu_ram_addr_from_host",
2013-07-04).
Backports commit 07bdaa4196b51bc7ffa7c3f74e9e4a9dc8a7966a from qemu
The collision check does nothing and hasn't been used. Remove the
variable together with related code.
Backports commit b61359781958759317ee6fd1a45b59be0b7dbbe1 from qemu
Disentangle cpu-common.h and memory.h from NEED_CPU_H. Prototypes are
not defined for !NEED_CPU_H, so remove them from poison.h too. Only
macros need poisoning.
Backports commit a7d6039cb35592683ecc56d2b37817da2d2f8b00 from qemu
Just specifying ops = NULL in some cases can be more convenient than having
two functions.
Backports commit 6d6d2abf2c2e52c0f404d0a31a963e945b0cc7ad from qemu
All references to mr->ram_addr are replaced by
memory_region_get_ram_addr(mr) (except for a few assertions that are
replaced with mr->ram_block).
Backports commit 8e41fb63c5bf29ecabe0cee1239bf6230f19978a from qemu
these two functions consume too much cpu overhead to
find the RAMBlock by ram address.
After this patch, we can pass the RAMBlock pointer
to them so that they don't need to find the RAMBlock
anymore most of the time. We can get better performance
in address translation processing.
Backports commit 3655cb9c7375a595a8051ec677c515b24d5c1fe6 from qemu
Each RAM memory region has a unique corresponding RAMBlock.
In the current realization, the memory region only stored
the ram_addr which means the offset of RAM address space,
We need to qurey the global ram.list to find the ram block
by ram_addr if we want to get the ram block, which is very
expensive.
Now, we store the RAMBlock pointer into memory region
structure. So, if we know the mr, we can easily get the
RAMBlock.
Backports commit 58eaa2174e99d9a05172d03fd2799ab8fd9e6f60 from qemu
This will either create a new AS or return a pointer to an
already existing equivalent one, if we have already created
an AS for the specified root memory region.
The motivation is to reuse address spaces as much as possible.
It's going to be quite common that bus masters out in device land
have pointers to the same memory region for their mastering yet
each will need to create its own address space. Let the memory
API implement sharing for them.
Aside from the perf optimisations, this should reduce the amount
of redundant output on info mtree as well.
Thee returned value will be malloced, but the malloc will be
automatically freed when the AS runs out of refs.
Backports commit f0c02d15b57da6f5463e3768aa0cfeedccf4b8f4 from qemu
memcpy can take a large amount of time for small reads and writes.
Handle the common case of reading s/g descriptors from memory (there
is no corresponding "write" case that is as common, because writes
often use address_space_st* functions) by inlining the relevant
parts of address_space_read into the caller.
Backports commit 3cc8f884996584630734a90c9b3c535af81e3c92 from qemu
We want to inline the case where there is only one iteration, because
then the compiler can also inline the memcpy. As a start, extract
everything after the first address_space_translate call.
Backports commit a203ac702e0720135fac8b1f2061d119814c1798 from qemu
For the common case of DMA into non-hotplugged RAM, it is unnecessary
but expensive to do object_ref/unref. Add back an owner field to
MemoryRegion, so that these memory regions can skip the reference
counting.
Backports commit 612263cf33062f7441a5d0e3b37c65991fdc3210 from qemu
Order fields so that all fields accessed during a RAM read/write fit in
the same cache line.
Backports commit a676854f3447019c7c4b005ab6aece905fccfddd from qemu
This makes ROM blocks resizeable. This infrastructure is required for other
functionality we have queued.
Backports commit aaf03019175949eda5087329448b8a0033b89479 from qemu
Including qemu-common.h from other header files is generally a bad
idea, because it means it's very easy to end up with a circular
dependency. For instance, if we wanted to include memory.h from
qom/cpu.h we'd end up with this loop:
memory.h -> qemu-common.h -> cpu.h -> cpu-qom.h -> qom/cpu.h -> memory.h
Remove the include from memory.h. This requires us to fix up a few
other files which were inadvertently getting declarations indirectly
through memory.h.
The biggest change is splitting the fprintf_function typedef out
into its own header so other headers can get at it without having
to include qemu-common.h.
Backports commit fba0a593b2809ecdda68650952cf3d3332ac1990 from qemu
This introduces the memory region property "global_locking". It is true
by default. By setting it to false, a device model can request BQL-free
dispatching of region accesses to its r/w handlers. The actual BQL
break-up will be provided in a separate patch.
Backports commit 196ea13104f802c508e57180b2a0d2b3418989a3 from qemu
The cpu_physical_memory_reset_dirty() function is sometimes used
together with cpu_physical_memory_get_dirty(). This is not atomic since
two separate accesses to the dirty memory bitmap are made.
Turn cpu_physical_memory_reset_dirty() and
cpu_physical_memory_clear_dirty_range_type() into the atomic
cpu_physical_memory_test_and_clear_dirty().
Backports commit 03eebc9e3246b9b3f5925aa41f7dfd7c1e467875 from qemu
DIRTY_MEMORY_CODE is only needed for TCG. By adding it directly to
mr->dirty_log_mask, we avoid testing for TCG everywhere a region is
checked for the enabled/disabled state of dirty logging.
Backports commit 677e7805cf95f3b2bca8baf0888d1ebed7f0c606 from qemu
When the dirty log mask will also cover other bits than DIRTY_MEMORY_VGA,
some listeners may be interested in the overall zero/non-zero value of
the dirty log mask; others may be interested in the value of single bits.
For this reason, always call log_start/log_stop if bits have respectively
appeared or disappeared, and pass the old and new values of the dirty log
mask so that listeners can distinguish the kinds of change.
For example, KVM checks if dirty logging used to be completely disabled
(in log_start) or is now completely disabled (in log_stop). On the
other hand, Xen has to check manually if DIRTY_MEMORY_VGA changed,
since that is the only bit it cares about.
Backports commit b2dfd71c4843a762f2befe702adb249cf55baf66 from qemu
For now memory regions only track DIRTY_MEMORY_VGA individually, but
this will change soon. To support this, split memory_region_is_logging
in two functions: one that returns a given bit from dirty_log_mask,
and one that returns the entire mask. memory_region_is_logging gets an
extra parameter so that the compiler flags misuse.
While VGA-specific users (including the Xen listener!) will want to keep
checking that bit, KVM and vhost check for "any bit except migration"
(because migration is handled via the global start/stop listener
callbacks).
Backports commit 2d1a35bef0ed96b3f23535e459c552414ccdbafd from qemu
Add new address_space_ld*/st* functions which allow transaction
attributes and error reporting for basic load and stores. These
are named to be in line with the address_space_read/write/rw
buffer operations.
The existing ld/st*_phys functions are now wrappers around
the new functions.
Backports commit 500131154d677930fce35ec3a6f0b5a26bcd2973 from qemu
Make address_space_rw take transaction attributes, rather
than always using the 'unspecified' attributes.
Backports commit 5c9eb0286c819c1836220a32f2e1a7b5004ac79a from qemu
Rather than retaining io_mem_read/write as simple wrappers around
the memory_region_dispatch_read/write functions, make the latter
public and change all the callers to use them, since we need to
touch all the callsites anyway to add MemTxAttrs and MemTxResult
support. Delete io_mem_read and io_mem_write entirely.
(All the callers currently pass MEMTXATTRS_UNSPECIFIED
and convert the return value back to bool or ignore it.)
Backports commit 3b6434953934e6d4a776ed426d8c6d6badee176f from qemu
Define an API so that devices can register MemoryRegionOps whose read
and write callback functions are passed an arbitrary pointer to some
transaction attributes and can return a success-or-failure status code.
This will allow us to model devices which:
* behave differently for ARM Secure/NonSecure memory accesses
* behave differently for privileged/unprivileged accesses
* may return a transaction failure (causing a guest exception)
for erroneous accesses
This patch defines the new API and plumbs the attributes parameter through
to the memory.c public level functions io_mem_read() and io_mem_write(),
where it is currently dummied out.
The success/failure response indication is also propagated out to
io_mem_read() and io_mem_write(), which retain the old-style
boolean true-for-error return.
Backports commit cc05c43ad942165ecc6ffd39e41991bee43af044 from qemu