unicorn

mirror of https://github.com/yuzu-emu/unicorn.git synced 2025-11-18 05:34:56 +00:00

Author	SHA1	Message	Date
Alex Bennée	454932263c	cputlb and arm/sparc targets: convert mmuidx flushes from varg to bitmap While the vargs approach was flexible the original MTTCG ended up having munge the bits to a bitmap so the data could be used in deferred work helpers. Instead of hiding that in cputlb we push the change to the API to make it take a bitmap of MMU indexes instead. For ARM some the resulting flushes end up being quite long so to aid readability I've tended to move the index shifting to a new line so all the bits being or-ed together line up nicely, for example: tlb_flush_page_by_mmuidx(other_cs, pageaddr, (1 << ARMMMUIdx_S1SE1) \| (1 << ARMMMUIdx_S1SE0)); Backports commit 0336cbf8532935d8e23c2aabf3e2ce2c0697b6ac from qemu	2018-03-02 10:12:40 -05:00
Alex Bennée	e3e57ca08e	cputlb: drop flush_global flag from tlb_flush We have never has the concept of global TLB entries which would avoid the flush so we never actually use this flag. Drop it and make clear that tlb_flush is the sledge-hammer it has always been. Backports commit d10eb08f5d8389c814b554d01aa2882ac58221bf from qemu	2018-03-01 19:36:04 -05:00
Jason Wang	29932d0719	memory: handle alias in memory_region_is_iommu() Backports commit 12d37882f0c0def5dee1c21be5d8fea9c21baada from qemu	2018-03-01 13:06:18 -05:00
Jason Wang	fdca6292a1	exec: introduce address_space_get_iotlb_entry() This patch introduces a helper to query the iotlb entry for a possible iova. This will be used by later device IOTLB API to enable the capability for a dataplane (e.g vhost) to query the IOTLB. Backports commit 052c8fa9983f553fdfa0d61034774070dd639c2b from qemu	2018-03-01 13:05:08 -05:00
Paolo Bonzini	81ad780e5e	exec: introduce MemoryRegionCache Device models often have to perform multiple access to a single memory region that is known in advance, but would to use "DMA-style" functions instead of address_space_map/unmap. This can happen for example when the data has to undergo endianness conversion. Introduce a new data structure to cache the result of address_space_translate without forcing usage of a host address like address_space_map does. Backports commit 1f4e496e1fc2eb6c8bf377a0f9695930c380bfd3 from qemu	2018-03-01 10:50:30 -05:00
Paolo Bonzini	88ad0f4f6e	exec: introduce memory_ldst.inc.c Templatize the address_space_* and *_phys functions, so that we can add similar functions in the next patch that work with a lightweight, cache-like version of address_space_map/unmap. Backports commit 0ce265ffef87f19f4dd1ff0663e09a63d66ae408 from qemu	2018-03-01 09:59:34 -05:00
Paolo Bonzini	9404dbf74e	cpu-exec: fix icount out-of-bounds access When icount is active, tb_add_jump is surprisingly called with an out of bounds basic block index. I have no idea how that can work, but it does not seem like a good idea. Clear *last_tb for all TB_EXIT_ICOUNT_EXPIRED cases, even when all you have to do is refill icount_extra. Backports commit d8dea6fbcbed177ca5d23ab77b3834a9437f0e88 from qemu	2018-03-01 09:17:26 -05:00
Bobby Bingham	d46e52d9d0	cpu_ldst.h: use correct guest address parameter In the user emulation code path, tlb_vaddr_to_host erronesously passed vaddr as the guest address to be translated, instead of addr, the parameter which actually contained the guest address. This resulted in incorrect addresses being used when emulating block copy (mvc/mvpg) and block clear (xc) instructions for the s390x target. Backports commit c2a85316902e67530da9d6548139fcce73c0cac6 from qemu	2018-03-01 08:56:37 -05:00
Paolo Bonzini	9d64a89acf	tcg: comment on which functions have to be called with tb_lock held softmmu requires more functions to be thread-safe, because translation blocks can be invalidated from e.g. notdirty callbacks. Probably the same holds for user-mode emulation, it's just that no one has ever tried to produce a coherent locking there. This patch will guide the introduction of more tb_lock and tb_unlock calls for system emulation. Note that after this patch some (most) of the mentioned functions are still called outside tb_lock/tb_unlock. The next one will rectify this. Backports commit 7d7500d99895f888f97397ef32bb536bb0df3b74 from qemu	2018-02-28 10:26:28 -05:00
Alex Bennée	7aab0bd9a6	translate-all: add DEBUG_LOCKING asserts This adds asserts to check the locking on the various translation engines structures. There are two sets of structures that are protected by locks. The first the l1map and PageDesc structures used to track which translation blocks are associated with which physical addresses. In user-mode this is covered by the mmap_lock. The second case are TB context related structures which are protected by tb_lock which is also user-mode only. Currently the asserts do nothing in SoftMMU mode but this will change for MTTCG. Backports commit 301e40ed8005306c009978be295ed9a4b725178b from qemu	2018-02-28 08:56:15 -05:00
Yongbok Kim	79e4c001a9	softmmu: Add probe_write() Probe for whether the specified guest write access is permitted. If it is not permitted then an exception will be taken in the same way as if this were a real write access (and we will not return). Otherwise the function will return, and there will be a valid entry in the TLB for this access. Backports commit 3b4afc9e75ab1a95f33e41f462921093f8a109c4 from qemu	2018-02-27 12:20:50 -05:00
Richard Henderson	e35aacd5ae	tcg: Add EXCP_ATOMIC When we cannot emulate an atomic operation within a parallel context, this exception allows us to stop the world and try again in a serial context. Backports commit fdbc2b5722f6092e47181a947c90fd4bdcc1c121 from qemu Also backports parts of commit 02d57ea115b7669f588371c86484a2e8ebc369be	2018-02-27 11:57:58 -05:00
Peter Maydell	db8b0a82b1	cpu: Support a target CPU having a variable page size Support target CPUs having a page size which isn't knownn at compile time. To use this, the CPU implementation should: * define TARGET_PAGE_BITS_VARY * not define TARGET_PAGE_BITS * define TARGET_PAGE_BITS_MIN to the smallest value it might possibly want for TARGET_PAGE_BITS * call set_preferred_target_page_bits() in its realize function to indicate the actual preferred target page size for the CPU (and report any error from it) In CONFIG_USER_ONLY, the CPU implementation should continue to define TARGET_PAGE_BITS appropriately for the guest OS page size. Machines which want to take advantage of having the page size something larger than TARGET_PAGE_BITS_MIN must set the MachineClass minimum_page_bits field to a value which they guarantee will be no greater than the preferred page size for any CPU they create. Note that changing the target page size by setting minimum_page_bits is a migration compatibility break for that machine. For debugging purposes, attempts to use TARGET_PAGE_SIZE before it has been finally confirmed will assert. Backports commit 20bccb82ff3ea09bcb7c4ee226d3160cab15f7da from qemu	2018-02-26 12:29:08 -05:00
Paolo Bonzini	eb75004013	memory: add a per-AddressSpace list of listeners This speeds up MEMORY_LISTENER_CALL noticeably. Right now, with many PCI devices you have N regions added to M AddressSpaces (M = # PCI devices with bus-master enabled) and each call looks up the whole listener list, with at least M listeners in it. Because most of the regions in N are BARs, which are also roughly proportional to M, the whole thing is O(M^3). This changes it to O(M^2), which is the best we can do without rewriting the whole thing. Backports commit 9a54635dcb51a3fcf7507af630168f514a8cd4e7 from qemu	2018-02-26 10:46:50 -05:00
Paolo Bonzini	4b06e8bbb7	memory: eliminate global MemoryListeners There is none, so just drop the code. Backports commit d45fa784cd0c111131696808d1168259d66b7519 from qemu	2018-02-26 10:19:28 -05:00
Richard Henderson	66d79ac959	tcg: Merge GETPC and GETRA The return address argument to the softmmu template helpers was confused. In the legacy case, we wanted to indicate that there is no return address, and so passed in NULL. However, we then immediately subtracted GETPC_ADJ from NULL, resulting in a non-zero value, indicating the presence of an (invalid) return address. Push the GETPC_ADJ subtraction down to the only point it's required: immediately before use within cpu_restore_state_from_tb, after all NULL pointer checks have been completed. This makes GETPC and GETRA identical. Remove GETRA as the lesser used macro, replacing all uses with GETPC. Backports commit 01ecaf438b1eb46abe23392c8ce5b7628b0c8cf5 from qemu	2018-02-26 02:54:44 -05:00
Paolo Bonzini	30845ae475	tcg: Prepare TB invalidation for lockless TB lookup When invalidating a translation block, set an invalid flag into the TranslationBlock structure first. It is also necessary to check whether the target TB is still valid after acquiring 'tb_lock' but before calling tb_add_jump() since TB lookup is to be performed out of 'tb_lock' in future. Note that we don't have to check 'last_tb'; an already invalidated TB will not be executed anyway and it is thus safe to patch it. Backports commit 6d21e4208f382dd8ca1f7995a6dd9ea7ca281163 from qemu	2018-02-26 01:48:13 -05:00
Alex Williamson	fe66c2e088	memory: Don't use memcpy for ram_device regions With a vfio assigned device we lay down a base MemoryRegion registered as an IO region, giving us read & write accessors. If the region supports mmap, we lay down a higher priority sub-region MemoryRegion on top of the base layer initialized as a RAM device pointer to the mmap. Finally, if we have any quirks for the device (ie. address ranges that need additional virtualization support), we put another IO sub-region on top of the mmap MemoryRegion. When this is flattened, we now potentially have sub-page mmap MemoryRegions exposed which cannot be directly mapped through KVM. This is as expected, but a subtle detail of this is that we end up with two different access mechanisms through QEMU. If we disable the mmap MemoryRegion, we make use of the IO MemoryRegion and service accesses using pread and pwrite to the vfio device file descriptor. If the mmap MemoryRegion is enabled and results in one of these sub-page gaps, QEMU handles the access as RAM, using memcpy to the mmap. Using either pread/pwrite or the mmap directly should be correct, but using memcpy causes us problems. I expect that not only does memcpy not necessarily honor the original width and alignment in performing a copy, but it potentially also uses processor instructions not intended for MMIO spaces. It turns out that this has been a problem for Realtek NIC assignment, which has such a quirk that creates a sub-page mmap MemoryRegion access. To resolve this, we disable memory_access_is_direct() for ram_device regions since QEMU assumes that it can use memcpy for those regions. Instead we access through MemoryRegionOps, which replaces the memcpy with simple de-references of standard sizes to the host memory. With this patch we attempt to provide unrestricted access to the RAM device, allowing byte through qword access as well as unaligned access. The assumption here is that accesses initiated by the VM are driven by a device specific driver, which knows the device capabilities. If unaligned accesses are not supported by the device, we don't want them to work in a VM by performing multiple aligned accesses to compose the unaligned access. A down-side of this philosophy is that the xp command from the monitor attempts to use the largest available access weidth, unaware of the underlying device. Using memcpy had this same restriction, but at least now an operator can dump individual registers, even if blocks of device memory may result in access widths beyond the capabilities of a given device (RTL NICs only support up to dword). Backports commit 1b16ded6a512809f99c133a97f19026fe612b2de from qemu	2018-02-25 23:06:36 -05:00
Alex Williamson	5db45219c9	memory: Replace skip_dump flag with ram_device Setting skip_dump on a MemoryRegion allows us to modify one specific code path, but the restriction we're trying to address encompasses more than that. If we have a RAM MemoryRegion backed by a physical device, it not only restricts our ability to dump that region, but also affects how we should manipulate it. Here we recognize that MemoryRegions do not change to sometimes allow dumps and other times not, so we replace setting the skip_dump flag with a new initializer so that we know exactly the type of region to which we're applying this behavior. Backports commit ca83f87a66d19fdaabf23d4f5ebb49396fe232c1 from qemu	2018-02-25 23:00:45 -05:00
Richard Henderson	1547048a22	tcg: Reorg TCGOp chaining Instead of using -1 as end of chain, use 0, and link through the 0 entry as a fully circular double-linked list. Backports commit dcb8e75870e2de199db853697f8839cb603beefe from qemu	2018-02-25 21:44:50 -05:00
Igor Mammedov	62c89b9cd4	exec: Reduce CONFIG_USER_ONLY ifdeffenery Backports commit 1bc7e522d9cf1b58f2de9c8f1737be0bb5129c35 from qemu	2018-02-25 20:57:48 -05:00
Markus Armbruster	c2ffbc575d	Clean up decorations and whitespace around header guards Cleaned up with scripts/clean-header-guards.pl. Backports commit 175de52487ce0b0c78daa4cdf41a5a465a168a25 from qemu	2018-02-25 04:26:02 -05:00
Markus Armbruster	1275b9b459	Clean up ill-advised or unusual header guards Cleaned up with scripts/clean-header-guards.pl. Backports commit 2a6a4076e117113ebec97b1821071afccfdfbc96 from qemu	2018-02-25 04:22:46 -05:00
Markus Armbruster	9ae2fc4d9e	Clean up header guards that don't match their file name Header guard symbols should match their file name to make guard collisions less likely. Offenders found with scripts/clean-header-guards.pl -vn. Cleaned up with scripts/clean-header-guards.pl, followed by some renaming of new guard symbols picked by the script to better ones. Backports commit 121d07125bb6d7079c7ebafdd3efe8c3a01cc440 from qemu	2018-02-25 04:18:42 -05:00
Markus Armbruster	60e8836b74	Use #include "..." for our own headers, <...> for others Tracked down with an ugly, brittle and probably buggy Perl script. Also move includes converted to <...> up so they get included before ours where that's obviously okay. Backports commit a9c94277f07d19d3eb14f199c3e93491aa3eae0e from qemu	2018-02-25 04:10:33 -05:00
Sergey Sorokin	d1e4ac0451	Fix confusing argument names in some common functions There are functions tlb_fill(), cpu_unaligned_access() and do_unaligned_access() that are called with access type and mmu index arguments. But these arguments are named 'is_write' and 'is_user' in their declarations. The patches fix the arguments to avoid a confusion. Backports commit b35399bb4e9968296a12303b00f9f2066470e987 from qemu	2018-02-25 03:58:27 -05:00
Sergey Sorokin	e4d123caa9	tcg: Improve the alignment check infrastructure Some architectures (e.g. ARMv8) need the address which is aligned to a size more than the size of the memory access. To support such check it's enough the current costless alignment check implementation in QEMU, but we need to support an alignment size specifying. Backports commit 1f00b27f17518a1bcb4cedca49eaec96a4d560bd from qemu	2018-02-25 02:23:28 -05:00
Peter Maydell	efc6cc2b83	memory: Assert that memory_region_init_rom_device() ops aren't NULL It doesn't make sense to pass a NULL ops argument to memory_region_init_rom_device(), because the effect will be that if the guest tries to write to the memory region then QEMU will segfault. Catch the bug earlier by sanity checking the arguments to this function, and remove the misleading documentation that suggests that passing NULL might be sensible. Backports commit 39e0b03dec518254fabd2acff29548d3f1d2b754 from qemu	2018-02-25 00:29:52 -05:00
Peter Maydell	334e951ec1	memory: Provide memory_region_init_rom() Provide a new helper function memory_region_init_rom() for memory regions which are read-only (and unlike those created by memory_region_init_rom_device() don't have special behaviour for writes). This has the same behaviour as calling memory_region_init_ram() and then memory_region_set_readonly() (which is what we do today in boards with pure ROMs) but is a more easily discoverable API for the purpose. Backports commit a1777f7f6462c66e1ee6e98f0d5c431bfe988aa5 from qemu	2018-02-25 00:28:17 -05:00
Alexey Kardashevskiy	7187d77cfa	memory: Add MemoryRegionIOMMUOps.notify_started/stopped callbacks The IOMMU driver may change behavior depending on whether a notifier client is present. In the case of POWER, this represents a change in the visibility of the IOTLB, for other drivers such as intel-iommu and future AMD-Vi emulation, notifier support is not yet enabled and this provides the opportunity to flag that incompatibility. Backports commit d22d8956b185c002b50a4d0883aff61f857347ef from qemu	2018-02-25 00:23:00 -05:00
Alexey Kardashevskiy	096ca207af	memory: Add reporting of supported page sizes Every IOMMU has some granularity which MemoryRegionIOMMUOps::translate uses when translating, however this information is not available outside the translate context for various checks. This adds a get_min_page_size callback to MemoryRegionIOMMUOps and a wrapper for it so IOMMU users (such as VFIO) can know the minimum actual page size supported by an IOMMU. As IOMMU MR represents a guest IOMMU, this uses TARGET_PAGE_SIZE as fallback. This removes vfio_container_granularity() and uses new helper in memory_region_iommu_replay() when replaying IOMMU mappings on added IOMMU memory region. Backports the relevant parts of commit f682e9c244af7166225f4a50cc18ff296bb9d43e from qemu	2018-02-24 19:23:28 -05:00
Emilio G. Cota	ae3e22a689	tb hash: hash phys_pc, pc, and flags with xxhash For some workloads such as arm bootup, tb_phys_hash is performance-critical. The is due to the high frequency of accesses to the hash table, originated by (frequent) TLB flushes that wipe out the cpu-private tb_jmp_cache's. More info: https://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg05098.html To dig further into this I modified an arm image booting debian jessie to immediately shut down after boot. Analysis revealed that quite a bit of time is unnecessarily spent in tb_phys_hash: the cause is poor hashing that results in very uneven loading of chains in the hash table's buckets; the longest observed chain had ~550 elements. The appended addresses this with two changes: 1) Use xxhash as the hash table's hash function. xxhash is a fast, high-quality hashing function. 2) Feed the hashing function with not just tb_phys, but also pc and flags. This improves performance over using just tb_phys for hashing, since that resulted in some hash buckets having many TB's, while others getting very few; with these changes, the longest observed chain on a single hash bucket is brought down from ~550 to ~40. Tests show that the other element checked for in tb_find_physical, cs_base, is always a match when tb_phys+pc+flags are a match, so hashing cs_base is wasteful. It could be that this is an ARM-only thing, though. UPDATE: On Tue, Apr 05, 2016 at 08:41:43 -0700, Richard Henderson wrote: > The cs_base field is only used by i386 (in 16-bit modes), and sparc (for a TB > consisting of only a delay slot). > It may well still turn out to be reasonable to ignore cs_base for hashing. BTW, after this change the hash table should not be called "tb_hash_phys" anymore; this is addressed later in this series. This change gives consistent bootup time improvements. I tested two host machines: - Intel Xeon E5-2690: 11.6% less time - Intel i7-4790K: 19.2% less time Increasing the number of hash buckets yields further improvements. However, using a larger, fixed number of buckets can degrade performance for other workloads that do not translate as many blocks (600K+ for debian-jessie arm bootup). This is dealt with later in this series. Backports commit 42bd32287f3a18d823f2258b813824a39ed7c6d9 from qemu	2018-02-24 18:00:14 -05:00
Emilio G. Cota	9ef9de9cf8	exec: add tb_hash_func5, derived from xxhash This will be used by upcoming changes for hashing the tb hash. Add this into a separate file to include the copyright notice from xxhash. Backports commit dc8b295d05ec35a8c032f9abca421772347ba5d4 from qemu	2018-02-24 17:36:35 -05:00
Peter Maydell	d7dccff836	cpu-exec: Rename cpu_resume_from_signal() to cpu_loop_exit_noexc() The function cpu_resume_from_signal() is now always called with a NULL puc argument, and is rather misnamed since it is never called from a signal handler. It is essentially forcing an exit to the top level cpu loop but without raising any exception, so rename it to cpu_loop_exit_noexc() and drop the useless unused argument. Backports commit 6886b98036a8f8f5bce8b10756ce080084cef11b from qemu	2018-02-24 17:25:28 -05:00
Peter Maydell	8d0faac1dc	qemu-common.h: Drop WORDS_ALIGNED define The WORDS_ALIGNED #define is not used anywhere, and hasn't been since 2013 when commit 612d590ebc6cef rewrote the various ld<type>_<endian>_p functions to not use it. Remove the #define and the comment describing it. Also remove the line in the comment about TARGET_WORDS_ALIGNED, since it has never actually existed. Backports commit 0d5c21f2b3bf1e0b562a2c74e353d2e03f2f50ef from qemu	2018-02-24 17:01:55 -05:00
Paolo Bonzini	8df5ad80b1	exec: hide mr->ram_addr from qemu_get_ram_ptr users Let users of qemu_get_ram_ptr and qemu_ram_ptr_length pass in an address that is relative to the MemoryRegion. This basically means what address_space_translate returns. Because the semantics of the second parameter change, rename the function to qemu_map_ram_ptr. Backports commit 0878d0e11ba8013dd759c6921cbf05ba6a41bd71 from qemu	2018-02-24 16:17:49 -05:00
Paolo Bonzini	b2e1b34bcc	memory: split memory_region_from_host from qemu_ram_addr_from_host Move the old qemu_ram_addr_from_host to memory_region_from_host and make it return an offset within the region. For qemu_ram_addr_from_host return the ram_addr_t directly, similar to what it was before commit 1b5ec23 ("memory: return MemoryRegion from qemu_ram_addr_from_host", 2013-07-04). Backports commit 07bdaa4196b51bc7ffa7c3f74e9e4a9dc8a7966a from qemu	2018-02-24 16:06:49 -05:00
Paolo Bonzini	918c626847	exec: remove ram_addr argument from qemu_ram_block_from_host Of the two callers, one does not use it, and the other can compute it itself based on the other output argument (offset) and the RAMBlock. Backports commit f615f39616c4fd1a3a3b078af8d75bb4be6390de from qemu	2018-02-24 03:37:40 -05:00
Paolo Bonzini	f26f1f123c	memory: remove qemu_get_ram_fd, qemu_set_ram_fd, qemu_ram_block_host_ptr Remove direct uses of ram_addr_t and optimize memory_region_{get,set}_fd now that a MemoryRegion knows its RAMBlock directly. Backports commit 4ff87573df3606856a92c14eef3393a63d736d11 from qemu	2018-02-24 03:34:44 -05:00
Fam Zheng	fb8135cd0d	memory: Remove code for mr->may_overlap The collision check does nothing and hasn't been used. Remove the variable together with related code. Backports commit b61359781958759317ee6fd1a45b59be0b7dbbe1 from qemu	2018-02-24 02:55:25 -05:00
Gonglei	feff56cc11	memory: drop find_ram_block() On the one hand, we have already qemu_get_ram_block() whose function is similar. On the other hand, we can directly use mr->ram_block but searching RAMblock by ram_addr which is a kind of waste. Backports commit fa53a0e53efdc7002497ea4a76aacf6cceb170ef from qemu	2018-02-24 02:52:20 -05:00
Paolo Bonzini	d0d3712417	hw: remove pio_addr_t pio_addr_t is almost unused, because these days I/O ports are simply accessed through the address space. cpu_{in,out}[bwl] themselves are almost unused; monitor.c and xen-hvm.c could use address_space_read/write directly, since they have an integer size at hand. This leaves qtest as the only user of those functions. On the other hand even portio_* functions use this type; the only interesting use of pio_addr_t thus is include/hw/sysbus.h. I guess I could move it there, but I don't see much benefit in that either. Using uint32_t is enough and avoids the need to include ioport.h everywhere. Backports commit 89a80e7400f7225d9401b35ef32454b4ab29dc67 from qemu	2018-02-24 02:43:16 -05:00
Paolo Bonzini	9485b7c2e1	cpu: move exec-all.h inclusion out of cpu.h exec-all.h contains TCG-specific definitions. It is not needed outside TCG-specific files such as translate.c, exec.c or *helper.c. One generic function had snuck into include/exec/exec-all.h; move it to include/qom/cpu.h. Backports commit 63c915526d6a54a95919ebece83fa9ca631b2508 from qemu	2018-02-24 02:39:08 -05:00
Paolo Bonzini	58693409ea	exec: extract exec/tb-context.h TCG backends do not need most of exec-all.h; extract what they actually need to a separate file or move it directly to tcg.h. The next patch will stop including exec-all.h from everywhere. Backports commit 00f6da6a1a5d1ce085334eccbb50ec899ceed513 from qemu	2018-02-24 02:09:58 -05:00
Paolo Bonzini	37f26922dd	qemu-common: push cpu.h inclusion out of qemu-common.h Backports commit 33c11879fd422b759483ed25fef133ea900ea8d7 from qemu	2018-02-24 01:50:56 -05:00
Paolo Bonzini	78fd1aab94	cpu: move endian-dependent load/store functions to cpu-all.h Disentangle cpu-common.h and memory.h from NEED_CPU_H. Prototypes are not defined for !NEED_CPU_H, so remove them from poison.h too. Only macros need poisoning. Backports commit a7d6039cb35592683ecc56d2b37817da2d2f8b00 from qemu	2018-02-24 01:04:26 -05:00
Sergey Fedorov	ba9a237586	tcg: Rework tb_invalidated_flag 'tb_invalidated_flag' was meant to catch two events: * some TB has been invalidated by tb_phys_invalidate(); * the whole translation buffer has been flushed by tb_flush(). Then it was checked: * in cpu_exec() to ensure that the last executed TB can be safely linked to directly call the next one; * in cpu_exec_nocache() to decide if the original TB should be provided for further possible invalidation along with the temporarily generated TB. It is always safe to patch an invalidated TB since it is not going to be used anyway. It is also safe to call tb_phys_invalidate() for an already invalidated TB. Thus, setting this flag in tb_phys_invalidate() is simply unnecessary. Moreover, it can prevent from pretty proper linking of TBs, if any arbitrary TB has been invalidated. So just don't touch it in tb_phys_invalidate(). If this flag is only used to catch whether tb_flush() has been called then rename it to 'tb_flushed'. Declare it as 'bool' and stick to using only 'true' and 'false' to set its value. Also, instead of setting it in tb_gen_code(), just after tb_flush() has been called, do it right inside of tb_flush(). In cpu_exec(), this flag is used to track if tb_flush() has been called and have made 'next_tb' (a reference to the last executed TB) invalid for linking it to directly call the next TB. tb_flush() can be called during the CPU execution loop from tb_gen_code(), during TB execution or by another thread while 'tb_lock' is released. Catch for translation buffer flush reliably by resetting this flag once before first TB lookup and each time we find it set before trying to add a direct jump. Don't touch in in tb_find_physical(). Each vCPU has its own execution loop in multithreaded mode and thus should have its own copy of the flag to be able to reset it with its own 'next_tb' and don't affect any other vCPU execution thread. So make this flag per-vCPU and move it to CPUState. In cpu_exec_nocache(), we only need to check if tb_flush() has been called from tb_gen_code() called by cpu_exec_nocache() itself. To do this reliably, preserve the old value of the flag, reset it before calling tb_gen_code(), check afterwards, and combine the saved value back to the flag. This patch is based on the patch "tcg: move tb_invalidated_flag to CPUState" from Paolo Bonzini <pbonzini@redhat.com>. Backports commit 6f789be56d3f38e9214dafcfab3bf9be7191f370 from qemu	2018-02-23 23:34:51 -05:00
Sergey Fedorov	d60af028c5	tcg: Clarify thread safety check in tb_add_jump() The check is to make sure that another thread hasn't already done the same while we were outside of tb_lock. Mention this in a comment. Backports commit 9962c478b153a18fe88a6509fe58cd178aff8abc from qemu	2018-02-23 21:32:47 -05:00
Sergey Fedorov	fbc0a1105f	tcg: Use uintptr_t type for jmp_list_{next\|first} fields of TB These fields do not contain pure pointers to a TranslationBlock structure. So uintptr_t is the most appropriate type for them. Also put some asserts to assure that the two least significant bits of the pointer are always zero before assigning it to jmp_list_first. Backports commit c37e6d7e3589ecb96914faa21025ad7ba6654aea from qemu	2018-02-23 21:28:19 -05:00
Sergey Fedorov	e60c24cecf	tcg: Clean up direct block chaining data fields Briefly describe in a comment how direct block chaining is done. It should help in understanding of the following data fields. Rename some fields in TranslationBlock and TCGContext structures to better reflect their purpose (dropping excessive 'tb_' prefix in TranslationBlock but keeping it in TCGContext): tb_next_offset => jmp_reset_offset tb_jmp_offset => jmp_insn_offset tb_next => jmp_target_addr jmp_next => jmp_list_next jmp_first => jmp_list_first Avoid using a magic constant as an invalid offset which is used to indicate that there's no n-th jump generated. Backports commit f309101c26b59641fc1aa8fb2a98a5441cdaea03 from qemu	2018-02-23 21:28:19 -05:00

1 2 3 4

170 commits