unicorn

mirror of https://github.com/yuzu-emu/unicorn.git synced 2024-12-26 08:35:39 +00:00

Author	SHA1	Message	Date
David Hildenbrand	de513617c8	accel/tcg: allow to invalidate a write TLB entry immediately Background: s390x implements Low-Address Protection (LAP). If LAP is enabled, writing to effective addresses (before any translation) 0-511 and 4096-4607 triggers a protection exception. So we have subpage protection on the first two pages of every address space (where the lowcore - the CPU private data resides). By immediately invalidating the write entry but allowing the caller to continue, we force every write access onto these first two pages into the slow path. we will get a tlb fault with the specific accessed addresses and can then evaluate if protection applies or not. We have to make sure to ignore the invalid bit if tlb_fill() succeeds. Backports commit f52bfb12143e29d7c8bd827bdb751aee47a9694e from qemu	2020-01-14 07:14:10 -05:00
David Hildenbrand	d9d91c1db6	tcg: Factor out probe_write() logic into probe_access() Let's also allow to probe other access types. Backports commit c25c283df0f08582df29f1d5d7be1516b851532d from qemu	2020-01-14 07:07:54 -05:00
David Hildenbrand	53c3c47efa	tcg: Make probe_write() return a pointer to the host page ... similar to tlb_vaddr_to_host(); however, allow access to the host page except when TLB_NOTDIRTY or TLB_MMIO is set. Backports commit fef39ccd567032d3ad520ed80f3576068e6eb2e3 from qemu	2020-01-14 07:04:17 -05:00
David Hildenbrand	2bc3843fe3	tcg: Enforce single page access in probe_write() Let's enforce the interface restriction. Backports commit ca86cf328ce216bb304bbf09a43614613f945d86 from qemu	2020-01-14 07:02:15 -05:00
David Hildenbrand	b732ad9eba	tcg: Check for watchpoints in probe_write() Let size > 0 indicate a promise to write to those bytes. Check for write watchpoints in the probed range. Backports commit 03a981893c99faba84bb373976796ad7dce0aecc from qemu	2020-01-14 07:01:05 -05:00
Richard Henderson	07f30382c0	cputlb: Handle watchpoints via TLB_WATCHPOINT The raising of exceptions from check_watchpoint, buried inside of the I/O subsystem, is fundamentally broken. We do not have the helper return address with which we can unwind guest state. Replace PHYS_SECTION_WATCH and io_mem_watch with TLB_WATCHPOINT. Move the call to cpu_check_watchpoint into the cputlb helpers where we do have the helper return address. This allows watchpoints on RAM to bypass the full i/o access path. Backports commit 50b107c5d617eaf93301cef20221312e7a986701 from qemu	2020-01-14 06:58:33 -05:00
Richard Henderson	6c4a3fd06f	cputlb: Fold TLB_RECHECK into TLB_INVALID_MASK We had two different mechanisms to force a recheck of the tlb. Before TLB_RECHECK was introduced, we had a PAGE_WRITE_INV bit that would immediate set TLB_INVALID_MASK, which automatically means that a second check of the tlb entry fails. We can use the same mechanism to handle small pages. Conserve TLB_* bits by removing TLB_RECHECK. Backports commit 30d7e098d5c38644359820317fcf72e3e129ec53 from qemu	2020-01-14 06:20:33 -05:00
Richard Henderson	bb313206e5	cputlb: Remove double-alignment in store_helper We have already aligned page2 to the start of the next page. There is no reason to do that a second time. Backports commit 5787585d0406cfd54dda0c71ea1a603347ce6e71 from qemu	2020-01-12 10:25:13 -05:00
Richard Henderson	6990b212e3	cputlb: Fix size operand for tlb_fill on unaligned store We are currently passing the size of the full write to the tlb_fill for the second page. Instead pass the real size of the write to that page. This argument is unused within all tlb_fill, except to be logged via tracing, so in practice this makes no difference. But in a moment we'll need the value of size2 for watchpoints, and if we've computed the value we might as well use it. Backports commit 8f7cd2ad4acd01242d00807e231097b3de9f0930 from qemu	2020-01-12 06:17:09 -05:00
Tony Nguyen	a95927de1d	cputlb: Byte swap memory transaction attribute Notice new attribute, byte swap, and force the transaction through the memory slow path. Required by architectures that can invert endianness of memory transaction, e.g. SPARC64 has the Invert Endian TTE bit. Backports commit a26fc6f5152b47f1d7ed928f9c9d462d01ff1624 from qemu	2020-01-07 19:15:33 -05:00
Tony Nguyen	103d6f51c8	memory: Single byte swap along the I/O path Now that MemOp has been pushed down into the memory API, and callers are encoding endianness, we can collapse byte swaps along the I/O path into the accelerator and target independent adjust_endianness. Collapsing byte swaps along the I/O path enables additional endian inversion logic, e.g. SPARC64 Invert Endian TTE bit, with redundant byte swaps cancelling out. Backports commit 9bf825bf3df4ebae3af51566c8088e3f1249a910 from qemu	2020-01-07 19:12:04 -05:00
Tony Nguyen	ad8957a4c3	cputlb: Replace size and endian operands for MemOp Preparation for collapsing the two byte swaps adjust_endianness and handle_bswap into the former. Backports commit be5c4787e9a6eed12fd765d9e890f7cc6cd63220 from qemu	2020-01-07 19:03:51 -05:00
Tony Nguyen	da98d0da4e	memory: Access MemoryRegion with endianness Preparation for collapsing the two byte swaps adjust_endianness and handle_bswap into the former. Call memory_region_dispatch_{read\|write} with endianness encoded into the "MemOp op" operand. This patch does not change any behaviour as memory_region_dispatch_{read\|write} is yet to handle the endianness. Once it does handle endianness, callers with byte swaps can collapse them into adjust_endianness. Backports commit d5d680cacc66ef7e3c02c81dc8f3a34eabce6dfe from qemu	2020-01-07 18:54:11 -05:00
Tony Nguyen	3b777a2332	cputlb: Access MemoryRegion with MemOp The memory_region_dispatch_{read\|write} operand "unsigned size" is being converted into a "MemOp op". Convert interfaces by using no-op size_memop. After all interfaces are converted, size_memop will be implemented and the memory_region_dispatch_{read\|write} operand "unsigned size" will be converted into a "MemOp op". As size_memop is a no-op, this patch does not change any behaviour. Backports commit 4cbb198eefef41bbca703605c78875fd4fec6ef6 from qemu	2020-01-07 18:26:29 -05:00
Tony Nguyen	f75368cd0f	tcg: TCGMemOp is now accelerator independent MemOp Preparation for collapsing the two byte swaps, adjust_endianness and handle_bswap, along the I/O path. Target dependant attributes are conditionalized upon NEED_CPU_H. Backports commit 14776ab5a12972ea439c7fb2203a4c15a09094b4 from qemu	2019-11-28 03:01:12 -05:00
Lioncash	802c626145	Revert "cputlb: Filter flushes on already clean tlbs" This reverts commit `5ab9723787`.	2019-06-30 19:21:20 -04:00
Alex Bennée	938f8465a0	cputlb: cast size_t to target_ulong before using for address masks While size_t is defined to happily access the biggest host object this isn't the case when generating masks for 64 bit guests on 32 bit hosts. Otherwise we end up truncating the address when we fall back to our unaligned helper. Fixes: https://bugs.launchpad.net/qemu/+bug/1831545 Backports commit ab7a2009df66241a3742cbdfe8f9a1f66c6af21f from qemu	2019-06-13 16:07:01 -04:00
Alex Bennée	9aef73f5fb	cputlb: use uint64_t for interim values for unaligned load When running on 32 bit TCG backends a wide unaligned load ends up truncating data before returning to the guest. We specifically have the return type as uint64_t to avoid any premature truncation so we should use the same for the interim types. Fixes: https://bugs.launchpad.net/qemu/+bug/1830872 Fixes: eed5664238e Backports commit 8c79b288513587e960b6b7257a9d955d5592f209 from qemu	2019-06-13 16:06:22 -04:00
Richard Henderson	fbf91a6535	cpu: Replace ENV_GET_CPU with env_cpu Now that we have both ArchCPU and CPUArchState, we can define this generically instead of via macro in each target's cpu.h. Backports commit 29a0af618ddd21f55df5753c3e16b0625f534b3c from qemu	2019-06-12 11:16:16 -04:00
Lioncash	5ab9723787	cputlb: Filter flushes on already clean tlbs Especially for guests with large numbers of tlbs, like ARM or PPC, we may well not use all of them in between flush operations. Remember which tlbs have been used since the last flush, and avoid any useless flushing. Backports much of 3d1523ced6060cdfe9e768a814d064067ccabfe5 from qemu along with a bunch of updating changes.	2019-06-10 20:42:15 -04:00
Richard Henderson	2a4a7b9391	tcg: Use tlb_fill probe from tlb_vaddr_to_host Most of the existing users would continue around a loop which would fault the tlb entry in via a normal load/store. But for AArch64 SVE we have an existing emulation bug wherein we would mark the first element of a no-fault vector load as faulted (within the FFR, not via exception) just because we did not have its address in the TLB. Now we can properly only mark it as faulted if there really is no valid, readable translation, while still not raising an exception. (Note that beyond the first element of the vector, the hardware may report a fault for any reason whatsoever; with at least one element loaded, forward progress is guaranteed.) Backports commit 4811e9095c0491bc6f5450e5012c9c4796b9e59d from qemu	2019-05-16 18:27:03 -04:00
Richard Henderson	dab0061a0d	tcg: Use CPUClass::tlb_fill in cputlb.c We can now use the CPUClass hook instead of a named function. Create a static tlb_fill function to avoid other changes within cputlb.c. This also isolates the asserts within. Remove the named tlb_fill function from all of the targets. Backports commit c319dc13579a92937bffe02ad2c9f1a550e73973 from qemu	2019-05-16 17:35:37 -04:00
Richard Henderson	9a02741c13	cputlb: Do unaligned store recursion to outermost function This is less tricky than for loads, because we always fall back to single byte stores to implement unaligned stores. Backports commit 4601f8d10d7628bcaf2a8179af36e04b42879e91 from qemu	2019-05-14 07:45:15 -04:00
Richard Henderson	bcab6f1719	cputlb: Do unaligned load recursion to outermost function If we attempt to recurse from load_helper back to load_helper, even via intermediary, we do not get all of the constants expanded away as desired. But if we recurse back to the original helper (or a shim that has a consistent function signature), the operands are folded away as desired. Backports commit 2dd926067867c2dd19e66d31a7990e8eea7258f6 from qemu	2019-05-14 07:43:31 -04:00
Richard Henderson	f12f36aebd	cputlb: Drop attribute flatten Going to approach this problem via __attribute__((always_inline)) instead, but full conversion will take several steps. Backports commit fc1bc777910dc14a3db4e2ad66f3e536effc297d from qemu	2019-05-14 07:33:39 -04:00
Richard Henderson	7991cd601f	cputlb: Move TLB_RECHECK handling into load/store_helper Having this in io_readx/io_writex meant that we forgot to re-compute index after tlb_fill. It also means we can use the normal aligned memory load path. It also fixes a bug in that we had cached a use of index across a tlb_fill. Backports commit f1be36969de2fb9b6b64397db1098f115210fcd9 from qemu	2019-05-14 07:28:15 -04:00
Alex Bennée	ccee796272	accel/tcg: demacro cputlb Instead of expanding a series of macros to generate the load/store helpers we move stuff into common functions and rely on the compiler to eliminate the dead code for each variant. Backports commit eed5664238ea5317689cf32426d9318686b2b75c from qemu	2019-05-14 07:28:11 -04:00
Shahab Vahedi	7f59d62f4a	cputlb: Fix io_readx() to respect the access_type This change adapts io_readx() to its input access_type. Currently io_readx() treats any memory access as a read, although it has an input argument "MMUAccessType access_type". This results in: 1) Calling the tlb_fill() only with MMU_DATA_LOAD 2) Considering only entry->addr_read as the tlb_addr Buglink: https://bugs.launchpad.net/qemu/+bug/1825359 Backports commit ef5dae6805cce7b59d129d801bdc5db71bcbd60d from qemu	2019-04-30 10:11:11 -04:00
Lioncash	5daabe55a4	cputlb: Synchronize with qemu Synchronizes the code with Qemu to reduce a few differences.	2019-04-26 15:48:45 -04:00
Emilio G. Cota	f31764dd5b	cputlb: update TLB entry/index after tlb_fill We are failing to take into account that tlb_fill() can cause a TLB resize, which renders prior TLB entry pointers/indices stale. Fix it by re-doing the TLB entry lookups immediately after tlb_fill. Fixes: 86e1eff8bc ("tcg: introduce dynamic TLB sizing", 2019-01-28) Backports commit 6d967cb86d5b4a60ba15b497126b621ce9ca6609 from qemu	2019-02-12 11:48:48 -05:00
Thomas Huth	85bc48fecd	tcg: Fix LGPL version number It's either "GNU Library General Public version 2" or "GNU Lesser General Public version 2.1", but there was no "version 2.0" of the "Lesser" library. So assume that version 2.1 is meant here. Backports commit fb0343d5b4dd4b9b9e96e563d913a3e0c709fe4e from qemu	2019-02-03 17:55:28 -05:00
Peter Maydell	1301becdab	tcg: Support MMU protection regions smaller than TARGET_PAGE_SIZE Add support for MMU protection regions that are smaller than TARGET_PAGE_SIZE. We do this by marking the TLB entry for those pages with a flag TLB_RECHECK. This flag causes us to always take the slow-path for accesses. In the slow path we can then special case them to always call tlb_fill() again, so we have the correct information for the exact address being accessed. This change allows us to handle reading and writing from small regions; we cannot deal with execution from the small region. Backports commit 55df6fcf5476b44bc1b95554e686ab3e91d725c5 from qemu	2018-11-16 21:35:54 -05:00
Lioncash	3a0ab1a64a	Partial backport of: exec.c: Handle IOMMUs in address_space_translate_for_iotlb() We just want the parameter changes here. Partial backport of commit 1f871c5e6b0f30644a60a81a6a7aadb3afb030ac from qemu	2018-11-16 21:24:55 -05:00
Emilio G. Cota	1677898a09	cputlb: read CPUTLBEntry.addr_write atomically Updates can come from other threads, so readers that do not take tlb_lock must use atomic_read to avoid undefined behaviour (UB). This completes the conversion to tlb_lock. This conversion results on average in no performance loss, as the following experiments (run on an Intel i7-6700K CPU @ 4.00GHz) show. 1. aarch64 bootup+shutdown test: - Before: Performance counter stats for 'taskset -c 0 ../img/aarch64/die.sh' (10 runs): 7487.087786 task-clock (msec) # 0.998 CPUs utilized ( +- 0.12% ) 31,574,905,303 cycles # 4.217 GHz ( +- 0.12% ) 57,097,908,812 instructions # 1.81 insns per cycle ( +- 0.08% ) 10,255,415,367 branches # 1369.747 M/sec ( +- 0.08% ) 173,278,962 branch-misses # 1.69% of all branches ( +- 0.18% ) 7.504481349 seconds time elapsed ( +- 0.14% ) - After: Performance counter stats for 'taskset -c 0 ../img/aarch64/die.sh' (10 runs): 7462.441328 task-clock (msec) # 0.998 CPUs utilized ( +- 0.07% ) 31,478,476,520 cycles # 4.218 GHz ( +- 0.07% ) 57,017,330,084 instructions # 1.81 insns per cycle ( +- 0.05% ) 10,251,929,667 branches # 1373.804 M/sec ( +- 0.05% ) 173,023,787 branch-misses # 1.69% of all branches ( +- 0.11% ) 7.474970463 seconds time elapsed ( +- 0.07% ) 2. SPEC06int: SPEC06int (test set) [Y axis: Speedup over master] 1.15 +-+----+------+------+------+------+------+-------+------+------+------+------+------+------+----+-+ \| \| 1.1 +-+.................................+++.............................+ tlb-lock-v2 (m+++x) +-+ \| +++ \| +++ tlb-lock-v3 (spinl\|ck) \| \| +++ \| \| +++ +++ \| \| \| 1.05 +-+....+++...........####.........\|####.+++.\|......\|.....###....+++...........+++....###.........+-+ \| ### ++#\| # \|# \|# *### +++### +++#+# \| +++ \| #\|# ### \| 1 +-++++#++++####+++#++#++++++++++#++#++++#++++#+#+**+#++++###++++###++++###++++#+#++++#+#+++-+ \| +* # #++# * # #### * # * ++# **+# \| * # ***\|# \|# # #\|# #+# # # \| 0.95 +-+....#....#..#.\|..#...#..#.\|..#....#.\|..#.++.#.+++#.**.#....#+#....#.#..++#.#..+-+ \| * # # # \| # # # \| # * * # ++ # * * # * * # * \|* # ++# # # # *** # \| \| * * # ++# # + # # # \| # * * # * * # * * # * * # ++ # **** # ++# # * * # \| 0.9 +-+....#...\|#..#....#.++#..#.\|..#....#....#....#....#....#..\|.#...\|#.#....#..+-+ \| * * # *** # * * # \|# # + # * * # * * # * * # * * # * * # ++ # \|# # * * # \| 0.85 +-+....#..\|..#....#.**..#....#....#....#....#....#....#....#.**.#....#..+-+ \| * # + # * * # \| # * * # * * # * * # * * # * * # * * # * * # * \|* # * * # \| \| * * # * * # * * # + # * * # * * # * * # * * # * * # * * # * * # * \|* # * * # \| 0.8 +-+....#.....#....#....#....#....#....#....#....#....#....#.++.#....#..+-+ \| * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # \| 0.75 +-+--*##--###-###-###-###-###-*##-##-##-##-##-##--*##--+-+ 400.perlben401.bzip2403.gcc429.m445.gob456.hmme45462.libqua464.h26471.omnet473483.xalancbmkgeomean png: https://imgur.com/a/BHzpPTW Notes: - tlb-lock-v2 corresponds to an implementation with a mutex. - tlb-lock-v3 corresponds to the current implementation, i.e. a spinlock and a single lock acquisition in tlb_set_page_with_attrs. Backports commit 403f290c0603f35f2d09c982bf5549b6d0803ec1 from qemu	2018-10-23 15:37:43 -04:00
Richard Henderson	d74e00a30a	tcg: Split CONFIG_ATOMIC128 GCC7+ will no longer advertise support for 16-byte __atomic operations if only cmpxchg is supported, as for x86_64. Fortunately, x86_64 still has support for __sync_compare_and_swap_16 and we can make use of that. AArch64 does not have, nor ever has had such support, so open-code it. Backports commit e6cd4bb59b8154fa00da611200beef7eb4e8ec56 from qemu	2018-10-23 15:17:39 -04:00
Richard Henderson	c911ea7128	tcg: Add tlb_index and tlb_entry helpers Isolate the computation of an index from an address into a helper before we change that function. Backports commit 383beda9cf32f795616c3b93f7d6154d70372d4b from qemu	2018-10-23 15:04:27 -04:00
Emilio G. Cota	dfb3954571	exec: introduce tlb_init Paves the way for the addition of a per-TLB lock. Backports commit 5005e2537d090bee87aca3b924dcd17920fd146a from qemu	2018-10-23 14:41:29 -04:00
Peter Maydell	0c6311f8cc	accel/tcg: Correct "is this a TLB miss" check in get_page_addr_code() In commit 71b9a45330fe220d1 we changed the condition we use to determine whether we need to refill the TLB in get_page_addr_code() to if (unlikely(env->tlb_table[mmu_idx][index].addr_code != (addr & (TARGET_PAGE_MASK \| TLB_INVALID_MASK)))) { This isn't the right check (it will falsely fail if the input addr happens to have the low bit corresponding to TLB_INVALID_MASK set, for instance). Replace it with a use of the new tlb_hit() function, which is the correct test. Backports commit e4c967a7201400d7f76e5847d5b4c4ac9e2566e0 from qemu	2018-07-03 19:23:25 -04:00
Peter Maydell	6543f9ea26	tcg: Define and use new tlb_hit() and tlb_hit_page() functions The condition to check whether an address has hit against a particular TLB entry is not completely trivial. We do this in various places, and in fact in one place (get_page_addr_code()) we have got the condition wrong. Abstract it out into new tlb_hit() and tlb_hit_page() inline functions (one for a known-page-aligned address and one for an arbitrary address), and use them in all the places where we had the condition correct. This is a no-behaviour-change patch; we leave fixing the buggy code in get_page_addr_code() to a subsequent patch Backports commit 334692bce7f0653a93b8d84ecde8c847b08dec38 from qemu	2018-07-03 19:21:36 -04:00
Peter Maydell	61a7ac6948	cpu-defs.h: Document CPUIOTLBEntry 'addr' field The 'addr' field in the CPUIOTLBEntry struct has a rather non-obvious use; add a comment documenting it (reverse-engineered from what the code that sets it is doing). Backports commit ace4109011b4912b24e76f152e2cf010e78819c5 from qemu	2018-06-15 12:07:39 -04:00
Peter Maydell	7a6ae26346	cputlb: Pass cpu_transaction_failed() the correct physaddr The API for cpu_transaction_failed() says that it takes the physical address for the failed transaction. However we were actually passing it the offset within the target MemoryRegion. We don't currently have any target CPU implementations of this hook that require the physical address; fix this bug so we don't get confused if we ever do add one. Backports commit 2d54f19401bc54b3b56d1cc44c96e4087b604b97 from qemu	2018-06-15 12:03:23 -04:00
Lioncash	035f1afa7d	tcg: move tcg backend files into accel/tcg/ move tcg-runtime.c, translate-all.(ch) and translate-common.c into accel/tcg/ subdirectory and updated related trace-events file. Backports commit 244f144134d0dd182f1af8654e7f9a79fe770368 and applies relevant changes made in db432672dc50ed86dda17ac821b7eb07411a90af and d9bb58e51068dfc48746c6af0179926c8dc05bce from qemu	2018-03-13 11:48:15 -04:00

42 commits