unicorn

mirror of https://github.com/yuzu-emu/unicorn.git synced 2024-12-23 19:15:34 +00:00

Author	SHA1	Message	Date
Richard Henderson	fb684825c8	tcg: Add opcodes for vector minmax arithmetic Backports commit dd0a0fcdd8848c2a18970c44a62bd8f394c2b495 from qemu	2019-01-29 16:24:52 -05:00
Richard Henderson	e08d0feee4	tcg: Add gvec expanders for nand, nor, eqv Backports commit f550805d8309500d642f640af8d9928958465478 from qemu	2019-01-29 15:57:28 -05:00
Lioncash	977ad292b3	accel/translate-all: Get rid of variable shadowing	2019-01-28 09:17:37 -05:00
Lioncash	ce8697f978	accel/translate-all: Convert a void* cast into an unsigned char* cast Strictly speaking, as far as the standard care, performing pointer arithmetic on a void* type is ill formed. This is a GNU extension that allows this. Instead, just use unsigned char* which preserves the same behavior.	2019-01-28 09:14:33 -05:00
Peter Maydell	1301becdab	tcg: Support MMU protection regions smaller than TARGET_PAGE_SIZE Add support for MMU protection regions that are smaller than TARGET_PAGE_SIZE. We do this by marking the TLB entry for those pages with a flag TLB_RECHECK. This flag causes us to always take the slow-path for accesses. In the slow path we can then special case them to always call tlb_fill() again, so we have the correct information for the exact address being accessed. This change allows us to handle reading and writing from small regions; we cannot deal with execution from the small region. Backports commit 55df6fcf5476b44bc1b95554e686ab3e91d725c5 from qemu	2018-11-16 21:35:54 -05:00
Lioncash	3a0ab1a64a	Partial backport of: exec.c: Handle IOMMUs in address_space_translate_for_iotlb() We just want the parameter changes here. Partial backport of commit 1f871c5e6b0f30644a60a81a6a7aadb3afb030ac from qemu	2018-11-16 21:24:55 -05:00
Emilio G. Cota	1677898a09	cputlb: read CPUTLBEntry.addr_write atomically Updates can come from other threads, so readers that do not take tlb_lock must use atomic_read to avoid undefined behaviour (UB). This completes the conversion to tlb_lock. This conversion results on average in no performance loss, as the following experiments (run on an Intel i7-6700K CPU @ 4.00GHz) show. 1. aarch64 bootup+shutdown test: - Before: Performance counter stats for 'taskset -c 0 ../img/aarch64/die.sh' (10 runs): 7487.087786 task-clock (msec) # 0.998 CPUs utilized ( +- 0.12% ) 31,574,905,303 cycles # 4.217 GHz ( +- 0.12% ) 57,097,908,812 instructions # 1.81 insns per cycle ( +- 0.08% ) 10,255,415,367 branches # 1369.747 M/sec ( +- 0.08% ) 173,278,962 branch-misses # 1.69% of all branches ( +- 0.18% ) 7.504481349 seconds time elapsed ( +- 0.14% ) - After: Performance counter stats for 'taskset -c 0 ../img/aarch64/die.sh' (10 runs): 7462.441328 task-clock (msec) # 0.998 CPUs utilized ( +- 0.07% ) 31,478,476,520 cycles # 4.218 GHz ( +- 0.07% ) 57,017,330,084 instructions # 1.81 insns per cycle ( +- 0.05% ) 10,251,929,667 branches # 1373.804 M/sec ( +- 0.05% ) 173,023,787 branch-misses # 1.69% of all branches ( +- 0.11% ) 7.474970463 seconds time elapsed ( +- 0.07% ) 2. SPEC06int: SPEC06int (test set) [Y axis: Speedup over master] 1.15 +-+----+------+------+------+------+------+-------+------+------+------+------+------+------+----+-+ \| \| 1.1 +-+.................................+++.............................+ tlb-lock-v2 (m+++x) +-+ \| +++ \| +++ tlb-lock-v3 (spinl\|ck) \| \| +++ \| \| +++ +++ \| \| \| 1.05 +-+....+++...........####.........\|####.+++.\|......\|.....###....+++...........+++....###.........+-+ \| ### ++#\| # \|# \|# *### +++### +++#+# \| +++ \| #\|# ### \| 1 +-++++#++++####+++#++#++++++++++#++#++++#++++#+#+**+#++++###++++###++++###++++#+#++++#+#+++-+ \| +* # #++# * # #### * # * ++# **+# \| * # ***\|# \|# # #\|# #+# # # \| 0.95 +-+....#....#..#.\|..#...#..#.\|..#....#.\|..#.++.#.+++#.**.#....#+#....#.#..++#.#..+-+ \| * # # # \| # # # \| # * * # ++ # * * # * * # * \|* # ++# # # # *** # \| \| * * # ++# # + # # # \| # * * # * * # * * # * * # ++ # **** # ++# # * * # \| 0.9 +-+....#...\|#..#....#.++#..#.\|..#....#....#....#....#....#..\|.#...\|#.#....#..+-+ \| * * # *** # * * # \|# # + # * * # * * # * * # * * # * * # ++ # \|# # * * # \| 0.85 +-+....#..\|..#....#.**..#....#....#....#....#....#....#....#.**.#....#..+-+ \| * # + # * * # \| # * * # * * # * * # * * # * * # * * # * * # * \|* # * * # \| \| * * # * * # * * # + # * * # * * # * * # * * # * * # * * # * * # * \|* # * * # \| 0.8 +-+....#.....#....#....#....#....#....#....#....#....#....#.++.#....#..+-+ \| * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # \| 0.75 +-+--*##--###-###-###-###-###-*##-##-##-##-##-##--*##--+-+ 400.perlben401.bzip2403.gcc429.m445.gob456.hmme45462.libqua464.h26471.omnet473483.xalancbmkgeomean png: https://imgur.com/a/BHzpPTW Notes: - tlb-lock-v2 corresponds to an implementation with a mutex. - tlb-lock-v3 corresponds to the current implementation, i.e. a spinlock and a single lock acquisition in tlb_set_page_with_attrs. Backports commit 403f290c0603f35f2d09c982bf5549b6d0803ec1 from qemu	2018-10-23 15:37:43 -04:00
Richard Henderson	d74e00a30a	tcg: Split CONFIG_ATOMIC128 GCC7+ will no longer advertise support for 16-byte __atomic operations if only cmpxchg is supported, as for x86_64. Fortunately, x86_64 still has support for __sync_compare_and_swap_16 and we can make use of that. AArch64 does not have, nor ever has had such support, so open-code it. Backports commit e6cd4bb59b8154fa00da611200beef7eb4e8ec56 from qemu	2018-10-23 15:17:39 -04:00
Richard Henderson	c911ea7128	tcg: Add tlb_index and tlb_entry helpers Isolate the computation of an index from an address into a helper before we change that function. Backports commit 383beda9cf32f795616c3b93f7d6154d70372d4b from qemu	2018-10-23 15:04:27 -04:00
Emilio G. Cota	dfb3954571	exec: introduce tlb_init Paves the way for the addition of a per-TLB lock. Backports commit 5005e2537d090bee87aca3b924dcd17920fd146a from qemu	2018-10-23 14:41:29 -04:00
Richard Henderson	e01deeb9ba	tcg: Implement CPU_LOG_TB_NOCHAIN during expansion Rather than test NOCHAIN before linking, do not emit the goto_tb opcode at all. We already do this for goto_ptr. Backports commit d7f425fdea991f052241c6479acd9feae834063b from qemu	2018-10-23 14:35:12 -04:00
Pavel Dovgalyuk	0242d19e79	translator: fix breakpoint processing QEMU cannot pass through the breakpoints when 'si' command is used in remote gdb. This patch disables inserting the breakpoints when we are already single stepping though the gdb remote protocol. This patch also fixes icount calculation for the blocks that include breakpoints - instruction with breakpoint is not executed and shouldn't be used in icount calculation. Backports commit f9f1f56e4da088b993ce28775c271d5bcdcf49ae from qemu	2018-10-04 04:04:57 -04:00
Peter Maydell	0c6311f8cc	accel/tcg: Correct "is this a TLB miss" check in get_page_addr_code() In commit 71b9a45330fe220d1 we changed the condition we use to determine whether we need to refill the TLB in get_page_addr_code() to if (unlikely(env->tlb_table[mmu_idx][index].addr_code != (addr & (TARGET_PAGE_MASK \| TLB_INVALID_MASK)))) { This isn't the right check (it will falsely fail if the input addr happens to have the low bit corresponding to TLB_INVALID_MASK set, for instance). Replace it with a use of the new tlb_hit() function, which is the correct test. Backports commit e4c967a7201400d7f76e5847d5b4c4ac9e2566e0 from qemu	2018-07-03 19:23:25 -04:00
Peter Maydell	6543f9ea26	tcg: Define and use new tlb_hit() and tlb_hit_page() functions The condition to check whether an address has hit against a particular TLB entry is not completely trivial. We do this in various places, and in fact in one place (get_page_addr_code()) we have got the condition wrong. Abstract it out into new tlb_hit() and tlb_hit_page() inline functions (one for a known-page-aligned address and one for an arbitrary address), and use them in all the places where we had the condition correct. This is a no-behaviour-change patch; we leave fixing the buggy code in get_page_addr_code() to a subsequent patch Backports commit 334692bce7f0653a93b8d84ecde8c847b08dec38 from qemu	2018-07-03 19:21:36 -04:00
Peter Maydell	61a7ac6948	cpu-defs.h: Document CPUIOTLBEntry 'addr' field The 'addr' field in the CPUIOTLBEntry struct has a rather non-obvious use; add a comment documenting it (reverse-engineered from what the code that sets it is doing). Backports commit ace4109011b4912b24e76f152e2cf010e78819c5 from qemu	2018-06-15 12:07:39 -04:00
Peter Maydell	7a6ae26346	cputlb: Pass cpu_transaction_failed() the correct physaddr The API for cpu_transaction_failed() says that it takes the physical address for the failed transaction. However we were actually passing it the offset within the target MemoryRegion. We don't currently have any target CPU implementations of this hook that require the physical address; fix this bug so we don't get confused if we ever do add one. Backports commit 2d54f19401bc54b3b56d1cc44c96e4087b604b97 from qemu	2018-06-15 12:03:23 -04:00
Richard Henderson	b42217fbaf	tcg: Use GEN_ATOMIC_HELPER_FN for opposite endian atomic add Backports commit 58edf9eef9d0e99dc051367c5a446a62223ec6e4 from qemu	2018-05-14 08:07:49 -04:00
Richard Henderson	de1708aadc	tcg: Introduce atomic helpers for integer min/max Given that this atomic operation will be used by both risc-v and aarch64, let's not duplicate code across the two targets. Backports commit 5507c2bf35aa6b4705939349184e71afd5e058b2 from qemu	2018-05-14 08:06:42 -04:00
Emilio G. Cota	d26bf1d446	translator: merge max_insns into DisasContextBase While at it, use int for both num_insns and max_insns to make sure we have same-type comparisons. Backports commit b542683d77b4f56cef0221b267c341616d87bce9 from qemu	2018-05-11 13:59:17 -04:00
Pavel Dovgalyuk	b4bf3c776b	icount: fix cpu_restore_state_from_tb for non-tb-exit cases In icount mode, instructions that access io memory spaces in the middle of the translation block invoke TB recompilation. After recompilation, such instructions become last in the TB and are allowed to access io memory spaces. When the code includes instruction like i386 'xchg eax, 0xffffd080' which accesses APIC, QEMU goes into an infinite loop of the recompilation. This instruction includes two memory accesses - one read and one write. After the first access, APIC calls cpu_report_tpr_access, which restores the CPU state to get the current eip. But cpu_restore_state_from_tb resets the cpu->can_do_io flag which makes the second memory access invalid. Therefore the second memory access causes a recompilation of the block. Then these operations repeat again and again. This patch moves resetting cpu->can_do_io flag from cpu_restore_state_from_tb to cpu_loop_exit* functions. It also adds a parameter for cpu_restore_state which controls restoring icount. There is no need to restore icount when we only query CPU state without breaking the TB. Restoring it in such cases leads to the incorrect flow of the virtual time. In most cases new parameter is true (icount should be recalculated). But there are two cases in i386 and openrisc when the CPU state is only queried without the need to break the TB. This patch fixes both of these cases. Backports commit afd46fcad2dceffda35c0586f5723c127b6e09d8 from qemu	2018-04-11 20:05:40 -04:00
Alex Bennée	4074587775	accel/tcg/translate-all: expand cpu_restore_state addr check We are still seeing signals during translation time when we walk over a page protection boundary. This expands the check to ensure the host PC is inside the code generation buffer. The original suggestion was to check versus tcg_ctx.code_gen_ptr but as we now segment the translation buffer we have to settle for just a general check for being inside. I've also fixed up the declaration to make it clear it can deal with invalid addresses. A later patch will fix up the call sites. Backports commit d25f2a72272b9ffe0d06710d6217d1169bc2cc7d from qemu	2018-04-11 19:53:57 -04:00
Richard Henderson	e0903adacf	tcg: Fix out-of-line generic vector compares A mistake in the type passed to sizeof, that happens to work when the out-of-line fallback itself is using host vectors, but fails when using only the base types. Backports commit 6cb1d3b8517572031a22675280ec642972cdb395 from qemu	2018-04-07 23:05:19 -04:00
Lioncash	a0c39b4996	translate-all: Fix missing #elif condition in alloc_code_gen_buffer	2018-03-21 12:46:03 -04:00
Pavel Dovgalyuk	36d902cd0e	cpu-exec: fix exception_index handling Function cpu_handle_interrupt calls cc->cpu_exec_interrupt to process pending hardware interrupts. Under the hood cpu_exec_interrupt uses cpu->exception_index to pass information to the internal function which is usually common for exception and interrupt processing. But this value is not reset after return and may be processed again by cpu_handle_exception. This does not happen due to overwriting the exception_index at the end of cpu_handle_interrupt. But this branch may also overwrite the valid exception_index in some cases. Therefore this patch: 1. resets exception_index just after the call to cpu_exec_interrupt 2. prevents overwriting the meaningful value of exception_index Backports commit 5f3bdfd4fa33255542a4b6249913d9ffb11b44f9 from qemu	2018-03-17 19:33:05 -04:00
Lioncash	103af93402	translate-all: Prevent null-pointer dereference possibility in tb_clean_internal()	2018-03-17 18:31:39 -04:00
Philippe Mathieu-Daudé	4eeb4f7faf	accel/tcg: move atomic_template.h to accel/tcg/	2018-03-13 12:28:50 -04:00
Thomas Huth	975924bb2e	accel/tcg: move softmmu_template.h to accel/tcg/ The header is only used by accel/tcg/cputlb.c so we can move it to the accel/tcg/ folder, too. Backports commit da1849c1eba50aa372f87c7945d7b230eb2b2fb2 from qemu	2018-03-13 12:27:04 -04:00
Lioncash	035f1afa7d	tcg: move tcg backend files into accel/tcg/ move tcg-runtime.c, translate-all.(ch) and translate-common.c into accel/tcg/ subdirectory and updated related trace-events file. Backports commit 244f144134d0dd182f1af8654e7f9a79fe770368 and applies relevant changes made in db432672dc50ed86dda17ac821b7eb07411a90af and d9bb58e51068dfc48746c6af0179926c8dc05bce from qemu	2018-03-13 11:48:15 -04:00
Richard Henderson	eb488f5bd6	tcg: Merge opcode arguments into TCGOp Rather than have a separate buffer of 10*max_ops entries, give each opcode 10 entries. The result is actually a bit smaller and should have slightly more cache locality. Backports commit 75e8b9b7aa0b95a761b9add7e2f09248b101a392 from qemu	2018-03-05 04:45:20 -05:00
Lluís Vilanova	74d437827b	target/arm: [tcg] Port to generic translation framework Backports commit 2316922420da6fd0d1ffb5557d0cdcc5958bcf44 from qemu	2018-03-04 20:28:06 -05:00
Lluís Vilanova	c40f5eb73e	target/i386: [tcg] Port to generic translation framework Backports commit d2e6eedf5078d0f2ac17fc1a0d24f6be79c071d7 from qemu	2018-03-04 17:42:42 -05:00
Lluís Vilanova	ed7225e685	tcg: Add generic translation framework Backports commit bb2e0039dc07177f928f9fe24758967da02d60a2 from qemu	2018-03-04 14:31:16 -05:00

32 commits