unicorn

mirror of https://github.com/yuzu-emu/unicorn.git synced 2024-12-23 14:15:39 +00:00

Author	SHA1	Message	Date
Peter Maydell	3bd5694a0a	memory: Rename memory_region_init_rom() and _rom_device() to _nomigrate() Rename memory_region_init_rom() to memory_region_init_rom_nomigrate() and memory_region_init_rom_device() to memory_region_init_rom_device_nomigrate(). Backports commit b59821a95bd1d7cb4697fd7748725c910582e0e7 from qemu	2018-03-03 22:29:01 -05:00
Peter Maydell	7b0027a828	memory: Rename memory_region_init_ram() to memory_region_init_ram_nomigrate() Rename memory_region_init_ram() to memory_region_init_ram_nomigrate(). This leaves the way clear for us to provide a memory_region_init_ram() which does handle migration. Backports commit 1cfe48c1ce219b60a9096312f7a61806fae64ab3 from qemu	2018-03-03 22:25:39 -05:00
Thomas Huth	cf5d583ef0	cpu: Introduce a wrapper for tlb_flush() that can be used in common code Commit 1f5c00cfdb8114c ("qom/cpu: move tlb_flush to cpu_common_reset") moved the call to tlb_flush() from the target-specific reset handlers into the common code qom/cpu.c file, and protected the call with "#ifdef CONFIG_SOFTMMU" to avoid that it is called for linux-user only targets. But since qom/cpu.c is common code, CONFIG_SOFTMMU is never defined here, so the tlb_flush() was simply never executed anymore. Fix it by introducing a wrapper for tlb_flush() in a file that is re-compiled for each target, i.e. in translate-all.c. Backports commit 2cd53943115be5118b5b2d4b80ee0a39c94c4f73 from qemu	2018-03-03 21:24:55 -05:00
Lioncash	0ef338aa71	Fix building for multi-arch targets	2018-03-03 21:14:08 -05:00
Emilio G. Cota	d3ada2feb5	tcg: allocate TB structs before the corresponding translated code Allocating an arbitrarily-sized array of tbs results in either (a) a lot of memory wasted or (b) unnecessary flushes of the code cache when we run out of TB structs in the array. An obvious solution would be to just malloc a TB struct when needed, and keep the TB array as an array of pointers (recall that tb_find_pc() needs the TB array to run in O(log n)). Perhaps a better solution, which is implemented in this patch, is to allocate TB's right before the translated code they describe. This results in some memory waste due to padding to have code and TBs in separate cache lines--for instance, I measured 4.7% of padding in the used portion of code_gen_buffer when booting aarch64 Linux on a host with 64-byte cache lines. However, it can allow for optimizations in some host architectures, since TCG backends could safely assume that the TB and the corresponding translated code are very close to each other in memory. See this message by rth for a detailed explanation: https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05172.html Subject: Re: GSoC 2017 Proposal: TCG performance enhancements Backports commit 6e3b2bfd6af488a896f7936e99ef160f8f37e6f2 from qemu	2018-03-03 17:05:49 -05:00
Emilio G. Cota	8f4f15e5f5	tcg: Introduce goto_ptr opcode and tcg_gen_lookup_and_goto_ptr Instead of exporting goto_ptr directly to TCG frontends, export tcg_gen_lookup_and_goto_ptr(), which calls goto_ptr with the pointer returned by the lookup_tb_ptr() helper. This is the only use case we have for goto_ptr and lookup_tb_ptr, so having this function is very convenient. Furthermore, it trivially allows us to avoid calling the lookup helper if goto_ptr is not implemented by the backend. Backports commit cedbcb01529cb6cf9a2289cdbebbc63f6149fc18 from qemu	2018-03-02 21:05:18 -05:00
Dr. David Alan Gilbert	55d79cf4c0	RAMBlocks: qemu_ram_is_shared Provide a helper to say whether a RAMBlock was created as a shared mapping. Backports commit 463a4ac23bcf0f0b65c850fa66f5ae6e43edd243 from qemu	2018-03-02 13:05:35 -05:00
Lioncash	18a229a69f	Resolve symbol errors with softfloat	2018-03-02 09:25:05 -05:00
KONRAD Frederic	c5730ff194	tcg: add options for enabling MTTCG We know there will be cases where MTTCG won't work until additional work is done in the front/back ends to support. It will however be useful to be able to turn it on. As a result MTTCG will default to off unless the combination is supported. However the user can turn it on for the sake of testing. Backports commit 8d4e9146b3568022ea5730d92841345d41275d66 from qemu	2018-03-02 09:25:01 -05:00
Paul Burton	411ddd16cf	target-mips: Provide function to test if a CPU supports an ISA Provide a new cpu_supports_isa function which allows callers to determine whether a CPU supports one of the ISA_ flags, by testing whether the associated struct mips_def_t sets the ISA flags in its insn_flags field. An example use of this is to allow boards which generate bootloader code to determine the properties of the CPU that will be used, for example whether the CPU is 64 bit or which architecture revision it implements. Backports commit bed9e5ceb158c886d548fe59675a6eba18baeaeb from qemu	2018-03-02 08:20:19 -05:00
Julian Brown	cc217b0c90	arm: Correctly handle watchpoints for BE32 CPUs In BE32 mode, sub-word size watchpoints can fail to trigger because the address of the access is adjusted in the opcode helpers before being compared with the watchpoint registers. This patch reverses the address adjustment before performing the comparison with the help of a new CPUClass hook. This version of the patch augments and tidies up comments a little. Backports commit 40612000599e52e792d23c998377a0fa429c4036 from qemu	2018-03-02 00:24:33 -05:00
Jean-Christophe DUBOIS	0aa0b849c2	ARM: Factor out ARM on/off PSCI control functions Split ARM on/off function from PSCI support code. This will allow to reuse these functions in other code. Backports commit 825482adde1f971cbddf27e15fb4453ab3fae994 from qemu	2018-03-01 23:31:47 -05:00
Artyom Tarasenko	0a124b2199	target-sparc: implement UA2005 GL register Backports commit cbc3a6a4cc675516328a2b0d3602355d68b6302d from qemu	2018-03-01 21:24:09 -05:00
Richard Henderson	4bec129626	tcg/i386: Handle ctpop opcode Backports commit 993508e43e6d180e9ba9b747a9657eac69aec5bb from qemu	2018-03-01 18:49:43 -05:00
Richard Henderson	5f6e7bbdbd	tcg: Add opcode for ctpop The number of actual invocations of ctpop itself does not warrent an opcode, but it is very helpful for POWER7 to use in generating an expansion for ctz. Backports commit a768e4e99247911f00c5c0267c12d4e207d5f6cc from qemu	2018-03-01 18:26:41 -05:00
Richard Henderson	01b3c6273a	target-arm: Use clrsb helper Backports commit bc21dbcc1203ae6bb536f832c46a3b5e22a73451 from qemu	2018-03-01 18:16:56 -05:00
Richard Henderson	fff7ca4617	tcg: Add helpers for clrsb The number of actual invocations does not warrent an opcode, and the backends generating it. But at least we can eliminate redundant helpers. Backports commit 086920c2c8008f125fd38781072fa25c3ad158ea from qemu	2018-03-01 18:14:11 -05:00
Richard Henderson	9cde8bfc44	target-arm: Use clz opcode Backports commit 7539a012f614b724426ac9360238f3281d928a3f from qemu	2018-03-01 16:13:26 -05:00
Richard Henderson	9b2752b0a9	target-mips: Use clz opcode Backports commit 1a0196c5c7f197fad7b079074d587b3204bcfb0f from qemu	2018-03-01 16:08:19 -05:00
Richard Henderson	2cf34e1b55	tcg: Add clz and ctz opcodes Backports commit 0e28d0063bbd9e59a981ea2d20f82f30c5d956a8 from qemu	2018-03-01 16:04:11 -05:00
Richard Henderson	9f2fcaaf27	tcg: Add deposit_z expander While we don't require a new opcode, it is handy to have an expander that knows the first source is zero. Backports commit 07cc68d52852bf47dea7c402b46ddd28248d4212 from qemu	2018-03-01 13:29:24 -05:00
Richard Henderson	8e0585dcb1	tcg: Add field extraction primitives Adds tcg_gen_extract_* and tcg_gen_sextract_* for extraction of fixed position bitfields, much like we already have for deposit. Backports commit 7ec8bab3deae643b1ce579c2d65a244f30708330 from qemu	2018-03-01 13:21:30 -05:00
Jason Wang	fdca6292a1	exec: introduce address_space_get_iotlb_entry() This patch introduces a helper to query the iotlb entry for a possible iova. This will be used by later device IOTLB API to enable the capability for a dataplane (e.g vhost) to query the IOTLB. Backports commit 052c8fa9983f553fdfa0d61034774070dd639c2b from qemu	2018-03-01 13:05:08 -05:00
Paolo Bonzini	81ad780e5e	exec: introduce MemoryRegionCache Device models often have to perform multiple access to a single memory region that is known in advance, but would to use "DMA-style" functions instead of address_space_map/unmap. This can happen for example when the data has to undergo endianness conversion. Introduce a new data structure to cache the result of address_space_translate without forcing usage of a host address like address_space_map does. Backports commit 1f4e496e1fc2eb6c8bf377a0f9695930c380bfd3 from qemu	2018-03-01 10:50:30 -05:00
Richard Henderson	f5a35908da	tcg: Add tcg_gen_mulsu2_{i32,i64,tl} This multiply has one signed input and one unsigned input, producing the full double-width result. Backports commit 5087abfb7dfd1d368ae6939420057036b4d8e509 from qemu	2018-03-01 08:39:37 -05:00
Richard Henderson	eec264526e	target-sparc: Implement ldqf and stqf inline At the same time, fix a problem with stqf_asi, when a write might access two pages. Backports commit f939ffe5a022a8798824e2720ed5a14186fca6b6 from qemu	2018-03-01 08:20:36 -05:00
Richard Henderson	3c48eb4aaf	target-sparc: Implement cas_asi/casx_asi inline Backports commit 7268adebfda6548b8ae6865dc8337f116a5d266d from qemu	2018-02-28 12:47:26 -05:00
Richard Henderson	9e60a8e432	target-sparc: Introduce cpu_raise_exception_ra Several helpers call helper_raise_exception directly, which requires in turn that their callers have performed save_state. The new function allows a TCG return address to be passed in so that we can restore PC + NPC + flags data from that. This fixes a bug in the usage of helper_check_align, whose callers had not been calling save_state. It fixes another bug in which the divide helpers used GETPC at a level other than the direct callee from TCG. This allows the translator to avoid save_state prior to SAVE, RESTORE, and FLUSHW instructions. Backports commit 2f9d35fc4006122bad33f9ae3e2e51d2263e98ee from qemu	2018-02-28 12:15:06 -05:00
Emilio G. Cota	cb92eea81a	target-arm: emulate aarch64's LL/SC using cmpxchg helpers Emulating LL/SC with cmpxchg is not correct, since it can suffer from the ABA problem. Portable parallel code, however, is written assuming only cmpxchg--and not LL/SC--is available. This means that in practice emulating LL/SC with cmpxchg is a viable alternative. The appended emulates LL/SC pairs in aarch64 with cmpxchg helpers. This works in both user and system mode. In usermode, it avoids pausing all other CPUs to perform the LL/SC pair. The subsequent performance and scalability improvement is significant, as the plots below show. They plot the throughput of atomic_add-bench compiled for ARM and executed on a 64-core x86 machine. Hi-res plots: http://imgur.com/a/JVc8Y atomic_add-bench: 1000000 ops/thread, [0,1] range 18 ++---------+----------+---------+----------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| 16 ++master +-H--+ ++ \|\| \| 14 ++ ++ \| \| \| 12 ++\| ++ \| \| \| 10 ++++ ++ 8 ++E ++ \|+++ \| 6 ++ \| ++ \| \| \| 4 ++ \| ++ \| \| \| 2 +H++E+--- ++ + \| +E++----+E+---+--+E+----++E+------+E+------+E++----+E+---+--+E\| 0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 1000000 ops/thread, [0,2] range 18 ++---------+----------+---------+----------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| 16 ++master +-H--+ ++ \| \| \| 14 ++E ++ \| \| \| 12 ++\| ++ \|+++ \| 10 ++ \| ++ 8 ++ \| ++ \| \| \| 6 ++ \| ++ \| \| \| 4 ++ \| ++ \| +E+--- \| 2 +H+ +E+-----+++ +++ +++ ---+E+-----+E+------+++ +++ + +E+---+--+E+----++E+------+E+--- ++++ +++ + +E\| 0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 1000000 ops/thread, [0,128] range 70 ++---------+----------+---------+----------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| 60 ++master +-H--+ +++ ---+E+-----+E+------+E+ \| +E+------E-------+E+--- \| \| --- +++ \| 50 ++ +++--- ++ \| -+E+ \| 40 ++ +++---- ++ \| E- \| \| --\| \| 30 ++ -- +++ ++ \| +E+ \| 20 ++E+ ++ \|E+ \| \| \| 10 ++ ++ + + + + + + + \| 0 +HH-H----H-+-----H----+---------+----------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads atomic_add-bench: 1000000 ops/thread, [0,1024] range 160 ++---------+---------+----------+---------+----------+----------+---++ +cmpxchg +-E--+ + + + + + \| 140 ++master +-H--+ +++ +++ \| -+E+-----+E+-------E\| 120 ++ +++ ---- +++ \| +++ ----E-- \| 100 ++ --E--- +++ ++ \| +++ ---- +++ \| 80 ++ --E-- ++ \| ---- +++ \| \| -+E+ \| 60 ++ ---- +++ ++ \| +E+- \| 40 ++ -- ++ \| +E+ \| 20 +EE+ ++ +++ + + + + + + \| 0 +HH-H---H--+-----H---+----------+---------+----------+----------+---++ 0 10 20 30 40 50 60 Number of threads Backports commit 1dd089d0eec060dcd8478735114d98421d414805 from qemu	2018-02-28 00:21:27 -05:00
Richard Henderson	064543a415	tcg: Add CONFIG_ATOMIC64 Allow qemu to build on 32-bit hosts without 64-bit atomic ops. Even if we only allow 32-bit hosts to multi-thread emulate 32-bit guests, we still need some way to handle the 32-bit guest using a 64-bit atomic operation. Do so by dropping back to single-step. Backports commit df79b996a7b21c6ea7847f7927a2e1a294b86c72 from qemu	2018-02-27 22:25:36 -05:00
Richard Henderson	da01e53757	tcg: Add atomic128 helpers Force the use of cmpxchg16b on x86_64. Wikipedia suggests that only very old AMD64 (circa 2004) did not have this instruction. Further, it's required by Windows 8 so no new cpus will ever omit it. If we truely care about these, then we could check this at startup time and then avoid executing paths that use it. Backports commit 7ebee43ee3e2fcd7b5063058b7ef74bc43216733 from qemu	2018-02-27 21:43:48 -05:00
Richard Henderson	5c0ce1b99c	tcg: Add atomic helpers Add all of cmpxchg, op_fetch, fetch_op, and xchg. Handle both endian-ness, and sizes up to 8. Handle expanding non-atomically, when emulating in serial. Backports commit c482cb117cc418115ca9c6d21a7a2315414c0a40 from qemu	2018-02-27 15:57:47 -05:00
Yongbok Kim	79e4c001a9	softmmu: Add probe_write() Probe for whether the specified guest write access is permitted. If it is not permitted then an exception will be taken in the same way as if this were a real write access (and we will not return). Otherwise the function will return, and there will be a valid entry in the TLB for this access. Backports commit 3b4afc9e75ab1a95f33e41f462921093f8a109c4 from qemu	2018-02-27 12:20:50 -05:00
Richard Henderson	e35aacd5ae	tcg: Add EXCP_ATOMIC When we cannot emulate an atomic operation within a parallel context, this exception allows us to stop the world and try again in a serial context. Backports commit fdbc2b5722f6092e47181a947c90fd4bdcc1c121 from qemu Also backports parts of commit 02d57ea115b7669f588371c86484a2e8ebc369be	2018-02-27 11:57:58 -05:00
Daniel P. Berrange	83a5bf2d25	qapi: rename QmpOutputVisitor to QObjectOutputVisitor The QmpOutputVisitor has no direct dependency on QMP. It is valid to use it anywhere that one wants a QObject. Rename it to better reflect its functionality as a generic QAPI to QObject converter. The commit before previous renamed the files, this one renames C identifiers. Backports commit 7d5e199ade76c53ec316ab6779800581bb47c50a from qemu	2018-02-27 08:05:33 -05:00
Daniel P. Berrange	2949a90977	qapi: rename QmpInputVisitor to QObjectInputVisitor The QmpInputVisitor has no direct dependency on QMP. It is valid to use it anywhere that one has a QObject. Rename it to better reflect its functionality as a generic QObject to QAPI converter. The previous commit renamed the files, this one renames C identifiers. Backports commit 09e68369a88d7de0f988972bf28eec1b80cc47f9 from qemu	2018-02-26 15:54:15 -05:00
Peter Maydell	db8b0a82b1	cpu: Support a target CPU having a variable page size Support target CPUs having a page size which isn't knownn at compile time. To use this, the CPU implementation should: * define TARGET_PAGE_BITS_VARY * not define TARGET_PAGE_BITS * define TARGET_PAGE_BITS_MIN to the smallest value it might possibly want for TARGET_PAGE_BITS * call set_preferred_target_page_bits() in its realize function to indicate the actual preferred target page size for the CPU (and report any error from it) In CONFIG_USER_ONLY, the CPU implementation should continue to define TARGET_PAGE_BITS appropriately for the guest OS page size. Machines which want to take advantage of having the page size something larger than TARGET_PAGE_BITS_MIN must set the MachineClass minimum_page_bits field to a value which they guarantee will be no greater than the preferred page size for any CPU they create. Note that changing the target page size by setting minimum_page_bits is a migration compatibility break for that machine. For debugging purposes, attempts to use TARGET_PAGE_SIZE before it has been finally confirmed will assert. Backports commit 20bccb82ff3ea09bcb7c4ee226d3160cab15f7da from qemu	2018-02-26 12:29:08 -05:00
Vijaya Kumar K	a7229cc08a	translate-all.c: Compute L1 page table properties at runtime Remove L1 page mapping table properties computing statically using macros which is dependent on TARGET_PAGE_BITS. Drop macros V_L1_SIZE, V_L1_SHIFT, V_L1_BITS macros and replace with variables which are computed at early stage of VM boot. Removing dependency can help to make TARGET_PAGE_BITS dynamic. Backports commit 66ec9f49399f0a9fa13ee77c472caba0de2773fc from qemu	2018-02-26 11:46:58 -05:00
Thomas Hanson	2af4ca54e9	target-arm: Infrastucture changes to enable handling of tagged address loading into PC When capturing the current CPU state for the TB, extract the TBI0 and TBI1 values from the correct TCR for the current EL and then add them to the TB flags field. Then, at the start of code generation for the block, copy the TBI fields into the DisasContext structure. Backports commit 86fb3fa4ed5873b021a362ea26a021f4aeab1bb4 from qemu	2018-02-26 07:58:17 -05:00
Pranith Kumar	5e44ce9be8	Introduce TCGOpcode for memory barrier This commit introduces the TCGOpcode for memory barrier instruction. This opcode takes an argument which is the type of memory barrier which should be generated. Backports commit f65e19bc2c9e8358e634d309606144ac2a3c2936 from qemu	2018-02-26 03:02:41 -05:00
Alex Williamson	5db45219c9	memory: Replace skip_dump flag with ram_device Setting skip_dump on a MemoryRegion allows us to modify one specific code path, but the restriction we're trying to address encompasses more than that. If we have a RAM MemoryRegion backed by a physical device, it not only restricts our ability to dump that region, but also affects how we should manipulate it. Here we recognize that MemoryRegions do not change to sometimes allow dumps and other times not, so we replace setting the skip_dump flag with a new initializer so that we know exactly the type of region to which we're applying this behavior. Backports commit ca83f87a66d19fdaabf23d4f5ebb49396fe232c1 from qemu	2018-02-25 23:00:45 -05:00
Richard Henderson	ede1cae3dc	tcg: Lower indirect registers in a separate pass Rather than rely on recursion during the middle of register allocation, lower indirect registers to loads and stores off the indirect base into plain temps. For an x86_64 host, with sufficient registers, this results in identical code, modulo the actual register assignments. For an i686 host, with insufficient registers, this means that temps can be (temporarily) spilled to the stack in order to satisfy an allocation. This as opposed to the possibility of not being able to spill, to allocate a register for the indirect base, in order to perform a spill. Backports commit 5a18407f55ade924aa6397c9a043a9ffd59645fe from qemu	2018-02-25 22:32:28 -05:00
Lioncash	17c54e2702	header_gen: alphabetize general symbols	2018-02-25 19:07:20 -05:00
Lioncash	4b8cae3f61	header_gen: alphabetize ARM symbols	2018-02-25 19:00:31 -05:00
Lioncash	fa10382007	header_gen: alphabetize aarch64 symbols	2018-02-25 19:00:01 -05:00
Lioncash	3f8802fcf5	header_gen: alphabetize MIPS symbols	2018-02-25 18:59:49 -05:00
Richard Henderson	12eecc4939	target-sparc: Use explicit writes to cpu_fsr By arranging for explicit writes to cpu_fsr after floating point operations, we are able to mark the helpers as not writing to tcg globals, which means that we don't need to invalidate the integer register set across said calls. Backports commit 7385aed20db5d83979f683b9d0048674411e963c from qemu	2018-02-25 18:55:07 -05:00
Leon Alrae	c0b3938b88	target-mips: add exception base to MIPS CPU Replace hardcoded 0xbfc00000 with exception_base which is initialized with this default address so there is no functional change here. However, it is now exposed and consequently it will be possible to modify it from outside of the CPU. Backports commit 89777fd10fc3dd573c3b4d1b2efdd10af823c001 from qemu	2018-02-25 03:22:10 -05:00
Peter Maydell	334e951ec1	memory: Provide memory_region_init_rom() Provide a new helper function memory_region_init_rom() for memory regions which are read-only (and unlike those created by memory_region_init_rom_device() don't have special behaviour for writes). This has the same behaviour as calling memory_region_init_ram() and then memory_region_set_readonly() (which is what we do today in boards with pure ROMs) but is a more easily discoverable API for the purpose. Backports commit a1777f7f6462c66e1ee6e98f0d5c431bfe988aa5 from qemu	2018-02-25 00:28:17 -05:00
Aleksandar Markovic	84b516d9db	target-mips: Add nan2008 flavor of <CEIL\|CVT\|FLOOR\|ROUND\|TRUNC>.<L\|W>.<S\|D> New set of helpers for handling nan2008-syle versions of instructions <CEIL\|CVT\|FLOOR\|ROUND\|TRUNC>.<L\|W>.<S\|D>, for Mips R6. All involved instructions have float operand and integer result. Their core functionality is implemented via invocations of appropriate SoftFloat functions. The problematic cases are when the operand is a NaN, and also when the operand (float) is out of the range of the result. Here one can distinguish three cases: CASE MIPS-A: (FCR31.NAN2008 == 1) 1. Operand is a NaN, result should be 0; 2. Operand is larger than INT_MAX, result should be INT_MAX; 3. Operand is smaller than INT_MIN, result should be INT_MIN. CASE MIPS-B: (FCR31.NAN2008 == 0) 1. Operand is a NaN, result should be INT_MAX; 2. Operand is larger than INT_MAX, result should be INT_MAX; 3. Operand is smaller than INT_MIN, result should be INT_MAX. CASE SoftFloat: 1. Operand is a NaN, result is INT_MAX; 2. Operand is larger than INT_MAX, result is INT_MAX; 3. Operand is smaller than INT_MIN, result is INT_MIN. Current implementation of <CEIL\|CVT\|FLOOR\|ROUND\|TRUNC>.<L\|W>.<S\|D> implements case MIPS-B. This patch relates to case MIPS-A. For case MIPS-A, only return value for NaN-operands should be corrected after appropriate SoftFloat library function is called. Related MSA instructions FTRUNC_S and FTINT_S already handle well all cases, in the fashion similar to the code from this patch. Backports commit 87552089b62fa229d2ff86906e4e779177fb5835 from qemu	2018-02-24 21:14:04 -05:00

1 2 3 4

164 commits