unicorn

mirror of https://github.com/yuzu-emu/unicorn.git synced 2024-12-23 13:25:31 +00:00

Author	SHA1	Message	Date
Laurent Vivier	0b62df7f30	target/m68k: add fmovecr fmovecr moves a floating point constant from the FPU ROM to a floating point register. Backports commit 9d403660d91229922c2786e81c23cc9dd8e644f1 from qemu	2018-03-03 20:51:21 -05:00
Laurent Vivier	ed3e8ab460	target/m68k: add fscc. use DisasCompare with FPU conditions in fscc and fbcc. Backports commit dd337bf86214e2436833d9442c995df95b136190 from qemu	2018-03-03 20:43:08 -05:00
Greg Kurz	a125b35f1f	qapi: add explicit null to string input and output visitors This may be used for deprecated object properties that are kept for backwards compatibility. Backports commit a733371214b68881d84725a3c71f60e2faf3b8e2 from qemu	2018-03-03 20:32:50 -05:00
KONRAD Frederic	18020c2c79	cputlb: cleanup get_page_addr_code to use VICTIM_TLB_HIT This replaces env1 and page_index variables by env and index so we can use VICTIM_TLB_HIT macro later. Backports commit 3416343255cbe01fbe12e5e36cd4bb5042425b27 from qemu	2018-03-03 19:54:13 -05:00
Laurent Vivier	f7ef6b49a8	target-m68k: add FPCR and FPSR Backports commit ba62494483ab51ee31c70952b6ce5171a31860b1 from qemu	2018-03-03 19:51:31 -05:00
Laurent Vivier	1c6b1e2b9f	target-m68k: use floatx80 internally Coldfire uses float64, but 680x0 use floatx80. This patch introduces the use of floatx80 internally and enables 680x0 80bits FPU. Backports commit f83311e4764f1f25a8abdec2b32c64483be1759b from qemu	2018-03-03 19:35:17 -05:00
Laurent Vivier	92555a1134	target-m68k: initialize FPU registers on reset, set FP registers to NaN and control registers to 0 Backports commit f4a6ce5155aab2a7ed7b9032a72187b37b3bfffe from qemu	2018-03-03 18:51:37 -05:00
Laurent Vivier	d92621522a	target-m68k: move fmove CR to a function Move code of fmove to/from control register to a function Backports commit 860b9ac779615fe9315cd58165652052ac165a92 from qemu	2018-03-03 18:49:49 -05:00
Marc-André Lureau	ca25248ecd	object: add uint property setter/getter Backports commit 3152779cd63ba41331ef41659406f65b03e7911a from qemu	2018-03-03 18:43:17 -05:00
Marc-André Lureau	fef464c4cb	qapi: update the qobject visitor to use QNUM_U64 Switch to use QNum/uint where appropriate to remove i64 limitation. The input visitor will cast i64 input to u64 for compatibility reasons (existing json QMP client already use negative i64 for large u64, and expect an implicit cast in qemu). Note: before the patch, uint64_t values above INT64_MAX are sent over json QMP as negative values, e.g. UINT64_MAX is sent as -1. After the patch, they are sent unmodified. Clearly a bug fix, but we have to consider compatibility issues anyway. libvirt should cope fine, because its parsing of unsigned integers accepts negative values modulo 2^64. There's hope that other clients will, too. Backports commit 5923f85fb82df7c8c60a89458a5ae856045e5ab1 from qemu	2018-03-03 18:40:51 -05:00
Marc-André Lureau	6ca6050206	qnum: add uint type In order to store integer values between INT64_MAX and UINT64_MAX, add a uint64_t internal representation. Backports commit 61a8f418b26a2d974e38e4ae55020aca8d402d88 from qemu	2018-03-03 18:37:56 -05:00
Marc-André Lureau	a57d8a5b50	qapi: Remove visit_start_alternate() parameter promote_int Before the previous commit, parameter promote_int = true made visit_start_alternate() with an input visitor avoid QTYPE_QINT variants and create QTYPE_QFLOAT variants instead. This was used where QTYPE_QINT variants were invalid. The previous commit fused QTYPE_QINT with QTYPE_QFLOAT, rendering promote_int useless and unused. Backports commit 60390d2dc85ffade8981ca41e02335cb07353a6d from qemu	2018-03-03 18:34:35 -05:00
Lioncash	a6623ce754	qapi: Update scripts to commit 01b2ffcedd94ad7b42bc870e4c6936c87ad03429	2018-03-03 18:32:12 -05:00
Marc-André Lureau	dd77730d49	qapi: merge QInt and QFloat in QNum We would like to use a same QObject type to represent numbers, whether they are int, uint, or floats. Getters will allow some compatibility between the various types if the number fits other representations. Add a few more tests while at it. Backports commit 01b2ffcedd94ad7b42bc870e4c6936c87ad03429 from qemu	2018-03-03 18:16:28 -05:00
Marc-André Lureau	f1dbfe6be6	qapi: Clean up qobject_input_type_number() control flow Use the more common pattern to error out. Backports commit 58634047b7deeab36e4b07c4744e44d698975561 from qemu	2018-03-03 17:40:45 -05:00
Markus Armbruster	d70f3bfc6b	qobject-input-visitor: Document full_name_nth() Backports commit 6c02258e143700314ebf268dae47eb23db17d1cf from qemu	2018-03-03 17:39:09 -05:00
Markus Armbruster	0d433af617	qobject-input-visitor: Catch misuse of end_struct vs. end_list Backports commit 8b2e41d733850ec6a67a85743138e023cbb8921b from qemu	2018-03-03 17:38:16 -05:00
Markus Armbruster	e9174563be	qapi: Document intended use of @name within alternate visits Backports commit ed0ba0f47e8cb6d924db0a54090bbb7b095fe9ea from qemu	2018-03-03 17:37:12 -05:00
Markus Armbruster	5ab0d5af81	qapi: New QAPI_CLONE_MEMBERS() QAPI_CLONE() returns a newly allocated QAPI object. Inconvenient when we want to clone into an existing object. QAPI_CLONE_MEMBERS() does exactly that. Backports commit 4626a19c86c30d96cedbac2bd44ef8103303cb37 from qemu	2018-03-03 17:36:02 -05:00
Eric Blake	734778da93	qobject: Add helper macros for common scalar insertions Rather than making lots of callers wrap a scalar in a QInt, QString, or QBool, provide helper macros that do the wrapping automatically. Update the Coccinelle script to make mass conversions easy, although the conversion itself will be done as a separate patches to ease review and backport efforts. Backports commit a92c21591b5bb9543996538f14854ca6b528318b from qemu	2018-03-03 17:33:30 -05:00
Markus Armbruster	09efe97bfd	qapi: Fix string input visitor regression for empty lists Visiting a list when input is the empty string should result in an empty list, not an error. Noticed when commit 3d089ce belatedly added tests, but simply accepted as weird then. It's actually a regression: broken in commit 74f24cb, v2.7.0. Fix it, and throw in another test case for empty string. Backports commit d2788227c6185c72d88ef3127e9fed41686f8e39 from qemu	2018-03-03 17:30:42 -05:00
Markus Armbruster	247a511c4a	qapi: Factor out common part of qobject input visitor creation Backports commit abe81bc21a6996c62e66ed2d051373c0df24f870 from qemu	2018-03-03 17:26:27 -05:00
Marc-André Lureau	c4e0911f95	object: fix potential leak in getters If the property is not of the requested type, the getters will leak a QObject. Backports commit 560f19f162529d691619ac69ed032321c7f5f1fb from qemu	2018-03-03 17:22:32 -05:00
Richard Henderson	42bb73fa96	target/arm: Exit after clearing aarch64 interrupt mask Exit to cpu loop so we reevaluate cpu_arm_hw_interrupts. Backports commit 8da54b2507c1cabf60c2de904cf0383b23239231 from qemu	2018-03-03 17:19:40 -05:00
Richard Henderson	dd1473f582	tcg: Increase hit rate of lookup_tb_ptr We can call tb_htable_lookup even when the tb_jmp_cache is completely empty. Therefore, un-nest most of the code dependent on tb != NULL from the read from the cache. This improves the hit rate of lookup_tb_ptr; for instance, when booting and immediately shutting down debian-arm, the hit rate improves from 93.2% to 99.4%. Backports commit b97a879de980e99452063851597edb98e7e8039c from qemu	2018-03-03 17:16:23 -05:00
Richard Henderson	9ec975448b	tcg/arm: Use ldr (literal) for goto_tb The new placement of the TB means that we can use one insn to load the goto_tb destination directly from the TB. Backports commit 308714e6bc945389c64faf1b9213e2c0d3f03391 from qemu	2018-03-03 17:14:27 -05:00
Richard Henderson	c99edca63b	tcg/arm: Try pc-relative addresses for movi Backports commit 9c39b94f1448770e7e573e9516d2483816785d1b from qemu	2018-03-03 17:13:31 -05:00
Richard Henderson	a5133ccaa1	tcg/arm: Remove limit on code buffer size Since we're no longer using a direct branch, we have no limit on the branch distance. Backports commit acb0b292b6d0f49972dc98f742e79ed53973e438 from qemu	2018-03-03 17:11:47 -05:00
Richard Henderson	68275ba6f3	tcg/arm: Use indirect branch for goto_tb Backports commit 3fb53fb4d12f2e7833bd1659e6013237b130ef20 from qemu	2018-03-03 17:11:18 -05:00
Richard Henderson	9a85cb0a26	tcg/aarch64: Use ADR in tcg_out_movi The new placement of the TB means that we can use one insn to load the return value for exit_tb returning the TB pointer. Backports commit cc74d332ff9a78684374847375ef63fc4bd10436 from qemu	2018-03-03 17:09:42 -05:00
Emilio G. Cota	f50e6cfa11	translate-all: consolidate tb init in tb_gen_code We are partially initializing tb in tb_alloc. Instead, fully initialize it in tb_gen_code, which is tb_alloc's only caller. This saves an unnecessary write to tb->cflags. Backports commit 2b48e10f888059a98043b4816769fa2a326a1d2c from qemu	2018-03-03 17:08:21 -05:00
Emilio G. Cota	d3ada2feb5	tcg: allocate TB structs before the corresponding translated code Allocating an arbitrarily-sized array of tbs results in either (a) a lot of memory wasted or (b) unnecessary flushes of the code cache when we run out of TB structs in the array. An obvious solution would be to just malloc a TB struct when needed, and keep the TB array as an array of pointers (recall that tb_find_pc() needs the TB array to run in O(log n)). Perhaps a better solution, which is implemented in this patch, is to allocate TB's right before the translated code they describe. This results in some memory waste due to padding to have code and TBs in separate cache lines--for instance, I measured 4.7% of padding in the used portion of code_gen_buffer when booting aarch64 Linux on a host with 64-byte cache lines. However, it can allow for optimizations in some host architectures, since TCG backends could safely assume that the TB and the corresponding translated code are very close to each other in memory. See this message by rth for a detailed explanation: https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05172.html Subject: Re: GSoC 2017 Proposal: TCG performance enhancements Backports commit 6e3b2bfd6af488a896f7936e99ef160f8f37e6f2 from qemu	2018-03-03 17:05:49 -05:00
Emilio G. Cota	8e58c67968	util: add cacheinfo Add helpers to gather cache info from the host at init-time. For now, only export the host's I/D cache line sizes, which we will use to improve cache locality to avoid false sharing. Backports commit b255b2c8a5484742606e8760870ba3e14d0c9605 from qemu	2018-03-03 16:58:28 -05:00
Laurent Vivier	da4d407317	target-m68k: define ext_opsize Backports commit 69e698220f68a17ce9584b068f68ed09e527a6ad from qemu	2018-03-03 15:05:55 -05:00
Laurent Vivier	409369a7ce	target-m68k: move FPU helpers to fpu_helper.c Backports commit c88f8107b14456d514b00571b0675cb532e82cad from qemu	2018-03-03 15:04:05 -05:00
Laurent Vivier	199c62ea01	softfloat: define 680x0 specific values Backports commit e5b0cbe8e8744b57faf0c62d023525cd466f5ab8 from qemu	2018-03-03 15:01:16 -05:00
Laurent Vivier	68c9ab9b77	target/m68k: fix V flag for CC_OP_SUBx V flag for subtraction is: v = (res ^ src1) & (src1 ^ src2) (see COMPUTE_CCR() in target/m68k/helper.c) But gen_flush_flags() uses: v = (res ^ src2) & (src1 ^ src2) The problem has been found with the following program: .global _start _start: move.l #-2147483648,%d0 subq.l #1,%d0 jvc 1f move.l #1,%d1 move.l #1,%d0 trap #0 1: move.l #0,%d1 move.l #1,%d0 trap #0 It works fine (exit(1)) on real hardware, and with "-singlestep". "-singlestep" uses gen_helper_flush_flags(), whereas without "-singlestep", V flag is computed directly in gen_flush_flags(). This patch updates gen_flush_flags() to have the same result as with gen_helper_flush_flags(). Backports commit 043b936ef6fe53396b3c6b8f5562ea3e238a071d from qemu	2018-03-03 14:59:20 -05:00
Mihail Abakumov	e1c2fac129	i386: fix read/write cr with icount option Running Windows with icount causes a crash in instruction of write cr. This patch fixes it. Reading and writing cr cause an icount read because there are called cpu_get_apic_tpr and cpu_set_apic_tpr functions. So, there is need gen_io_start()/gen_io_end() calls. Backports commit 5b003a40bb1ab14d0398e91f03393d3c6b9577cd from qemu	2018-03-03 14:56:18 -05:00
Paolo Bonzini	741ff79e23	target/i386: use multiple CPU AddressSpaces This speeds up SMM switches. Later on it may remove the need to take the BQL, and it may also allow to reuse code between TCG and KVM. Backports commit f8c45c6550b9ff1e1f0b92709ff3213a79870879 from qemu	2018-03-03 14:53:47 -05:00
Paolo Bonzini	710f393c13	target/i386: enable A20 automatically in system management mode Ignore env->a20_mask when running in system management mode. Backports commit c8bc83a4dd29a9a33f5be81686bfe6e2e628097b from qemu	2018-03-03 14:33:09 -05:00
Peter Xu	fb8d3e2f6a	exec: simplify phys_page_find() params It really only plays with the dispatchers, so the parameter list does not need that complexity. This helps for readability at least. Backports commit 003a0cf2cd1828a1141a874428571267b117f765 from qemu	2018-03-03 14:28:25 -05:00
Laurent Vivier	ce25609ed3	target/m68k: implement rtd Add "Return and Deallocate" (rtd) instruction. RTD #d (SP) -> PC SP + 4 + d -> SP Backports commit 18059c9e1648bf4fc5c7c1bae6f54690742b05ba from qemu	2018-03-03 14:27:01 -05:00
Aurelien Jarno	2c49a6b2f6	target/mips: optimize indirect branches Backports commit e350d8ca3ac7e31c6af71a4ab74d2442dfefc697 from qemu	2018-03-03 14:23:58 -05:00
Aurelien Jarno	8ce8d4fe20	target/mips: optimize cross-page direct jumps in softmmu Backports commit d9a9acde64b862107933f9e9a01435e51bf8f91b from qemu	2018-03-03 14:23:25 -05:00
Emilio G. Cota	baa0983ae3	target/aarch64: optimize indirect branches Measurements: [Baseline performance is that before applying this and the previous commit] - NBench, aarch64-softmmu. Host: Intel i7-4790K @ 4.00GHz 1.7x +-+--------------------------------------------------------------------------------------------------------------+-+ \| \| \| cross \| 1.6x +cross+jr.................................................####...................................................+-+ \| #++# \| \| # # \| 1.5x +-+...................................................****..#...................................................+-+ \| +++* # \| \| * * # \| 1.4x +-+........................................................#...................................................+-+ \| * * # \| \| ##### * * # \| 1.3x +-+................................***+++#................#...................................................+-+ \| ++* # * * # \| \| * * # * * # \| 1.2x +-+.....................................#................#...................................................+-+ \| * * # * * # \| \| #### * * # * * # \| 1.1x +-+.......................+++#..#.......#................#...................................................+-+ \| **** # * * # * * # ***#### \| \| * # * * # * * # **### +++#### *### * # \| 1x +-++-++++++-++++***###++-++++#+++++-+#++**++++++++++#++++-+#++**++#++*###-++++-+#+++-+++#+-++-+ \| ***### * # * * # * * # ++### * * # * * # * * # * ++# * # * * # \| \| * ++# * # * * # * * # * * # * * # * * # * * # * * # * * # * * # \| 0.9x +-+---***###--###---###--####--###--*###--###--*###--###---###--####---+-+ ASSIGNMENT BITFIELD FOURFP EMULATION HUFFMAN LU DECOMPOSITIONNEURAL NUMERIC SORSTRING SORT hmean png: http://imgur.com/qO9ubtk NB. cross here represents the previous commit. - SPECint06 (test set), aarch64-linux-user. Host: Intel i7-4790K @ 4.00GHz 1.5x +-+--------------------------------------------------------------------------------------------------------------+-+ \| *** \| \| +++ jr \| \| * * \| 1.4x +-+.............................................................................................+++............+-+ \| * * \| \| \| ***** * * \| \| \| * * * * ***** \| 1.3x +-+...........................................................................................\|............+-+ \| +++ * * * * * \| * \| \| ***** * * * * +++ \| \| * * * * * * * * \| 1.2x +-+...............................................................................****..................+-+ \| **** * * * * * * * * * * +++ \| \| * * * * * * * * * * * * ***** \| \| * * * * ***** * * * * * * * * * * \| 1.1x +-+....................................................................+++.......................+-+ \| * * * * * * * * * * ***** * * * * * * \| \| * * * * * * * * ***** * * * * * * * * * * \| \| * * ***** * * * * * * * * ****** * * * * * * * * * * \| 1x +-++-++++-++++++++++-++++-+++++-++++++++++-++++-++****+++++-+++++-++++-++++++++++-++++-++-+ \| * * * * * * * * * * * * * +++ * * * * * * * * * * \| \| * * * * * * * * * * * * * * * * * * * * * * * * * * \| \| * * * * * * * * * * * * * * * * * * * * * * * * * * \| 0.9x +-+---***---*----*---*---*---*---**---*---*---*---*----*---*---+-+ astar bzip2 gcc gobmk h264ref hmmlibquantum mcf omnetpperlbench sjengxalancbmk hmean png: http://imgur.com/3Dp4vvq - SPECint06 (train set), aarch64-linux-user. Host: Intel i7-4790K @ 4.00GHz 1.7x +-+--------------------------------------------------------------------------------------------------------------+-+ \| \| \| jr \| 1.6x +-+...............................................................................................+++............+-+ \| *** \| \| +++ \| \| * * \| 1.5x +-+............................................................................................................+-+ \| +++ * * \| \| ***** * * \| 1.4x +-+.....................................................................+++..................................+-+ \| * * * * \| \| ***** * * * * \| \| * * * * ***** * * \| 1.3x +-+......................................................................................................+-+ \| +++ * * * * * * * * \| \| ***** * * * * * * ***** * * \| 1.2x +-+.............................................................................+++..........****...+-+ \| * * * * * * * * * * * +++ \| \| ***** * * ***** * * * * * * * * * * * * \| \| * * * * +++ * * * * * * * * * * * * \| 1.1x +-+............................................................................................+-+ \| * * ***** * * * * * * ***** * * * * * * * * * * \| \| * * * * * * * * * * +++ ****** +++ * * * * * * * * * * \| 1x +-+---***---*----*---*---*---*---**---*---*---*---*----*---***---+-+ astar bzip2 gcc gobmk h264ref hmmlibquantum mcf omnetpperlbench sjengxalancbmk hmean png: http://imgur.com/vRrdc9j Backports commit e75449a346bf558296966a44277bfd93412c6da6 from qemu	2018-03-03 14:22:12 -05:00
Emilio G. Cota	83ea5b72f2	target/aarch64: optimize cross-page direct jumps in softmmu Perf numbers in next commit's log. Backports commit e78722368c721f3c5b8109ed525adac1653ae97b from qemu	2018-03-03 14:20:55 -05:00
Aurelien Jarno	0e9d3d1943	tcg/mips: implement goto_ptr Backports commit 5786e0683c4f8170dd05a550814b8809d8ae6d86 from qemu	2018-03-03 14:19:46 -05:00
Richard Henderson	1d6c4f1a42	tcg/arm: Implement goto_ptr Backports commit 085c648bef7301eabe7d4a3301c8d012ae4423b8 from qemu	2018-03-03 14:18:41 -05:00
Richard Henderson	3b02642372	tcg/arm: Clarify tcg_out_bx for arm4 host In theory this would re-enable usage of QEMU on an armv4 host. Whether this is worthwhile is debatable -- we've been unconditionally issuing the armv5t BX instruction in the prologue since 2011 without complaint. Possibly we should simply require an armv6 host. Backports commit 702a947484eb3e615183dafc93de590ab0679f60 from qemu	2018-03-03 14:17:13 -05:00
Richard Henderson	d496bb6150	tcg/s390: Implement goto_ptr Backports commit 46644483cae978c734460131bb1d9071f813b287 from qemu	2018-03-03 14:16:03 -05:00
Richard Henderson	f0420c3427	tcg/sparc: Implement goto_ptr Backports commit 38f81dc5938fb7025531c5ed602afd41fef799a7 from qemu	2018-03-03 14:14:32 -05:00
Richard Henderson	81f1aae572	tcg/aarch64: Implement goto_ptr Measurements: SPECint06 (test set), x86_64-linux-user. Host: APM 64-bit ARMv8 (Atlas/A57) @ 2.4 GHz 1.45x +-+-------------------------------------------------------------------------------------------------------------+-+ \| ***** \| \| +++ * * +goto-ptr \| 1.4x +-+...****...................................................................................................+-+ \| +++* * * +++ \| 1.35x +-+................................................................****....................................+-+ \| * * * +++ \| \| * * * * * * \| 1.3x +-+.......................................................................................................+-+ \| * * * * * * \| \| * * * * * * ***** \| 1.25x +-+.................****.........................................................***.................+-+ \| * * * * * * * +++ * * \| 1.2x +-+.................................................................................................+-+ \| * * * * * * * * * * * * \| \| * * * * * * * * * * * * ***** \| 1.15x +-+...............................................................................................+-+ \| * * * * * * * * +++ * * * * * * \| \| * * * * * * * * ***** * * * * * * \| 1.1x +-+........................****.........***..................................................+-+ \| * * * * * * * * * * * * * * * * * * * \| 1.05x +-+.........................................................................................+-+ \| * * ***** * * * * * * * * * * * * * * * * * * \| \| * * * * * * * * * * * * *** *** * * * * * * * * * * \| 1x +-+---***---*---*----*---*---*---*---*---*---*----*---*---***---+-+ astar bzip2 gcc gobmk h264ref hmmlibquantum mcf omnetpperlbench sjenxalancbmk hmean png: http://imgur.com/en9HE8L Backports commit b19f0c2e7d344d4d62daf554951acdb6c94a34b0 from qemu	2018-03-03 14:13:09 -05:00
Emilio G. Cota	7d0440dec4	tb-hash: improve tb_jmp_cache hash function in user mode Optimizations to cross-page chaining and indirect branches make performance more sensitive to the hit rate of tb_jmp_cache. The constraint of reserving some bits for the page number lowers the achievable quality of the hashing function. However, user-mode does not have this requirement. Thus, with this change we use for user-mode a hashing function that is both faster and of better quality than the previous one. Measurements: Note: baseline (i.e. speedup == 1x) is QEMU v2.9.0. - SPECint06 (test set), x86_64-linux-user. Host: Intel i7-6700K @ 4.00GHz 2.2x +-+--------------------------------------------------------------------------------------------------------------+-+ \| \| \| jr \| 2x +jr+multhash +....................................................+++++...................................+-+ \| jr+hash \|$$$ \| \| \|$+$ \| \| ### $ \| 1.8x +-+......................................................................#\|#.$...................................+-+ \| ++#+# $ \| \| \|# # $ \| 1.6x +-+....................................................................**.#.$....................++$$$..........+-+ \| $$$ +* # $ \|$+$ \| \| ++$$$ ### $ * * # $ +++\|$ $ \| \| ++###+$ # # $ * * # $ ### **## $ \| 1.4x +-+...................+#.$.........*.#.$............................#.$...........#+#$$.++\|#.$..........+-+ \| +* # $ * * # $ * * # $ # # $ * +# $ \| \| * # $ +++++ * * # $ * * # $ *** # $ * * # $ ###$$ \| 1.2x +-+.....................#.$.**##$$...#.$............................#.$...........#.$....#.$.*+#+$..+-+ \| * # $ + # $ * * # $ +++ * * # $ ++###$$ * * # $ * * # $ * * # $ \| \| **##$$ * # $ * * # $ * * # $ **##$$ ++### * # $ *** #+$ * * # $ * * # $ * * # $ \| \| ++#+$ **##$$$ * # $ * * # $ * * # $ + # $ ++####$$ **+# * # $ * * # $ * * # $ * * # $ * * # $ \| 1x +-++-++#+$+++#-+$++-#+$+++#+$+++#+$+-+#+$+**++#+$+++#$$+++#+$+++#+$++-#+$++-+#+$+++#+$-++-+ \| * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ \| \| * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ \| 0.8x +-+--*##$$-##$$$-##$$-##$$-##$$-##$$-###$$-##$$-##$$-##$$-##$$-##$$-##$$--+-+ astar bzip2 gcc gobmk h264ref hmmlibquantum mcf omnetpperlbench sjengxalancbmk hmean png: http://imgur.com/4UXTrEc Here I also tried the hash function suggested by Paolo ("multhash"): return ((uint64_t) (pc 2654435761) >> 32) & (TB_JMP_CACHE_SIZE - 1); As you can see it is just as good as the other new function ("hash"), which is what I ended up going with. - SPECint06 (train set), x86_64-linux-user. Host: Intel i7-6700K @ 4.00GHz 2.6x +-+--------------------------------------------------------------------------------------------------------------+-+ \| \| \| jr ### \| 2.4x +jr+hash...........................................................................................#.#...........+-+ \| # # \| \| # # \| 2.2x +-+................................................................................................#.#...........+-+ \| # # \| \| # # \| 2x +-+................................................................................................#.#...........+-+ \| **** # \| \| * * # \| 1.8x +-+................................................................................................#...........+-+ \| +++ * * # \| \| #### #### * * # \| 1.6x +-+......................................####.............................#..#.***..#.............#...........+-+ \| +++ #++# *** # * * # #### * * # \| \| ### # # * * # * * # # # * * # \| 1.4x +-+...................**+#..........*..#..............................#.....#....#..#.....#...........+-+ \| ++* # * * # * * # * * # *** # * * # #### \| \| * * # #### * * # * * # * * # * * # * * # **** # \| 1.2x +-+......................#..***++#.....#..............................#.....#.....#.....#......#..+-+ \| **### * # * * # * * # * * # * * # * * # * * # * * # \| \| * * # **### * # * * # * * # ***## * # * * # * * # * * # * * # \| 1x +-+--**###--###--*##--###-###--###--###--##--###-###--###--*##--###--+-+ astar bzip2 gcc gobmk h264ref hmmlibquantum mcf omnetpperlbench sjengxalancbmk hmean png: http://imgur.com/ArCbHqo - NBench, x86_64-linux-user. Host: Intel i7-6700K @ 4.00GHz 1.12x +-+-------------------------------------------------------------------------------------------------------------+-+ \| \| \| jr +++ \| 1.1x +jr+hash...........................................................####.........................................+-+ \| +++#\| # \| \| \| #++# \| 1.08x +-+................................+++................+++.+++..**..#.........................................+-+ \| \| +++ \| \| \| * # \| \| \| \| \| \| +++ # \| 1.06x +-+................................***###.............\|...\|........#.........................+++.............+-+ \| \| * \|# ***### * # \| \| \| \| ++# \| \|# * * # #### \| 1.04x +-+................................++..#............\|..\|#.......#........................#.\|#.............+-+ \| * * # ++++# * * # +++#++# \| \| * * # * * # * * # \| # # +++#### \| 1.02x +-+....................................#......+++.......#.......#.....................**..#..**++#...+-+ \| +++ * # +++ \| * * # * * # +++ \| # +++ # \| \| +++ \| +++ +++ ++++++ * * # ****### * # * * # \| +++ ++++++ ++ # * * # \| 1x +-++-+++++####++***###++++-+####+-++++#-++++-+#++++++#+++-+++#+-+++####-+***###++++++#+++-+++#+-++-+ \| ***\| # ++* \|# ****\| # * # * ++# * # * * # **** \|# * * # * * # * * # \| \| * \| \| # ++# \| ++# * # * * # * * # * * # \| ++# * * # * * # * * # \| 0.98x +-+....\|.++#......#..+++..#......#.......#......#.......#..++..#.......#......#.......#...+-+ \| +++ # * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # \| \| * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # \| 0.96x +-+---***###--###--*###--###--*###--###--*###--###--*###--###--*###---+-+ ASSIGNMENT BITFIELD FOURFP EMULATION HUFFMAN LU DECOMPOSITIONEURAL NNUMERIC SOSTRING SORT hmean png: http://imgur.com/ZXFX0hJ - NBench, arm-linux-user. Host: Intel i7-4790K @ 4.00GHz 1.3x +-+-------------------------------------------------------------------------------------------------------------+-+ \| #### \| \| jr # # +++ \| 1.25x +jr+hash.....................#..#...........................................####................................+-+ \| # # # # \| \| # # # # \| 1.2x +-+..........................#..#...........................................#..#................................+-+ \| # # # # \| \| # # # # \| 1.15x +-+..........................#..#...........................................#..#................................+-+ \| # # #### # # \| \| # # # # # # \| 1.1x +-+..........................#..#..................................#..#.....#..#................................+-+ \| # # # # # # +++ \| \| # # #### # # # # #### \| 1.05x +-+..........................#..#...............#..#.....####......#..#.....#..#.........................#..#...+-+ \| # # # # # # # # # # +++ # # \| \| +++ * # #### * # # # +++# # # ### # # \| 1x +-++-+*###++*++++++-+++#+-**++#-++++-+#+++++#++#++***++#+-++++#-+***-++++++++#++***++#+-++-+ \| * # * * \| * * # * * # * * # **** # * * # * * # * ### ++# * # \| \| * * # * ### * # * * # * * # * * # * * # * * # * * # * * # * * # \| 0.95x +-+........#.....\|#.......#......#.......#......#.......#......#.......#......#.......#...+-+ \| * * # * * \|# * * # * * # * * # * * # * * # * * # * * # * * # * * # \| \| * * # * * \|# * * # * * # * * # * * # * * # * * # * * # * * # * * # \| 0.9x +-+---***###--###--*###--###--*###--###--*###--###--*###--###--***###---+-+ ASSIGNMENT BITFIELD FOURFP EMULATION HUFFMAN LU DECOMPOSITIONEURAL NNUMERIC SOSTRING SORT hmean png: http://imgur.com/FfD27ey Backports commit 6f1653180f5701c6a8f1b35b89a80b1e3260928e from qemu	2018-03-03 14:11:29 -05:00
Emilio G. Cota	2d16da435e	target/i386: optimize indirect branches Speed up indirect branches by jumping to the target if it is valid. Softmmu measurements (see later commit for user-mode numbers): Note: baseline (i.e. speedup == 1x) is QEMU v2.9.0. - SPECint06 (test set), x86_64-softmmu (Ubuntu 16.04 guest). Host: Intel i7-4790K @ 4.00GHz 2.4x +-+--------------------------------------------------------------------------------------------------------------+-+ \| \| \| cross \| 2.2x +cross+jr..........................................................................+++...........................+-+ \| \| \| \| +++ \| \| 2x +-+..............................................................................\|..\|............................+-+ \| \| \| \| \| \| \| \| 1.8x +-+..............................................................................\|####...........................+-+ \| \|# \|# \| \| **** \|# \| 1.6x +-+.............................................................................\|.\|#...........................+-+ \| * \|* \|# \| \| * \|* \|# \| 1.4x +-+.......................................................................+++...\|.\|#...........................+-+ \| ++++++ #### * \|++# +++ \| \| +++ \| \| #++# ++* # +++ \| \| 1.2x +-+......................###.....####....+++............\|..\|...........***..#.....#....####...\|.###.....####..+-+ \| +++ * # # #### ### ++* # * * # #++# **\|# +++#++# \| \| *### +++ ++* # ++ # ++# # #### \| \|# +++ * * # * * # *** # \| \|# **** # \| 1x +-++-++++#++**###+++++#+++-++#+*++#++++#+-+++#-+**##++++-+#+++-+#+++++#++-++#++++++#-++-+ \| * # * * # * * # * * # * * # * * # \| \|# ++ # * * # * * # * * # * * # * * # \| \| * * # * * # * * # * * # * * # * * # +++# * * # * * # * * # * * # * * # * * # \| 0.8x +-+--**###--###--*##--###-###--###--###--##--###-###--###--*##--**###--+-+ astar bzip2 gcc gobmk h264ref hmmlibquantum mcf omnetpperlbench sjengxalancbmk hmean png: http://imgur.com/DU36YFU NB. 'cross' represents the previous commit. Backports commit b4aa297781ceddef79deb0e99da7817551fa89f8 from qemu	2018-03-03 14:10:14 -05:00
Emilio G. Cota	3895eea3b4	target/i386: optimize cross-page direct jumps in softmmu Instead of unconditionally exiting to the exec loop, use the gen_jr helper to jump to the target if it is valid. Perf impact: see next commit's log. Backports commit fe62089563ffc6a42f16ff28a6b6be34d2697766 from qemu	2018-03-03 14:08:27 -05:00
Emilio G. Cota	baa017d29b	target/i386: introduce gen_jr helper to generate lookup_and_goto_ptr This helper will be used by subsequent changes. Backports commit 1ebb1af1b8068fca36f48f738eb7146ecdf03625 from qemu	2018-03-03 14:06:05 -05:00
Emilio G. Cota	9aaad9ed27	target/arm: optimize indirect branches Speed up indirect branches by jumping to the target if it is valid. Softmmu measurements (see later commit for user-mode results): Note: baseline (i.e. speedup == 1x) is QEMU v2.9.0. - Impact on Boot time \| setup \| ARM debian jessie boot+shutdown time \| stddev \| \|--------+--------------------------------------+--------\| \| v2.9.0 \| 8.84 \| 0.07 \| \| +cross \| 8.85 \| 0.03 \| \| +jr \| 8.83 \| 0.06 \| - NBench, arm-softmmu (debian jessie guest). Host: Intel i7-4790K @ 4.00GHz 1.3x +-+-------------------------------------------------------------------------------------------------------------+-+ \| \| \| cross #### \| 1.25x +cross+jr..........................................................#++#.........................................+-+ \| #### # # \| \| +++# # # # \| \| +++ **** # # # \| 1.2x +-+...................................####................#......#..#.........................................+-+ \| **** # * * # # # #### \| \| * * # * * # # # # # \| 1.15x +-+....................................#................#......#..#.....#..#................................+-+ \| * * # * * # # # # # \| \| * * # #### * * # # # # # \| \| * * # # # * * # # # # # #### \| 1.1x +-+....................................#......#..#......#......#..#.....#..#.........................#..#...+-+ \| * * # # # * * # # # # # # # \| \| * * # # # * * # # # # # # # \| 1.05x +-+..........................####......#......#..#......#......#..#.....#..#......+++............***..#...+-+ \| *** # * * # # # * * # *** # # # +++ \| *### * # \| \| +++ # * * # # # * * # +++ # ** # **### * # * * # \| \| ****### +++#### * # * * # ***** # * * # * * # * * # * \| ++# * # * * # \| 1x +-++-++++-+#++***++#+++-+++#+-++++#-++++-+#++++++#+++-+++#+-++++#-++++-+#++++++#+++-+++#+-++-+ \| * # * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # \| \| * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # \| 0.95x +-+---***###--###--*###--###--*###--###--*###--###--*###--###--***###---+-+ ASSIGNMENT BITFIELD FOURFP EMULATION HUFFMAN LU DECOMPOSITIONEURAL NNUMERIC SOSTRING SORT hmean png: http://imgur.com/eOLmZNR NB. 'cross' represents the previous commit. Backports commit 8a6b28c7b5104263344508df0f4bce97f22cfcaf from qemu	2018-03-02 21:18:15 -05:00
Emilio G. Cota	5a42602b92	target/arm: optimize cross-page direct jumps in softmmu Instead of unconditionally exiting to the exec loop, use the lookup_and_goto_ptr helper to jump to the target if it is valid. Perf impact: see next commit's log. Backports commit 7ad55b4ffd982c80f26f7f3658138d94cdc678e8 from qemu	2018-03-02 21:09:44 -05:00
Emilio G. Cota	e4dfb7f807	tcg/i386: implement goto_ptr Backports commit 5cb4ef80f65252dd85b86fa7f3c985015423d670 from qemu	2018-03-02 21:08:38 -05:00
Emilio G. Cota	8f4f15e5f5	tcg: Introduce goto_ptr opcode and tcg_gen_lookup_and_goto_ptr Instead of exporting goto_ptr directly to TCG frontends, export tcg_gen_lookup_and_goto_ptr(), which calls goto_ptr with the pointer returned by the lookup_tb_ptr() helper. This is the only use case we have for goto_ptr and lookup_tb_ptr, so having this function is very convenient. Furthermore, it trivially allows us to avoid calling the lookup helper if goto_ptr is not implemented by the backend. Backports commit cedbcb01529cb6cf9a2289cdbebbc63f6149fc18 from qemu	2018-03-02 21:05:18 -05:00
Richard Henderson	23d8f5fba2	qemu/atomic: Loosen restrictions for 64-bit ILP32 hosts We need to coordinate with the TCG_OVERSIZED_GUEST test in cputlb.c, and allow 64-bit atomics even though sizeof(void *) == 4. Backports commit 374aae653499f4d405caf32b7fff0c8639113fe4 from qemu	2018-03-02 20:06:39 -05:00
Luc MICHEL	393019de26	target/arm: add data cache invalidation cp15 instruction to cortex-r5 The cp15, CRn=15, opc1=0, CRm=5, opc2=0 instruction invalidates all the data cache on the cortex-r5. Implementing it as a NOP. Backports commit 95e9a242e2a393c7d4e5cc04340e39c3a9420f03 from qemu	2018-03-02 20:04:20 -05:00
Peter Maydell	565626ca63	armv7m: Raise correct kind of UsageFault for attempts to execute ARM code M profile doesn't implement ARM, and the architecturally required behaviour for attempts to execute with the Thumb bit clear is to generate a UsageFault with the CFSR INVSTATE bit set. We were incorrectly implementing this as generating an UNDEFINSTR UsageFault; fix this. Backports commit e13886e3a790b52f0b2e93cb5e84fdc2ada5471a from qemu	2018-03-02 20:00:58 -05:00
Peter Maydell	fbfeca93b3	armv7m: Check exception return consistency Implement the exception return consistency checks described in the v7M pseudocode ExceptionReturn(). Inspired by a patch from Michael Davidsaver's series, but this is a reimplementation from scratch based on the ARM ARM pseudocode. Backports commit aa488fe3bb5460c6675800ccd80f6dccbbd70159 from qemu	2018-03-02 19:59:18 -05:00
Peter Maydell	0736054d6d	armv7m: Extract "exception taken" code into functions Extract the code from the tail end of arm_v7m_do_interrupt() which enters the exception handler into a pair of utility functions v7m_exception_taken() and v7m_push_stack(), which correspond roughly to the pseudocode PushStack() and ExceptionTaken(). This also requires us to move the arm_v7m_load_vector() utility routine up so we can call it. Handling illegal exception returns has some cases where we want to take a UsageFault either on an existing stack frame or with a new stack frame but with a specific LR value, so we want to be able to call these without having to go via arm_v7m_cpu_do_interrupt(). Backports commit 39ae2474e337247e5930e8be783b689adc9f6215 from qemu	2018-03-02 19:54:46 -05:00
Michael Davidsaver	5b9f53bd27	armv7m: Simpler and faster exception start All the places in armv7m_cpu_do_interrupt() which pend an exception in the NVIC are doing so for synchronous exceptions. We know that we will always take some exception in this case, so we can just acknowledge it immediately, rather than returning and then immediately being called again because the NVIC has raised its outbound IRQ line. Backports commit a25dc805e2e63a55029e787a52335e12dabf07dc from qemu	2018-03-02 19:52:01 -05:00
Peter Maydell	43ba76cb28	armv7m: Fix condition check for taking exceptions The M profile condition for when we can take a pending exception or interrupt is not the same as that for A/R profile. The code originally copied from the A/R profile version of the cpu_exec_interrupt function only worked by chance for the very simple case of exceptions being masked by PRIMASK. Replace it with a call to a function in the NVIC code that correctly compares the priority of the pending exception against the current execution priority of the CPU. Backports commit 7ecdaa4a9635f1ded0dfa9218c25273b6d4dcd44 from qemu	2018-03-02 19:50:05 -05:00
Peter Maydell	5470bd1763	armv7m: Remove unused armv7m_nvic_acknowledge_irq() return value Having armv7m_nvic_acknowledge_irq() return the new value of env->v7m.exception and its one caller assign the return value back to env->v7m.exception is pointless. Just make the return type void instead. Backports commit a5d8235545e98c1ce02560d5f4f57552d937efe9 from qemu	2018-03-02 19:36:07 -05:00
Peter Maydell	50c956db7e	arm: Implement HFNMIENA support for M profile MPU Implement HFNMIENA support for the M profile MPU. This bit controls whether the MPU is treated as enabled when executing at execution priorities of less than zero (in NMI, HardFault or with the FAULTMASK bit set). Doing this requires us to use a different MMU index for "running at execution priority < 0", because we will have different access permissions for that case versus the normal case. Backports commit 3bef7012560a7f0ea27b265105de5090ba117514 from qemu	2018-03-02 19:33:24 -05:00
Michael Davidsaver	611a711f7b	arm: add MPU support to M profile CPUs The M series MPU is almost the same as the already implemented R profile MPU (v7 PMSA). So all we need to implement here is the MPU register interface in the system register space. This implementation has the same restriction as the R profile MPU that it doesn't permit regions to be sized down smaller than 1K. We also do not yet implement support for MPU_CTRL.HFNMIENA; this bit should if zero disable use of the MPU when running HardFault, NMI or with FAULTMASK set to 1 (ie at an execution priority of less than zero) -- if the MPU is enabled we don't treat these cases any differently. Backports commit 29c483a506070e8f554c77d22686f405e30b9114 from qemu	2018-03-02 19:30:20 -05:00
Michael Davidsaver	09d69209a0	armv7m: Classify faults as MemManage or BusFault General logic is that operations stopped by the MPU are MemManage, and those which go through the MPU and are caught by the unassigned handle are BusFault. Distinguish these by looking at the exception.fsr values, and set the CFSR bits and (if appropriate) fill in the BFAR or MMFAR with the exception address. Backports commit 5dd0641d234e355597be62e5279d8a519c831625 from qemu	2018-03-02 19:28:21 -05:00
Peter Maydell	9bc3050c51	arm: All M profile cores are PMSA All M profile CPUs are PMSA, so set the feature bit. (We haven't actually implemented the M profile MPU register interface yet, but setting this feature bit gives us closer to correct behaviour for the MPU-disabled case.) Backports commit 790a11503cfb5e1dcd031ea2212bbebae4ca3cec from qemu	2018-03-02 19:26:41 -05:00
Michael Davidsaver	4d8ae4a2b2	armv7m: Implement M profile default memory map Add support for the M profile default memory map which is used if the MPU is not present or disabled. The main differences in behaviour from implementing this correctly are that we set the PAGE_EXEC attribute on the right regions of memory, such that device regions are not executable. Backports commit 3a00d560bcfca7ad04327062c1986a016c104b1f from qemu	2018-03-02 19:25:02 -05:00
Michael Davidsaver	7c845dabe8	armv7m: Improve "-d mmu" tracing for PMSAv7 MPU Improve the "-d mmu" tracing for the PMSAv7 MPU translation process as an aid in debugging guest MPU configurations: * fix a missing newline for a guest-error log * report the region number with guest-error or unimp logs of bad region register values * add a log message for the overall result of the lookup * print "0x" prefix for hex values Backports commit c9f9f1246d630960bce45881e9c0d27b55be71e2 from qemu	2018-03-02 19:17:05 -05:00
Peter Maydell	bfe99e9a0b	arm: Remove unnecessary check on cpu->pmsav7_dregion Now that we enforce both: * pmsav7_dregion == 0 implies has_mpu == false * PMSA with has_mpu == false means SCTLR.M cannot be set we can remove a check on pmsav7_dregion from get_phys_addr_pmsav7(), because we can only reach this code path if the MPU is enabled (and so region_translation_disabled() returned false). Backports commit e9235c6983b261e04e897e8ff900b2b7a391e644 from qemu	2018-03-02 19:14:50 -05:00
Peter Maydell	349227bb05	arm: Don't let no-MPU PMSA cores write to SCTLR.M If the CPU is a PMSA config with no MPU implemented, then the SCTLR.M bit should be RAZ/WI, so that the guest can never turn on the non-existent MPU. Backports commit 06312febfb2d35367006ef23608ddd6a131214d4 from qemu	2018-03-02 19:13:37 -05:00
Peter Maydell	e564ed6311	arm: Don't clear ARM_FEATURE_PMSA for no-mpu configs Fix the handling of QOM properties for PMSA CPUs with no MPU: Allow no-MPU to be specified by either: * has-mpu = false * pmsav7_dregion = 0 and make setting one imply the other. Don't clear the PMSA feature bit in this situation. Backports commit f50cd31413d8bc9d1eef8edd1f878324543bf65d from qemu	2018-03-02 19:12:20 -05:00
Peter Maydell	6614ba9615	arm: Clean up handling of no-MPU PMSA CPUs ARM CPUs come in two flavours: * proper MMU ("VMSA") * only an MPU ("PMSA") For PMSA, the MPU may be implemented, or not (in which case there is default "always acts the same" behaviour, but it isn't guest programmable). QEMU is a bit confused about how we indicate this: we have an ARM_FEATURE_MPU, but it's not clear whether this indicates "PMSA, not VMSA" or "PMSA and MPU present" , and sometimes we use it for one purpose and sometimes the other. Currently trying to implement a PMSA-without-MPU core won't work correctly because we turn off the ARM_FEATURE_MPU bit and then a lot of things which should still exist get turned off too. As the first step in cleaning this up, rename the feature bit to ARM_FEATURE_PMSA, which indicates a PMSA CPU (with or without MPU). Backports commit 452a095526a0537f16c271516a2200877a272ea8 from qemu	2018-03-02 19:05:31 -05:00
Peter Maydell	b50d2da03c	arm: Use different ARMMMUIdx values for M profile Make M profile use completely separate ARMMMUIdx values from those that A profile CPUs use. This is a prelude to adding support for the MPU and for v8M, which together will require 6 MMU indexes which don't map cleanly onto the A profile uses: non secure User non secure Privileged non secure Privileged, execution priority < 0 secure User secure Privileged secure Privileged, execution priority < 0 Backports commit e7b921c2d9efc249f99b9feb0e7dca82c96aa5c4 from qemu	2018-03-02 19:01:42 -05:00
Michael Davidsaver	f532e80749	armv7m: Escalate exceptions to HardFault if necessary The v7M exception architecture requires that if a synchronous exception cannot be taken immediately (because it is disabled or at too low a priority) then it should be escalated to HardFault (and the HardFault exception is then taken). Implement this escalation logic. Backports commit a73c98e159d18155445d29b6044be6ad49fd802f from qemu	2018-03-02 18:59:13 -05:00
Peter Maydell	b7bf752d3c	arm: Add support for M profile CPUs having different MMU index semantics The M profile CPU's MPU has an awkward corner case which we would like to implement with a different MMU index. We can avoid having to bump the number of MMU modes ARM uses, because some of our existing MMU indexes are only used by non-M-profile CPUs, so we can borrow one. To avoid that getting too confusing, clean up the code to try to keep the two meanings of the index separate. Instead of ARMMMUIdx enum values being identical to core QEMU MMU index values, they are now the core index values with some high bits set. Any particular CPU always uses the same high bits (so eventually A profile cores and M profile cores will use different bits). New functions arm_to_core_mmu_idx() and core_to_arm_mmu_idx() convert between the two. In general core index values are stored in 'int' types, and ARM values are stored in ARMMMUIdx types. Backports commit 8bd5c82030b2cb09d3eef6b444f1620911cc9fc5 from qemu	2018-03-02 18:59:13 -05:00
Wei Huang	19335c32c9	target/arm: clear PMUVER field of AA64DFR0 when vPMU=off The PMUv3 driver of linux kernel (in arch/arm64/kernel/perf_event.c) relies on the PMUVER field of id_aa64dfr0_el1 to decide if PMU support is present or not. This patch clears the PMUVER field under TCG mode when vPMU=off. Without it, PMUv3 will init insider guest VMs even with vPMU=off. This patch also removes a redundant line inside the if-statement. Backports commit 2b3ffa929249b15a75d8bde3e8e57a744f52aff0 from qemu	2018-03-02 18:59:12 -05:00
Peter Maydell	4789e49c4d	arm: Use the mmu_idx we're passed in arm_cpu_do_unaligned_access() When identifying the DFSR format for an alignment fault, use the mmu index that we are passed, rather than calling cpu_mmu_index() to get the mmu index for the current CPU state. This doesn't actually make any difference since the only cases where the current MMU index differs from the index used for the load are the "unprivileged load/store" instructions, and in that case the mmu index may differ but the translation regime is the same (apart from the "use from Hyp mode" case which is UNPREDICTABLE). However it's the more logical thing to do. Backports commit e517d95b63427fae9f03958dbc005c36b4ebf2cf from qemu	2018-03-02 18:59:12 -05:00
Peter Xu	fce1b469e5	memory: tune last param of iommu_ops.translate() This patch converts the old "is_write" bool into IOMMUAccessFlags. The difference is that "is_write" can only express either read/write, but sometimes what we really want is "none" here (neither read nor write). Replay is an good example - during replay, we should not check any RW permission bits since thats not an actual IO at all. Backports commit bf55b7afce53718ef96f4e6616da62c0ccac37dd from qemu	2018-03-02 18:59:12 -05:00
Peter Xu	5621c7e09f	exec: abstract address_space_do_translate() This function is an abstraction helper for address_space_translate() and address_space_get_iotlb_entry(). It does the lookup of address into memory region section, then does proper IOMMU translation if necessary. Refactor the two existing functions to use it. This fixes vhost when IOMMU is disabled by guest. Backports commit a764040cc831cfe5b8bf1c80e8341b9bf2de3ce8 from qemu	2018-03-02 18:59:12 -05:00
Nikunj A Dadhania	d907423bac	cputlb: handle first atomic write to the page In case where the conditional write is the first write to the page, TLB_NOTDIRTY will be set and stop_the_world is triggered. Handle this as a special case and set the dirty bit. After that fall through to the actual atomic instruction below. Backports commit 7f9af1abdcc69fd1d3d8d2be68464329600616d6 from qemu	2018-03-02 18:59:12 -05:00
Aurelien Jarno	00ebbae128	tcg/mips: fix field extraction opcode The "msb" argument should correspond to (len - 1). Backports commit 2f5a5f5774d95baacf86c03aa8a77a2d0390f2b2 from qemu	2018-03-02 18:59:12 -05:00
Richard Henderson	69116abafc	tcg: Initialize return value after exit_atomic Users of tcg_gen_atomic_cmpxchg and do_atomic_op rightfully utilize the output. Even though this code is dead, it gets translated, and without the initialization we encounter a tcg_error. Backports commit 79b1af906245558c30e0a5faf26cb52b63f83cce from qemu	2018-03-02 18:59:11 -05:00
Gerd Hoffmann	108354cc4a	bitmap: add bitmap_copy_and_clear_atomic Backports commit d6eb1413920affb7be3df9982682dd183a805dd7 from qemu	2018-03-02 18:59:11 -05:00
Peter Maydell	b8b70dfcd2	Drop QEMU_GNUC_PREREQ() checks for gcc older than 4.1 We already require gcc 4.1 or newer (for the atomic support), so the fallback codepaths for older gcc versions than that are now dead code and we can just delete them. NB: clang reports itself as gcc 4.2 (regardless of clang version), so clang won't be using the fallbacks either. Backports commit fa54abb8c298f892639ffc4bc2f61448ac3be4a1 from qemu	2018-03-02 18:59:05 -05:00
Peter Maydell	2935a9af7a	arm: Remove workarounds for old M-profile exception return implementation Now that we've rewritten M-profile exception return so that the magic PC values are not visible to other parts of QEMU, we can delete the special casing of them elsewhere. Backports commit f4e8e4edda875cab9df91dc4ae9767f7cb1f50aa from qemu	2018-03-02 15:02:14 -05:00
Peter Maydell	44bf8985e5	arm: Implement M profile exception return properly On M profile, return from exceptions happen when code in Handler mode executes one of the following function call return instructions: * POP or LDM which loads the PC * LDR to PC * BX register and the new PC value is 0xFFxxxxxx. QEMU tries to implement this by not treating the instruction specially but then catching the attempt to execute from the magic address value. This is not ideal, because: * there are guest visible differences from the architecturally specified behaviour (for instance jumping to 0xFFxxxxxx via a different instruction should not cause an exception return but it will in the QEMU implementation) * we have to account for it in various places (like refusing to take an interrupt if the PC is at a magic value, and making sure that the MPU doesn't deny execution at the magic value addresses) Drop these hacks, and instead implement exception return the way the architecture specifies -- by having the relevant instructions check for the magic value and raise the 'do an exception return' QEMU internal exception immediately. The effect on the generated code is minor: bx lr, old code (and new code for Thread mode): TCG: mov_i32 tmp5,r14 movi_i32 tmp6,$0xfffffffffffffffe and_i32 pc,tmp5,tmp6 movi_i32 tmp6,$0x1 and_i32 tmp5,tmp5,tmp6 st_i32 tmp5,env,$0x218 exit_tb $0x0 set_label $L0 exit_tb $0x7f2aabd61993 x86_64 generated code: 0x7f2aabe87019: mov %ebx,%ebp 0x7f2aabe8701b: and $0xfffffffffffffffe,%ebp 0x7f2aabe8701e: mov %ebp,0x3c(%r14) 0x7f2aabe87022: and $0x1,%ebx 0x7f2aabe87025: mov %ebx,0x218(%r14) 0x7f2aabe8702c: xor %eax,%eax 0x7f2aabe8702e: jmpq 0x7f2aabe7c016 bx lr, new code when in Handler mode: TCG: mov_i32 tmp5,r14 movi_i32 tmp6,$0xfffffffffffffffe and_i32 pc,tmp5,tmp6 movi_i32 tmp6,$0x1 and_i32 tmp5,tmp5,tmp6 st_i32 tmp5,env,$0x218 movi_i32 tmp5,$0xffffffffff000000 brcond_i32 pc,tmp5,geu,$L1 exit_tb $0x0 set_label $L1 movi_i32 tmp5,$0x8 call exception_internal,$0x0,$0,env,tmp5 x86_64 generated code: 0x7fe8fa1264e3: mov %ebp,%ebx 0x7fe8fa1264e5: and $0xfffffffffffffffe,%ebx 0x7fe8fa1264e8: mov %ebx,0x3c(%r14) 0x7fe8fa1264ec: and $0x1,%ebp 0x7fe8fa1264ef: mov %ebp,0x218(%r14) 0x7fe8fa1264f6: cmp $0xff000000,%ebx 0x7fe8fa1264fc: jae 0x7fe8fa126509 0x7fe8fa126502: xor %eax,%eax 0x7fe8fa126504: jmpq 0x7fe8fa122016 0x7fe8fa126509: mov %r14,%rdi 0x7fe8fa12650c: mov $0x8,%esi 0x7fe8fa126511: mov $0x56095dbeccf5,%r10 0x7fe8fa12651b: callq *%r10 which is a difference of one cmp/branch-not-taken. This will be lost in the noise of having to exit generated code and look up the next TB anyway. Backports commit 3bb8a96f5348913ee130169504f3642f501b113e from qemu	2018-03-02 14:58:14 -05:00
Peter Maydell	cfc1611d6f	arm: Track M profile handler mode state in TB flags For M profile exception-return handling we'd like to generate different code for some instructions depending on whether we are in Handler mode or Thread mode. This isn't the same as "are we privileged or user", so we need an extra bit in the TB flags to distinguish. Backports commit 064c379c99b835bdcc478d21a3849507ea07d53a from qemu	2018-03-02 14:54:16 -05:00
Peter Maydell	8233756382	arm: Move condition-failed codepath generation out of if() Move the code to generate the "condition failed" instruction codepath out of the if (singlestepping) {} else {}. This will allow adding support for handling a new is_jmp type which can't be neatly split into "singlestepping case" versus "not singlestepping case". Backports commit f021b2c4627890d82fbcc300db3bd782b37b7f8a from qemu arm: Abstract out "are we singlestepping" test to utility function We now test for "are we singlestepping" in several places and it's not a trivial check because we need to care about both architectural singlestep and QEMU gdbstub singlestep. We're also about to add another place that needs to make this check, so pull the condition out into a function. Backports commit b636649f5a2e108413dd171edaf320f781f57942 from qemu	2018-03-02 14:52:30 -05:00
Peter Maydell	43d6e73fea	arm: Move gen_set_condexec() and gen_set_pc_im() up in the file Move the utility routines gen_set_condexec() and gen_set_pc_im() up in the file, as we will want to use them from a function placed earlier in the file than their current location. Backports commit 4d5e8c969a74c86124fc2284ea603cc6dd3c5dfa from qemu	2018-03-02 14:48:36 -05:00
Peter Maydell	23141d7620	arm: Factor out "generate right kind of step exception" We currently have two places that do: if (dc->ss_active) { gen_step_complete_exception(dc); } else { gen_exception_internal(EXCP_DEBUG); } Factor this out into its own function, as we're about to add a third place that needs the same logic. Backports commit 5425415ebba5fa20558e1ef25e1997a6f5ea4c7c from qemu	2018-03-02 14:45:30 -05:00
Peter Maydell	ddfe550411	arm: Thumb shift operations should not permit interworking branches In Thumb mode, the only instructions which can cause an interworking branch by writing the PC are BLX, BX, BXJ, LDR, POP and LDM. Unlike ARM mode, data processing instructions which target the PC do not cause interworking branches. When we added support for doing interworking branches on writes to PC from data processing instructions in commit 21aeb3430ce7ba, we accidentally changed a Thumb instruction to have interworking branch behaviour for writes to PC. (MOV, MOVS register-shifted register, encoding T2; this is the standard encoding for LSL/LSR/ASR/ROR (register).) For this encoding, behaviour with Rd == R15 is specified as UNPREDICTABLE, so allowing an interworking branch is within spec, but it's confusing and differs from our handling of this class of UNPREDICTABLE for other Thumb ALU operations. Make it perform a simple (non-interworking) branch like the others. Backports commit bedb8a6b09c1754c3b9f155750c62dc087706698 from qemu	2018-03-02 14:42:40 -05:00
Peter Maydell	9f938da9e1	arm: Don't implement BXJ on M-profile CPUs For M-profile CPUs, the BXJ instruction does not exist at all, and the encoding should always UNDEF. We were accidentally implementing it to behave like A-profile BXJ; correct the error. Backports commit 9d7c59c84d4530d05e8702b1c3a31e6da00a397e from qemu	2018-03-02 14:42:04 -05:00
Peter Maydell	e9d507a193	target/arm: Add assertion about FSC format for syndrome registers In tlb_fill() we construct a syndrome register value from a fault status register value which is filled in by arm_tlb_fill(). arm_tlb_fill() returns FSR values which might be in the format used with short-format page descriptors, or the format used with long-format (LPAE) descriptors. The syndrome register always uses LPAE-format FSR status codes. It isn't actually possible to end up delivering a syndrome register value to the guest for a fault which is reported with a short-format FSR (that kind of stage 1 fault will only happen for an AArch32 translation regime which doesn't have a syndrome register, and can never be redirected to an AArch64 or Hyp exception level). Add an assertion which checks this, and adjust the code so that we construct a syndrome with an invalid status code, rather than allowing set bits in the FSR input to randomly corrupt other fields in the syndrome. Backports commit 65ed2ed90d9d81fd4b639029be850ea5651f919f from qemu	2018-03-02 14:41:07 -05:00
Peter Maydell	1cf80d7536	arm: Move excnames[] array into arm_log_exceptions() The excnames[] array is defined in internals.h because we used to use it from two different source files for handling logging of AArch32 and AArch64 exception entry. Refactoring means that it's now used only in arm_log_exception() in helper.c, so move the array into that function. Backports commit 2c4a7cc5afb1bfc1728a39abd951ddd7714c476e from qemu	2018-03-02 14:39:37 -05:00

1 2 3 4 5 ...

2503 commits