unicorn

mirror of https://github.com/yuzu-emu/unicorn.git synced 2025-12-16 02:41:39 +00:00

Author	SHA1	Message	Date
Aurelien Jarno	2c49a6b2f6	target/mips: optimize indirect branches Backports commit e350d8ca3ac7e31c6af71a4ab74d2442dfefc697 from qemu	2018-03-03 14:23:58 -05:00
Aurelien Jarno	8ce8d4fe20	target/mips: optimize cross-page direct jumps in softmmu Backports commit d9a9acde64b862107933f9e9a01435e51bf8f91b from qemu	2018-03-03 14:23:25 -05:00
Emilio G. Cota	baa0983ae3	target/aarch64: optimize indirect branches Measurements: [Baseline performance is that before applying this and the previous commit] - NBench, aarch64-softmmu. Host: Intel i7-4790K @ 4.00GHz 1.7x +-+--------------------------------------------------------------------------------------------------------------+-+ \| \| \| cross \| 1.6x +cross+jr.................................................####...................................................+-+ \| #++# \| \| # # \| 1.5x +-+...................................................****..#...................................................+-+ \| +++* # \| \| * * # \| 1.4x +-+........................................................#...................................................+-+ \| * * # \| \| ##### * * # \| 1.3x +-+................................***+++#................#...................................................+-+ \| ++* # * * # \| \| * * # * * # \| 1.2x +-+.....................................#................#...................................................+-+ \| * * # * * # \| \| #### * * # * * # \| 1.1x +-+.......................+++#..#.......#................#...................................................+-+ \| **** # * * # * * # ***#### \| \| * # * * # * * # **### +++#### *### * # \| 1x +-++-++++++-++++***###++-++++#+++++-+#++**++++++++++#++++-+#++**++#++*###-++++-+#+++-+++#+-++-+ \| ***### * # * * # * * # ++### * * # * * # * * # * ++# * # * * # \| \| * ++# * # * * # * * # * * # * * # * * # * * # * * # * * # * * # \| 0.9x +-+---***###--###---###--####--###--*###--###--*###--###---###--####---+-+ ASSIGNMENT BITFIELD FOURFP EMULATION HUFFMAN LU DECOMPOSITIONNEURAL NUMERIC SORSTRING SORT hmean png: http://imgur.com/qO9ubtk NB. cross here represents the previous commit. - SPECint06 (test set), aarch64-linux-user. Host: Intel i7-4790K @ 4.00GHz 1.5x +-+--------------------------------------------------------------------------------------------------------------+-+ \| *** \| \| +++ jr \| \| * * \| 1.4x +-+.............................................................................................+++............+-+ \| * * \| \| \| ***** * * \| \| \| * * * * ***** \| 1.3x +-+...........................................................................................\|............+-+ \| +++ * * * * * \| * \| \| ***** * * * * +++ \| \| * * * * * * * * \| 1.2x +-+...............................................................................****..................+-+ \| **** * * * * * * * * * * +++ \| \| * * * * * * * * * * * * ***** \| \| * * * * ***** * * * * * * * * * * \| 1.1x +-+....................................................................+++.......................+-+ \| * * * * * * * * * * ***** * * * * * * \| \| * * * * * * * * ***** * * * * * * * * * * \| \| * * ***** * * * * * * * * ****** * * * * * * * * * * \| 1x +-++-++++-++++++++++-++++-+++++-++++++++++-++++-++****+++++-+++++-++++-++++++++++-++++-++-+ \| * * * * * * * * * * * * * +++ * * * * * * * * * * \| \| * * * * * * * * * * * * * * * * * * * * * * * * * * \| \| * * * * * * * * * * * * * * * * * * * * * * * * * * \| 0.9x +-+---***---*----*---*---*---*---**---*---*---*---*----*---*---+-+ astar bzip2 gcc gobmk h264ref hmmlibquantum mcf omnetpperlbench sjengxalancbmk hmean png: http://imgur.com/3Dp4vvq - SPECint06 (train set), aarch64-linux-user. Host: Intel i7-4790K @ 4.00GHz 1.7x +-+--------------------------------------------------------------------------------------------------------------+-+ \| \| \| jr \| 1.6x +-+...............................................................................................+++............+-+ \| *** \| \| +++ \| \| * * \| 1.5x +-+............................................................................................................+-+ \| +++ * * \| \| ***** * * \| 1.4x +-+.....................................................................+++..................................+-+ \| * * * * \| \| ***** * * * * \| \| * * * * ***** * * \| 1.3x +-+......................................................................................................+-+ \| +++ * * * * * * * * \| \| ***** * * * * * * ***** * * \| 1.2x +-+.............................................................................+++..........****...+-+ \| * * * * * * * * * * * +++ \| \| ***** * * ***** * * * * * * * * * * * * \| \| * * * * +++ * * * * * * * * * * * * \| 1.1x +-+............................................................................................+-+ \| * * ***** * * * * * * ***** * * * * * * * * * * \| \| * * * * * * * * * * +++ ****** +++ * * * * * * * * * * \| 1x +-+---***---*----*---*---*---*---**---*---*---*---*----*---***---+-+ astar bzip2 gcc gobmk h264ref hmmlibquantum mcf omnetpperlbench sjengxalancbmk hmean png: http://imgur.com/vRrdc9j Backports commit e75449a346bf558296966a44277bfd93412c6da6 from qemu	2018-03-03 14:22:12 -05:00
Emilio G. Cota	83ea5b72f2	target/aarch64: optimize cross-page direct jumps in softmmu Perf numbers in next commit's log. Backports commit e78722368c721f3c5b8109ed525adac1653ae97b from qemu	2018-03-03 14:20:55 -05:00
Aurelien Jarno	0e9d3d1943	tcg/mips: implement goto_ptr Backports commit 5786e0683c4f8170dd05a550814b8809d8ae6d86 from qemu	2018-03-03 14:19:46 -05:00
Richard Henderson	1d6c4f1a42	tcg/arm: Implement goto_ptr Backports commit 085c648bef7301eabe7d4a3301c8d012ae4423b8 from qemu	2018-03-03 14:18:41 -05:00
Richard Henderson	3b02642372	tcg/arm: Clarify tcg_out_bx for arm4 host In theory this would re-enable usage of QEMU on an armv4 host. Whether this is worthwhile is debatable -- we've been unconditionally issuing the armv5t BX instruction in the prologue since 2011 without complaint. Possibly we should simply require an armv6 host. Backports commit 702a947484eb3e615183dafc93de590ab0679f60 from qemu	2018-03-03 14:17:13 -05:00
Richard Henderson	d496bb6150	tcg/s390: Implement goto_ptr Backports commit 46644483cae978c734460131bb1d9071f813b287 from qemu	2018-03-03 14:16:03 -05:00
Richard Henderson	f0420c3427	tcg/sparc: Implement goto_ptr Backports commit 38f81dc5938fb7025531c5ed602afd41fef799a7 from qemu	2018-03-03 14:14:32 -05:00
Richard Henderson	81f1aae572	tcg/aarch64: Implement goto_ptr Measurements: SPECint06 (test set), x86_64-linux-user. Host: APM 64-bit ARMv8 (Atlas/A57) @ 2.4 GHz 1.45x +-+-------------------------------------------------------------------------------------------------------------+-+ \| ***** \| \| +++ * * +goto-ptr \| 1.4x +-+...****...................................................................................................+-+ \| +++* * * +++ \| 1.35x +-+................................................................****....................................+-+ \| * * * +++ \| \| * * * * * * \| 1.3x +-+.......................................................................................................+-+ \| * * * * * * \| \| * * * * * * ***** \| 1.25x +-+.................****.........................................................***.................+-+ \| * * * * * * * +++ * * \| 1.2x +-+.................................................................................................+-+ \| * * * * * * * * * * * * \| \| * * * * * * * * * * * * ***** \| 1.15x +-+...............................................................................................+-+ \| * * * * * * * * +++ * * * * * * \| \| * * * * * * * * ***** * * * * * * \| 1.1x +-+........................****.........***..................................................+-+ \| * * * * * * * * * * * * * * * * * * * \| 1.05x +-+.........................................................................................+-+ \| * * ***** * * * * * * * * * * * * * * * * * * \| \| * * * * * * * * * * * * *** *** * * * * * * * * * * \| 1x +-+---***---*---*----*---*---*---*---*---*---*----*---*---***---+-+ astar bzip2 gcc gobmk h264ref hmmlibquantum mcf omnetpperlbench sjenxalancbmk hmean png: http://imgur.com/en9HE8L Backports commit b19f0c2e7d344d4d62daf554951acdb6c94a34b0 from qemu	2018-03-03 14:13:09 -05:00
Emilio G. Cota	7d0440dec4	tb-hash: improve tb_jmp_cache hash function in user mode Optimizations to cross-page chaining and indirect branches make performance more sensitive to the hit rate of tb_jmp_cache. The constraint of reserving some bits for the page number lowers the achievable quality of the hashing function. However, user-mode does not have this requirement. Thus, with this change we use for user-mode a hashing function that is both faster and of better quality than the previous one. Measurements: Note: baseline (i.e. speedup == 1x) is QEMU v2.9.0. - SPECint06 (test set), x86_64-linux-user. Host: Intel i7-6700K @ 4.00GHz 2.2x +-+--------------------------------------------------------------------------------------------------------------+-+ \| \| \| jr \| 2x +jr+multhash +....................................................+++++...................................+-+ \| jr+hash \|$$$ \| \| \|$+$ \| \| ### $ \| 1.8x +-+......................................................................#\|#.$...................................+-+ \| ++#+# $ \| \| \|# # $ \| 1.6x +-+....................................................................**.#.$....................++$$$..........+-+ \| $$$ +* # $ \|$+$ \| \| ++$$$ ### $ * * # $ +++\|$ $ \| \| ++###+$ # # $ * * # $ ### **## $ \| 1.4x +-+...................+#.$.........*.#.$............................#.$...........#+#$$.++\|#.$..........+-+ \| +* # $ * * # $ * * # $ # # $ * +# $ \| \| * # $ +++++ * * # $ * * # $ *** # $ * * # $ ###$$ \| 1.2x +-+.....................#.$.**##$$...#.$............................#.$...........#.$....#.$.*+#+$..+-+ \| * # $ + # $ * * # $ +++ * * # $ ++###$$ * * # $ * * # $ * * # $ \| \| **##$$ * # $ * * # $ * * # $ **##$$ ++### * # $ *** #+$ * * # $ * * # $ * * # $ \| \| ++#+$ **##$$$ * # $ * * # $ * * # $ + # $ ++####$$ **+# * # $ * * # $ * * # $ * * # $ * * # $ \| 1x +-++-++#+$+++#-+$++-#+$+++#+$+++#+$+-+#+$+**++#+$+++#$$+++#+$+++#+$++-#+$++-+#+$+++#+$-++-+ \| * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ \| \| * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ \| 0.8x +-+--*##$$-##$$$-##$$-##$$-##$$-##$$-###$$-##$$-##$$-##$$-##$$-##$$-##$$--+-+ astar bzip2 gcc gobmk h264ref hmmlibquantum mcf omnetpperlbench sjengxalancbmk hmean png: http://imgur.com/4UXTrEc Here I also tried the hash function suggested by Paolo ("multhash"): return ((uint64_t) (pc 2654435761) >> 32) & (TB_JMP_CACHE_SIZE - 1); As you can see it is just as good as the other new function ("hash"), which is what I ended up going with. - SPECint06 (train set), x86_64-linux-user. Host: Intel i7-6700K @ 4.00GHz 2.6x +-+--------------------------------------------------------------------------------------------------------------+-+ \| \| \| jr ### \| 2.4x +jr+hash...........................................................................................#.#...........+-+ \| # # \| \| # # \| 2.2x +-+................................................................................................#.#...........+-+ \| # # \| \| # # \| 2x +-+................................................................................................#.#...........+-+ \| **** # \| \| * * # \| 1.8x +-+................................................................................................#...........+-+ \| +++ * * # \| \| #### #### * * # \| 1.6x +-+......................................####.............................#..#.***..#.............#...........+-+ \| +++ #++# *** # * * # #### * * # \| \| ### # # * * # * * # # # * * # \| 1.4x +-+...................**+#..........*..#..............................#.....#....#..#.....#...........+-+ \| ++* # * * # * * # * * # *** # * * # #### \| \| * * # #### * * # * * # * * # * * # * * # **** # \| 1.2x +-+......................#..***++#.....#..............................#.....#.....#.....#......#..+-+ \| **### * # * * # * * # * * # * * # * * # * * # * * # \| \| * * # **### * # * * # * * # ***## * # * * # * * # * * # * * # \| 1x +-+--**###--###--*##--###-###--###--###--##--###-###--###--*##--###--+-+ astar bzip2 gcc gobmk h264ref hmmlibquantum mcf omnetpperlbench sjengxalancbmk hmean png: http://imgur.com/ArCbHqo - NBench, x86_64-linux-user. Host: Intel i7-6700K @ 4.00GHz 1.12x +-+-------------------------------------------------------------------------------------------------------------+-+ \| \| \| jr +++ \| 1.1x +jr+hash...........................................................####.........................................+-+ \| +++#\| # \| \| \| #++# \| 1.08x +-+................................+++................+++.+++..**..#.........................................+-+ \| \| +++ \| \| \| * # \| \| \| \| \| \| +++ # \| 1.06x +-+................................***###.............\|...\|........#.........................+++.............+-+ \| \| * \|# ***### * # \| \| \| \| ++# \| \|# * * # #### \| 1.04x +-+................................++..#............\|..\|#.......#........................#.\|#.............+-+ \| * * # ++++# * * # +++#++# \| \| * * # * * # * * # \| # # +++#### \| 1.02x +-+....................................#......+++.......#.......#.....................**..#..**++#...+-+ \| +++ * # +++ \| * * # * * # +++ \| # +++ # \| \| +++ \| +++ +++ ++++++ * * # ****### * # * * # \| +++ ++++++ ++ # * * # \| 1x +-++-+++++####++***###++++-+####+-++++#-++++-+#++++++#+++-+++#+-+++####-+***###++++++#+++-+++#+-++-+ \| ***\| # ++* \|# ****\| # * # * ++# * # * * # **** \|# * * # * * # * * # \| \| * \| \| # ++# \| ++# * # * * # * * # * * # \| ++# * * # * * # * * # \| 0.98x +-+....\|.++#......#..+++..#......#.......#......#.......#..++..#.......#......#.......#...+-+ \| +++ # * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # \| \| * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # \| 0.96x +-+---***###--###--*###--###--*###--###--*###--###--*###--###--*###---+-+ ASSIGNMENT BITFIELD FOURFP EMULATION HUFFMAN LU DECOMPOSITIONEURAL NNUMERIC SOSTRING SORT hmean png: http://imgur.com/ZXFX0hJ - NBench, arm-linux-user. Host: Intel i7-4790K @ 4.00GHz 1.3x +-+-------------------------------------------------------------------------------------------------------------+-+ \| #### \| \| jr # # +++ \| 1.25x +jr+hash.....................#..#...........................................####................................+-+ \| # # # # \| \| # # # # \| 1.2x +-+..........................#..#...........................................#..#................................+-+ \| # # # # \| \| # # # # \| 1.15x +-+..........................#..#...........................................#..#................................+-+ \| # # #### # # \| \| # # # # # # \| 1.1x +-+..........................#..#..................................#..#.....#..#................................+-+ \| # # # # # # +++ \| \| # # #### # # # # #### \| 1.05x +-+..........................#..#...............#..#.....####......#..#.....#..#.........................#..#...+-+ \| # # # # # # # # # # +++ # # \| \| +++ * # #### * # # # +++# # # ### # # \| 1x +-++-+*###++*++++++-+++#+-**++#-++++-+#+++++#++#++***++#+-++++#-+***-++++++++#++***++#+-++-+ \| * # * * \| * * # * * # * * # **** # * * # * * # * ### ++# * # \| \| * * # * ### * # * * # * * # * * # * * # * * # * * # * * # * * # \| 0.95x +-+........#.....\|#.......#......#.......#......#.......#......#.......#......#.......#...+-+ \| * * # * * \|# * * # * * # * * # * * # * * # * * # * * # * * # * * # \| \| * * # * * \|# * * # * * # * * # * * # * * # * * # * * # * * # * * # \| 0.9x +-+---***###--###--*###--###--*###--###--*###--###--*###--###--***###---+-+ ASSIGNMENT BITFIELD FOURFP EMULATION HUFFMAN LU DECOMPOSITIONEURAL NNUMERIC SOSTRING SORT hmean png: http://imgur.com/FfD27ey Backports commit 6f1653180f5701c6a8f1b35b89a80b1e3260928e from qemu	2018-03-03 14:11:29 -05:00
Emilio G. Cota	2d16da435e	target/i386: optimize indirect branches Speed up indirect branches by jumping to the target if it is valid. Softmmu measurements (see later commit for user-mode numbers): Note: baseline (i.e. speedup == 1x) is QEMU v2.9.0. - SPECint06 (test set), x86_64-softmmu (Ubuntu 16.04 guest). Host: Intel i7-4790K @ 4.00GHz 2.4x +-+--------------------------------------------------------------------------------------------------------------+-+ \| \| \| cross \| 2.2x +cross+jr..........................................................................+++...........................+-+ \| \| \| \| +++ \| \| 2x +-+..............................................................................\|..\|............................+-+ \| \| \| \| \| \| \| \| 1.8x +-+..............................................................................\|####...........................+-+ \| \|# \|# \| \| **** \|# \| 1.6x +-+.............................................................................\|.\|#...........................+-+ \| * \|* \|# \| \| * \|* \|# \| 1.4x +-+.......................................................................+++...\|.\|#...........................+-+ \| ++++++ #### * \|++# +++ \| \| +++ \| \| #++# ++* # +++ \| \| 1.2x +-+......................###.....####....+++............\|..\|...........***..#.....#....####...\|.###.....####..+-+ \| +++ * # # #### ### ++* # * * # #++# **\|# +++#++# \| \| *### +++ ++* # ++ # ++# # #### \| \|# +++ * * # * * # *** # \| \|# **** # \| 1x +-++-++++#++**###+++++#+++-++#+*++#++++#+-+++#-+**##++++-+#+++-+#+++++#++-++#++++++#-++-+ \| * # * * # * * # * * # * * # * * # \| \|# ++ # * * # * * # * * # * * # * * # \| \| * * # * * # * * # * * # * * # * * # +++# * * # * * # * * # * * # * * # * * # \| 0.8x +-+--**###--###--*##--###-###--###--###--##--###-###--###--*##--**###--+-+ astar bzip2 gcc gobmk h264ref hmmlibquantum mcf omnetpperlbench sjengxalancbmk hmean png: http://imgur.com/DU36YFU NB. 'cross' represents the previous commit. Backports commit b4aa297781ceddef79deb0e99da7817551fa89f8 from qemu	2018-03-03 14:10:14 -05:00
Emilio G. Cota	3895eea3b4	target/i386: optimize cross-page direct jumps in softmmu Instead of unconditionally exiting to the exec loop, use the gen_jr helper to jump to the target if it is valid. Perf impact: see next commit's log. Backports commit fe62089563ffc6a42f16ff28a6b6be34d2697766 from qemu	2018-03-03 14:08:27 -05:00
Emilio G. Cota	baa017d29b	target/i386: introduce gen_jr helper to generate lookup_and_goto_ptr This helper will be used by subsequent changes. Backports commit 1ebb1af1b8068fca36f48f738eb7146ecdf03625 from qemu	2018-03-03 14:06:05 -05:00
Emilio G. Cota	9aaad9ed27	target/arm: optimize indirect branches Speed up indirect branches by jumping to the target if it is valid. Softmmu measurements (see later commit for user-mode results): Note: baseline (i.e. speedup == 1x) is QEMU v2.9.0. - Impact on Boot time \| setup \| ARM debian jessie boot+shutdown time \| stddev \| \|--------+--------------------------------------+--------\| \| v2.9.0 \| 8.84 \| 0.07 \| \| +cross \| 8.85 \| 0.03 \| \| +jr \| 8.83 \| 0.06 \| - NBench, arm-softmmu (debian jessie guest). Host: Intel i7-4790K @ 4.00GHz 1.3x +-+-------------------------------------------------------------------------------------------------------------+-+ \| \| \| cross #### \| 1.25x +cross+jr..........................................................#++#.........................................+-+ \| #### # # \| \| +++# # # # \| \| +++ **** # # # \| 1.2x +-+...................................####................#......#..#.........................................+-+ \| **** # * * # # # #### \| \| * * # * * # # # # # \| 1.15x +-+....................................#................#......#..#.....#..#................................+-+ \| * * # * * # # # # # \| \| * * # #### * * # # # # # \| \| * * # # # * * # # # # # #### \| 1.1x +-+....................................#......#..#......#......#..#.....#..#.........................#..#...+-+ \| * * # # # * * # # # # # # # \| \| * * # # # * * # # # # # # # \| 1.05x +-+..........................####......#......#..#......#......#..#.....#..#......+++............***..#...+-+ \| *** # * * # # # * * # *** # # # +++ \| *### * # \| \| +++ # * * # # # * * # +++ # ** # **### * # * * # \| \| ****### +++#### * # * * # ***** # * * # * * # * * # * \| ++# * # * * # \| 1x +-++-++++-+#++***++#+++-+++#+-++++#-++++-+#++++++#+++-+++#+-++++#-++++-+#++++++#+++-+++#+-++-+ \| * # * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # \| \| * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # \| 0.95x +-+---***###--###--*###--###--*###--###--*###--###--*###--###--***###---+-+ ASSIGNMENT BITFIELD FOURFP EMULATION HUFFMAN LU DECOMPOSITIONEURAL NNUMERIC SOSTRING SORT hmean png: http://imgur.com/eOLmZNR NB. 'cross' represents the previous commit. Backports commit 8a6b28c7b5104263344508df0f4bce97f22cfcaf from qemu	2018-03-02 21:18:15 -05:00
Emilio G. Cota	5a42602b92	target/arm: optimize cross-page direct jumps in softmmu Instead of unconditionally exiting to the exec loop, use the lookup_and_goto_ptr helper to jump to the target if it is valid. Perf impact: see next commit's log. Backports commit 7ad55b4ffd982c80f26f7f3658138d94cdc678e8 from qemu	2018-03-02 21:09:44 -05:00
Emilio G. Cota	e4dfb7f807	tcg/i386: implement goto_ptr Backports commit 5cb4ef80f65252dd85b86fa7f3c985015423d670 from qemu	2018-03-02 21:08:38 -05:00
Emilio G. Cota	8f4f15e5f5	tcg: Introduce goto_ptr opcode and tcg_gen_lookup_and_goto_ptr Instead of exporting goto_ptr directly to TCG frontends, export tcg_gen_lookup_and_goto_ptr(), which calls goto_ptr with the pointer returned by the lookup_tb_ptr() helper. This is the only use case we have for goto_ptr and lookup_tb_ptr, so having this function is very convenient. Furthermore, it trivially allows us to avoid calling the lookup helper if goto_ptr is not implemented by the backend. Backports commit cedbcb01529cb6cf9a2289cdbebbc63f6149fc18 from qemu	2018-03-02 21:05:18 -05:00
Richard Henderson	23d8f5fba2	qemu/atomic: Loosen restrictions for 64-bit ILP32 hosts We need to coordinate with the TCG_OVERSIZED_GUEST test in cputlb.c, and allow 64-bit atomics even though sizeof(void *) == 4. Backports commit 374aae653499f4d405caf32b7fff0c8639113fe4 from qemu	2018-03-02 20:06:39 -05:00
Luc MICHEL	393019de26	target/arm: add data cache invalidation cp15 instruction to cortex-r5 The cp15, CRn=15, opc1=0, CRm=5, opc2=0 instruction invalidates all the data cache on the cortex-r5. Implementing it as a NOP. Backports commit 95e9a242e2a393c7d4e5cc04340e39c3a9420f03 from qemu	2018-03-02 20:04:20 -05:00
Peter Maydell	565626ca63	armv7m: Raise correct kind of UsageFault for attempts to execute ARM code M profile doesn't implement ARM, and the architecturally required behaviour for attempts to execute with the Thumb bit clear is to generate a UsageFault with the CFSR INVSTATE bit set. We were incorrectly implementing this as generating an UNDEFINSTR UsageFault; fix this. Backports commit e13886e3a790b52f0b2e93cb5e84fdc2ada5471a from qemu	2018-03-02 20:00:58 -05:00
Peter Maydell	fbfeca93b3	armv7m: Check exception return consistency Implement the exception return consistency checks described in the v7M pseudocode ExceptionReturn(). Inspired by a patch from Michael Davidsaver's series, but this is a reimplementation from scratch based on the ARM ARM pseudocode. Backports commit aa488fe3bb5460c6675800ccd80f6dccbbd70159 from qemu	2018-03-02 19:59:18 -05:00
Peter Maydell	0736054d6d	armv7m: Extract "exception taken" code into functions Extract the code from the tail end of arm_v7m_do_interrupt() which enters the exception handler into a pair of utility functions v7m_exception_taken() and v7m_push_stack(), which correspond roughly to the pseudocode PushStack() and ExceptionTaken(). This also requires us to move the arm_v7m_load_vector() utility routine up so we can call it. Handling illegal exception returns has some cases where we want to take a UsageFault either on an existing stack frame or with a new stack frame but with a specific LR value, so we want to be able to call these without having to go via arm_v7m_cpu_do_interrupt(). Backports commit 39ae2474e337247e5930e8be783b689adc9f6215 from qemu	2018-03-02 19:54:46 -05:00
Michael Davidsaver	5b9f53bd27	armv7m: Simpler and faster exception start All the places in armv7m_cpu_do_interrupt() which pend an exception in the NVIC are doing so for synchronous exceptions. We know that we will always take some exception in this case, so we can just acknowledge it immediately, rather than returning and then immediately being called again because the NVIC has raised its outbound IRQ line. Backports commit a25dc805e2e63a55029e787a52335e12dabf07dc from qemu	2018-03-02 19:52:01 -05:00
Peter Maydell	43ba76cb28	armv7m: Fix condition check for taking exceptions The M profile condition for when we can take a pending exception or interrupt is not the same as that for A/R profile. The code originally copied from the A/R profile version of the cpu_exec_interrupt function only worked by chance for the very simple case of exceptions being masked by PRIMASK. Replace it with a call to a function in the NVIC code that correctly compares the priority of the pending exception against the current execution priority of the CPU. Backports commit 7ecdaa4a9635f1ded0dfa9218c25273b6d4dcd44 from qemu	2018-03-02 19:50:05 -05:00
Peter Maydell	5470bd1763	armv7m: Remove unused armv7m_nvic_acknowledge_irq() return value Having armv7m_nvic_acknowledge_irq() return the new value of env->v7m.exception and its one caller assign the return value back to env->v7m.exception is pointless. Just make the return type void instead. Backports commit a5d8235545e98c1ce02560d5f4f57552d937efe9 from qemu	2018-03-02 19:36:07 -05:00
Peter Maydell	50c956db7e	arm: Implement HFNMIENA support for M profile MPU Implement HFNMIENA support for the M profile MPU. This bit controls whether the MPU is treated as enabled when executing at execution priorities of less than zero (in NMI, HardFault or with the FAULTMASK bit set). Doing this requires us to use a different MMU index for "running at execution priority < 0", because we will have different access permissions for that case versus the normal case. Backports commit 3bef7012560a7f0ea27b265105de5090ba117514 from qemu	2018-03-02 19:33:24 -05:00
Michael Davidsaver	611a711f7b	arm: add MPU support to M profile CPUs The M series MPU is almost the same as the already implemented R profile MPU (v7 PMSA). So all we need to implement here is the MPU register interface in the system register space. This implementation has the same restriction as the R profile MPU that it doesn't permit regions to be sized down smaller than 1K. We also do not yet implement support for MPU_CTRL.HFNMIENA; this bit should if zero disable use of the MPU when running HardFault, NMI or with FAULTMASK set to 1 (ie at an execution priority of less than zero) -- if the MPU is enabled we don't treat these cases any differently. Backports commit 29c483a506070e8f554c77d22686f405e30b9114 from qemu	2018-03-02 19:30:20 -05:00
Michael Davidsaver	09d69209a0	armv7m: Classify faults as MemManage or BusFault General logic is that operations stopped by the MPU are MemManage, and those which go through the MPU and are caught by the unassigned handle are BusFault. Distinguish these by looking at the exception.fsr values, and set the CFSR bits and (if appropriate) fill in the BFAR or MMFAR with the exception address. Backports commit 5dd0641d234e355597be62e5279d8a519c831625 from qemu	2018-03-02 19:28:21 -05:00
Peter Maydell	9bc3050c51	arm: All M profile cores are PMSA All M profile CPUs are PMSA, so set the feature bit. (We haven't actually implemented the M profile MPU register interface yet, but setting this feature bit gives us closer to correct behaviour for the MPU-disabled case.) Backports commit 790a11503cfb5e1dcd031ea2212bbebae4ca3cec from qemu	2018-03-02 19:26:41 -05:00
Michael Davidsaver	4d8ae4a2b2	armv7m: Implement M profile default memory map Add support for the M profile default memory map which is used if the MPU is not present or disabled. The main differences in behaviour from implementing this correctly are that we set the PAGE_EXEC attribute on the right regions of memory, such that device regions are not executable. Backports commit 3a00d560bcfca7ad04327062c1986a016c104b1f from qemu	2018-03-02 19:25:02 -05:00
Michael Davidsaver	7c845dabe8	armv7m: Improve "-d mmu" tracing for PMSAv7 MPU Improve the "-d mmu" tracing for the PMSAv7 MPU translation process as an aid in debugging guest MPU configurations: * fix a missing newline for a guest-error log * report the region number with guest-error or unimp logs of bad region register values * add a log message for the overall result of the lookup * print "0x" prefix for hex values Backports commit c9f9f1246d630960bce45881e9c0d27b55be71e2 from qemu	2018-03-02 19:17:05 -05:00
Peter Maydell	bfe99e9a0b	arm: Remove unnecessary check on cpu->pmsav7_dregion Now that we enforce both: * pmsav7_dregion == 0 implies has_mpu == false * PMSA with has_mpu == false means SCTLR.M cannot be set we can remove a check on pmsav7_dregion from get_phys_addr_pmsav7(), because we can only reach this code path if the MPU is enabled (and so region_translation_disabled() returned false). Backports commit e9235c6983b261e04e897e8ff900b2b7a391e644 from qemu	2018-03-02 19:14:50 -05:00
Peter Maydell	349227bb05	arm: Don't let no-MPU PMSA cores write to SCTLR.M If the CPU is a PMSA config with no MPU implemented, then the SCTLR.M bit should be RAZ/WI, so that the guest can never turn on the non-existent MPU. Backports commit 06312febfb2d35367006ef23608ddd6a131214d4 from qemu	2018-03-02 19:13:37 -05:00
Peter Maydell	e564ed6311	arm: Don't clear ARM_FEATURE_PMSA for no-mpu configs Fix the handling of QOM properties for PMSA CPUs with no MPU: Allow no-MPU to be specified by either: * has-mpu = false * pmsav7_dregion = 0 and make setting one imply the other. Don't clear the PMSA feature bit in this situation. Backports commit f50cd31413d8bc9d1eef8edd1f878324543bf65d from qemu	2018-03-02 19:12:20 -05:00
Peter Maydell	6614ba9615	arm: Clean up handling of no-MPU PMSA CPUs ARM CPUs come in two flavours: * proper MMU ("VMSA") * only an MPU ("PMSA") For PMSA, the MPU may be implemented, or not (in which case there is default "always acts the same" behaviour, but it isn't guest programmable). QEMU is a bit confused about how we indicate this: we have an ARM_FEATURE_MPU, but it's not clear whether this indicates "PMSA, not VMSA" or "PMSA and MPU present" , and sometimes we use it for one purpose and sometimes the other. Currently trying to implement a PMSA-without-MPU core won't work correctly because we turn off the ARM_FEATURE_MPU bit and then a lot of things which should still exist get turned off too. As the first step in cleaning this up, rename the feature bit to ARM_FEATURE_PMSA, which indicates a PMSA CPU (with or without MPU). Backports commit 452a095526a0537f16c271516a2200877a272ea8 from qemu	2018-03-02 19:05:31 -05:00
Peter Maydell	b50d2da03c	arm: Use different ARMMMUIdx values for M profile Make M profile use completely separate ARMMMUIdx values from those that A profile CPUs use. This is a prelude to adding support for the MPU and for v8M, which together will require 6 MMU indexes which don't map cleanly onto the A profile uses: non secure User non secure Privileged non secure Privileged, execution priority < 0 secure User secure Privileged secure Privileged, execution priority < 0 Backports commit e7b921c2d9efc249f99b9feb0e7dca82c96aa5c4 from qemu	2018-03-02 19:01:42 -05:00
Michael Davidsaver	f532e80749	armv7m: Escalate exceptions to HardFault if necessary The v7M exception architecture requires that if a synchronous exception cannot be taken immediately (because it is disabled or at too low a priority) then it should be escalated to HardFault (and the HardFault exception is then taken). Implement this escalation logic. Backports commit a73c98e159d18155445d29b6044be6ad49fd802f from qemu	2018-03-02 18:59:13 -05:00
Peter Maydell	b7bf752d3c	arm: Add support for M profile CPUs having different MMU index semantics The M profile CPU's MPU has an awkward corner case which we would like to implement with a different MMU index. We can avoid having to bump the number of MMU modes ARM uses, because some of our existing MMU indexes are only used by non-M-profile CPUs, so we can borrow one. To avoid that getting too confusing, clean up the code to try to keep the two meanings of the index separate. Instead of ARMMMUIdx enum values being identical to core QEMU MMU index values, they are now the core index values with some high bits set. Any particular CPU always uses the same high bits (so eventually A profile cores and M profile cores will use different bits). New functions arm_to_core_mmu_idx() and core_to_arm_mmu_idx() convert between the two. In general core index values are stored in 'int' types, and ARM values are stored in ARMMMUIdx types. Backports commit 8bd5c82030b2cb09d3eef6b444f1620911cc9fc5 from qemu	2018-03-02 18:59:13 -05:00
Wei Huang	19335c32c9	target/arm: clear PMUVER field of AA64DFR0 when vPMU=off The PMUv3 driver of linux kernel (in arch/arm64/kernel/perf_event.c) relies on the PMUVER field of id_aa64dfr0_el1 to decide if PMU support is present or not. This patch clears the PMUVER field under TCG mode when vPMU=off. Without it, PMUv3 will init insider guest VMs even with vPMU=off. This patch also removes a redundant line inside the if-statement. Backports commit 2b3ffa929249b15a75d8bde3e8e57a744f52aff0 from qemu	2018-03-02 18:59:12 -05:00
Peter Maydell	4789e49c4d	arm: Use the mmu_idx we're passed in arm_cpu_do_unaligned_access() When identifying the DFSR format for an alignment fault, use the mmu index that we are passed, rather than calling cpu_mmu_index() to get the mmu index for the current CPU state. This doesn't actually make any difference since the only cases where the current MMU index differs from the index used for the load are the "unprivileged load/store" instructions, and in that case the mmu index may differ but the translation regime is the same (apart from the "use from Hyp mode" case which is UNPREDICTABLE). However it's the more logical thing to do. Backports commit e517d95b63427fae9f03958dbc005c36b4ebf2cf from qemu	2018-03-02 18:59:12 -05:00
Peter Xu	fce1b469e5	memory: tune last param of iommu_ops.translate() This patch converts the old "is_write" bool into IOMMUAccessFlags. The difference is that "is_write" can only express either read/write, but sometimes what we really want is "none" here (neither read nor write). Replay is an good example - during replay, we should not check any RW permission bits since thats not an actual IO at all. Backports commit bf55b7afce53718ef96f4e6616da62c0ccac37dd from qemu	2018-03-02 18:59:12 -05:00
Peter Xu	5621c7e09f	exec: abstract address_space_do_translate() This function is an abstraction helper for address_space_translate() and address_space_get_iotlb_entry(). It does the lookup of address into memory region section, then does proper IOMMU translation if necessary. Refactor the two existing functions to use it. This fixes vhost when IOMMU is disabled by guest. Backports commit a764040cc831cfe5b8bf1c80e8341b9bf2de3ce8 from qemu	2018-03-02 18:59:12 -05:00
Nikunj A Dadhania	d907423bac	cputlb: handle first atomic write to the page In case where the conditional write is the first write to the page, TLB_NOTDIRTY will be set and stop_the_world is triggered. Handle this as a special case and set the dirty bit. After that fall through to the actual atomic instruction below. Backports commit 7f9af1abdcc69fd1d3d8d2be68464329600616d6 from qemu	2018-03-02 18:59:12 -05:00
Aurelien Jarno	00ebbae128	tcg/mips: fix field extraction opcode The "msb" argument should correspond to (len - 1). Backports commit 2f5a5f5774d95baacf86c03aa8a77a2d0390f2b2 from qemu	2018-03-02 18:59:12 -05:00
Richard Henderson	69116abafc	tcg: Initialize return value after exit_atomic Users of tcg_gen_atomic_cmpxchg and do_atomic_op rightfully utilize the output. Even though this code is dead, it gets translated, and without the initialization we encounter a tcg_error. Backports commit 79b1af906245558c30e0a5faf26cb52b63f83cce from qemu	2018-03-02 18:59:11 -05:00
Gerd Hoffmann	108354cc4a	bitmap: add bitmap_copy_and_clear_atomic Backports commit d6eb1413920affb7be3df9982682dd183a805dd7 from qemu	2018-03-02 18:59:11 -05:00
Peter Maydell	b8b70dfcd2	Drop QEMU_GNUC_PREREQ() checks for gcc older than 4.1 We already require gcc 4.1 or newer (for the atomic support), so the fallback codepaths for older gcc versions than that are now dead code and we can just delete them. NB: clang reports itself as gcc 4.2 (regardless of clang version), so clang won't be using the fallbacks either. Backports commit fa54abb8c298f892639ffc4bc2f61448ac3be4a1 from qemu	2018-03-02 18:59:05 -05:00
Peter Maydell	2935a9af7a	arm: Remove workarounds for old M-profile exception return implementation Now that we've rewritten M-profile exception return so that the magic PC values are not visible to other parts of QEMU, we can delete the special casing of them elsewhere. Backports commit f4e8e4edda875cab9df91dc4ae9767f7cb1f50aa from qemu	2018-03-02 15:02:14 -05:00
Peter Maydell	44bf8985e5	arm: Implement M profile exception return properly On M profile, return from exceptions happen when code in Handler mode executes one of the following function call return instructions: * POP or LDM which loads the PC * LDR to PC * BX register and the new PC value is 0xFFxxxxxx. QEMU tries to implement this by not treating the instruction specially but then catching the attempt to execute from the magic address value. This is not ideal, because: * there are guest visible differences from the architecturally specified behaviour (for instance jumping to 0xFFxxxxxx via a different instruction should not cause an exception return but it will in the QEMU implementation) * we have to account for it in various places (like refusing to take an interrupt if the PC is at a magic value, and making sure that the MPU doesn't deny execution at the magic value addresses) Drop these hacks, and instead implement exception return the way the architecture specifies -- by having the relevant instructions check for the magic value and raise the 'do an exception return' QEMU internal exception immediately. The effect on the generated code is minor: bx lr, old code (and new code for Thread mode): TCG: mov_i32 tmp5,r14 movi_i32 tmp6,$0xfffffffffffffffe and_i32 pc,tmp5,tmp6 movi_i32 tmp6,$0x1 and_i32 tmp5,tmp5,tmp6 st_i32 tmp5,env,$0x218 exit_tb $0x0 set_label $L0 exit_tb $0x7f2aabd61993 x86_64 generated code: 0x7f2aabe87019: mov %ebx,%ebp 0x7f2aabe8701b: and $0xfffffffffffffffe,%ebp 0x7f2aabe8701e: mov %ebp,0x3c(%r14) 0x7f2aabe87022: and $0x1,%ebx 0x7f2aabe87025: mov %ebx,0x218(%r14) 0x7f2aabe8702c: xor %eax,%eax 0x7f2aabe8702e: jmpq 0x7f2aabe7c016 bx lr, new code when in Handler mode: TCG: mov_i32 tmp5,r14 movi_i32 tmp6,$0xfffffffffffffffe and_i32 pc,tmp5,tmp6 movi_i32 tmp6,$0x1 and_i32 tmp5,tmp5,tmp6 st_i32 tmp5,env,$0x218 movi_i32 tmp5,$0xffffffffff000000 brcond_i32 pc,tmp5,geu,$L1 exit_tb $0x0 set_label $L1 movi_i32 tmp5,$0x8 call exception_internal,$0x0,$0,env,tmp5 x86_64 generated code: 0x7fe8fa1264e3: mov %ebp,%ebx 0x7fe8fa1264e5: and $0xfffffffffffffffe,%ebx 0x7fe8fa1264e8: mov %ebx,0x3c(%r14) 0x7fe8fa1264ec: and $0x1,%ebp 0x7fe8fa1264ef: mov %ebp,0x218(%r14) 0x7fe8fa1264f6: cmp $0xff000000,%ebx 0x7fe8fa1264fc: jae 0x7fe8fa126509 0x7fe8fa126502: xor %eax,%eax 0x7fe8fa126504: jmpq 0x7fe8fa122016 0x7fe8fa126509: mov %r14,%rdi 0x7fe8fa12650c: mov $0x8,%esi 0x7fe8fa126511: mov $0x56095dbeccf5,%r10 0x7fe8fa12651b: callq *%r10 which is a difference of one cmp/branch-not-taken. This will be lost in the noise of having to exit generated code and look up the next TB anyway. Backports commit 3bb8a96f5348913ee130169504f3642f501b113e from qemu	2018-03-02 14:58:14 -05:00

1 2 3 4 5 ...

3765 commits