Commit graph

177 commits

Author SHA1 Message Date
Emilio G. Cota d3ada2feb5
tcg: allocate TB structs before the corresponding translated code
Allocating an arbitrarily-sized array of tbs results in either
(a) a lot of memory wasted or (b) unnecessary flushes of the code
cache when we run out of TB structs in the array.

An obvious solution would be to just malloc a TB struct when needed,
and keep the TB array as an array of pointers (recall that tb_find_pc()
needs the TB array to run in O(log n)).

Perhaps a better solution, which is implemented in this patch, is to
allocate TB's right before the translated code they describe. This
results in some memory waste due to padding to have code and TBs in
separate cache lines--for instance, I measured 4.7% of padding in the
used portion of code_gen_buffer when booting aarch64 Linux on a
host with 64-byte cache lines. However, it can allow for optimizations
in some host architectures, since TCG backends could safely assume that
the TB and the corresponding translated code are very close to each
other in memory. See this message by rth for a detailed explanation:

https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05172.html
Subject: Re: GSoC 2017 Proposal: TCG performance enhancements

Backports commit 6e3b2bfd6af488a896f7936e99ef160f8f37e6f2 from qemu
2018-03-03 17:05:49 -05:00
Emilio G. Cota 8f4f15e5f5
tcg: Introduce goto_ptr opcode and tcg_gen_lookup_and_goto_ptr
Instead of exporting goto_ptr directly to TCG frontends, export
tcg_gen_lookup_and_goto_ptr(), which calls goto_ptr with the pointer
returned by the lookup_tb_ptr() helper. This is the only use case
we have for goto_ptr and lookup_tb_ptr, so having this function is
very convenient. Furthermore, it trivially allows us to avoid calling
the lookup helper if goto_ptr is not implemented by the backend.

Backports commit cedbcb01529cb6cf9a2289cdbebbc63f6149fc18 from qemu
2018-03-02 21:05:18 -05:00
Richard Henderson b4b173615c
tcg: Allow an operand to be matching or a constant
This allows an output operand to match an input operand
only when the input operand needs a register.

Backports commit 17280ff4a5f264e01e55ae514ee6d3586f9577b2 from qemu
2018-03-01 15:49:05 -05:00
Richard Henderson 3f38611159
tcg: Pass the opcode width to target_parse_constraint
This will let us choose how to interpret a given constraint
depending on whether the opcode is 32- or 64-bit. Which will
let us share more constraint combinations between opcodes.

At the same time, change the interface to return the advanced
pointer instead of passing it in/out by reference.

Backports commit 069ea736b50b75fdec99c9b8cc603b97bd98419e from qemu
2018-03-01 15:45:40 -05:00
Richard Henderson b8c93597b4
tcg: Transition flat op_defs array to a target callback
This will allow the target to tailor the constraints to the
auto-detected ISA extensions.

Backports commit f69d277ece43c42c7ab0144c2ff05ba740f6706b from qemu
2018-03-01 15:40:11 -05:00
Richard Henderson 551ef0a9f7
tcg: Add markup for output requires new register
This is the same concept as, and same markup as, the
early clobber markup in gcc.

Backports commit 82790a870992bd87d5fd9e607f40859dcf4f82ac from qemu
2018-03-01 15:24:58 -05:00
Paolo Bonzini 8734e13a73
tcg: try sti when moving a constant into a dead memory temp
This comes from free from unifying tcg_reg_alloc_mov and
tcg_reg_alloc_movi's handling of TEMP_VAL_CONST. It triggers
often on moves to cc_dst, such as the following translation
of "sub $0x3c,%esp":

before: after:
subl $0x3c,%ebp subl $0x3c,%ebp
movl %ebp,0x10(%r14) movl %ebp,0x10(%r14)
movl $0x3c,%ebx movl $0x3c,0x2c(%r14)
movl %ebx,0x2c(%r14)

Backports commit 0fe4fca4e1a5e06a270127dd80bb753d4dda61c6 from qemu
2018-02-26 10:08:47 -05:00
Thomas Huth b581d4033f
tcg: Remove duplicate header includes
host-utils.h and timer.h are included twice in tcg.c.
One time should be enough.

Backports commit 347519eb9d68303a6c23a7663c0fa6c20a225191 from qemu
2018-02-26 02:29:38 -05:00
Richard Henderson ede1cae3dc
tcg: Lower indirect registers in a separate pass
Rather than rely on recursion during the middle of register allocation,
lower indirect registers to loads and stores off the indirect base into
plain temps.

For an x86_64 host, with sufficient registers, this results in identical
code, modulo the actual register assignments.

For an i686 host, with insufficient registers, this means that temps can
be (temporarily) spilled to the stack in order to satisfy an allocation.
This as opposed to the possibility of not being able to spill, to allocate
a register for the indirect base, in order to perform a spill.

Backports commit 5a18407f55ade924aa6397c9a043a9ffd59645fe from qemu
2018-02-25 22:32:28 -05:00
Richard Henderson 8a012ff6d3
tcg: Require liveness analysis
Backports commit c0ef05b5e62ab0c291a94022f14104e61e306f03 from qemu
2018-02-25 22:20:42 -05:00
Richard Henderson 2aa46dd9a1
tcg: Include liveness info in the dumps
Backports commit bdfb460ef77500f7b186759b585f06ff2120929d from qemu
2018-02-25 22:13:08 -05:00
Richard Henderson e973e89a57
tcg: Compress dead_temps and mem_temps into a single array
We only need two bits per temporary. Fold the two bytes into one,
and reduce the memory and cachelines required during compilation.

Backports commit c70fbf0a9938baf3b4f843355a77c17a7e945b98 from qemu
2018-02-25 22:07:08 -05:00
Richard Henderson 690985a582
tcg: Fold life data into TCGOp
Reduce the size of other bitfields to make room.
This reduces the cache footprint of compilation.

Backports commit bee158cb4dde35c41632a3a129c869f14a32f8f0 from qemu
2018-02-25 21:49:42 -05:00
Richard Henderson 1547048a22
tcg: Reorg TCGOp chaining
Instead of using -1 as end of chain, use 0, and link through the 0
entry as a fully circular double-linked list.

Backports commit dcb8e75870e2de199db853697f8839cb603beefe from qemu
2018-02-25 21:44:50 -05:00
Richard Henderson b2e6e351c2
tcg: Compress liveness data to 16 bits
This reduces both memory usage and per-insn cacheline usage
during code generation.

Backports commit a1b3c48d2b23d6eaeb4529d3e1183d2648731bf8 from qemu
2018-02-25 21:27:24 -05:00
Sergey Sorokin e4d123caa9
tcg: Improve the alignment check infrastructure
Some architectures (e.g. ARMv8) need the address which is aligned
to a size more than the size of the memory access.
To support such check it's enough the current costless alignment
check implementation in QEMU, but we need to support
an alignment size specifying.

Backports commit 1f00b27f17518a1bcb4cedca49eaec96a4d560bd from qemu
2018-02-25 02:23:28 -05:00
Richard Henderson 23586e2674
tcg: Optimize spills of constants
While we can store constants via constrants on INDEX_op_st_i32 et al,
we weren't able to spill constants to backing store.

Add a new backend interface, tcg_out_sti, which may store the constant
(and is allowed to fail). Rearrange the temp_* helpers so that we only
attempt to directly store a constant when the temp is becoming dead/free.

Backports commit 59d7c14eeff8d2ad7f61aed86ce5a176113bc153 from qemu
2018-02-25 01:45:29 -05:00
Richard Henderson 64fda683b1
tcg: Fix name for high-half register 2018-02-25 01:36:35 -05:00
Emilio G. Cota 8518f55df7
compiler.h: add QEMU_ALIGNED() to enforce struct alignment
Backports commit 911a4d2215b05267b16925503218f49d607c6b29 from qemu
2018-02-24 17:32:43 -05:00
Paolo Bonzini 9485b7c2e1
cpu: move exec-all.h inclusion out of cpu.h
exec-all.h contains TCG-specific definitions. It is not needed outside
TCG-specific files such as translate.c, exec.c or *helper.c.

One generic function had snuck into include/exec/exec-all.h; move it to
include/qom/cpu.h.

Backports commit 63c915526d6a54a95919ebece83fa9ca631b2508 from qemu
2018-02-24 02:39:08 -05:00
Aurelien Jarno 6060ab6596
tcg: check for CONFIG_DEBUG_TCG instead of NDEBUG
Check for CONFIG_DEBUG_TCG instead of NDEBUG, drop now useless code.

Backports commit 8d8fdbae010aa75a23f0307172e81034125aba6e from qemu
2018-02-23 13:55:21 -05:00
Aurelien Jarno 355ed7cd08
tcg: use tcg_debug_assert instead of assert (fix performance regression)
The TCG code is quite performance sensitive, but at the same time can
also be quite tricky. That is why asserts that can be enabled with the
--enable-debug-tcg configure option.

This used to work the following way:

| #include "config.h"
|
| ...
|
| #if !defined(CONFIG_DEBUG_TCG) && !defined(NDEBUG)
| /* define it to suppress various consistency checks (faster) */
| #define NDEBUG
| #endif
|
| ...
|
| #include <assert.h>

Since commit 757e725b (tcg: Clean up includes) "config.h" as been
replaced by "qemu/osdep.h" which itself includes <assert.h>. As a
consequence the assertions are always enabled, even when using
--disable-debug-tcg, causing a performance regression, especially on
targets with many registers. For instance on qemu-system-ppc the
speed difference is about 15%.

tcg_debug_assert is controlled directly by CONFIG_DEBUG_TCG and already
uses in some places. This patch replaces all the calls to assert into
calss to tcg_debug_assert.

Backports commit eabb7b91b36b202b4dac2df2d59d698e3aff197a from qemu
2018-02-23 13:52:13 -05:00
Alex Bennée 3da7d9d9ae
qemu-log: dfilter-ise exec, out_asm, op and opt_op
qemu-log: dfilter-ise exec, out_asm, op and opt_op

This ensures the code generation debug code will honour -dfilter if set.
For the "exec" tracing I've added a new inline macro for efficiency's
sake.

Backports commit d977e1c2dbc9e63454b2000f91954d02543bf43b from qemu
2018-02-22 10:06:19 -05:00
Alex Bennée bc5d7c5e1d
tcg: pass down TranslationBlock to tcg_code_gen
My later debugging patches need access to the origin PC which is held in
the TranslationBlock structure. Pass down the whole structure as it also
holds the information about the code start point.

Backports commit 5bd2ec3d7b47b2252745882795d79aef36380fb7 from qemu
2018-02-22 09:28:06 -05:00
Veronia Bahaa bafc81b1d3
util: move declarations out of qemu-common.h
Move declarations out of qemu-common.h for functions declared in
utils/ files: e.g. include/qemu/path.h for utils/path.c.
Move inline functions out of qemu-common.h and into new files (e.g.
include/qemu/bcd.h)

Backports commit f348b6d1a53e5271cf1c9f9acc4646b4b98c1771 from qemu
2018-02-22 09:25:48 -05:00
Peter Maydell 7784a25470
tcg: Rename tcg-target.c to tcg-target.inc.c
Rename the per-architecture tcg-target.c files to tcg-target.inc.c.
This makes it clearer that they are not intended to be standalone
C files, but are instead #included into another source file.

Backports commit ce151109813e2770fd3cee2f37bfa2cdd01a12b9 from qemu
2018-02-20 20:39:57 -05:00
Richard Henderson 3653771265
tcg: Allocate indirect_base temporaries in a different order
Since we've not got liveness analysis for indirect bases,
placing them at the end of the call-saved registers makes
it more likely that it'll stay live.

Backports commit 91478cefaaf2fa678e56df8635b34957f4d5d565 from qemu
2018-02-20 19:46:59 -05:00
Richard Henderson bf385eba3c
tcg: Implement indirect memory registers
That is, global_mem registers whose base is another global_mem
register, rather than a fixed register.

Backports commit b3915dbbdcdb2e04753f3d34a1b0865eea005069 from qemu
2018-02-20 19:20:01 -05:00
Richard Henderson 9299329349
tcg: Work around clang bug wrt enum ranges, part 2
A previous patch patch changed the type of REG from int
to enum TCGReg, which provokes the following bug in clang:

https://llvm.org/bugs/show_bug.cgi?id=16154

Backports commit 869938ae2a284fe730cb6f807ea0f9e324e0f87c from qemu
2018-02-20 19:12:49 -05:00
Richard Henderson 292c67109a
tcg: Introduce temp_load
Unify all of the places that realize a temporary into a register.

Backports commit 40ae5c62ebaaf7d9d3b93b88c2d32bf6342f7889 from qemu
2018-02-19 11:44:01 -05:00
Richard Henderson c821ffd989
tcg: Change temp_save argument to TCGTemp
Backports commit b13eb728d33deaa53efc0dcef557da998e6ec40e from qemu
2018-02-19 11:39:04 -05:00
Richard Henderson 2c3ad57215
tcg: Change temp_sync argument to TCGTemp
Backports commit 12b9b11a2743002232098afb41810f1c0cb211a0 from qemu
2018-02-19 11:37:12 -05:00
Richard Henderson 82a4e93629
tcg: Change temp_dead argument to TCGTemp
Backports commit f8bf00f1028a00a7978e9175da53944de95b9fcb from qemu
2018-02-19 11:34:17 -05:00
Richard Henderson daf837956c
tcg: Change reg_to_temp to TCGTemp pointer
Backports commit f8b2f202344b362b1e676688f838d6b7c08f1975 from qemu
2018-02-19 11:30:26 -05:00
Richard Henderson cf59e51811
tcg: Work around clang bug wrt enum ranges
A subsequent patch patch will change the type of REG from int
to enum TCGReg, which provokes the following bug in clang:

https://llvm.org/bugs/show_bug.cgi?id=16154

Backports commit c8074023204e8e8a213399961ab56e2814aa6116 from qemu
2018-02-19 11:23:19 -05:00
Richard Henderson 7cb5f2fed8
tcg: Tidy temporary allocation
In particular, make sure the memory is memset before use.
Continues the increased use of TCGTemp pointers instead of
integer indices where appropriate.

Backports commit 7ca4b752feaab647b0c1a147bd3815fcdb479a59 from qemu
2018-02-19 11:17:45 -05:00
Richard Henderson 45f9ddf970
tcg: Remove tcg_get_arg_str_i32/64
Backports commit e4ce0d4eb774eb2a8b6a27cd8a6f1d75e05c21ae from qemu
2018-02-19 02:07:04 -05:00
Richard Henderson 12577dfcc0
tcg: More use of TCGReg where appropriate
Backports commit b66386623176e0b0f3bd270640bdb8ac8431c732 from qemu
2018-02-19 02:06:08 -05:00
Richard Henderson c507f16702
tcg: Remove lingering references to gen_opc_buf
Three in comments and one in code in the stub tcg_liveness_analysis.

Backports commit 201577059331b8b3aef221ee2ed594deb99d6631 from qemu
2018-02-19 01:42:55 -05:00
Richard Henderson 8dbf46ca82
tcg: Respect highwater in tcg_out_tb_finalize
Undo the workaround at b17a6d3390f87620735f7efb03bb1c96682ff449.

If there are lots of memory operations in a TB, the slow path code
can exceed the highwater reservation. Add a check within the loop.

Backports commit 23dceda62a3643f734b7aa474fa6052593ae1a70 from qemu
2018-02-19 01:40:20 -05:00
Peter Maydell 4ca19f2cd6
tcg: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Backports commit 757e725b58c57d3ebb66a31fd2210df977a12154 from qemu
2018-02-19 01:04:30 -05:00
John Clarke 5c57445f08
tcg: Fix highwater check
A simple typo in the variable to use when comparing vs the highwater mark.
Reports are that qemu can in fact segfault occasionally due to this mistake.

Backports commit 644da9b39e477caa80bab69d2847dfcb468f0d33 from qemu
2018-02-17 18:53:18 -05:00
Richard Henderson bdf667fd4e
tcg: Check for overflow via highwater mark
We currently pre-compute an worst case code size for any TB, which
works out to be 122kB. Since the average TB size is near 1kB, this
wastes quite a lot of storage.

Instead, check for overflow in between generating code for each opcode.
The overhead of the check isn't measurable and wastage is minimized.

Backports commit b125f9dc7bd68cd4c57189db4da83b0620b28a72 from qemu
2018-02-17 15:24:00 -05:00
Richard Henderson 19a3c7e03f
tcg: Emit prologue to the beginning of code_gen_buffer
By putting the prologue at the end, we risk overwriting the
prologue should our estimate of maximum TB size. Given the
two different placements of the call to tcg_prologue_init,
move the high water mark computation into tcg_prologue_init.

Backports commit 8163b74938d8b7d12e70597c4553dd0dc49443d5 from qemu
2018-02-17 15:24:00 -05:00
Richard Henderson 532877a366
tcg: Remove tcg_gen_code_search_pc
It's no longer used, so tidy up everything reached by it.

Backports commit 04fe64000162c45d8974da9ca4d266f8d0e67eb7 from qemu
2018-02-17 15:24:00 -05:00
Richard Henderson 66de6cc37c
tcg: Save insn data and use it in cpu_restore_state_from_tb
We can now restore state without retranslation.

Backports commit fca8a500d519a56abeaedf8073167a61d3c6b9c4 from qemu
2018-02-17 15:23:59 -05:00
Richard Henderson 1cbd175736
tcg: Pass data argument to restore_state_to_opc
The gen_opc_* arrays are already redundant with the data stored in
the insn_start arguments. Transition restore_state_to_opc to use
data from the latter.

Backports commit bad729e272387de7dbfa3ec4319036552fc6c107 from qemu
2018-02-17 15:23:58 -05:00
Peter Crosthwaite 2b15db6e12
tcg: split tcg_op_defs to -common
tcg_op_defs (and the _max) are both needed by the TCI disassembler. For
multi-arch, tcg.c will be multiple-compiled (arch-obj) with its symbols
hidden from common code. So split the definition off to new file,
tcg-common.c which will remain a regular obj-y for use by both the TCI
disas as well as the multiple tcg.c's.

Backports commit 7d8f787d9d261d6880b69e35ed682241e3f9242f from qemu
2018-02-17 15:23:51 -05:00
Richard Henderson 3f9502dc8b
tcg: Allow extra data to be attached to insn_start
With an eye toward having this data replace the gen_opc_* arrays
that each target collects in order to enable restore_state_from_tb.

Backports commit 9aef40ed1f6e2bd794bbb3ba8c8b773e506334c9 from qemu
2018-02-11 13:03:51 -05:00
Lioncash b3f9ff667b
tcg: Rename debug_insn_start to insn_start
With an eye toward making it mandatory.

Backports commit 765b842adec4c5a359e69ca08785553599f71496 from qemu
2018-02-11 12:34:01 -05:00
Richard Henderson 17c1f027c1
tcg: Handle MO_AMASK in tcg_dump_ops
Backports commit 59c4b7e8dfab0cdc41434fedbf2686222f541e57 from qemu
2018-02-10 20:32:52 -05:00
Richard Henderson 6234d07489
tcg: Merge memop and mmu_idx parameters to qemu_ld/st
At the tcg opcode level, not at the tcg-op.h generator level.
This requires minor changes through all of the tcg backends,
but none of the cpu translators.

Backports commit 59227d5d45bb3c31dc2118011691c35b3c00879c from qemu
2018-02-10 19:01:49 -05:00
Richard Henderson 6bd102ba86
tcg: Use tcg_malloc to allocate TCGLabel
Pre-allocating 512 of them per TB is a waste.

Backports commit 51e3972c41598adc91fe3f4767057f5198dcc15c from qemu
2018-02-09 14:48:20 -05:00
Richard Henderson 00b0a50f47
tcg: Change generator-side labels to a pointer
This is less about improved type checking than enabling a
subsequent change to the representation of labels.

Backports commit bec1631100323fac0900aea71043d5c4e22fc2fa from qemu
2018-02-09 14:40:59 -05:00
Richard Henderson 232632e76c
tcg: Change translator-side labels to a pointer
This is improved type checking for the translators -- it's no longer
possible to accidentally swap arguments to the branch functions.

Note that the code generating backends still manipulate labels as int.

With notable exceptions, the scope of the change is just a few lines
for each target, so it's not worth building extra machinery to do this
change in per-target increments.

Backports commit 42a268c241183877192c376d03bd9b6d527407c7 from qemu
2018-02-09 14:17:56 -05:00
Richard Henderson 255a160c66
tcg: Remove unused opcodes
We no longer need INDEX_op_end to terminate the list, nor do we
need 5 forms of nop, since we just remove the TCGOp instead.

Backports commit 15fc7daa770764cc795158cbb525569f156f3659 from qemu
2018-02-09 13:20:41 -05:00
Richard Henderson 4fcaabf38c
tcg: Remove opcodes instead of noping them out
With the linked list scheme we need not leave nops in the stream
that we need to process later.

Backports commit 0c627cdca20155753a536c51385abb73941a59a0 from qemu
2018-02-09 13:03:58 -05:00
Lioncash 0273e6ae18
tcg: Put opcodes in a linked list
The previous setup required ops and args to be completely sequential,
and was error prone when it came to both iteration and optimization.
2018-02-09 12:54:05 -05:00
Richard Henderson 500c546444
tcg: Move some opcode generation functions out of line
Some of these functions are really quite large.  We have a number of
things that ought to be circularly dependent, but we duplicated code
to break that chain for the inlines.

This saved 25% of the code size of one of the translators I examined.
2018-02-09 08:10:00 -05:00
Richard Henderson cb7b19ad26
tcg: Change ts->mem_reg to ts->mem_base
Chain the temporaries together via pointers intstead of indices.
The mem_reg value is now mem_base->reg.  This will be important later.

This does require that the frame pointer have a global temporary
allocated for it.  This is simple bar the existing reserved_regs check.

Backports commit b3a62939561e07bc34493444fa926b6137cba4e8 from qemu
2018-02-08 13:04:48 -05:00
Richard Henderson 6b4b493dae
tcg: Change tcg_global_mem_new_* to take a TCGv_ptr
Thus, use cpu_env as the parameter, not TCG_AREG0 directly.
Update all uses in the translators.

Backports commit e1ccc05444676b92c63708096e36582be27fbee1 from qemu
2018-02-08 12:33:33 -05:00
Nguyen Anh Quynh c01dcf0a14 fix merge conflicts 2017-03-10 21:04:33 +08:00
Nguyen Anh Quynh c3808179e1 another attempt to fix #766 2017-02-26 15:22:24 +08:00
Nguyen Anh Quynh e65fef70dc add missing TCG context arg to few functions in tcg.c. see #766 2017-02-26 09:47:40 +08:00
xorstream 770c5616e2 Automated leading tab to spaces conversion. 2017-01-21 12:28:22 +11:00
xorstream 002151874a Unicorn interface working with test app in 32bit and 64bit builds. 2017-01-20 17:27:22 +11:00
xorstream 1aeaf5c40d This code should now build the x86_x64-softmmu part 2. 2017-01-19 22:50:28 +11:00
Chris Eagle fccbcfd4c2 revert to use of g_free to make future qemu integrations easier (#695)
* revert to use of g_free to make future qemu integrations easier

* bracing
2016-12-21 22:28:36 +08:00
Chris Eagle e46545f722 remove glib dependency by provide compatible replacements 2016-12-18 14:56:58 -08:00
Nguyen Anh Quynh bfeb08d1ba fix some compilation warning 2016-01-06 14:11:21 +08:00
Nguyen Anh Quynh e0cb02569e remove unused tcg_register_jit() and related code 2016-01-05 16:02:34 +07:00
Nguyen Anh Quynh b5feddbf1e indentation 2015-12-28 13:04:59 +08:00
JC Yang 8ef018a2cb Fix possible wrong conditional branch in generated host code by fixing
the tcg_liveness_analysis().
Refer to https://github.com/unicorn-engine/unicorn/issues/287 for further info.
2015-12-21 18:01:01 +08:00
farmdve 65a649dec0 Fix issue #269
Patch from here
http://lists.nongnu.org/archive/html/qemu-devel/2015-11/msg03848.html

Also fix another potential issue with constants from
bbeb82395e (diff-9e0011b4d4a5890b309421630e6d86c3)
2015-11-17 18:34:38 +02:00
Nguyen Anh Quynh 2b0b4169bc mips: advance PC for SYSCALL instruction. this fixes issue #157 2015-09-28 10:58:43 +08:00
danghvu 0c67f41ed9 Fix issue #118 2015-09-21 20:30:05 -05:00
Nguyen Anh Quynh 344d016104 import 2015-08-21 15:04:50 +08:00