Correct these issues with the handling of CP0.Status for MIPSr6:
* only ignore the bit pattern of 0b11 on writes to CP0.Status.KSU, that
is for processors that do implement Supervisor Mode, let the bit
pattern be written to CP0.Status.UM:R0 freely (of course the value
written to read-only CP0.Status.R0 will be discarded anyway); this is
in accordance to the relevant architecture specification[1],
* check the newly written pattern rather than the current contents of
CP0.Status for the KSU bits being 0b11,
* use meaningful macro names to refer to CP0.Status bits rather than
magic numbers.
References:
[1] "MIPS Architecture For Programmers, Volume III: MIPS64 / microMIPS64
Privileged Resource Architecture", MIPS Technologies, Inc., Document
Number: MD00091, Revision 6.00, March 31, 2014, Table 9.45 "Status
Register Field Descriptions", pp. 210-211.
Backports commit f88f79ec9df06d26d84e1d2e0c02d2634b4d8583 from qemu
Correct MIPS16/microMIPS branch size calculation in PC adjustment
needed:
- to set the value of CP0.ErrorEPC at the entry to the reset exception,
- for the purpose of branch reexecution in the context of device I/O.
Follow the approach taken in `exception_resume_pc' for ordinary, Debug
and NMI exceptions.
MIPS16 and microMIPS branches can be 2 or 4 bytes in size and that has
to be reflected in calculation. Original MIPS ISA branches, which is
where this code originates from, are always 4 bytes long, just as all
original MIPS ISA instructions.
Backports commit c3577479815f5bcf9d38993967bca2115af245d8 from qemu
Restore the order of helpers that used to be: unary operations (generic,
then MIPS-specific), binary operations (generic, then MIPS-specific),
compare operations. At one point FMA operations were inserted at a
random place in the file, disregarding the preexisting order, and later
on even more operations sprinkled across the file. Revert the mess by
moving FMA operations to a new ternary class inserted after the binary
class and move the misplaced unary and binary operations to where they
belong.
Backports commit 8fc605b8aa257feb3e69d44794a765bd492b573b from qemu
Remove the `FLOAT_OP' macro, unused since commit
b6d96beda3a6cbf20a2d04a609eff78adebd8859 [Use temporary registers for
the MIPS FPU emulation.].
Backports commit 51fdea945ae7adae8d7e4a1624e35bb7f714b58f from qemu
Move the call to `update_fcr31' in `helper_float_cvtw_s' after the
exception flag check, for consistency with the remaining helpers that do
it last too.
Backports commit 2b09f94cdbf5c54e2278d7f3aed2eceff3494790 from qemu
Backports commits d75de74967f631a7d0b538d4b88f96f9c426bfe2, 6225a4a0e39cb24e7b9e1d4d2c1a3e6eaee18e85, and d2bfa6e6222baa0218bd0658499d38bac56ac34c from qemu
Add the M14K and M14Kc processors from MIPS Technologies that are the
original implementation of the microMIPS ISA. They are dual instruction
set processors, implementing both the microMIPS and the standard MIPSr32
ISA.
These processors correspond to the M4K and 4KEc CPUs respectively,
except with support for the microMIPS instruction set added, support for
the MCU ASE added and two extra interrupt lines, making a total of 8
hardware interrupts plus 2 software interrupts. The remaining parts of
the microarchitecture, in particular the pipeline, stayed unchanged.
The presence of the microMIPS ASE is is reflected in the configuration
added. We currently have no support for the MCU ASE, including in
particular the ACLR, ASET and IRET instructions in either encoding, and
we have no support for the extra interrupt lines, including bits in
CP0.Status and CP0.Cause registers, so these features are not marked,
making our support diverge from real hardware.
Backports commit 11f5ea105c06bec72e9bc9a700fa65d60afb5ec3 from qemu
Make the data type used for the CP0.Config4 and CP0.Config5 registers
and their mask signed, for consistency with the remaining 32-bit CP0
registers, like CP0.Config0, etc.
Backports commit 8280b12c0e4b515d707509dde4ddde05d9bda4ef from qemu
Add the 5KEc and 5KEf processors from MIPS Technologies that are the
original implementation of the MIPS64r2 ISA.
Silicon for these processors has never been taped out and no soft cores
were released even. They do exist though, a CP0.PRId value has been
assigned and experimental RTLs produced at the time the MIPS64r2 ISA has
been finalized. The settings introduced here faithfully reproduce that
hardware.
As far the implementation goes these processors are the same as the 5Kc
and the 5Kf CPUs respectively, except implementing the MIPS64r2 rather
than the original MIPS64 instruction set. There must have been some
updates to the CP0 architecture as mandated by the ISA, such as the
addition of the EBase register, although I am not sure about the exact
details, no documentation has ever been produced for these processors.
The remaining parts of the microarchitecture, in particular the
pipeline, stayed unchanged. Or to put it another way, the difference
between a 5K and a 5KE CPU corresponds to one between a 4K and a 4KE
CPU, except for the 64-bit rather than 32-bit ISA.
Backports commit 36b86e0dc2be93fc538fe7e11e0fda1a198f0135 from qemu
With an eye toward having this data replace the gen_opc_* arrays
that each target collects in order to enable restore_state_from_tb.
Backports commit 9aef40ed1f6e2bd794bbb3ba8c8b773e506334c9 from qemu
While we're at it, emit the opcode adjacent to where we currently
record data for search_pc. This puts gen_io_start et al on the
"correct" side of the marker.
Backports commit 667b8e29c5b1d8c5b4e6ad5f780ca60914eb6e96 from qemu
Usually, eliminate an operation from the translator by combining
a shift with an extract.
In the case of gen_set_NZ64, we don't need a boolean value for cpu_ZF,
merely a non-zero value. Given that we can extract both halves of a
64-bit input in one call, this simplifies the code.
Backports commit 7cb36e18b2f1c1f971ebdc2121de22a8c2e94fd6 from qemu
For !SF, this initial ext32u can't be optimized away by the
current TCG code generator. (It would require backward bit
liveness propagation.)
Backports commit d3a77b42decd0cbfa62a5526e67d1d6d380c83a9 from qemu
This can allow much of a ccmp to be elided when particular
flags are subsequently dead.
Backports commit 7dd03d773e0dafae9271318fc8d6b2b14de74403 from qemu
Handling this with TCG_COND_ALWAYS will allow these unlikely
cases to be handled without special cases in the rest of the
translator. The TCG optimizer ought to be able to reduce
these ALWAYS conditions completely.
Backports commit 9305eac09e61d857c9cc11e20db754dfc25a82db from qemu
Split arm_gen_test_cc into 3 functions, so that it can be reused
for non-branch TCG comparisons.
Backports commit 6c2c63d3a02c79e9035ca0370cc549d0f938a4dd from qemu
In ffc6372851d8631a9f9fa56ec613b3244dc635b9, we swapped the guest
base to the address base register from the address index register.
Except that 31 in the base slot is SP not XZR, so we need to be
more intelligent about which reg gets placed in which slot.
Backports commit 352bcb0a2b816ff9ab9d75d0f2384650d9e9ab19 from qemu
Rather than allow arbitrary shift+trunc, only concern ourselves
with low and high parts. This is all that was being used anyway.
Backports commit 609ad70562793937257c89d07bf7c1370b9fc9aa from qemu
They behave the same as ext32s_i64 and ext32u_i64 from the constant
folding and zero propagation point of view, except that they can't
be replaced by a mov, so we don't compute the affected value.
Backports commit 8bcb5c8f34f9215d4f88f388c7ff14c9bd5cecd3 from qemu
Implement real ext_i32_i64 and extu_i32_i64 ops. They ensure that a
32-bit value is always converted to a 64-bit value and not propagated
through the register allocator or the optimizer.
Backports commit 4f2331e5b67af8172419eb1c8db510b497b30a7b from qemu
The op is sometimes named trunc_shr_i32 and sometimes trunc_shr_i64_i32,
and the name in the README doesn't match the name offered to the
frontends.
Always use the long name to make it clear it is a size changing op.
Backports commit 0632e555fc4d281d69cb08d98d500d96185b041f from qemu
Instead of using an enum which could be either a copy or a const, track
them separately. This will be used in the next patch.
Constants are tracked through a bool. Copies are tracked by initializing
temp's next_copy and prev_copy to itself, allowing to simplify the code
a bit.
Backports commit b41059dd9deec367a4ccd296659f0bc5de2dc705 from qemu
Add two accessor functions temp_is_const and temp_is_copy, to make the
code more readable and make code change easier.
Backports commit d9c769c60948815ee03b2684b1c1c68ee4375149 from qemu
The tcg_temp_info structure uses 24 bytes per temp. Now that we emulate
vector registers on most guests, it's not uncommon to have more than 100
used temps. This means we have initialize more than 2kB at least twice
per TB, often more when there is a few goto_tb.
Instead used a TCGTempSet bit array to track which temps are in used in
the current basic block. This means there are only around 16 bytes to
initialize.
This improves the boot time of a MIPS guest on an x86-64 host by around
7% and moves out tcg_optimize from the the top of the profiler list.
Backports commit 1208d7dd5fddc1fbd98de800d17429b4e5578848 from qemu
By convention, on a 64-bit host TCG internally stores 32-bit constants
as sign-extended. This is not the case in the optimizer when a 32-bit
constant is folded.
This doesn't seem to have more consequences than suboptimal code
generation. For instance the x86 backend assumes sign-extended constants,
and in some rare cases uses a 32-bit unsigned immediate 0xffffffff
instead of a 8-bit signed immediate 0xff for the constant -1. This is
with a ppc guest:
before
------
---- 0x9f29cc
movi_i32 tmp1,$0xffffffff
movi_i32 tmp2,$0x0
add2_i32 tmp0,CA,CA,tmp2,r6,tmp2
add2_i32 tmp0,CA,tmp0,CA,tmp1,tmp2
mov_i32 r10,tmp0
0x7fd8c7dfe90c: xor %ebp,%ebp
0x7fd8c7dfe90e: mov %ebp,%r11d
0x7fd8c7dfe911: mov 0x18(%r14),%r9d
0x7fd8c7dfe915: add %r9d,%r10d
0x7fd8c7dfe918: adc %ebp,%r11d
0x7fd8c7dfe91b: add $0xffffffff,%r10d
0x7fd8c7dfe922: adc %ebp,%r11d
0x7fd8c7dfe925: mov %r11d,0x134(%r14)
0x7fd8c7dfe92c: mov %r10d,0x28(%r14)
after
-----
---- 0x9f29cc
movi_i32 tmp1,$0xffffffffffffffff
movi_i32 tmp2,$0x0
add2_i32 tmp0,CA,CA,tmp2,r6,tmp2
add2_i32 tmp0,CA,tmp0,CA,tmp1,tmp2
mov_i32 r10,tmp0
0x7f37010d490c: xor %ebp,%ebp
0x7f37010d490e: mov %ebp,%r11d
0x7f37010d4911: mov 0x18(%r14),%r9d
0x7f37010d4915: add %r9d,%r10d
0x7f37010d4918: adc %ebp,%r11d
0x7f37010d491b: add $0xffffffffffffffff,%r10d
0x7f37010d491f: adc %ebp,%r11d
0x7f37010d4922: mov %r11d,0x134(%r14)
0x7f37010d4929: mov %r10d,0x28(%r14)
Backports commit 29f3ff8d6cbc28f79933aeaa25805408d0984a8f from qemu
Due to a copy&paste, the new op value is tested against mov_i32 instead
of movi_i32. The test is therefore always false. Fix that.
Backports commit 961521261a3d600b0695b2e6d2b0f490076f7e90 from qemu
The tcg_constant_folding folding ends up doing all the optimizations
(which is a good thing to avoid looping on all ops multiple time), so
make it clear and just rename it tcg_optimize.
Backports commit 36e60ef6ac5d8a262d0fbeedfdb2b588514cb1ea from qemu
Most of the calls to tcg_opt_gen_mov are preceeded by a test to check if
the source temp is a constant. Fold that into the tcg_opt_gen_mov
function.
Backports commit 97a79eb70dd35a24fda87d86196afba5e6f21c5d from qemu
Each call to tcg_opt_gen_mov is preceeded by a test to check if the
source and destination temps are copies. Fold that into the
tcg_opt_gen_mov function.
Backports commit 5365718a9afeeabde3784d82a542f8ad909b18cf from qemu
We can get the opcode using the TCGOp pointer. It needs to be
dereferenced, but it's anyway done a few lines below to write
the new value.
Backports commit 8d6a91602ea824ef4435ea38fd475387eecc098c from qemu