Commit graph

1011 commits

Author SHA1 Message Date
Richard Henderson e4ca88f9d6
target/arm: Convert MOVW, MOVT
Backports commit 8f4451274b7010c1f50e0baa5bb608f19f02b90f from qemu
2019-11-28 02:46:04 -05:00
Richard Henderson b35749e239
target/arm: Convert Signed multiply, signed and unsigned divide
Backports commit 2c7c4e090409189488149869797da4acf895bad0 from qemu
2019-11-28 02:45:33 -05:00
Richard Henderson 987641cf10
target/arm: Convert packing, unpacking, saturation, and reversal
Backports commit 46497f6af73bb33c1064d43a28a48cbb4d233a23 from qemu
2019-11-28 02:44:55 -05:00
Richard Henderson 83cced6170
target/arm: Convert Parallel addition and subtraction
Backports commit adf1a5662a47d5b5b96f4f1e440e34c26b14a154 from qemu
2019-11-28 02:44:20 -05:00
Richard Henderson 21df423e47
target/arm: Convert USAD8, USADA8, SBFX, UBFX, BFC, BFI, UDF
In op_bfx, note that tcg_gen_{,s}extract_i32 already checks
for width == 32, so we don't need to special case that here.

Backports commit 86d21e4b509a2835ed79f234f476a4c5191d435b from qemu
2019-11-28 02:44:20 -05:00
Richard Henderson dbcc67ab20
target/arm: Diagnose UNPREDICTABLE ldrex/strex cases
Backports commit af2882289951e58363d714afd16f80050685fa29 from qemu
2019-11-28 02:44:20 -05:00
Richard Henderson 3ac019eb98
target/arm: Convert Synchronization primitives
Backports commit 1efdd407a25f617129e2e0d5c009c07cbe847990 from qemu
2019-11-28 02:44:18 -05:00
Richard Henderson c794962c42
target/arm: Convert load/store (register, immediate, literal)
Backports commit 5e291fe16846d216d5a69569b1c59f497dff96e4 from qemu
2019-11-28 02:42:01 -05:00
Richard Henderson d5d98450f3
target/arm: Convert T32 ADDW/SUBW
Backports commit 145952e87fb86aaa9434d768c31eedbd323f7157 from qemu
2019-11-28 02:42:01 -05:00
Richard Henderson 7b9025910d
target/arm: Convert the rest of A32 Miscelaneous instructions
Backports commit 2cde9ea57dbc4cdee3677a1a335574537810fe2e from qemu
2019-11-28 02:42:01 -05:00
Richard Henderson be2a259d3c
target/arm: Convert ERET
Pass the T5 encoding of SUBS PC, LR, #IMM through the normal SUBS path
to make it clear exactly what's happening -- we hit ALUExceptionReturn
along that path.

Backports commit ef11bc3c461e2c650e8bef552146a4b08f81884e from qemu
2019-11-28 02:42:00 -05:00
Richard Henderson 74040da34c
target/arm: Convert CLZ
Document our choice about the T32 CONSTRAINED UNPREDICTABLE behaviour.
This matches the undocumented choice made by the legacy decoder.

Backports commit 4c97f5b2f0fa9b37f9ff497f15411d809e6fd098 from qemu
2019-11-28 02:42:00 -05:00
Richard Henderson 94968602b8
target/arm: Convert BX, BXJ, BLX (register)
Backports commit 4ed95abd700e43dee8e032f754b53bec2b047f75 from qemu
2019-11-28 02:42:00 -05:00
Richard Henderson 831e17d970
target/arm: Convert Cyclic Redundancy Check
Backports commit 6c35d53f1bde7fe327c074473c3048d6e6f15e95 from qemu
2019-11-28 02:42:00 -05:00
Richard Henderson fdd135c7d2
target/arm: Convert MRS/MSR (banked, register)
The m-profile and a-profile decodings overlap. Only return false
for the case of wrong profile; handle UNDEFINED for permission failure
directly. This ensures that we don't accidentally pass an insn that
applies to the wrong profile.

Backports commit d0b26644502103ca97093ef67749812dc1df7eea from qemu
2019-11-28 02:42:00 -05:00
Richard Henderson 571d879c49
target/arm: Convert MSR (immediate) and hints
Backports commit 6313059623dc512308681ba160ed862ac387e2fb from qemu
2019-11-28 02:41:59 -05:00
Richard Henderson a011318794
target/arm: Simplify op_smlawx for SMLAW*
By shifting the 16-bit input left by 16, we can align the desired
portion of the 48-bit product and use tcg_gen_muls2_i32.

Backports commit 485b607d4f393e0de92c922806a68aef22340c98 from qemu
2019-11-28 02:40:01 -05:00
Richard Henderson 201be7b8b1
target/arm: Simplify op_smlaxxx for SMLAL*
Since all of the inputs and outputs are i32, dispense with
the intermediate promotion to i64 and use tcg_gen_add2_i32.

Backports commit ea96b374641bc429269096d88d4e91ee544273e9 from qemu
2019-11-28 02:40:00 -05:00
Richard Henderson 543b598d45
target/arm: Convert Halfword multiply and multiply accumulate
Backports commit 26c6923de7131fa1cf223ab67131d1992dc17001 from qemu
2019-11-28 02:40:00 -05:00
Richard Henderson 44416a6794
target/arm: Convert Saturating addition and subtraction
Backports commit 6d0730a82417e3a4a1911eb8e0246f3ba996f932 from qemu
2019-11-28 02:40:00 -05:00
Richard Henderson 45566b2780
target/arm: Simplify UMAAL
Since all of the inputs and outputs are i32, dispense with
the intermediate promotion to i64 and use tcg_gen_mulu2_i32
and tcg_gen_add2_i32.

Backports commit 2409d56454f0d028619fb1002eda86bf240906dd from qemu
2019-11-28 02:40:00 -05:00
Richard Henderson 5e5ae4c0d0
target/arm: Convert multiply and multiply accumulate
Backports commit bd92fe353bda4412ffc46c0f7415207a684b45f2 from qemu
2019-11-28 02:40:00 -05:00
Richard Henderson 677cf191d2
target/arm: Convert Data Processing (immediate)
Convert the modified immediate form of the data processing insns.
For A32, we can finally remove any code that was intertwined with
the register and register-shifted-register forms.

Backports commit 581c6ebd17c8f56ad52772216e6c6d8cc8997e8b from qemu
2019-11-28 02:39:16 -05:00
Richard Henderson 1b21ced6a1
target/arm: Convert Data Processing (reg-shifted-reg)
Convert the register shifted by register form of the data
processing insns. For A32, we cannot yet remove any code
because the legacy decoder intertwines the immediate form.

Backports commit 5be2c12337f4cbdbda4efe6ab485350f730faaad from qemu
2019-11-28 02:39:16 -05:00
Richard Henderson e151696a65
target/arm: Convert Data Processing (register)
Convert the register shifted by immediate form of the data
processing insns. For A32, we cannot yet remove any code
because the legacy decoder intertwines the reg-shifted-reg
and immediate forms.

Backports commit 25ae32c558182c07fc6ad01b936e9151cbf00c44 from qemu
2019-11-28 02:38:58 -05:00
Richard Henderson 9fc793b566
target/arm: Add stubs for aa32 decodetree
Add the infrastructure that will become the new decoder.
No instructions adjusted so far.

Backports commit 51409b9e8cfe997b1ac3365df7400e0c6e844437 from qemu
2019-11-28 02:38:49 -05:00
Richard Henderson 6ec6c71d50
target/arm: Use store_reg_from_load in thumb2 code
This function already includes the test for an interworking write
to PC from a load. Change the T32 LDM implementation to match the
A32 LDM implementation.

For LDM, the reordering of the tests does not change valid
behaviour because the only case that differs is has rn == 15,
which is UNPREDICTABLE.

Backports commit 69be3e13764111737e1a7a13bb0c231e4d5be756 from qemu
2019-11-28 02:38:42 -05:00
Richard Henderson 46a8dfff59
target/arm: Fix SMMLS argument order
The previous simplification got the order of operands to the
subtraction wrong. Since the 64-bit product is the subtrahend,
we must use a 64-bit subtract to properly compute the borrow
from the low-part of the product.

Fixes: 5f8cd06ebcf5 ("target/arm: Simplify SMMLA, SMMLAR, SMMLS, SMMLSR")

Backports commit e0a0c8322b8ebcdad674f443a3e86db8708d6738 from qemu
2019-11-20 17:24:44 -05:00
Peter Maydell 9fb54a7f72
target/arm: Take exceptions on ATS instructions when needed
The translation table walk for an ATS instruction can result in
various faults. In general these are just reported back via the
PAR_EL1 fault status fields, but in some cases the architecture
requires that the fault is turned into an exception:
* synchronous stage 2 faults of any kind during AT S1E0* and
AT S1E1* instructions executed from NS EL1 fault to EL2 or EL3
* synchronous external aborts are taken as Data Abort exceptions

(This is documented in the v8A Arm ARM DDI0487A.e D5.2.11 and
G5.13.4.)

Backports commit 0710b2fa84a4aeb925422e1e88edac49ed407c79 from qemu
2019-11-20 17:24:44 -05:00
Peter Maydell 56b54f361e
target/arm: Allow ARMCPRegInfo read/write functions to throw exceptions
Currently the only part of an ARMCPRegInfo which is allowed to cause
a CPU exception is the access function, which returns a value indicating
that some flavour of UNDEF should be generated.

For the ATS system instructions, we would like to conditionally
generate exceptions as part of the writefn, because some faults
during the page table walk (like external aborts) should cause
an exception to be raised rather than returning a value.

There are several ways we could do this:
* plumb the GETPC() value from the top level set_cp_reg/get_cp_reg
helper functions through into the readfn and writefn hooks
* add extra readfn_with_ra/writefn_with_ra hooks that take the GETPC()
value
* require the ATS instructions to provide a dummy accessfn,
which serves no purpose except to cause the code generation
to emit TCG ops to sync the CPU state
* add an ARM_CP_ flag to mark the ARMCPRegInfo as possibly
throwing an exception in its read/write hooks, and make the
codegen sync the CPU state before calling the hooks if the
flag is set

This patch opts for the last of these, as it is fairly simple
to implement and doesn't require invasive changes like updating
the readfn/writefn hook function prototype signature.

Backports commit 37ff584c15bc3e1dd2c26b1998f00ff87189538c from qemu
2019-11-20 17:24:37 -05:00
Richard Henderson 87c06b7fae
target/arm: Factor out unallocated_encoding for aarch32
Make this a static function private to translate.c.
Thus we can use the same idiom between aarch64 and aarch32
without actually sharing function implementations.

Backports commit 1ce21ba1eaf08b22da5925f3e37fc0b4322da858 from qemu
2019-11-18 23:51:45 -05:00
Richard Henderson 1f59a43544
Revert "target/arm: Use unallocated_encoding for aarch32"
Despite the fact that the text for the call to gen_exception_insn
is identical for aarch64 and aarch32, the implementation inside
gen_exception_insn is totally different.

This fixes exceptions raised from aarch64.

This reverts commit fb2d3c9a9a.
2019-11-18 23:49:47 -05:00
Richard Henderson 9d2a3064af
target/arm: Use tcg_gen_extrh_i64_i32 to extract the high word
Separate shift + extract low will result in one extra insn
for hosts like RISC-V, MIPS, and Sparc.

Backports commit 664b7e3b97d6376f3329986c465b3782458b0f8b from qemu
2019-11-18 20:36:19 -05:00
Richard Henderson 93c016a3e7
target/arm: Simplify SMMLA, SMMLAR, SMMLS, SMMLSR
All of the inputs to these instructions are 32-bits. Rather than
extend each input to 64-bits and then extract the high 32-bits of
the output, use tcg_gen_muls2_i32 and other 32-bit generator functions.

Backports commit 5f8cd06ebcf57420be8fea4574de2e074de46709 from qemu
2019-11-18 20:31:12 -05:00
Richard Henderson 4a1cc16eef
target/arm: Use tcg_gen_rotri_i32 for gen_swap_half
Rotate is the more compact and obvious way to swap 16-bit
elements of a 32-bit word.

Backports commit adefba76e8bf10dfb342094d2f5debfeedb1a74d from qemu
2019-11-18 20:27:12 -05:00
Richard Henderson 751ab7b24b
target/arm: Use ror32 instead of open-coding the operation
The helper function is more documentary, and also already
handles the case of rotate by zero.

Backports commit dd861b3f29be97a9e3cdb9769dcbc0c7d7825185 from qemu
2019-11-18 20:25:51 -05:00
Richard Henderson df4c773ed2
target/arm: Remove redundant shift tests
The immediate shift generator functions already test for,
and eliminate, the case of a shift by zero.

Backports commit 464eaa9571fae5867d9aea7d7209c091c8a50223 from qemu
2019-11-18 20:24:39 -05:00
Richard Henderson 4dd30ebfbd
target/arm: Use tcg_gen_deposit_i32 for PKHBT, PKHTB
Use deposit as the composit operation to merge the
bits from the two inputs.

Backports commit d1f8755fc93911f5b27246b1da794542d222fa1b from qemu
2019-11-18 20:22:00 -05:00
Richard Henderson 25ccd28e78
target/arm: Use tcg_gen_extract_i32 for shifter_out_im
Extract is a compact combination of shift + and.

Backports commit 191f4bfe8d6cf0c7d5cd7f84cd7076e32e3745dd from qemu
2019-11-18 20:19:40 -05:00
Andrew Jones ad63ee7509
target/arm/cpu: Use div-round-up to determine predicate register array size
Unless we're guaranteed to always increase ARM_MAX_VQ by a multiple of
four, then we should use DIV_ROUND_UP to ensure we get an appropriate
array size.

Backports commit 46417784d21c89446763f2047228977bdc267895 from qemu
2019-11-18 20:16:46 -05:00
Andrew Jones bb8b3bc42b
target/arm/helper: zcr: Add build bug next to value range assumption
The current implementation of ZCR_ELx matches the architecture, only
implementing the lower four bits, with the rest RAZ/WI. This puts
a strict limit on ARM_MAX_VQ of 16. Make sure we don't let ARM_MAX_VQ
grow without a corresponding update here.

Backports commit 7b351d98709d3f77d6bb18562e1bf228862b0d57 from qemu
2019-11-18 20:14:42 -05:00
Richard Henderson 3d3d56056b
target/arm: Remove helper_double_saturate
Replace x = double_saturate(y) with x = add_saturate(y, y).
There is no need for a separate more specialized helper.

Backports commit 640581a06d14e2d0d3c3ba79b916de6bc43578b0 from qemu
2019-11-18 20:13:21 -05:00
Richard Henderson fb2d3c9a9a
target/arm: Use unallocated_encoding for aarch32
Promote this function from aarch64 to fully general use.
Use it to unify the code sequences for generating illegal
opcode exceptions.

Backports commit 3cb36637157088892e9e33ddb1034bffd1251d3b from qemu
2019-11-18 20:10:50 -05:00
Richard Henderson d562bea784
target/arm: Remove offset argument to gen_exception_bkpt_insn
Unlike the other more generic gen_exception{,_internal}_insn
interfaces, breakpoints always refer to the current instruction.

Backports commit 06bcbda3f64d464b6ecac789bce4bd69f199cd68 from qemu
2019-11-18 20:05:45 -05:00
Richard Henderson f19b4df20d
target/arm: Replace offset with pc in gen_exception_internal_insn
The offset is variable depending on the instruction set.
Passing in the actual value is clearer in intent.

Backpors commit aee828e7541a5895669ade3a4b6978382b6b094a from qemu
2019-11-18 20:05:23 -05:00
Richard Henderson 00fbadf637
target/arm: Replace s->pc with s->base.pc_next
We must update s->base.pc_next when we return from the translate_insn
hook to the main translator loop. By incrementing s->base.pc_next
immediately after reading the insn word, "pc_next" contains the address
of the next instruction throughout translation.

All remaining uses of s->pc are referencing the address of the next insn,
so this is now a simple global replacement. Remove the "s->pc" field.

Backports commit a04159166b880b505ccadc16f2fe84169806883d from qemu
2019-11-18 17:32:53 -05:00
Richard Henderson 7d1fcef722
target/arm: Remove redundant s->pc & ~1
The thumb bit has already been removed from s->pc, and is always even.

Backports commit 4818c3743b0e0095fdcecd24457da9b3443730ab from qemu
2019-11-18 17:32:53 -05:00
Richard Henderson a2e60445de
target/arm: Introduce add_reg_for_lit
Provide a common routine for the places that require ALIGN(PC, 4)
as the base address as opposed to plain PC. The two are always
the same for A32, but the difference is meaningful for thumb mode.

Backports commit 16e0d8234ef9291747332d2c431e46808a060472 from qemu
2019-11-18 17:32:49 -05:00
Richard Henderson 1c0914e58c
target/arm: Introduce read_pc
We currently have 3 different ways of computing the architectural
value of "PC" as seen in the ARM ARM.

The value of s->pc has been incremented past the current insn,
but that is all. Thus for a32, PC = s->pc + 4; for t32, PC = s->pc;
for t16, PC = s->pc + 2. These differing computations make it
impossible at present to unify the various code paths.

With the newly introduced s->pc_curr, we can compute the correct
value for all cases, using the formula given in the ARM ARM.

This changes the behaviour for load_reg() and load_reg_var()
when called with reg==15 from a 32-bit Thumb instruction:
previously they would have returned the incorrect value
of pc_curr + 6, and now they will return the architecturally
correct value of PC, which is pc_curr + 4. This will not
affect well-behaved guest software, because all of the places
we call these functions from T32 code are instructions where
using r15 is UNPREDICTABLE. Using the architectural PC value
here is more consistent with the T16 and A32 behaviour.

Backports commit fdbcf6329d0c2984c55d7019419a72bf8e583c36 from qemu
2019-11-18 17:04:50 -05:00
Richard Henderson 0048f3e887
target/arm: Introduce pc_curr
Add a new field to retain the address of the instruction currently
being translated. The 32-bit uses are all within subroutines used
by a32 and t32. This will become less obvious when t16 support is
merged with a32+t32, and having a clear definition will help.

Convert aarch64 as well for consistency. Note that there is one
instance of a pre-assert fprintf that used the wrong value for the
address of the current instruction.

Backports commit 43722a6d4f0c92f7e7e1e291580039b0f9789df1 from qemu
2019-11-18 16:58:40 -05:00