The current VFP code has two different idioms for
loading and storing from the VFP register file:
1 using the gen_mov_F0_vreg() and similar functions,
which load and store to a fixed set of TCG globals
cpu_F0s, CPU_F0d, etc
2 by direct calls to tcg_gen_ld_f64() and friends
We want to phase out idiom 1 (because the use of the
fixed globals is a relic of a much older version of TCG),
but idiom 2 is quite longwinded:
tcg_gen_ld_f64(tmp, cpu_env, vfp_reg_offset(true, reg))
requires us to specify the 64-bitness twice, once in
the function name and once by passing 'true' to
vfp_reg_offset(). There's no guard against accidentally
passing the wrong flag.
Instead, let's move to a convention of accessing 64-bit
registers via the existing neon_load_reg64() and
neon_store_reg64(), and provide new neon_load_reg32()
and neon_store_reg32() for the 32-bit equivalents.
Implement the new functions and use them in the code in
translate-vfp.inc.c. We will convert the rest of the VFP
code as we do the decodetree conversion in subsequent
commits.
Backports commit 160f3b64c5cc4c8a09a1859edc764882ce6ad6bf from qemu
Move the trans_*() functions we've just created from translate.c
to translate-vfp.inc.c. This is pure code motion with no textual
changes (this can be checked with 'git show --color-moved').
Backports commit f7bbb8f31f0761edbf0c64b7ab3c3f49c13612ea from qemu
Convert the VSEL instructions to decodetree.
We leave trans_VSEL() in translate.c for now as this allows
the patch to show just the changes from the old handle_vsel().
In the old code the check for "do D16-D31 exist" was hidden in
the VFP_DREG macro, and assumed that VFPv3 always implied that
D16-D31 exist. In the new code we do the correct ID register test.
This gives identical behaviour for most of our CPUs, and fixes
previously incorrect handling for Cortex-R5F, Cortex-M4 and
Cortex-M33, which all implement VFPv3 or better with only 16
double-precision registers.
Backports commit b3ff4b87b4ae08120a51fe12592725e1dca8a085 from qemu
Factor out the VFP access checking code so that we can use it in the
leaf functions of the decodetree decoder.
We call the function full_vfp_access_check() so we can keep
the more natural vfp_access_check() for a version which doesn't
have the 'ignore_vfp_enabled' flag -- that way almost all VFP
insns will be able to use vfp_access_check(s) and only the
special-register access function will have to use
full_vfp_access_check(s, ignore_vfp_enabled).
Backports commit 06db8196bba34776829020192ed623a0b22e6557 from qemu
Add the infrastructure for building and invoking a decodetree decoder
for the AArch32 VFP encodings. At the moment the new decoder covers
nothing, so we always fall back to the existing hand-written decode.
We need to have one decoder for the unconditional insns and one for
the conditional insns, as otherwise the patterns for conditional
insns would incorrectly match against the unconditional ones too.
Since translate.c is over 14,000 lines long and we're going to be
touching pretty much every line of the VFP code as part of the
decodetree conversion, we create a new translate-vfp.inc.c to hold
the code which deals with VFP in the new scheme. It should be
possible to convert this into a standalone translation unit
eventually, but the conversion process will be much simpler if we
simply #include it midway through translate.c to start with.
Backports commit 78e138bc1f672c145ef6ace74617db00eebaa2ba from qemu