When we changed the interface of get_phys_addr_lpae to require
the cacheattr parameter, this spot was missed. The compiler is
unable to detect the use of NULL vs the nonnull attribute here.
Fixes: 7e98e21c098
Backports commit a6d6f37aed4b171d121cd4a9363fbb41e90dcb53 from qemu
Quoting ISO C99 6.7.8p4, "All the expressions in an initializer for an
object that has static storage duration shall be constant expressions or
string literals".
The compound literal produced by the make_floatx80() macro is not such a
constant expression, per 6.6p7-9. (An implementation may accept it,
according to 6.6p10, but is not required to.)
Therefore using "floatx80_zero" and make_floatx80() for initializing
"f2xm1_table" and "fpatan_table" is not portable. And gcc-4.8 in RHEL-7.6
actually chokes on them:
> target/i386/fpu_helper.c:871:5: error: initializer element is not constant
> { make_floatx80(0xbfff, 0x8000000000000000ULL),
> ^
We've had the make_floatx80_init() macro for this purpose since commit
3bf7e40ab914 ("softfloat: fix for C99", 2012-03-17), so let's use that
macro again.
Fixes: eca30647fc0 ("target/i386: reimplement f2xm1 using floatx80 operations")
Fixes: ff57bb7b632 ("target/i386: reimplement fpatan using floatx80 operations")
Backports commit 163b3d1af2552845a60967979aca8d78a6b1b088 from qemu
We forgot to update cc_op before these branch insns,
which lead to losing track of the current eflags.
Buglink: https://bugs.launchpad.net/qemu/+bug/1888165
Backports commit 3cb3a7720b01830abd5fbb81819dbb9271bf7821 from qemu
Forgetting this asserts when tcg_gen_cmp_vec is called from
within tcg_gen_cmpsel_vec.
Fixes: 72b4c792c7a
Backports commit 69c918d2ef319ac63cd759c527debc2a2bdf3a0c from qemu
For CPUs support fast short REP MOV[CPUID.(EAX=7,ECX=0):EDX(bit4)], e.g
Icelake and Tigerlake, expose it to the guest VM.
Backports commit 5cb287d2bd578dfe4897458793b4fce35bc4f744 from qemu
Raw writes to this register when in KVM mode can cause interrupts to be
raised (even when the PMU is disabled). Because the underlying state is
already aliased to PMINTENSET (which already provides raw write
functions), we can safely disable raw accesses to PMINTENCLR entirely.
Backports commit 887c0f1544991f567543b7c214aa11ab0cea0a29 from qemu
The mtedesc that was constructed was not actually passed in.
Found by Coverity (CID 1429996).
Backports commit cdecb3fc1eb182d90666348a47afe63c493686e7 from qemu
In 32-bit mode, the higher 16 bits of the destination
register are undefined. In practice CR0[31:0] is stored,
just like in 64-bit mode, so just remove the "if" that
currently differentiates the behavior.
Backports commit c0c8445255b2b5b440c355431c8b01b7b7b7c8cf from qemu
The SSE instruction implementations all fail to raise the expected
IEEE floating-point exceptions because they do nothing to convert the
exception state from the softfloat machinery into the exception flags
in MXCSR.
Fix this by adding such conversions. Unlike for x87, emulated SSE
floating-point operations might be optimized using hardware floating
point on the host, and so a different approach is taken that is
compatible with such optimizations. The required invariant is that
all exceptions set in env->sse_status (other than "denormal operand",
for which the SSE semantics are different from those in the softfloat
code) are ones that are set in the MXCSR; the emulated MXCSR is
updated lazily when code reads MXCSR, while when code sets MXCSR, the
exceptions in env->sse_status are set accordingly.
A few instructions do not raise all the exceptions that would be
raised by the softfloat code, and those instructions are made to save
and restore the softfloat exception state accordingly.
Nothing is done about "denormal operand"; setting that (only for the
case when input denormals are *not* flushed to zero, the opposite of
the logic in the softfloat code for such an exception) will require
custom code for relevant instructions, or else architecture-specific
conditionals in the softfloat code for when to set such an exception
together with custom code for various SSE conversion and rounding
instructions that do not set that exception.
Nothing is done about trapping exceptions (for which there is minimal
and largely broken support in QEMU's emulation in the x87 case and no
support at all in the SSE case).
Backports commit 418b0f93d12a1589d5031405de857844f32e9ccc from qemu
The code to set floating-point state when MXCSR changes calls
set_flush_to_zero on &env->fp_status, so affecting the x87
floating-point state rather than the SSE state. Fix to call it for
&env->sse_status instead.
Backports commit 3ddc0eca2229846bfecc3485648a6cb85a466dc7 from qemu
According to the comment, this definition of invalid encoding is given
by intel developer's manual, and doesn't comply with 680x0 FPU.
With m68k, the explicit integer bit can be zero in the case of:
- zeros (exp == 0, mantissa == 0)
- denormalized numbers (exp == 0, mantissa != 0)
- unnormalized numbers (exp != 0, exp < 0x7FFF)
- infinities (exp == 0x7FFF, mantissa == 0)
- not-a-numbers (exp == 0x7FFF, mantissa != 0)
For infinities and NaNs, the explicit integer bit can be either one or
zero.
The IEEE 754 standard does not define a zero integer bit. Such a number
is an unnormalized number. Hardware does not directly support
denormalized and unnormalized numbers, but implicitly supports them by
trapping them as unimplemented data types, allowing efficient conversion
in software.
See "M68000 FAMILY PROGRAMMER’S REFERENCE MANUAL",
"1.6 FLOATING-POINT DATA TYPES"
We will implement in the m68k TCG emulator the FP_UNIMP exception to
trap into the kernel to normalize the number. In case of linux-user,
the number will be normalized by QEMU.
Backports commit d159dd058c7dc48a9291fde92eaae52a9f26a4d1 from qemu
Since all callers to get_physical_address() now apply the same page offset to
the translation result, move the logic into get_physical_address() itself to
avoid duplication.
Backports commit 852002b5664bf079da05c5201dbf2345b870e5ed from qemu
The result of the get_physical_address() function should be combined with the
offset of the original page access before being returned. Otherwise the
m68k_cpu_get_phys_page_debug() function can round to the wrong page causing
incorrect lookups in gdbstub and various "Disassembler disagrees with
translator over instruction decoding" warnings to appear at translation time.
Fixes: 88b2fef6c3 ("target/m68k: add MC68040 MMU")
The smin/smax/umin/umax operations require the operands to be
properly sign extended. Do not drop the MO_SIGN bit from the
load, and additionally extend the val input.
Backports commit 852f933e482518797f7785a2e017a215b88df815 from qemu
The temp that gets assigned to clean_addr has been allocated with
new_tmp_a64, which means that it will be freed at the end of the
instruction. Freeing it earlier leads to assertion failure.
The loop creates a complication, in which we allocate a new local
temp, which does need freeing, and the final code path is shared
between the loop and non-loop.
Fix this complication by adding new_tmp_a64_local so that the new
local temp is freed at the end, and can be treated exactly like
the non-loop path.
Fixes: bba87d0a0f4
Backports commit 4b4dc9750a0aa0b9766bd755bf6512a84744ce8a from qemu
We now implement all of the components of MTE, without actually
supporting any tagged memory. All MTE instructions will work,
trivially, so we can enable support.
Backports commit c7459633baa71d1781fde4a245d6ec9ce2f008cf from qemu
Look up the physical address for the given virtual address,
convert that to a tag physical address, and finally return
the host address that backs it.
Backports commit e4d5bf4fbd5abfc3727e711eda64a583cab4d637 from qemu
We need to check the memattr of a page in order to determine
whether it is Tagged for MTE. Between Stage1 and Stage2,
this becomes simpler if we always collect this data, instead
of occasionally being presented with NULL.
Use the nonnull attribute to allow the compiler to check that
all pointer arguments are non-null.
Backports commit 7e98e21c09871cddc20946c8f3f3595e93154ecb from qemu
There are a number of paths by which the TBI is still intact
for user-only in the SVE helpers.
Because we currently always set TBI for user-only, we do not
need to pass down the actual TBI setting from above, and we
can remove the top byte in the inner-most primitives, so that
none are forgotten. Moreover, this keeps the "dirty" pointer
around at the higher levels, where we need it for any MTE checking.
Since the normal case, especially for user-only, goes through
RAM, this clearing merely adds two insns per page lookup, which
will be completely in the noise.
Backports commit c4af8ba19b9d22aac79cab679a20b159af9d6809 from qemu
Because the elements are non-sequential, we cannot eliminate many
tests straight away like we can for sequential operations. But
we often have the PTE details handy, so we can test for Tagged.
Backports commit d28d12f008ee44dc2cc2ee5d8f673be9febc951e from qemu
Because the elements are sequential, we can eliminate many tests all
at once when the tag hits TCMA, or if the page(s) are not Tagged.
Backports commit aa13f7c3c378fa41366b9fcd6c29af1c3d81126a from qemu
Because the elements are sequential, we can eliminate many tests all
at once when the tag hits TCMA, or if the page(s) are not Tagged.
Backports commit 71b9f3948c75bb97641a3c8c7de96d1cb47cdc07 from qemu
Because the elements are sequential, we can eliminate many tests all
at once when the tag hits TCMA, or if the page(s) are not Tagged.
Backports commit 206adacfb8d35e671e3619591608c475aa046b63 from qemu
This avoids the need for a separate set of helpers to implement
no-fault semantics, and will enable MTE in the future.
Backports commit 50de9b78cec06e6d16e92a114a505779359ca532 from qemu
Follow the model set up for contiguous loads. This handles
watchpoints correctly for contiguous stores, recognizing the
exception before any changes to memory.
Backports commit 0fa476c1bb37a70df7eeff1e5bfb4791feb37e0e from qemu
With sve_cont_ldst_pages, the differences between first-fault and no-fault
are minimal, so unify the routines. With cpu_probe_watchpoint, we are able
to make progress through pages with TLB_WATCHPOINT set when the watchpoint
does not actually fire.
Backports commit c647673ce4d72a8789703c62a7f3cbc732cb1ea8 from qemu
Handle all of the watchpoints for active elements all at once,
before we've modified the vector register. This removes the
TLB_WATCHPOINT bit from page[].flags, which means that we can
use the normal fast path via RAM.
Backports commit 4bcc3f0ff8e5ae2b17b5aab9aa613ff1b8025896 from qemu
First use of the new helper functions, so we can remove the
unused markup. No longer need a scratch for user-only, as
we completely probe the page set before reading; system mode
still requires a scratch for MMIO.
Backports commit b854fd06a868e0308bcfe05ad0a71210705814c7 from qemu
The current interface includes a loop; change it to load a
single element. We will then be able to use the function
for ld{2,3,4} where individual vector elements are not adjacent.
Replace each call with the simplest possible loop over active
elements.
Backports commit cf4a49b71b1712142d7122025a8ca7ea5b59d73f from qemu
For contiguous predicated memory operations, we want to
minimize the number of tlb lookups performed. We have
open-coded this for sve_ld1_r, but for correctness with
MTE we will need this for all of the memory operations.
Create a structure that holds the bounds of active elements,
and metadata for two pages. Add routines to find those
active elements, lookup the pages, and run watchpoints
for those pages.
Temporarily mark the functions unused to avoid Werror.
Backports commit b4cd95d2f4c7197b844f51b29871d888063ea3e7 from qemu
Use the "normal" memory access functions, rather than the
softmmu internal helper functions directly.
Since fb901c9, cpu_mem_index is now a simple extract
from env->hflags and not a large computation. Which means
that it's now more work to pass around this value than it
is to recompute it.
This only adjusts the primitives, and does not clean up
all of the uses within sve_helper.c.
Move the variable declarations to the top of the function,
but do not create a new label before sve_access_check.
Backports commit c0ed9166b1aea86a2fbaada1195aacd1049f9e85 from qemu
Replace existing uses of check_data_tbi in translate-a64.c that
perform multiple logical memory access. Leave the helper blank
for now to reduce the patch size.
Backports commit 73ceeb0011b23bac8bd2c09ebe3c18d034aa69ce from qemu
Replace existing uses of check_data_tbi in translate-a64.c that
perform a single logical memory access. Leave the helper blank
for now to reduce the patch size.
Backports commit 0a405be2b8fd9506a009b10d7d2d98c394b36db6 from qemu
Now that we know that the operation is on a single page,
we need not loop over pages while probing.
Backports commit e26d0d226892f67435cadcce86df0ddfb9943174 from qemu
We can simplify our DC_ZVA if we recognize that the largest BS
that we actually use in system mode is 64. Let us just assert
that it fits within TARGET_PAGE_SIZE.
For DC_GVA and STZGM, we want to be able to write whole bytes
of tag memory, so assert that BS is >= 2 * TAG_GRANULE, or 32.
Backports commit a4157b80242bf1c8aa0ee77aae7458ba79012d5d from qemu
Use the same code as system mode, so that we generate the same
exception + syndrome for the unaligned access.
For the moment, if MTE is enabled so that this path is reachable,
this would generate a SIGSEGV in the user-only cpu_loop. Decoding
the syndrome to produce the proper SIGBUS will be done later.
Backports commit 0d1762e931f8a694f261c604daba605bcda70928 from qemu
The current Arm ARM has adjusted the official decode of
"Add/subtract (immediate)" so that the shift field is only bit 22,
and bit 23 is part of the op1 field of the parent category
"Data processing - immediate".
Backports commit 21a8b343eaae63f6984f9a200092b0ea167647f1 from qemu
Cache the composite ATA setting.
Cache when MTE is fully enabled, i.e. access to tags are enabled
and tag checks affect the PE. Do this for both the normal context
and the UNPRIV context.
Backports commit 81ae05fa2d21ac1a0054935b74342aa38a5ecef7 from qemu
This is TFSRE0_EL1, TFSR_EL1, TFSR_EL2, TFSR_EL3,
RGSR_EL1, GCR_EL1, GMID_EL1, and PSTATE.TCO.
Backports commit 4b779cebb3e5ab30b945181f1ba3932f5f8a1cb5 from qemu
Add an option that writes back the PC, like DISAS_UPDATE_EXIT,
but does not exit back to the main loop.
Backports commit 329833286d7a1b0ef8c7daafe13c6ae32429694e from qemu
target/arm: Add support for MTE to HCR_EL2 and SCR_EL3
This does not attempt to rectify all of the res0 bits, but does
clear the mte bits when not enabled. Since there is no high-part
mapping of SCTLR, aa32 mode cannot write to these bits.
Backports commits f00faf130d5dcf64b04f71a95f14745845ca1014, and
8ddb300bf60a5f3d358dd6fbf81174f6c03c1d9f from qemu.
Protect reads of aa64 id registers with ARM_CP_STATE_AA64.
Use this as a simpler test than arm_el_is_aa64, since EL3
cannot change mode.
Backports commit 252e8c69669599b4bcff802df300726300292f47 from qemu
The x87 fpatan emulation is currently based around conversion to
double. This is inherently unsuitable for a good emulation of any
floatx80 operation. Reimplement using the soft-float operations, as
for other such instructions.
Backports commit ff57bb7b63267dabd60f88354c8c29ea5e1eb3ec from qemu
The x87 fyl2x emulation is currently based around conversion to
double. This is inherently unsuitable for a good emulation of any
floatx80 operation. Reimplement using the soft-float operations,
building on top of the reimplementation of fyl2xp1 and factoring out
code to be shared between the two instructions.
The included test assumes that the result in round-to-nearest mode
should always be one of the two closest floating-point numbers to the
mathematically exact result (including that it should be exact, in the
exact cases which cover more cases than for fyl2xp1).
Backports commit 1f18a1e6ab8368a4eab2d22894d3b2ae75250cd3 from qemu
The x87 fyl2xp1 emulation is currently based around conversion to
double. This is inherently unsuitable for a good emulation of any
floatx80 operation, even before considering that it is a particularly
naive implementation using double (adding 1 then using log rather than
attempting a better emulation using log1p).
Reimplement using the soft-float operations, as was done for f2xm1; as
in that case, m68k has related operations but not exactly this one and
it seemed safest to implement directly rather than reusing the m68k
code to avoid accumulation of errors.
A test is included with many randomly generated inputs. The
assumption of the test is that the result in round-to-nearest mode
should always be one of the two closest floating-point numbers to the
mathematical value of y * log2(x + 1); the implementation aims to do
somewhat better than that (about 70 correct bits before rounding). I
haven't investigated how accurate hardware is.
Intel manuals describe a narrower range of valid arguments to this
instruction than AMD manuals. The implementation accepts the wider
range (it's needed anyway for the core code to be reusable in a
subsequent patch reimplementing fyl2x), but the test only has inputs
in the narrower range so that it's valid on hardware that may reject
or produce poor results for inputs outside that range.
Code in the previous implementation that sets C2 for some out-of-range
arguments is not carried forward to the new implementation; C2 is
undefined for this instruction and I suspect that code was just
cut-and-pasted from the trigonometric instructions (fcos, fptan, fsin,
fsincos) where C2 *is* defined to be set for out-of-range arguments.
Backports commit 5eebc49d2d0aa5fc7e90eeac97533051bb7b72fa from qemu
The x87 fprem and fprem1 emulation is currently based around
conversion to double, which is inherently unsuitable for a good
emulation of any floatx80 operation. Reimplement using the soft-float
floatx80 remainder operations.
Backports commit 5ef396e2ba865f34a4766dbd60c739fb4bcb4fcc from qemu
Both x87 and m68k need the low parts of the quotient for their
remainder operations. Arrange for floatx80_modrem to track those bits
and return them via a pointer.
The architectures using float32_rem and float64_rem do not appear to
need this information, so the *_rem interface is left unchanged and
the information returned only from floatx80_modrem. The logic used to
determine the low 7 bits of the quotient for m68k
(target/m68k/fpu_helper.c:make_quotient) appears completely bogus (it
looks at the result of converting the remainder to integer, the
quotient having been discarded by that point); this patch does not
change that, but the m68k maintainers may wish to do so.
Backports commit 445810ec915687d37b8ae0ef8d7340ab4a153efa from qemu
The floatx80 remainder implementation unnecessarily sets the high bit
of bSig explicitly. By that point in the function, arguments that are
invalid, zero, infinity or NaN have already been handled and
subnormals have been through normalizeFloatx80Subnormal, so the high
bit will already be set. Remove the unnecessary code.
Backports commit 566601f1f9d972e44214696d3cb320e6c18880aa from qemu
The floatx80 remainder implementation sometimes returns the numerator
unchanged when the denominator is sufficiently larger than the
numerator. But if the value to be returned unchanged is a
pseudo-denormal, that is incorrect. Fix it to normalize the numerator
in that case.
Backports commit b662495dca0a2a36008cf8def91e2566519ed3f2 from qemu
The floatx80 remainder implementation ignores the high bit of the
significand when checking whether an operand (numerator) with zero
exponent is zero. This means it mishandles a pseudo-denormal
representation of 0x1p-16382L by treating it as zero. Fix this by
checking the whole significand instead.
Backports commit 499a2f7b554a295cfc10f8cd026d9b20a38fe664 from qemu
The m68k-specific softfloat code includes a function floatx80_mod that
is extremely similar to floatx80_rem, but computing the remainder
based on truncating the quotient toward zero rather than rounding it
to nearest integer. This is also useful for emulating the x87 fprem
and fprem1 instructions. Change the floatx80_rem implementation into
floatx80_modrem that can perform either operation, with both
floatx80_rem and floatx80_mod as thin wrappers available for all
targets.
There does not appear to be any use for the _mod operation for other
floating-point formats in QEMU (the only other architectures using
_rem at all are linux-user/arm/nwfpe, for FPA emulation, and openrisc,
for instructions that have been removed in the latest version of the
architecture), so no change is made to the code for other formats.
Backports commit 6b8b0136ab3018e4b552b485f808bf66bcf19ead from qemu
The x87 f2xm1 emulation is currently based around conversion to
double. This is inherently unsuitable for a good emulation of any
floatx80 operation, even before considering that it is a particularly
naive implementation using double (computing with pow and then
subtracting 1 rather than attempting a better emulation using expm1).
Reimplement using the soft-float operations, including additions and
multiplications with higher precision where appropriate to limit
accumulation of errors. I considered reusing some of the m68k code
for transcendental operations, but the instructions don't generally
correspond exactly to x87 operations (for example, m68k has 2^x and
e^x - 1, but not 2^x - 1); to avoid possible accumulation of errors
from applying multiple such operations each rounding to floatx80
precision, I wrote a direct implementation of 2^x - 1 instead. It
would be possible in principle to make the implementation more
efficient by doing the intermediate operations directly with
significands, signs and exponents and not packing / unpacking floatx80
format for each operation, but that would make it significantly more
complicated and it's not clear that's worthwhile; the m68k emulation
doesn't try to do that.
A test is included with many randomly generated inputs. The
assumption of the test is that the result in round-to-nearest mode
should always be one of the two closest floating-point numbers to the
mathematical value of 2^x - 1; the implementation aims to do somewhat
better than that (about 70 correct bits before rounding). I haven't
investigated how accurate hardware is.
Backports commit eca30647fc078f4d9ed1b455bd67960f99dbeb7a from qemu
In commit cfdb2c0c95ae9205b0 ("target/arm: Vectorize SABA/UABA") we
replaced the old handling of SABA/UABA with a vectorized implementation
which returns early rather than falling into the loop-ever-elements
code. We forgot to delete the part of the old looping code that
did the accumulate step, and Coverity correctly warns (CID 1428955)
that this code is now dead. Delete it.
Fixes: cfdb2c0c95ae9205b0
Backports commit ced7e8edb282765685d2ba0206a11f8692d8ec1c from qemu
Since commit ba3e7926691ed3 it has been unnecessary for target code
to call gen_io_end() after an IO instruction in icount mode; it is
sufficient to call gen_io_start() before it and to force the end of
the TB.
Many now-unnecessary calls to gen_io_end() were removed in commit
9e9b10c6491153b, but some were missed or accidentally added later.
Remove unneeded calls from the arm target:
* the call in the handling of exception-return-via-LDM is
unnecessary, and the code is already forcing end-of-TB
* the call in the VFP access check code is more complicated:
we weren't ending the TB, so we need to add the code to
force that by setting DISAS_UPDATE
* the doc comment for ARM_CP_IO doesn't need to mention
gen_io_end() any more
Backports commit 55c812b74289863c348449135812027d188f040a from qemu
The functions neon_element_offset(), neon_load_element(),
neon_load_element64(), neon_store_element() and
neon_store_element64() are used only in the translate-neon.inc.c
file, so move their definitions there.
Since the .inc.c file is #included in translate.c this doesn't make
much difference currently, but it's a more logical place to put the
functions and it might be helpful if we ever decide to try to make
the .inc.c files genuinely separate compilation units.
Backports commit 6fb5787898aab6aa04887fed9cf3220dd4c3f36a from qemu
Convert the Neon VTRN insn to decodetree. This is the last insn in the
Neon data-processing group, so we can remove all the now-unused old
decoder framework.
It's possible that there's a more efficient implementation of
VTRN, but for this conversion we just copy the existing approach.
Backports commit d4366190f84fe89cc5d46da995dac1e7d541b98e from qemu
Convert the Neon VSWP insn to decodetree. Since the new implementation
doesn't have to share a pass-loop with the other 2-reg-misc operations
we can implement the swap with 64-bit accesses rather than 32-bits
(which brings us into line with the pseudocode and is more efficient).
Backports commit 8ab3a227a0f13f0ff85846f36f7c466769aef4fc from qemu
Convert the Neon 2-reg-misc VRINT insns to decodetree.
Giving these insns their own do_vrint() function allows us
to change the rounding mode just once at the start and end
rather than doing it for every element in the vector.
Backports commit 128123ea34e9e6afe4842aefcb9cf84b9642ac22 from qemu
Convert the Neon 2-reg-misc insns which are implemented with
simple calls to functions that take the input, output and
fpstatus pointer.
Backports commit 3e96b205286dfb8bbf363229709e4f8648fce379 from qemu
Convert the Neon VQABS and VQNEG insns to decodetree.
Since these are the only ones which need cpu_env passing to
the helper, we wrap the helper rather than creating a whole
new do_2misc_env() function.
Backports commit 4936f38abe6db0a9d23fd04e4cb0cf4d51cff174 from qemu
Convert the remaining ops in the Neon 2-reg-misc group which
can be implemented simply with our do_2misc() helper.
Backports commit 84eae770af69c37a92496a4c4248875c070d5ee3 from qemu
Make gen_swap_half() take a source and destination TCGv_i32 rather
than modifying the input TCGv_i32; we're going to want to be able to
use it with the more flexible function signature, and this also
brings it into line with other functions like gen_rev16() and
gen_revsh().
Backports commit 8ec3de7018a8198624aae49eef5568256114a829 from qemu
All the other typedefs like these spell "Op" with a lowercase 'p';
remane the NeonGenTwoSingleOPFn and NeonGenTwoDoubleOPFn typedefs to
match.
Backports commit 5de3fd045be11b74cd0fbf36c6d4fb8387d5463b from qemu
The NeonGenOneOpFn typedef breaks with the pattern of the other
NeonGen*Fn typedefs, because it is a TCGv_i64 -> TCGv_i64 operation
but it does not have '64' in its name. Rename it to NeonGenOne64OpFn,
so that the old name is available for a TCGv_i32 -> TCGv_i32 operation
(which we will need in a subsequent commit).
Backports commit 039f4e809ad2772fb33de4511ff68a485d875618 from qemu