Much like recpe the ARM ARM has simplified the pseudo code for the
calculation which is done on a fixed point 9 bit integer maths. So
while adding f16 we can also clean this up to be a little less heavy
on the floating point and just return the fractional part and leave
the calle's to do the final packing of the result.
Backports commit d719cbc7641991d16b891ffbbfc3a16a04e37b9a from qemu
Also removes a load of symbols that seem unnecessary from the header_gen script
It looks like the ARM ARM has simplified the pseudo code for the
calculation which is done on a fixed point 9 bit integer maths. So
while adding f16 we can also clean this up to be a little less heavy
on the floating point and just return the fractional part and leave
the calle's to do the final packing of the result.
Backports commit 5eb70735af1c0b607bf2671a53aff3710cc1672f from qemu
This is a little bit of a departure from softfloat's original approach
as we skip the estimate step in favour of a straight iteration. There
is a minor optimisation to avoid calculating more bits of precision
than we need however this still brings a performance drop, especially
for float64 operations.
Backports commit c13bb2da9eedfbc5886c8048df1bc1114b285fb0 from qemu
The compare function was already expanded from a macro. I keep the
macro expansion but move most of the logic into a compare_decomposed.
Backports commit 0c4c90929143a530730e2879204a55a30bf63758 from qemu
Let's do the same re-factor treatment for minmax functions. I still
use the MACRO trick to expand but now all the checking code is common.
Backports commit 89360067071b1844bf745682e18db7dde74cdb8d from qemu
This is one of the simpler manipulations you could make to a floating
point number.
Backports commit 0bfc9f195209593e91a98cf2233753f56a2e5c02 from qemu
These are considerably simpler as the lower order integers can just
use the higher order conversion function. As the decomposed fractional
part is a full 64 bit rounding and inexact handling comes from the
pack functions.
Backports commit c02e1fb80b553d47420f7492de4bc590c2461a86 from qemu
We share the common int64/uint64_pack_decomposed function across all
the helpers and simply limit the final result depending on the final
size.
Backports commit ab52f973a504f8de0c5df64631ba4caea70a7d9e from qemu
We can now add float16_round_to_int and use the common round_decomposed and
canonicalize functions to have a single implementation for
float16/32/64 round_to_int functions.
Backports commit dbe4d53a590f5689772b683984588b3cf6df163e from qemu
We can now add float16_muladd and use the common decompose and
canonicalize functions to have a single implementation for
float16/32/64 muladd functions.
Backports commit d446830a3aac33e7221e361dad3ab1e1892646cb from qemu
We can now add float16_div and use the common decompose and
canonicalize functions to have a single implementation for
float16/32/64 versions.
Backports commit cf07323d494f4bc225e405688c2e455c3423cc40 from qemu
We can now add float16_mul and use the common decompose and
canonicalize functions to have a single implementation for
float16/32/64 versions.
Backports commit 74d707e2cc1e406068acad8e5559cd2584b1073a from qemu
We can now add float16_add/sub and use the common decompose and
canonicalize functions to have a single implementation for
float16/32/64 add and sub functions.
Backports commit 6fff216769cf7eaa3961c85dee7a72838696d365 from qemu
This will be required when expanding the MINMAX() macro for 16
bit/half-precision operations.
Backports commit 210cbd4910ae9e41e0a1785b96890ea2c291b381 from qemu
compare_litqobj_to_qobj() lacks a qlit_ prefix. Moreover, "compare"
suggests -1, 0, +1 for less than, equal and greater than. The
function actually returns non-zero for equal, zero for unequal.
Rename to qlit_equal_qobject().
Its return type will be cleaned up in the next patch.
Backports commit 60cc2eb7afd40b9cbaa35a5e0b54f365ac6e49f1 from qemu
This implements emulation of the new SM4 instructions that have
been added as an optional extension to the ARMv8 Crypto Extensions
in ARM v8.2.
Backports commit b6577bcd251ca0d57ae1de149e3c706b38f21587 from qemu
This implements emulation of the new SM3 instructions that have
been added as an optional extension to the ARMv8 Crypto Extensions
in ARM v8.2.
Backports commit 80d6f4c6bbb718f343a832df8dee15329cc7686c from qemu
This implements emulation of the new SHA-512 instructions that have
been added as an optional extensions to the ARMv8 Crypto Extensions
in ARM v8.2.
Backports commit 90b827d131812d7f0a8abb13dba1942a2bcee821 from qemu
The x86 vector instruction set is extremely irregular. With newer
editions, Intel has filled in some of the blanks. However, we don't
get many 64-bit operations until SSE4.2, introduced in 2009.
The subsequent edition was for AVX1, introduced in 2011, which added
three-operand addressing, and adjusts how all instructions should be
encoded.
Given the relatively narrow 2 year window between possible to support
and desirable to support, and to vastly simplify code maintainence,
I am only planning to support AVX1 and later cpus.
Backports commit 770c2fc7bb70804ae9869995fd02dadd6d7656ac from qemu
Use dup to convert a non-constant scalar to a third vector.
Add addition, multiplication, and logical operations with an immediate.
Add addition, subtraction, multiplication, and logical operations with
a non-constant scalar. Allow for the front-end to build operations in
which the scalar operand comes first.
Backports commit 22fc3527034678489ec554e82fd52f8a7f05418e from qemu
No vector ops as yet. SSE only has direct support for 8- and 16-bit
saturation; handling 32- and 64-bit saturation is much more expensive.
Backports commit f49b12c6e6a75a5bd109bcbbda072b24e5fb8dfd from qemu
Opcodes are added for scalar and vector shifts, but considering the
varied semantics of these do not expose them to the front ends. Do
go ahead and provide them in case they are needed for backend expansion.
Backports commit d0ec97967f940bbc11dced83422b39c224127f1e from qemu
With no fixed array allocation, we can't overflow a buffer.
This will be important as optimizations related to host vectors
may expand the number of ops used.
Use QTAILQ to link the ops together.
Backports commit 15fa08f8451babc88d733bd411d4c94976f9d0f8 from qemu
This was never used since its introduction in commit
196ea13104f8 ("memory: Add global-locking property to memory
regions").
Backports commit e2fbe20851ceec5ccd7b539a89db0420393fb85d from qemu
Implement the TT instruction which queries the security
state and access permissions of a memory location.
Backports commit 5158de241b0fb344a6c948dfcbc4e611ab5fafbe from qemu
In the v7M architecture, there is an invariant that if the CPU is
in Handler mode then the CONTROL.SPSEL bit cannot be nonzero.
This in turn means that the current stack pointer is always
indicated by CONTROL.SPSEL, even though Handler mode always uses
the Main stack pointer.
In v8M, this invariant is removed, and CONTROL.SPSEL may now
be nonzero in Handler mode (though Handler mode still always
uses the Main stack pointer). In preparation for this change,
change how we handle this bit: rename switch_v7m_sp() to
the now more accurate write_v7m_control_spsel(), and make it
check both the handler mode state and the SPSEL bit.
Note that this implicitly changes the point at which we switch
active SP on exception exit from before we pop the exception
frame to after it.
Backports commit de2db7ec894f11931932ca78cd14a8d2b1389d5b from qemu
We have object_get_objects_root() to keep user created objects, however
no place for objects that will be used internally. Create such a
container for internal objects.
Backports commit 7c47c4ead75d0b733ee8f2f51fd1de0644cc1308 from qemu
Replace the USE_DIRECT_JUMP ifdef with a TCG_TARGET_HAS_direct_jump
boolean test. Replace the tb_set_jmp_target1 ifdef with an unconditional
function tb_target_set_jmp_target.
While we're touching all backends, add a parameter for tb->tc_ptr;
we're going to need it shortly for some backends.
Move tb_set_jmp_target and tb_add_jump from exec-all.h to cpu-exec.c.
Backports commit a85833933628384d74ec412024d55cf012640287 from qemu
Define a new MachineClass field ignore_memory_transaction_failures.
If this is flag is true then the CPU will ignore memory transaction
failures which should cause the CPU to take an exception due to an
access to an unassigned physical address; the transaction will
instead return zero (for a read) or be ignored (for a write). This
should be set only by legacy board models which rely on the old
RAZ/WI behaviour for handling devices that QEMU does not yet model.
New board models should instead use "unimplemented-device" for all
memory ranges where the guest will attempt to probe for a device that
QEMU doesn't implement and a stub device is required.
We need this for ARM boards, where we're about to implement support for
generating external aborts on memory transaction failures. Too many
of our legacy board models rely on the RAZ/WI behaviour and we
would break currently working guests when their "probe for device"
code provoked an external abort rather than a RAZ.
Backports commit ed860129acd3fcd0b1e47884e810212aaca4d21b from qemu
Implement the BXNS v8M instruction, which is like BX but will do a
jump-and-switch-to-NonSecure if the branch target address has bit 0
clear.
This is the first piece of code which implements "switch to the
other security state", so the commit also includes the code to
switch the stack pointers around, which is the only complicated
part of switching security state.
BLXNS is more complicated than just "BXNS but set the link register",
so we leave it for a separate commit.
Backports commit fb602cb726b3ebdd01ef3b1732d74baf9fee7ec9 from qemu
Rename memory_region_init_rom() to memory_region_init_rom_nomigrate()
and memory_region_init_rom_device() to
memory_region_init_rom_device_nomigrate().
Backports commit b59821a95bd1d7cb4697fd7748725c910582e0e7 from qemu
Rename memory_region_init_ram() to memory_region_init_ram_nomigrate().
This leaves the way clear for us to provide a memory_region_init_ram()
which does handle migration.
Backports commit 1cfe48c1ce219b60a9096312f7a61806fae64ab3 from qemu
Commit 1f5c00cfdb8114c ("qom/cpu: move tlb_flush to cpu_common_reset")
moved the call to tlb_flush() from the target-specific reset handlers
into the common code qom/cpu.c file, and protected the call with
"#ifdef CONFIG_SOFTMMU" to avoid that it is called for linux-user
only targets. But since qom/cpu.c is common code, CONFIG_SOFTMMU is
*never* defined here, so the tlb_flush() was simply never executed
anymore. Fix it by introducing a wrapper for tlb_flush() in a file
that is re-compiled for each target, i.e. in translate-all.c.
Backports commit 2cd53943115be5118b5b2d4b80ee0a39c94c4f73 from qemu
Allocating an arbitrarily-sized array of tbs results in either
(a) a lot of memory wasted or (b) unnecessary flushes of the code
cache when we run out of TB structs in the array.
An obvious solution would be to just malloc a TB struct when needed,
and keep the TB array as an array of pointers (recall that tb_find_pc()
needs the TB array to run in O(log n)).
Perhaps a better solution, which is implemented in this patch, is to
allocate TB's right before the translated code they describe. This
results in some memory waste due to padding to have code and TBs in
separate cache lines--for instance, I measured 4.7% of padding in the
used portion of code_gen_buffer when booting aarch64 Linux on a
host with 64-byte cache lines. However, it can allow for optimizations
in some host architectures, since TCG backends could safely assume that
the TB and the corresponding translated code are very close to each
other in memory. See this message by rth for a detailed explanation:
https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05172.html
Subject: Re: GSoC 2017 Proposal: TCG performance enhancements
Backports commit 6e3b2bfd6af488a896f7936e99ef160f8f37e6f2 from qemu