unicorn

mirror of https://github.com/yuzu-emu/unicorn.git synced 2025-01-10 06:15:35 +00:00

Author	SHA1	Message	Date
Richard Henderson	cd538f0b7e	tcg: Initialize cpu_env generically This is identical for each target. So, move the initialization to common code. Move the variable itself out of tcg_ctx and name it cpu_env to minimize changes within targets. This also means we can remove tcg_global_reg_new_{ptr,i32,i64}, since there are no longer global-register temps created by targets. Backports commit 1c2adb958fc07e5b3e81ed21b801c04a15f41f4f from qemu	2018-03-15 15:49:19 -04:00
Lioncash	72c18027a6	cpu: Unicorn-ify the qemu_tcg_mttcg_enabled() macro Gets rid of reliance on a non-existent variable	2018-03-14 12:10:29 -04:00
Emilio G. Cota	3fe9866ffe	osdep: introduce qemu_mprotect_rwx/none Backports commit 5fa64b3130af9a45e7e2a904bde1f8cfb72be5c9 from qemu	2018-03-14 12:10:28 -04:00
Emilio G. Cota	5ad6116f20	tcg: allocate optimizer temps with tcg_malloc Groundwork for supporting multiple TCG contexts. While at it, also allocate temps_used directly as a bitmap of the required size, instead of using a bitmap of TCG_MAX_TEMPS via TCGTempSet. Performance-wise we lose about 1.12% in a translation-heavy workload such as booting+shutting down debian-arm: Performance counter stats for 'taskset -c 0 arm-softmmu/qemu-system-arm \ -machine type=virt -nographic -smp 1 -m 4096 \ -netdev user,id=unet,hostfwd=tcp::2222-:22 \ -device virtio-net-device,netdev=unet \ -drive file=die-on-boot.qcow2,id=myblock,index=0,if=none \ -device virtio-blk-device,drive=myblock \ -kernel kernel.img -append console=ttyAMA0 root=/dev/vda1 \ -name arm,debug-threads=on -smp 1' (10 runs): exec time (s) Relative slowdown wrt original (%) --------------------------------------------------------------- original 20.213321616 0. tcg_malloc 20.441130078 1.1270214 TCGContext 20.477846517 1.3086662 g_malloc 20.780527895 2.8061013 The other two alternatives shown in the table are: - TCGContext: embed temps[TCG_MAX_TEMPS] and TCGTempSet used_temps in TCGContext. This is simple enough but it isn't faster than using tcg_malloc; moreover, it wastes memory. - g_malloc: allocate/deallocate both temps and used_temps every time tcg_optimize is executed. Backports commit 34184b071817b4f9edbfd1aa2225c196f05a0947 from qemu	2018-03-14 12:10:28 -04:00
Emilio G. Cota	66fa401871	exec-all: rename tb_free to tb_remove We don't really free anything in this function anymore; we just remove the TB from the binary search tree. Backports commit be1e01171b556807198c84feac7cf4bca0d904c2 from qemu	2018-03-13 16:20:41 -04:00
Emilio G. Cota	f7c984d21f	translate-all: use a binary search tree to track TBs in TBContext This is a prerequisite for supporting multiple TCG contexts, since we will have threads generating code in separate regions of code_gen_buffer. For this we need a new field (.size) in struct tb_tc to keep track of the size of the translated code. This field uses a size_t to avoid adding a hole to the struct, although really an unsigned int would have been enough. The comparison function we use is optimized for the common case: insertions. Profiling shows that upon booting debian-arm, 98% of comparisons are between existing tb's (i.e. a->size and b->size are both !0), which happens during insertions (and removals, but those are rare). The remaining cases are lookups. From reading the glib sources we see that the first key is always the lookup key. However, the code does not assume this to always be the case because this behaviour is not guaranteed in the glib docs. However, we embed this knowledge in the code as a branch hint for the compiler. Note that tb_free does not free space in the code_gen_buffer anymore, since we cannot easily know whether the tb is the last one inserted in code_gen_buffer. The next patch in this series renames tb_free to tb_remove to reflect this. Performance-wise, lookups in tb_find_pc are the same as before: O(log n). However, insertions are O(log n) instead of O(1), which results in a small slowdown when booting debian-arm: Performance counter stats for 'build/arm-softmmu/qemu-system-arm \ -machine type=virt -nographic -smp 1 -m 4096 \ -netdev user,id=unet,hostfwd=tcp::2222-:22 \ -device virtio-net-device,netdev=unet \ -drive file=img/arm/jessie-arm32.qcow2,id=myblock,index=0,if=none \ -device virtio-blk-device,drive=myblock \ -kernel img/arm/aarch32-current-linux-kernel-only.img \ -append console=ttyAMA0 root=/dev/vda1 \ -name arm,debug-threads=on -smp 1' (10 runs): - Before: 8048.598422 task-clock (msec) # 0.931 CPUs utilized ( +- 0.28% ) 16,974 context-switches # 0.002 M/sec ( +- 0.12% ) 0 cpu-migrations # 0.000 K/sec 10,125 page-faults # 0.001 M/sec ( +- 1.23% ) 35,144,901,879 cycles # 4.367 GHz ( +- 0.14% ) <not supported> stalled-cycles-frontend <not supported> stalled-cycles-backend 65,758,252,643 instructions # 1.87 insns per cycle ( +- 0.33% ) 10,871,298,668 branches # 1350.707 M/sec ( +- 0.41% ) 192,322,212 branch-misses # 1.77% of all branches ( +- 0.32% ) 8.640869419 seconds time elapsed ( +- 0.57% ) - After: 8146.242027 task-clock (msec) # 0.923 CPUs utilized ( +- 1.23% ) 17,016 context-switches # 0.002 M/sec ( +- 0.40% ) 0 cpu-migrations # 0.000 K/sec 18,769 page-faults # 0.002 M/sec ( +- 0.45% ) 35,660,956,120 cycles # 4.378 GHz ( +- 1.22% ) <not supported> stalled-cycles-frontend <not supported> stalled-cycles-backend 65,095,366,607 instructions # 1.83 insns per cycle ( +- 1.73% ) 10,803,480,261 branches # 1326.192 M/sec ( +- 1.95% ) 195,601,289 branch-misses # 1.81% of all branches ( +- 0.39% ) 8.828660235 seconds time elapsed ( +- 0.38% ) Backports commit 2ac01d6dafabd4a726254eea98824c798d416ee4 from qemu	2018-03-13 16:18:29 -04:00
Richard Henderson	35e551dc45	tcg: Remove CF_IGNORE_ICOUNT Now that we have curr_cflags, we can include CF_USE_ICOUNT early and then remove it as necessary. Backports commit 416986d3f97329655e30da7271a2d11c6d707b06 from qemu	2018-03-13 15:28:47 -04:00
Richard Henderson	f04beeea78	tcg: Add CF_LAST_IO + CF_USE_ICOUNT to CF_HASH_MASK These flags are used by target/*/translate.c, and affect code generation. Backports commit 0cf8a44c2f56ba884c2f6db47d27fbb24975daa3 from qemu	2018-03-13 15:25:09 -04:00
Emilio G. Cota	c384da2f47	tcg: convert tb->cflags reads to tb_cflags(tb) Convert all existing readers of tb->cflags to tb_cflags, so that we use atomic_read and therefore avoid undefined behaviour in C11. Note that the remaining setters/getters of the field are protected by tb_lock, and therefore do not need conversion. Luckily all readers access the field via 'tb->cflags' (so no foo.cflags, bar->cflags in the code base), which makes the conversion easily scriptable: FILES=$(git grep 'tb->cflags' target include/exec/gen-icount.h \ accel/tcg/translator.c \| cut -f1 -d':' \| sort \| uniq) perl -pi -e 's/([^.>])tb->cflags/$1tb_cflags(tb)/g' $FILES perl -pi -e 's/([a-z->.]*)(->\|\.)tb->cflags/tb_cflags($1$2tb)/g' $FILES Then manually fixed the few errors that checkpatch reported. Compile-tested for all targets. Backports commit c5a49c63fa26e8825ad101dfe86339ae4c216539 from qemu	2018-03-13 14:57:51 -04:00
Richard Henderson	d6ca4d59dc	tcg: Include CF_COUNT_MASK in CF_HASH_MASK Backports commit cdfef1715c779eb528d633e8b76cbc8a10e71ac8 from qemu	2018-03-13 14:42:42 -04:00
Richard Henderson	5d360366e9	tcg: Add CPUState cflags_next_tb We were generating code during tb_invalidate_phys_page_range, check_watchpoint, cpu_io_recompile, and (seemingly) discarding the TB, assuming that it would magically be picked up during the next iteration through the cpu_exec loop. Instead, record the desired cflags in CPUState so that we request the proper TB so that there is no more magic. Backports commit 9b990ee5a3cc6aa38f81266fb0c6ef37a36c45b9 from qemu	2018-03-13 14:39:43 -04:00
Emilio G. Cota	b5961a139b	tcg: define CF_PARALLEL and use it for TB hashing along with CF_COUNT_MASK This will enable us to decouple code translation from the value of parallel_cpus at any given time. It will also help us minimize TB flushes when generating code via EXCP_ATOMIC. Note that the declaration of parallel_cpus is brought to exec-all.h to be able to define there the "curr_cflags" inline. Backports commit 4e2ca83e71b51577b06b1468e836556912bd5b6e from qemu	2018-03-13 14:32:43 -04:00
Emilio G. Cota	6bc05eeee4	tb hash: track translated blocks with qht Having a fixed-size hash table for keeping track of all translation blocks is suboptimal: some workloads are just too big or too small to get maximum performance from the hash table. The MRU promotion policy helps improve performance when the hash table is a little undersized, but it cannot make up for severely undersized hash tables. Furthermore, frequent MRU promotions result in writes that are a scalability bottleneck. For scalability, lookups should only perform reads, not writes. This is not a big deal for now, but it will become one once MTTCG matures. The appended fixes these issues by using qht as the implementation of the TB hash table. This solution is superior to other alternatives considered, namely: - master: implementation in QEMU before this patchset - xxhash: before this patch, i.e. fixed buckets + xxhash hashing + MRU. - xxhash-rcu: fixed buckets + xxhash + RCU list + MRU. MRU is implemented here by adding an intermediate struct that contains the u32 hash and a pointer to the TB; this allows us, on an MRU promotion, to copy said struct (that is not at the head), and put this new copy at the head. After a grace period, the original non-head struct can be eliminated, and after another grace period, freed. - qht-fixed-nomru: fixed buckets + xxhash + qht without auto-resize + no MRU for lookups; MRU for inserts. The appended solution is the following: - qht-dyn-nomru: dynamic number of buckets + xxhash + qht w/ auto-resize + no MRU for lookups; MRU for inserts. The plots below compare the considered solutions. The Y axis shows the boot time (in seconds) of a debian jessie image with arm-softmmu; the X axis sweeps the number of buckets (or initial number of buckets for qht-autoresize). The plots in PNG format (and with errorbars) can be seen here: http://imgur.com/a/Awgnq Each test runs 5 times, and the entire QEMU process is pinned to a single core for repeatability of results. Host: Intel Xeon E5-2690 28 ++------------+-------------+-------------+-------------+------------++ A*** + + + master A*** + 27 ++ * xxhash ##B###++ \| A****A** xxhash-rcu $$C$$$ \| 26 C$$ A**A**** qht-fixed-nomru%%D%%%++ D%%$$ A***A***Aqht-dyn-mru AE*A 25 ++ %%$$ qht-dyn-nomru &&F&&&++ B#####% \| 24 ++ #C$$$$$ ++ \| B### $ \| \| ## C$$$$$$ \| 23 ++ # C$$$$$$ ++ \| B###### C$$$$$$ %%%D 22 ++ %B###### C$$$$$$C$$$$$$C$$$$$$C$$$$$$C$$$$$$C \| D%%%%%%B###### @E@@@@@@ %%%D%%%@@@E@@@@@@E 21 E@@@@@@E@@@@@@F&&&@@@E@@@&&&D%%%%%%B######B######B######B######B######B + E@@@ F&&& + E@ + F&&& + + 20 ++------------+-------------+-------------+-------------+------------++ 14 16 18 20 22 24 log2 number of buckets Host: Intel i7-4790K 14.5 ++------------+------------+-------------+------------+------------++ A + + + master A* + 14 ++ xxhash ##B###++ 13.5 ++ xxhash-rcu $$C$$$++ \| qht-fixed-nomru %%D%%% \| 13 ++ A**** qht-dyn-mru @@E@@@++ \| A*A**A** qht-dyn-nomru &&F&&& \| 12.5 C$$ A**A**A*A** A 12 ++ $$ A ++ D%%% $$ \| 11.5 ++ %% ++ B### %C$$$$$$ \| 11 ++ ## D%%%%% C$$$$$ ++ \| # % C$$$$$$ \| 10.5 F&&&&&&B######D%%%%% C$$$$$$C$$$$$$C$$$$$$C$$$$$C$$$$$$ $$$C 10 E@@@@@@E@@@@@@B#####B######B######E@@@@@@E@@@%%%D%%%%%D%%%###B######B + F&& D%%%%%%B######B######B#####B###@@@D%%% + 9.5 ++------------+------------+-------------+------------+------------++ 14 16 18 20 22 24 log2 number of buckets Note that the original point before this patch series is X=15 for "master"; the little sensitivity to the increased number of buckets is due to the poor hashing function in master. xxhash-rcu has significant overhead due to the constant churn of allocating and deallocating intermediate structs for implementing MRU. An alternative would be do consider failed lookups as "maybe not there", and then acquire the external lock (tb_lock in this case) to really confirm that there was indeed a failed lookup. This, however, would not be enough to implement dynamic resizing--this is more complex: see "Resizable, Scalable, Concurrent Hash Tables via Relativistic Programming" by Triplett, McKenney and Walpole. This solution was discarded due to the very coarse RCU read critical sections that we have in MTTCG; resizing requires waiting for readers after every pointer update, and resizes require many pointer updates, so this would quickly become prohibitive. qht-fixed-nomru shows that MRU promotion is advisable for undersized hash tables. However, qht-dyn-mru shows that MRU promotion is not important if the hash table is properly sized: there is virtually no difference in performance between qht-dyn-nomru and qht-dyn-mru. Before this patch, we're at X=15 on "xxhash"; after this patch, we're at X=15 @ qht-dyn-nomru. This patch thus matches the best performance that we can achieve with optimum sizing of the hash table, while keeping the hash table scalable for readers. The improvement we get before and after this patch for booting debian jessie with arm-softmmu is: - Intel Xeon E5-2690: 10.5% less time - Intel i7-4790K: 5.2% less time We could get this same improvement _for this particular workload_ by statically increasing the size of the hash table. But this would hurt workloads that do not need a large hash table. The dynamic (upward) resizing allows us to start small and enlarge the hash table as needed. A quick note on downsizing: the table is resized back to 215 buckets on every tb_flush; this makes sense because it is not guaranteed that the table will reach the same number of TBs later on (e.g. most bootup code is thrown away after boot); it makes sense to grow the hash table as more code blocks are translated. This also avoids the complication of having to build downsizing hysteresis logic into qht. Backports commit 909eaac9bbc2ed4f3a82ce38e905b87d478a3e00 from qemu	2018-03-13 14:16:26 -04:00
Lioncash	e45c294405	Backport qht hashtable	2018-03-13 13:55:30 -04:00
Kevin Wolf	025e354370	qdict: Introduce qdict_rename_keys() A few block drivers will need to rename .bdrv_create options for their QAPIfication, so let's have a helper function for that. Backports commit bcebf102ccc3c6db327f341adc379fdf0673ca6b from qemu	2018-03-12 10:11:48 -04:00
Lioncash	a81439c7ca	exec: Drop unnecessary code for unicorn The dirty memory code isn't strictly necessary	2018-03-12 10:11:46 -04:00
Alexey Kardashevskiy	b90333a531	memory: Share special empty FlatView This shares an cached empty FlatView among address spaces. The empty FV is used every time when a root MR renders into a FV without memory sections which happens when MR or its children are not enabled or zero-sized. The empty_view is not NULL to keep the rest of memory API intact; it also has a dispatch tree for the same reason. On POWER8 with 255 CPUs, 255 virtio-net, 40 PCI bridges guest this halves the amount of FlatView's in use (557 -> 260) and dispatch tables (~800000 -> ~370000). In an unrelated experiment with 112 non-virtio devices on x86 ("-M pc"), only 4 FlatViews are alive, and about ~2000 are created at startup. Backports commit 092aa2fc65b7a35121616aad8f39d47b8f921618 from qemu	2018-03-11 22:34:28 -04:00
Alexey Kardashevskiy	1fd8b64072	memory: Get rid of address_space_init_shareable Since FlatViews are shared now and ASes not, this gets rid of address_space_init_shareable(). This should cause no behavioural change. Backports commit b516572f31c0ea0937cd9d11d9bd72dd83809886 from qemu	2018-03-11 22:12:38 -04:00
Alexey Kardashevskiy	f2c72dc278	memory: Share FlatView's and dispatch trees between address spaces This allows sharing flat views between address spaces (AS) when the same root memory region is used when creating a new address space. This is done by walking through all ASes and caching one FlatView per a physical root MR (i.e. not aliased). This removes search for duplicates from address_space_init_shareable() as FlatViews are shared elsewhere and keeping as::ref_count correct seems an unnecessary and useless complication. This should cause no change and memory use or boot time yet. Backports commit 967dc9b1194a9281124b2e1ce67b6c3359a2138f from qemu	2018-03-11 22:05:44 -04:00
Alexey Kardashevskiy	d9bc1bcc8c	memory: Rename mem_begin/mem_commit/mem_add helpers This renames some helpers to reflect better what they do. This should cause no behavioural change. Backports commit 8629d3fcb77e9775e44d9051bad0fb5187925eae from qemu	2018-03-11 21:36:50 -04:00
Alexey Kardashevskiy	aa2b76b4e8	memory: Switch memory from using AddressSpace to FlatView FlatView's will be shared between AddressSpace's and subpage_t and MemoryRegionSection cannot store AS anymore, hence this change. In particular, for: typedef struct subpage_t { MemoryRegion iomem; - AddressSpace as; + FlatView fv; hwaddr base; uint16_t sub_section[]; } subpage_t; struct MemoryRegionSection { MemoryRegion mr; - AddressSpace address_space; + FlatView *fv; hwaddr offset_within_region; Int128 size; hwaddr offset_within_address_space; bool readonly; }; This should cause no behavioural change. Backports commit 166206845f7fd75e720e6feea0bb01957c8da07f from qemu	2018-03-11 21:21:37 -04:00
Lioncash	1591f208c0	memory: Move AddressSpaceDispatch from AddressSpace to FlatView As we are going to share FlatView's between AddressSpace's, and AddressSpaceDispatch is a structure to perform quick lookup in FlatView, this moves ASD to FlatView. After previosly open coded ASD rendering, we can also remove as->next_dispatch as the new FlatView pointer is stored on a stack and set to an AS atomically. flatview_destroy() is executed under RCU instead of address_space_dispatch_free() now. This makes mem_begin/mem_commit to work with ASD and mem_add with FV as later on mem_add will be taking FV as an argument anyway. This should cause no behavioural change. Backports commit 66a6df1dc6d5b28cc3e65db0d71683fbdddc6b62 from qemu	2018-03-11 20:40:24 -04:00
Marc-André Lureau	aee9f7327f	machine: use class base init generated name machine_class_base_init() member name is allocated by machine_class_base_init(), but not freed by machine_class_finalize(). Simply freeing there doesn't work, because DEFINE_PC_MACHINE() overwrites it with a literal string. Fix DEFINE_PC_MACHINE() not to overwrite it, and add the missing free to machine_class_finalize(). Backports commit 8ea753718b2d1a42e9ce7b8db9f5e4e1f330e827 from qemu	2018-03-11 16:54:40 -04:00
Lioncash	8648b1df4f	include/elf: Update elf.h to commit f71a8eaffba3271cf7cdad95572f6996f7523a5b	2018-03-11 15:34:35 -04:00
Eduardo Habkost	7c7bb4c6d1	machine: Eliminate QEMUMachine and qemu_register_machine() The struct is not used anymore and can be eliminated. Backports commit 3b53e45f43825caaaf4fad6a5b85ce6a9949ff02 from qemu	2018-03-11 15:22:25 -04:00
Andreas Färber	048aaf05ca	Revert use of DEFINE_MACHINE() for registrations of multiple machines The script used for converting from QEMUMachine had used one DEFINE_MACHINE() per machine registered. In cases where multiple machines are registered from one source file, avoid the excessive generation of module init functions by reverting this unrolling. Backports commit 8a661aea0e7f6e776c6ebc9abe339a85b34fea1d from qemu	2018-03-11 15:17:17 -04:00
Eduardo Habkost	a7f59d7771	Use DEFINE_MACHINE() to register all machines Convert all machines to use DEFINE_MACHINE() instead of QEMUMachine automatically using a script. Backports commit e264d29de28c5b0be3d063307ce9fb613b427cc3 from qemu	2018-03-11 15:12:46 -04:00
Eduardo Habkost	426b961644	machine: DEFINE_MACHINE() macro The macro will allow easy registration of a TYPE_MACHINE subclass, using only the machine name and a MachineClass initialization function as parameter. Backports commit ed0b6de343448d1014b53bcf541041373322fa1c from qemu	2018-03-11 14:42:12 -04:00
Eduardo Habkost	46e1c5482b	machine: Set MachineClass::name automatically Now all TYPE_MACHINE subclasses use MACHINE_TYPE_NAME to generate the class name. So instead of requiring each subclass to set MachineClass::name manually, we can now set it automatically at the TYPE_MACHINE class_base_init() function. Backports commit 98cec76a7076c4a38e16f1a9de170a7942b3be54 from qemu	2018-03-11 14:38:58 -04:00
Eduardo Habkost	0261df973b	machine: Ensure all TYPE_MACHINE subclasses have the right suffix Now that all non-abstract TYPE_MACHINE subclasses have the -machine suffix, add an assert to ensure this will be always true. Backports commit dcb3d601115eed77aef543fe3a920adc17544e06 from qemu	2018-03-11 14:30:38 -04:00
Eduardo Habkost	df4cfe6804	machine: MACHINE_TYPE_NAME macro The macro will be useful to ensure the machine class names follow the right format to make machine class lookup by class name work correctly. Backports commit c84a8f01b2a5d8bf98c447796d4a747333a5b1fd from qemu	2018-03-11 13:44:26 -04:00
Eduardo Habkost	940d2371ea	machine: Remove unused fields from QEMUMachine This removes the following fields from QEMUMachine: family, alias, reset, hot_add_cpu, units_per_default_bus, no_serial, no_parallel, use_virtcon, use_sclp, no_floppy, no_cdrom, default_display, compat_props, and hw_version. The only users of those fields were already converted to use QOM and MachineClass directly, so they are not needed anymore. Backports commit d48f4fa69eb3efb03a2efe2e4606a97a17cf222f from qemu	2018-03-09 14:26:23 -05:00
Eduardo Habkost	12acb995fa	pc: Don't use QEMUMachine anymore Now that we have a DEFINE_PC_MACHINE helper macro that just requires an initialization function, it is trivial to convert them to register a QOM machine class directly, instead of using QEMUMachine. Backports commit 865906f7fdadd2732441ab158787f81f6a212bfe from qemu	2018-03-09 14:22:43 -05:00
Eduardo Habkost	b65a3ece3b	machine: Remove unused fields from QEMUMachine This removes the following fields from QEMUMachine: family, alias, reset, hot_add_cpu, units_per_default_bus, no_serial, no_parallel, use_virtcon, use_sclp, no_floppy, no_cdrom, default_display, compat_props, and hw_version. The only users of those fields were already converted to use QOM and MachineClass directly, so they are not needed anymore. Backports commit d48f4fa69eb3efb03a2efe2e4606a97a17cf222f from qemu	2018-03-09 13:41:30 -05:00
Marc-André Lureau	9ec040b74d	bus: simplify name handling Simplify a bit the code by using g_strdup_printf() and store it in a non-const value so casting is no longer needed, and ownership is clearer. Backports commit f73480c36f49562556b80bb5bf8acc45e20dcca1 from qemu	2018-03-09 13:02:15 -05:00
Thomas Huth	af3cd62c4b	Introduce DEVICE_CATEGORY_CPU for CPU devices Now that CPUs show up in the help text of "-device ?", we should group them into an appropriate category. Backports commit ba31cc7226ebcee639f18faa90c1542bd364fba3 from qemu	2018-03-09 13:00:32 -05:00
Peter Maydell	4149e877c4	configure: Drop ancient Solaris 9 and earlier support Solaris 9 was released in 2002, its successor Solaris 10 was released in 2005, and Solaris 9 was end-of-lifed in 2014. Nobody has stepped forward to express interest in supporting Solaris of any flavour, so removing support for the ancient versions seems uncontroversial. In particular, this allows us to remove a use of 'uname' in configure that won't work if you're cross-compiling. Backports commit 91939262ffcd3c85ea6a4793d3029326eea1d649 from qemu	2018-03-09 12:14:21 -05:00
Richard Henderson	7e327aaf84	util: Introduce include/qemu/cpuid.h Clang 3.9 passes the CONFIG_AVX2_OPT configure test. However, the supplied <cpuid.h> does not contain the bit_AVX2 define that we use when detecting whether the routine can be enabled. Introduce a qemu-specific header that uses the compiler's definition of __cpuid et al, but supplies any missing bit_* definitions needed. This avoids introducing any extra ifdefs to util/bufferiszero.c, and allows quite a few to be removed from tcg/i386/tcg-target.inc.c. Backports commit 5dd8990841a9e331d9d4838a116291698208cbb6 from qemu	2018-03-09 12:12:00 -05:00
Peter Maydell	6d0e83d218	Drop remaining bits of ia64 host support We dropped support for ia64 host CPUs in the 2.11 release (removing the TCG backend for it, and advertising the support as being completely removed in the changelog). However there are a few bits and pieces of code still floating about. Remove those, too. We can drop the check in configure for "ia64 or hppa host?" entirely, because we don't support hppa hosts either any more. Backports commit b1cef6d02f84bd842fb94a6109ad4e2ad873e8e5 from qemu	2018-03-09 11:54:57 -05:00
Markus Armbruster	3277400723	qapi: Move qapi-schema.json to qapi/, rename generated files Move qapi-schema.json to qapi/, so it's next to its modules, and all files get generated to qapi/, not just the ones generated for modules. Consistently name the generated files qapi-MODULE.EXT: qmp-commands.[ch] become qapi-commands.[ch], qapi-event.[ch] become qapi-events.[ch], and qmp-introspect.[ch] become qapi-introspect.[ch]. This gets rid of the temporary hacks in scripts/qapi/commands.py, scripts/qapi/events.py, and scripts/qapi/common.py. Backports commit eb815e248f50cde9ab86eddd57eca5019b71ca78 from qemu	2018-03-09 11:35:11 -05:00
Markus Armbruster	5500a5e912	Include less of the generated modular QAPI headers In my "build everything" tree, a change to the types in qapi-schema.json triggers a recompile of about 4800 out of 5100 objects. The previous commit split up qmp-commands.h, qmp-event.h, qmp-visit.h, qapi-types.h. Each of these headers still includes all its shards. Reduce compile time by including just the shards we actually need. To illustrate the benefits: adding a type to qapi/migration.json now recompiles some 2300 instead of 4800 objects. The next commit will improve it further. Backports commit 9af2398977a78d37bf184d6ff6bd04c72bfbf006 from qemu	2018-03-09 10:06:19 -05:00
Laurent Vivier	5fa3a97549	softfloat: use floatx80_infinity in softfloat Since f3218a8 ("softfloat: add floatx80 constants") floatx80_infinity is defined but never used. This patch updates floatx80 functions to use this definition. This allows to define a different default Infinity value on m68k: the m68k FPU defines infinity with all bits set to zero in the mantissa. Backports commit 0f605c889ca3fe9744166ad4149d0dff6dacb696 from qemu	2018-03-09 01:34:45 -05:00
Laurent Vivier	b42fcb5496	softfloat: export some functions Move fpu/softfloat-macros.h to include/fpu/ Export floatx80 functions to be used by target floatx80 specific implementations. Exports: propagateFloatx80NaN(), extractFloatx80Frac(), extractFloatx80Exp(), extractFloatx80Sign(), normalizeFloatx80Subnormal(), packFloatx80(), roundAndPackFloatx80(), normalizeRoundAndPackFloatx80() Also exports packFloat32() that will be used to implement m68k fsinh, fcos, fsin, ftan operations. Backports commit 88857aca93f6ec8f372fb9c8201394b0e5582034 from qemu	2018-03-09 01:22:00 -05:00
Alex Bennée	4b2577537b	arm/translate-a64: add FP16 FR[ECP/SQRT]S to simd_three_reg_same_fp16 As some of the constants here will also be needed elsewhere (specifically for the upcoming SVE support) we move them out to softfloat.h. Backports commit 026e2d6ef74000afb9049f46add4b94f594c8fb3 from qemu	2018-03-08 15:47:34 -05:00
Alex Bennée	a02b9b81a9	arm/translate-a64: add FP16 FMULA/X/S to simd_three_reg_same_fp16 Backports commit 2deb992b767d28035fac3b374c7730494ff0b43d from qemu Also backports the fp16 changes introduced in commit f566c0474a9b9bbd9ed248607e4007e24d3358c0	2018-03-08 15:42:48 -05:00
Alex Bennée	e56ed38819	include/exec/helper-head.h: support f16 in helper calls This allows us to explicitly pass float16 to helpers rather than assuming uint32_t and dealing with the result. Of course they will be passed in i32 sized registers by default. Backports commit 35737497008aeabce5dc381a41d3827bec486192 from qemu	2018-03-08 12:28:05 -05:00
Alex Bennée	283abedc68	fpu/softfloat: re-factor sqrt This is a little bit of a departure from softfloat's original approach as we skip the estimate step in favour of a straight iteration. There is a minor optimisation to avoid calculating more bits of precision than we need however this still brings a performance drop, especially for float64 operations. Backports commit c13bb2da9eedfbc5886c8048df1bc1114b285fb0 from qemu	2018-03-08 12:23:54 -05:00
Alex Bennée	e2fb4b40c3	fpu/softfloat: re-factor compare The compare function was already expanded from a macro. I keep the macro expansion but move most of the logic into a compare_decomposed. Backports commit 0c4c90929143a530730e2879204a55a30bf63758 from qemu	2018-03-08 12:21:20 -05:00
Alex Bennée	c38b64f8a9	fpu/softfloat: re-factor minmax Let's do the same re-factor treatment for minmax functions. I still use the MACRO trick to expand but now all the checking code is common. Backports commit 89360067071b1844bf745682e18db7dde74cdb8d from qemu	2018-03-08 12:18:35 -05:00
Alex Bennée	9b296329f6	fpu/softfloat: re-factor scalbn This is one of the simpler manipulations you could make to a floating point number. Backports commit 0bfc9f195209593e91a98cf2233753f56a2e5c02 from qemu	2018-03-08 12:16:19 -05:00

1 2 3 4 5 ...

599 commits