unicorn

mirror of https://github.com/yuzu-emu/unicorn.git synced 2025-11-05 06:34:57 +00:00

Author	SHA1	Message	Date
Paolo Bonzini	a47c68164d	compiler: never omit assertions if using a static analysis tool Assertions help both Coverity and the clang static analyzer avoid false positives, but on the other hand both are confused when the condition is compiled as (void)(x != FOO). Always expand assertion macros when using Coverity or clang, through a new QEMU_STATIC_ANALYSIS preprocessor symbol. This fixes a couple false positives in TCG. Backports commit 8bff06a0bbf257a2083223534c1607bf87d913e6 from qemu	2018-02-25 19:19:28 -05:00
Markus Armbruster	c2ffbc575d	Clean up decorations and whitespace around header guards Cleaned up with scripts/clean-header-guards.pl. Backports commit 175de52487ce0b0c78daa4cdf41a5a465a168a25 from qemu	2018-02-25 04:26:02 -05:00
Markus Armbruster	1275b9b459	Clean up ill-advised or unusual header guards Cleaned up with scripts/clean-header-guards.pl. Backports commit 2a6a4076e117113ebec97b1821071afccfdfbc96 from qemu	2018-02-25 04:22:46 -05:00
Markus Armbruster	9ae2fc4d9e	Clean up header guards that don't match their file name Header guard symbols should match their file name to make guard collisions less likely. Offenders found with scripts/clean-header-guards.pl -vn. Cleaned up with scripts/clean-header-guards.pl, followed by some renaming of new guard symbols picked by the script to better ones. Backports commit 121d07125bb6d7079c7ebafdd3efe8c3a01cc440 from qemu	2018-02-25 04:18:42 -05:00
Markus Armbruster	60e8836b74	Use #include "..." for our own headers, <...> for others Tracked down with an ugly, brittle and probably buggy Perl script. Also move includes converted to <...> up so they get included before ours where that's obviously okay. Backports commit a9c94277f07d19d3eb14f199c3e93491aa3eae0e from qemu	2018-02-25 04:10:33 -05:00
Peter Maydell	f6f843b4d4	bswap.h: Document cpu_to_* and *_to_cpu conversion functions Add a documentation comment describing the functions for converting between the cpu and little or bigendian formats. Backports commit 7d820b766a2049f33ca7e078aa51018f2335f8c5 from qemu	2018-02-25 04:06:28 -05:00
Peter Maydell	1d7f813942	bswap.h: Remove unused cpu_to_w() and _to_cpup() Now that all uses of cpu_to_w() and _to_cpup() have been replaced with either ld_p()/st_p() or by doing direct dereferences and using the cpu_to_()/_to_cpu() byteswap functions, we can remove the unused implementations. Backports commit f76bde702916d0230bf359d478bcac8d7f3b30ae from qemu	2018-02-25 04:04:46 -05:00
Sergey Sorokin	d1e4ac0451	Fix confusing argument names in some common functions There are functions tlb_fill(), cpu_unaligned_access() and do_unaligned_access() that are called with access type and mmu index arguments. But these arguments are named 'is_write' and 'is_user' in their declarations. The patches fix the arguments to avoid a confusion. Backports commit b35399bb4e9968296a12303b00f9f2066470e987 from qemu	2018-02-25 03:58:27 -05:00
Sergey Sorokin	e4d123caa9	tcg: Improve the alignment check infrastructure Some architectures (e.g. ARMv8) need the address which is aligned to a size more than the size of the memory access. To support such check it's enough the current costless alignment check implementation in QEMU, but we need to support an alignment size specifying. Backports commit 1f00b27f17518a1bcb4cedca49eaec96a4d560bd from qemu	2018-02-25 02:23:28 -05:00
Lioncash	532f840dc3	qapi: Add new clone visitor We have a couple places in the code base that want to deep-clone one QAPI object into another, and they were resorting to serializing the struct out to QObject then reparsing it. A much more efficient version can be done by adding a new clone visitor. Since cloning is still relatively uncommon, expose the use of the new visitor via a QAPI_CLONE() macro that takes care of type-punning the underlying function pointer, rather than generating lots of unused functions for types that won't be cloned. And yes, we're relying on the compiler treating all pointers equally, even though a strict C program cannot portably do so - but we're not the first one in the qemu code base to expect it to work (hello, glib!). The choice of adding a fourth visitor type deserves some explanation. On the surface, the clone visitor is mostly an input visitor (it takes arbitrary input - in this case, another QAPI object - and creates a new QAPI object during the course of the visit). But ever since commit da72ab0 consolidated enum visits based on the visitor type, using VISITOR_INPUT would cause us to run visit_type_str(), even though for cloning there is nothing to do (we just copy the enum value across, without regards to its mapping to strings). Also, since our input happens to be a QAPI object, we can also satisfy the internal checks for VISITOR_OUTPUT. So in the end, I settled with a new VISITOR_CLONE, and chose its value such that many internal checks can use 'v->type & mask', sticking to 'v->type == value' where the difference matters. Note that we can only clone objects (including alternates) and lists, not built-ins or enums. The visitor core hides integer width from the actual visitor (since commit 04e070d), and as long as that's the case, we can't clone top-level integers. Then again, those can always be cloned by direct copy, since they are not objects with deep pointers, so it's no real loss. And restricting cloning to just objects and lists is cleaner than restricting it to non-integers. As such, I documented that the clone visitor is for direct use only by code internal to QAPI, and should not be used on incomplete objects (other than a hack to work around the fact that we allow NULL in place of "" in visit_type_str() in other output visitors). Note that as written, the clone visitor will never fail on a complete object. Scalars (including enums) not at the root of the clone copy just fine with no additional effort while visiting the scalar, by virtue of a g_memdup() each time we push another struct onto the stack. Cloning a string requires deduplication of a pointer, which means it can also provide the guarantee of an input visitor of never producing NULL even when still accepting NULL in place of "" the way the QMP output visitor does. Cloning an 'any' type could be possible by incrementing the QObject refcnt, but it's not obvious whether that is better than implementing a QObject deep clone. So for now, we document it as unsupported, and intentionally omit the .type_any() callback to let a developer know their usage needs implementation. Add testsuite coverage for several different clone situations, to ensure that the code is working. I also tested that valgrind was happy with the test. Backports commit a15fcc3cf69ee3d408f60d6cc316488d2b0249b4 from qemu	2018-02-25 01:34:12 -05:00
Eric Blake	85af4b2030	qapi: Add new visit_complete() function Making each output visitor provide its own output collection function was the only remaining reason for exposing visitor sub-types to the rest of the code base. Add a polymorphic visit_complete() function which is a no-op for input visitors, and which populates an opaque pointer for output visitors. For maximum type-safety, also add a parameter to the output visitor constructors with a type-correct version of the output pointer, and assert that the two uses match. This approach was considered superior to either passing the output parameter only during construction (action at a distance during visit_free() feels awkward) or only during visit_complete() (defeating type safety makes it easier to use incorrectly). Most callers were function-local, and therefore a mechanical conversion; the testsuite was a bit trickier, but the previous cleanup patch minimized the churn here. The visit_complete() function may be called at most once; doing so lets us use transfer semantics rather than duplication or ref-count semantics to get the just-built output back to the caller, even though it means our behavior is not idempotent. Generated code is simplified as follows for events: \|@@ -26,7 +26,7 @@ void qapi_event_send_acpi_device_ost(ACP \| QDict qmp; \| Error err = NULL; \| QMPEventFuncEmit emit; \|- QmpOutputVisitor qov; \|+ QObject obj; \| Visitor v; \| q_obj_ACPI_DEVICE_OST_arg param = { \| info \|@@ -39,8 +39,7 @@ void qapi_event_send_acpi_device_ost(ACP \| \| qmp = qmp_event_build_dict("ACPI_DEVICE_OST"); \| \|- qov = qmp_output_visitor_new(); \|- v = qmp_output_get_visitor(qov); \|+ v = qmp_output_visitor_new(&obj); \| \| visit_start_struct(v, "ACPI_DEVICE_OST", NULL, 0, &err); \| if (err) { \|@@ -55,7 +54,8 @@ void qapi_event_send_acpi_device_ost(ACP \| goto out; \| } \| \|- qdict_put_obj(qmp, "data", qmp_output_get_qobject(qov)); \|+ visit_complete(v, &obj); \|+ qdict_put_obj(qmp, "data", obj); \| emit(QAPI_EVENT_ACPI_DEVICE_OST, qmp, &err); and for commands: \| { \| Error err = NULL; \|- QmpOutputVisitor qov = qmp_output_visitor_new(); \| Visitor v; \| \|- v = qmp_output_get_visitor(qov); \|+ v = qmp_output_visitor_new(ret_out); \| visit_type_AddfdInfo(v, "unused", &ret_in, &err); \|- if (err) { \|- goto out; \|+ if (!err) { \|+ visit_complete(v, ret_out); \| } \|- *ret_out = qmp_output_get_qobject(qov); \|- \|-out: \| error_propagate(errp, err); Backports commit 3b098d56979d2f7fd707c5be85555d114353a28d from qemu	2018-02-25 01:20:03 -05:00
Eric Blake	ec53301cda	qmp-output-visitor: Favor new visit_free() function Now that we have a polymorphic visit_free(), we no longer need qmp_output_visitor_cleanup(); however, we still need to expose the subtype for qmp_output_get_qobject(). Backports commit 1830f22a6777cedaccd67a08f675d30f7a85ebfd from qemu	2018-02-25 01:12:27 -05:00
Eric Blake	f008d93ac0	qmp-input-visitor: Favor new visit_free() function Now that we have a polymorphic visit_free(), we no longer need qmp_input_visitor_cleanup(); which in turn means we no longer need to return a subtype from qmp_input_visitor_new() nor a public upcast function. Generated code changes to qmp-marshal.c look like: \|@@ -52,11 +52,10 @@ void qmp_marshal_add_fd(QDict args, QOb \| { \| Error err = NULL; \| AddfdInfo retval; \|- QmpInputVisitor qiv = qmp_input_visitor_new(QOBJECT(args), true); \| Visitor *v; \| q_obj_add_fd_arg arg = {0}; \| \|- v = qmp_input_get_visitor(qiv); \|+ v = qmp_input_visitor_new(QOBJECT(args), true); \| visit_start_struct(v, NULL, NULL, 0, &err); \| if (err) { \| goto out; Backports commit b70ce1018a251c0c33498d9c927a07cade655a5e from qemu	2018-02-25 01:10:53 -05:00
Eric Blake	e88a7e260b	string-input-visitor: Favor new visit_free() function Now that we have a polymorphic visit_free(), we no longer need string_input_visitor_cleanup(); which in turn means we no longer need to return a subtype from string_input_visitor_new() nor a public upcast function. Backports commit 7a0525c7be6b38d32d586e3fd12e7377ded21faa from qemu	2018-02-25 01:08:04 -05:00
Eric Blake	7f741a6c9b	qapi: Add new visit_free() function Making each visitor provide its own (awkwardly-named) FOO_cleanup() is unusual, when we can instead have a polymorphic visit_free() interface. Over the next few patches, we can use the polymorphic functions to eliminate the need for a FOO_get_visitor() function for accessing specific visitor functionality, once everything can be accessed directly through the Visitor* interfaces. The dealloc visitor is the first one converted to completely use the new entry point, since qapi_dealloc_visitor_cleanup() was the only reason that qapi_dealloc_get_visitor() existed, and only generated and testsuite code was even using it. With the new visit_free() entry point in place, we no longer need to expose the QapiDeallocVisitor subtype through qapi_dealloc_visitor_new(), and can get by with less generated code, with diffs that look like: \| void qapi_free_ACPIOSTInfo(ACPIOSTInfo obj) \| { \|- QapiDeallocVisitor qdv; \| Visitor *v; \| \| if (!obj) { \| return; \| } \| \|- qdv = qapi_dealloc_visitor_new(); \|- v = qapi_dealloc_get_visitor(qdv); \|+ v = qapi_dealloc_visitor_new(); \| visit_type_ACPIOSTInfo(v, NULL, &obj, NULL); \|- qapi_dealloc_visitor_cleanup(qdv); \|+ visit_free(v); \|} Backports commit 2c0ef9f411ae6081efa9eca5b3eab2dbeee45a6c from qemu	2018-02-25 01:05:41 -05:00
Eric Blake	37ae4dfdfd	qapi: Add parameter to visit_end_* Rather than making the dealloc visitor track of stack of pointers remembered during visit_start_* in order to free them during visit_end_, it's a lot easier to just make all callers pass the same pointer to visit_end_. The generated code has access to the same pointer, while all other users are doing virtual walks and can pass NULL. The dealloc visitor is then greatly simplified. All three visit_end_() functions intentionally take a void, even though the visit_start_() functions differ between void, GenericList, and GenericAlternate*. This is done for several reasons: when doing a virtual walk, passing NULL doesn't care what the type is, but when doing a generated walk, we already have to cast the caller's specific FOO to call visit_start, while using void** lets us use visit_end without a cast. Also, an upcoming patch will add a clone visitor that wants to use the same implementation for all three visit_end callbacks, which is made easier if all three share the same signature. For visitors with already track per-object state (the QMP visitors via a stack, and the string visitors which do not allow nesting), add an assertion that the caller is indeed passing the same pointer to paired calls. Backports commit 1158bb2a058fcdd0c8fc3e60dc77f7a57ddbb271 from qemu	2018-02-25 00:57:54 -05:00
Changlong Xie	2ca07642f1	qom: Fix comment typo It's qom_unref, not qdef_unref. Backports commit ada03a0e8423ef8950e30d216f56a9661a4070e2 from qemu	2018-02-25 00:46:15 -05:00
Markus Armbruster	eeef227560	range: Replace internal representation of Range Range represents a range as follows. Member @start is the inclusive lower bound, member @end is the exclusive upper bound. Zero @end is special: if @start is also zero, the range is empty, else @end is to be interpreted as 2^64. No other empty ranges may occur. The range [0,2^64-1] cannot be represented. If you try to create it with range_set_bounds1(), you get the empty range instead. If you try to create it with range_set_bounds() or range_extend(), assertions fail. Before range_set_bounds() existed, the open-coded creation usually got you the empty range instead. Open deathtrap. Moreover, the code dealing with the janus-faced @end is too clever by half. Dumb this down to a more pedestrian representation: members @lob and @upb are inclusive lower and upper bounds. The empty range is encoded as @lob = 1, @upb = 0. Backports commit 6dd726a2bf1b800289d90a84d5fcb5ce7b78a8e1 from qemu	2018-02-25 00:44:36 -05:00
Markus Armbruster	8b2a0c4ece	range: Eliminate direct Range member access Users of struct Range mess liberally with its members, which makes refactoring hard. Create a set of methods, and convert all users to call them instead of accessing members. The methods have carefully worded contracts, and use assertions to check them. Backports commit a0efbf16604770b9d805bcf210ec29942321134f from qemu	2018-02-25 00:39:43 -05:00
Alistair Francis	fbb0645fb3	bitops: Add MAKE_64BIT_MASK macro Add a macro that creates a 64bit value which has length number of ones shifted across by the value of shift. Backports commit ae2923b5c20a21c6457680330506a9c13873485c from qemu	2018-02-25 00:30:39 -05:00
Peter Maydell	efc6cc2b83	memory: Assert that memory_region_init_rom_device() ops aren't NULL It doesn't make sense to pass a NULL ops argument to memory_region_init_rom_device(), because the effect will be that if the guest tries to write to the memory region then QEMU will segfault. Catch the bug earlier by sanity checking the arguments to this function, and remove the misleading documentation that suggests that passing NULL might be sensible. Backports commit 39e0b03dec518254fabd2acff29548d3f1d2b754 from qemu	2018-02-25 00:29:52 -05:00
Peter Maydell	334e951ec1	memory: Provide memory_region_init_rom() Provide a new helper function memory_region_init_rom() for memory regions which are read-only (and unlike those created by memory_region_init_rom_device() don't have special behaviour for writes). This has the same behaviour as calling memory_region_init_ram() and then memory_region_set_readonly() (which is what we do today in boards with pure ROMs) but is a more easily discoverable API for the purpose. Backports commit a1777f7f6462c66e1ee6e98f0d5c431bfe988aa5 from qemu	2018-02-25 00:28:17 -05:00
Alexey Kardashevskiy	7187d77cfa	memory: Add MemoryRegionIOMMUOps.notify_started/stopped callbacks The IOMMU driver may change behavior depending on whether a notifier client is present. In the case of POWER, this represents a change in the visibility of the IOTLB, for other drivers such as intel-iommu and future AMD-Vi emulation, notifier support is not yet enabled and this provides the opportunity to flag that incompatibility. Backports commit d22d8956b185c002b50a4d0883aff61f857347ef from qemu	2018-02-25 00:23:00 -05:00
Eric Blake	c14d8226ab	qapi: Fix memleak in string visitors on int lists Commit 7f8f9ef1 introduced the ability to store a list of integers as a sorted list of ranges, but when merging ranges, it leaks one or more ranges. It was also using range_get_last() incorrectly within range_compare() (a range is a start/end pair, but range_get_last() is for start/len pairs), and will also mishandle a range ending in UINT64_MAX (remember, we document that no range covers 2*64 bytes, but that ranges that end on UINT64_MAX have end < begin). The whole merge algorithm was rather complex, and included unnecessary passes over data within glib functions, and enough indirection to make it hard to easily plug the data leaks. Since we are already hard-coding things to a list of ranges, just rewrite the thing to open-code the traversal and comparisons, by making the range_compare() helper function give us an answer that is easier to use, at which point we avoid the need to pass any callbacks to g_list_(). Then by reusing range_extend() instead of duplicating effort with range_merge(), we cover the corner cases correctly. Drop the now-unused range_merge() and ranges_can_merge(). Doing this lets test-string-{input,output}-visitor pass under valgrind without leaks. Backports commit db486cc334aafd3dbdaf107388e37fc3d6d3e171 from qemu	2018-02-25 00:20:34 -05:00
Eric Blake	ef357d06bc	qapi: Simplify use of range.h Calling our function g_list_insert_sorted_merged is a misnomer, since we are NOT writing a glib function. Furthermore, we are making every caller pass the same comparator function of range_merge(): any caller that would try otherwise would break in weird ways since our internal call to ranges_can_merge() is hard-coded to operate only on ranges, rather than paying attention to the caller's comparator. Better is to fix things so that callers don't have to care about our internal comparator, by picking a function name and updating the parameter type away from a gratuitous use of void*, to make it obvious that we are operating specifically on a list of ranges and not a generic list. Plus, refactoring the code here will make it easier to plug a memory leak in the next patch. range_compare() is now internal only, and moves to the .c file. Backports commit 7c47959d0cb05db43014141a156ada0b6d53a750 from qemu	2018-02-25 00:02:42 -05:00
Eric Blake	5e22c7e180	range: Create range.c for code that should not be inline g_list_insert_sorted_merged() is rather large to be an inline function; move it to its own file. range_merge() and ranges_can_merge() can likewise move, as they are only used internally. Also, it becomes obvious that the condition within range_merge() is already satisfied by its caller, and that the return value is not used. The diffstat is misleading, because of the copyright boilerplate. Backports commit fec0fc0a13ac7f1a1130433a6740cd850c3db34a from qemu	2018-02-24 23:59:13 -05:00
Aleksandar Markovic	6eb4fa54f6	softfloat: Implement run-time-configurable meaning of signaling NaN bit This patch modifies SoftFloat library so that it can be configured in run-time in relation to the meaning of signaling NaN bit, while, at the same time, strictly preserving its behavior on all existing platforms. Background: In floating-point calculations, there is a need for denoting undefined or unrepresentable values. This is achieved by defining certain floating-point numerical values to be NaNs (which stands for "not a number"). For additional reasons, virtually all modern floating-point unit implementations use two kinds of NaNs: quiet and signaling. The binary representations of these two kinds of NaNs, as a rule, differ only in one bit (that bit is, traditionally, the first bit of mantissa). Up to 2008, standards for floating-point did not specify all details about binary representation of NaNs. More specifically, the meaning of the bit that is used for distinguishing between signaling and quiet NaNs was not strictly prescribed. (IEEE 754-2008 was the first floating-point standard that defined that meaning clearly, see [1], p. 35) As a result, different platforms took different approaches, and that presented considerable challenge for multi-platform emulators like QEMU. Mips platform represents the most complex case among QEMU-supported platforms regarding signaling NaN bit. Up to the Release 6 of Mips architecture, "1" in signaling NaN bit denoted signaling NaN, which is opposite to IEEE 754-2008 standard. From Release 6 on, Mips architecture adopted IEEE standard prescription, and "0" denotes signaling NaN. On top of that, Mips architecture for SIMD (also known as MSA, or vector instructions) also specifies signaling bit in accordance to IEEE standard. MSA unit can be implemented with both pre-Release 6 and Release 6 main processor units. QEMU uses SoftFloat library to implement various floating-point-related instructions on all platforms. The current QEMU implementation allows for defining meaning of signaling NaN bit during build time, and is implemented via preprocessor macro called SNAN_BIT_IS_ONE. On the other hand, the change in this patch enables SoftFloat library to be configured in run-time. This configuration is meant to occur during CPU initialization, at the moment when it is definitely known what desired behavior for particular CPU (or any additional FPUs) is. The change is implemented so that it is consistent with existing implementation of similar cases. This means that structure float_status is used for passing the information about desired signaling NaN bit on each invocation of SoftFloat functions. The additional field in float_status is called snan_bit_is_one, which supersedes macro SNAN_BIT_IS_ONE. IMPORTANT: This change is not meant to create any change in emulator behavior or functionality on any platform. It just provides the means for SoftFloat library to be used in a more flexible way - in other words, it will just prepare SoftFloat library for usage related to Mips platform and its specifics regarding signaling bit meaning, which is done in some of subsequent patches from this series. Further break down of changes: 1) Added field snan_bit_is_one to the structure float_status, and correspondent setter function set_snan_bit_is_one(). 2) Constants <float16\|float32\|float64\|floatx80\|float128>_default_nan (used both internally and externally) converted to functions <float16\|float32\|float64\|floatx80\|float128>_default_nan(float_status). This is necessary since they are dependent on signaling bit meaning. At the same time, for the sake of code cleanup and simplicity, constants <floatx80\|float128>_default_nan_<low\|high> (used only internally within SoftFloat library) are removed, as not needed. 3) Added a float_status argument to SoftFloat library functions XXX_is_quiet_nan(XXX a_), XXX_is_signaling_nan(XXX a_), XXX_maybe_silence_nan(XXX a_). This argument must be present in order to enable correct invocation of new version of functions XXX_default_nan(). (XXX is <float16\|float32\|float64\|floatx80\|float128> here) 4) Updated code for all platforms to reflect changes in SoftFloat library. This change is twofolds: it includes modifications of SoftFloat library functions invocations, and an addition of invocation of function set_snan_bit_is_one() during CPU initialization, with arguments that are appropriate for each particular platform. It was established that all platforms zero their main CPU data structures, so snan_bit_is_one(0) in appropriate places is not added, as it is not needed. [1] "IEEE Standard for Floating-Point Arithmetic", IEEE Computer Society, August 29, 2008. Backports commit af39bc8c49224771ec0d38f1b693ea78e221d7bc from qemu	2018-02-24 20:27:12 -05:00
Alexey Kardashevskiy	096ca207af	memory: Add reporting of supported page sizes Every IOMMU has some granularity which MemoryRegionIOMMUOps::translate uses when translating, however this information is not available outside the translate context for various checks. This adds a get_min_page_size callback to MemoryRegionIOMMUOps and a wrapper for it so IOMMU users (such as VFIO) can know the minimum actual page size supported by an IOMMU. As IOMMU MR represents a guest IOMMU, this uses TARGET_PAGE_SIZE as fallback. This removes vfio_container_granularity() and uses new helper in memory_region_iommu_replay() when replaying IOMMU mappings on added IOMMU memory region. Backports the relevant parts of commit f682e9c244af7166225f4a50cc18ff296bb9d43e from qemu	2018-02-24 19:23:28 -05:00
Peter Maydell	f893dacef0	bitops.h: Implement half-shuffle and half-unshuffle ops A half-shuffle operation takes a word with zeros in the high half: 0000 0000 0000 0000 ABCD EFGH IJKL MNOP and spreads the bits out so they are in every other bit of the word: 0A0B 0C0D 0E0F 0G0H 0I0J 0K0L 0M0N 0O0P A half-unshuffle performs the reverse operation. Provide functions in bitops.h which implement these operations for 32-bit and 64-bit inputs, and add tests for them. Backports commit b355438de52d0782983bf4bdc47936189a0c988b from qemu	2018-02-24 19:02:36 -05:00
Bharata B Rao	851dec945d	qom: API to get instance_size of a type Add an API object_type_get_size(const char *typename) that returns the instance_size of the give typename. Backports commit 3f97b53a682d2595747c926c00d78b9d406f1be0 from qemu	2018-02-24 19:00:16 -05:00
Emilio G. Cota	ae3e22a689	tb hash: hash phys_pc, pc, and flags with xxhash For some workloads such as arm bootup, tb_phys_hash is performance-critical. The is due to the high frequency of accesses to the hash table, originated by (frequent) TLB flushes that wipe out the cpu-private tb_jmp_cache's. More info: https://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg05098.html To dig further into this I modified an arm image booting debian jessie to immediately shut down after boot. Analysis revealed that quite a bit of time is unnecessarily spent in tb_phys_hash: the cause is poor hashing that results in very uneven loading of chains in the hash table's buckets; the longest observed chain had ~550 elements. The appended addresses this with two changes: 1) Use xxhash as the hash table's hash function. xxhash is a fast, high-quality hashing function. 2) Feed the hashing function with not just tb_phys, but also pc and flags. This improves performance over using just tb_phys for hashing, since that resulted in some hash buckets having many TB's, while others getting very few; with these changes, the longest observed chain on a single hash bucket is brought down from ~550 to ~40. Tests show that the other element checked for in tb_find_physical, cs_base, is always a match when tb_phys+pc+flags are a match, so hashing cs_base is wasteful. It could be that this is an ARM-only thing, though. UPDATE: On Tue, Apr 05, 2016 at 08:41:43 -0700, Richard Henderson wrote: > The cs_base field is only used by i386 (in 16-bit modes), and sparc (for a TB > consisting of only a delay slot). > It may well still turn out to be reasonable to ignore cs_base for hashing. BTW, after this change the hash table should not be called "tb_hash_phys" anymore; this is addressed later in this series. This change gives consistent bootup time improvements. I tested two host machines: - Intel Xeon E5-2690: 11.6% less time - Intel i7-4790K: 19.2% less time Increasing the number of hash buckets yields further improvements. However, using a larger, fixed number of buckets can degrade performance for other workloads that do not translate as many blocks (600K+ for debian-jessie arm bootup). This is dealt with later in this series. Backports commit 42bd32287f3a18d823f2258b813824a39ed7c6d9 from qemu	2018-02-24 18:00:14 -05:00
Emilio G. Cota	9ef9de9cf8	exec: add tb_hash_func5, derived from xxhash This will be used by upcoming changes for hashing the tb hash. Add this into a separate file to include the copyright notice from xxhash. Backports commit dc8b295d05ec35a8c032f9abca421772347ba5d4 from qemu	2018-02-24 17:36:35 -05:00
Emilio G. Cota	8518f55df7	compiler.h: add QEMU_ALIGNED() to enforce struct alignment Backports commit 911a4d2215b05267b16925503218f49d607c6b29 from qemu	2018-02-24 17:32:43 -05:00
Peter Maydell	d7dccff836	cpu-exec: Rename cpu_resume_from_signal() to cpu_loop_exit_noexc() The function cpu_resume_from_signal() is now always called with a NULL puc argument, and is rather misnamed since it is never called from a signal handler. It is essentially forcing an exit to the top level cpu loop but without raising any exception, so rename it to cpu_loop_exit_noexc() and drop the useless unused argument. Backports commit 6886b98036a8f8f5bce8b10756ce080084cef11b from qemu	2018-02-24 17:25:28 -05:00
Peter Maydell	8d0faac1dc	qemu-common.h: Drop WORDS_ALIGNED define The WORDS_ALIGNED #define is not used anywhere, and hasn't been since 2013 when commit 612d590ebc6cef rewrote the various ld<type>_<endian>_p functions to not use it. Remove the #define and the comment describing it. Also remove the line in the comment about TARGET_WORDS_ALIGNED, since it has never actually existed. Backports commit 0d5c21f2b3bf1e0b562a2c74e353d2e03f2f50ef from qemu	2018-02-24 17:01:55 -05:00
Paolo Bonzini	8df5ad80b1	exec: hide mr->ram_addr from qemu_get_ram_ptr users Let users of qemu_get_ram_ptr and qemu_ram_ptr_length pass in an address that is relative to the MemoryRegion. This basically means what address_space_translate returns. Because the semantics of the second parameter change, rename the function to qemu_map_ram_ptr. Backports commit 0878d0e11ba8013dd759c6921cbf05ba6a41bd71 from qemu	2018-02-24 16:17:49 -05:00
Paolo Bonzini	b2e1b34bcc	memory: split memory_region_from_host from qemu_ram_addr_from_host Move the old qemu_ram_addr_from_host to memory_region_from_host and make it return an offset within the region. For qemu_ram_addr_from_host return the ram_addr_t directly, similar to what it was before commit 1b5ec23 ("memory: return MemoryRegion from qemu_ram_addr_from_host", 2013-07-04). Backports commit 07bdaa4196b51bc7ffa7c3f74e9e4a9dc8a7966a from qemu	2018-02-24 16:06:49 -05:00
Paolo Bonzini	918c626847	exec: remove ram_addr argument from qemu_ram_block_from_host Of the two callers, one does not use it, and the other can compute it itself based on the other output argument (offset) and the RAMBlock. Backports commit f615f39616c4fd1a3a3b078af8d75bb4be6390de from qemu	2018-02-24 03:37:40 -05:00
Paolo Bonzini	f26f1f123c	memory: remove qemu_get_ram_fd, qemu_set_ram_fd, qemu_ram_block_host_ptr Remove direct uses of ram_addr_t and optimize memory_region_{get,set}_fd now that a MemoryRegion knows its RAMBlock directly. Backports commit 4ff87573df3606856a92c14eef3393a63d736d11 from qemu	2018-02-24 03:34:44 -05:00
Emilio G. Cota	ab569f5cde	atomics: do not emit consume barrier for atomic_rcu_read Currently we emit a consume-load in atomic_rcu_read. Because of limitations in current compilers, this is overkill for non-Alpha hosts and it is only useful to make Thread Sanitizer work. This patch leaves the consume-load in atomic_rcu_read when compiling with Thread Sanitizer enabled, and resorts to a relaxed load + smp_read_barrier_depends otherwise. On an RMO host architecture, such as aarch64, the performance improvement of this change is easily measurable. For instance, qht-bench performs an atomic_rcu_read on every lookup. Performance before and after applying this patch: $ tests/qht-bench -d 5 -n 1 Before: 9.78 MT/s After: 10.96 MT/s Backports commit 15487aa132109891482f79d78a30d6cfd465a391 from qemu	2018-02-24 03:28:11 -05:00
Emilio G. Cota	87ef2a2c5f	atomics: emit an smp_read_barrier_depends() barrier only for Alpha and Thread Sanitizer For correctness, smp_read_barrier_depends() is only required to emit a barrier on Alpha hosts. However, we are currently emitting a consume fence unconditionally, and most compilers currently treat consume and acquire fences as equivalent. Fix it by keeping the consume fence if we're compiling with Thread Sanitizer, since this might help prevent false warnings. Otherwise, only emit the barrier for Alpha hosts. Note that we still guarantee that smp_read_barrier_depends() is a compiler barrier. Backports commit c983895258a771f8a5e4a53950bfb7fd2216651c from qemu	2018-02-24 03:26:52 -05:00
Eduardo Habkost	aa3d46ef83	osdep: Move default qemu_hw_version() value to a macro The macro will be used by code that will stop calling qemu_hw_version() at runtime and just need a constant value. Backports commit d494352c2f7818aeba184a8ef757569083740bb2 from qemu	2018-02-24 03:16:34 -05:00
Fam Zheng	fb8135cd0d	memory: Remove code for mr->may_overlap The collision check does nothing and hasn't been used. Remove the variable together with related code. Backports commit b61359781958759317ee6fd1a45b59be0b7dbbe1 from qemu	2018-02-24 02:55:25 -05:00
Gonglei	feff56cc11	memory: drop find_ram_block() On the one hand, we have already qemu_get_ram_block() whose function is similar. On the other hand, we can directly use mr->ram_block but searching RAMblock by ram_addr which is a kind of waste. Backports commit fa53a0e53efdc7002497ea4a76aacf6cceb170ef from qemu	2018-02-24 02:52:20 -05:00
Paolo Bonzini	9bb67a3f58	hw: clean up hw/hw.h includes Include qom/object.h and exec/memory.h instead of exec/ioport.h; exec/ioport.h was almost everywhere required only for those two includes, not for the content of the header itself. Remove block/aio.h, everybody is already including it through another path. With this change, include/hw/hw.h is freed from qemu-common.h. Backports commit df43d49cb8708b9c88a20afe0d1a3089b550a5b8 from qemu	2018-02-24 02:46:41 -05:00
Paolo Bonzini	d0d3712417	hw: remove pio_addr_t pio_addr_t is almost unused, because these days I/O ports are simply accessed through the address space. cpu_{in,out}[bwl] themselves are almost unused; monitor.c and xen-hvm.c could use address_space_read/write directly, since they have an integer size at hand. This leaves qtest as the only user of those functions. On the other hand even portio_* functions use this type; the only interesting use of pio_addr_t thus is include/hw/sysbus.h. I guess I could move it there, but I don't see much benefit in that either. Using uint32_t is enough and avoids the need to include ioport.h everywhere. Backports commit 89a80e7400f7225d9401b35ef32454b4ab29dc67 from qemu	2018-02-24 02:43:16 -05:00
Paolo Bonzini	9485b7c2e1	cpu: move exec-all.h inclusion out of cpu.h exec-all.h contains TCG-specific definitions. It is not needed outside TCG-specific files such as translate.c, exec.c or *helper.c. One generic function had snuck into include/exec/exec-all.h; move it to include/qom/cpu.h. Backports commit 63c915526d6a54a95919ebece83fa9ca631b2508 from qemu	2018-02-24 02:39:08 -05:00
Paolo Bonzini	58693409ea	exec: extract exec/tb-context.h TCG backends do not need most of exec-all.h; extract what they actually need to a separate file or move it directly to tcg.h. The next patch will stop including exec-all.h from everywhere. Backports commit 00f6da6a1a5d1ce085334eccbb50ec899ceed513 from qemu	2018-02-24 02:09:58 -05:00
Paolo Bonzini	f9b9d0ba0f	hw: explicitly include qemu/log.h Move the inclusion out of hw/hw.h, most files do not need it. Backports commit 03dd024ff57733a55cd2e455f361d053c81b1b29 from qemu	2018-02-24 02:00:45 -05:00
Paolo Bonzini	37f26922dd	qemu-common: push cpu.h inclusion out of qemu-common.h Backports commit 33c11879fd422b759483ed25fef133ea900ea8d7 from qemu	2018-02-24 01:50:56 -05:00

1 2 3 4 5 ...

369 commits