This allows KVM with the Book3S radix MMU mode to take advantage of
THP and install larger pages in the partition scope page tables (the
host translation).
Backports commit 0c1272cc7c72dfe0ef66be8f283cf67c74b58586 from qemu
Actually enable the global memory barriers if supported by the OS.
Because only recent versions of Linux include the support, they
are disabled by default. Note that it also has to be disabled
for QEMU to run under Wine.
Before this patch, rcutorture reports 85 ns/read for my machine,
after the patch it reports 12.5 ns/read. On the other hand updates
go from 50 *micro*seconds to 20 *milli*seconds.
Backports commit a40161cbe9ccbcbab798c3e4d257c4bba99d153a from qemu
There are qemu_strtoNN functions for various sized integers. This adds two
more for plain int & unsigned int types, with suitable range checking.
Backports commit 473a2a331ee382703f7ca0067ba2545350cfa06c from qemu
Check for the presence of posix_memalign() in the configure script,
not using "defined(_POSIX_C_SOURCE) && !defined(__sun__)". This
lets qemu use posix_memalign() on NetBSD versions that have it,
instead of falling back to valloc() which is wasteful when the
required alignment is smaller than a page.
Backports commit 9bc5a7193fb422ee53187601eba577ee5d195522 from qemu
SPARC Linux has an oddity that it insists that mmap()
of MAP_FIXED memory must be at an alignment defined by
SHMLBA, which is more aligned than the page size
(typically, SHMLBA alignment is to 16K, and pages are 8K).
This is a relic of ancient hardware that had cache
aliasing constraints, but even on modern hardware the
kernel still insists on the alignment.
To ensure that we get mmap() alignment sufficient to
make the kernel happy, change QEMU_VMALLOC_ALIGN,
qemu_fd_getpagesize() and qemu_mempath_getpagesize()
to use the maximum of getpagesize() and SHMLBA.
In particular, this allows 'make check' to pass on Sparc:
we were previously failing the ivshmem tests.
Backports commit 57d1f6d7ce23e79a8ebe4a57bd2363b269b4664b from qemu
Provide helpers to convert bitmaps to little endian format. It can be
used when we want to send one bitmap via network to some other hosts.
One thing to mention is that, these helpers only solve the problem of
endianess, but it does not solve the problem of different word size on
machines (the bitmaps managing same count of bits may contains different
size when malloced). So we need to take care of the size alignment issue
on the callers for now.
Backports commit d7788151a0807d5d2d410e3f8944d8c8a651f8d2 from qemu
Nobody has mentioned AIX host support on the mailing list for years,
and we have no test systems for it so it is most likely broken.
We've advertised in configure for two releases now that we plan
to drop support for this host OS, and have had no complaints.
Drop the AIX host support code.
We can also drop the now-unused AIX version of sys_cache_info().
Note that the _CALL_AIX define used in the PPC tcg backend is
also used for Linux PPC64, and so that code should not be removed.
Backports commit 7872375219c03682bda3f6191fa5f6a58238ed36 from qemu
This include was forgotten when splitting cacheinfo.c out of
tcg/ppc/tcg-target.inc.c (see commit b255b2c8).
For a Centos7 host, the include path
<signal.h>
<bits/sigcontext.h>
<asm/sigcontext.h>
<asm/elf.h>
<asm/auxvec.h>
implicitly pulls in the desired AT_* defines.
Not so for Debian Jessie.
Backports commit 810d5cad4087236236e00fd3046a16adf26e9060 from qemu
Clang generates the following warning on aarch64 host:
CC util/cacheinfo.o
/home/pranith/qemu/util/cacheinfo.c:121:48: warning: value size does not match register size specified by the constraint and modifier [-Wasm-operand-widths]
asm volatile("mrs\t%0, ctr_el0" : "=r"(ctr));
^
/home/pranith/qemu/util/cacheinfo.c:121:28: note: use constraint modifier "w"
asm volatile("mrs\t%0, ctr_el0" : "=r"(ctr));
^~
%w0
Constraint modifier 'w' is not (yet?) accepted by gcc. Fix this by increasing the ctr size.
Backports commit 2ae96c157ab3155baf6595c08cf5d3fe3c023a60 from qemu
Add helpers to gather cache info from the host at init-time.
For now, only export the host's I/D cache line sizes, which we
will use to improve cache locality to avoid false sharing.
Backports commit b255b2c8a5484742606e8760870ba3e14d0c9605 from qemu
This makes qemu_strtosz(), qemu_strtosz_mebi() and
qemu_strtosz_metric() similar to qemu_strtoi64(), except negative
values are rejected.
Backports commit f17fd4fdf0df3d2f3444399d04c38d22b9a3e1b7 from qemu
Change the qemu_strtosz() & friends to return -EINVAL when @endptr is
null and the conversion doesn't consume the string completely.
Matches how qemu_strtol() & friends work.
Only test_qemu_strtosz_simple() passes a null @endptr. No functional
change there, because its conversion consumes the string.
Simplify callers that use @endptr only to fail when it doesn't point
to '\0' to pass a null @endptr instead.
Backports commit 4fcdf65ae2c00ae69f7625f26ed41f37d77b403c from qemu
Writing QEMU_STRTOSZ_DEFSUFFIX_* instead of '*' gains nothing. Get
rid of these eyesores.
Backports commit 17f942560e54f8ee72996bc3276c697503606d7b from qemu
Most callers of qemu_strtosz_suffix() pass QEMU_STRTOSZ_DEFSUFFIX_B.
Capture the pattern in new qemu_strtosz().
Inline qemu_strtosz_suffix() into its only remaining caller.
Backports commit 466dea14e677555dd24465aca75d00a3537ad062 from qemu
With qemu_strtosz(), no suffix means mebibytes. It's used rarely.
I'm going to add a similar function where no suffix means bytes.
Rename qemu_strtosz() to qemu_strtosz_MiB() to make the name
qemu_strtosz() available for the new function.
Backports commit e591591b323772eea733de6027f5e8b50692d0ff from qemu
To parse numbers with metric suffixes, we use
qemu_strtosz_suffix_unit(nptr, &eptr, QEMU_STRTOSZ_DEFSUFFIX_B, 1000)
Capture this in a new function for legibility:
qemu_strtosz_metric(nptr, &eptr)
Replace test_qemu_strtosz_suffix_unit() by test_qemu_strtosz_metric().
Rename qemu_strtosz_suffix_unit() to do_strtosz() and give it internal
linkage.
Backports commit d2734d2629266006b0413433778474d5801c60be from qemu
Reorder check_strtox_error() to make it obvious that we always store
through a non-null @endptr.
Transform
if (some error) {
error case ...
err = value for error case;
} else {
normal case ...
err = value for normal case;
}
return err;
to
if (some error) {
error case ...
return value for error case;
}
normal case ...
return value for normal case;
Backports commit 4baef2679e029c76707be1e2ed54bf3dd21693fe from qemu
Name same things the same, different things differently.
* qemu_strtol()'s parameter @nptr is called @p in
check_strtox_error(). Rename the latter.
* qemu_strtol()'s parameter @endptr is called @next in
check_strtox_error(). Rename the latter.
* qemu_strtol()'s variable @p is called @endptr in
check_strtox_error(). Rename both to @ep.
* qemu_strtol()'s variable @err is *negative* errno,
check_strtox_error()'s parameter @err is *positive*. Rename the
latter to @libc_errno.
Same for qemu_strtoul(), qemu_strtoi64(), qemu_strtou64(), of course.
Backports commit 717adf960933da0650d995f050d457063d591914 from qemu
The name qemu_strtoll() suggests conversion to long long, but it
actually converts to int64_t. Rename to qemu_strtoi64().
The name qemu_strtoull() suggests conversion to unsigned long long,
but it actually converts to uint64_t. Rename to qemu_strtou64().
Backports commit b30d188677456b17c1cd68969e08ddc634cef644 from qemu
Fixes the following documentation bugs:
* Fails to document that null @nptr is safe.
* Fails to document that we return -EINVAL when no conversion could be
performed (commit 47d4be1).
* Confuses long long with int64_t, and unsigned long long with
uint64_t.
* Claims the unsigned conversions can underflow. They can't.
While there, mark problematic assumptions that int64_t is long long,
and uint64_t is unsigned long long with FIXME comments.
Backports commit 4295f879becfbbb9f4330489311586b96915d920 from qemu
1st mmap returns *ptr* which aligns to host page size,
| size + align |
------------------------------------------
ptr
input param *align* could be 1M, or 2M, or host page size. After
QEMU_ALIGN_UP, offset will >= 0
2nd mmap use flag MAP_FIXED, then it return ptr+offset, or else fail.
If it success, then we will have something like:
| offset | size |
--------------------------------------
ptr ptr1
*ptr1* is what we really want to return, it equals ptr+offset.
Backports commit 6e4c890e15b23f078650499fbde11760b8eccf10 from qemu
Include sys/user.h for declaration of 'struct kinfo_proc'.
Add -lutil to qemu-ga link for kinfo_getproc.
Backports commit a7764f1548ef9946af30a8f96be9cef10761f0c1 from qemu
The QmpOutputVisitor has no direct dependency on QMP. It is
valid to use it anywhere that one wants a QObject. Rename it
to better reflect its functionality as a generic QAPI
to QObject converter.
The commit before previous renamed the files, this one renames C
identifiers.
Backports commit 7d5e199ade76c53ec316ab6779800581bb47c50a from qemu
Use Neon instructions to perform zero checking of
buffer. This is helps in reducing total migration time.
Use case: Idle VM live migration with 4 VCPUS and 8GB ram
running CentOS 7.
Without Neon, the Total migration time is 3.5 Sec
Migration status: completed
total time: 3560 milliseconds
downtime: 33 milliseconds
setup: 5 milliseconds
transferred ram: 297907 kbytes
throughput: 685.76 mbps
remaining ram: 0 kbytes
total ram: 8519872 kbytes
duplicate: 2062760 pages
skipped: 0 pages
normal: 69808 pages
normal bytes: 279232 kbytes
dirty sync count: 3
With Neon, the total migration time is 2.9 Sec
Migration status: completed
total time: 2960 milliseconds
downtime: 65 milliseconds
setup: 4 milliseconds
transferred ram: 299869 kbytes
throughput: 830.19 mbps
remaining ram: 0 kbytes
total ram: 8519872 kbytes
duplicate: 2064313 pages
skipped: 0 pages
normal: 70294 pages
normal bytes: 281176 kbytes
dirty sync count: 3
Backports commit 7069532e3b944c25707d4f69998e68a739eabff9 from qemu
Tracked down with an ugly, brittle and probably buggy Perl script.
Also move includes converted to <...> up so they get included before
ours where that's obviously okay.
Backports commit a9c94277f07d19d3eb14f199c3e93491aa3eae0e from qemu
Range represents a range as follows. Member @start is the inclusive
lower bound, member @end is the exclusive upper bound. Zero @end is
special: if @start is also zero, the range is empty, else @end is to
be interpreted as 2^64. No other empty ranges may occur.
The range [0,2^64-1] cannot be represented. If you try to create it
with range_set_bounds1(), you get the empty range instead. If you try
to create it with range_set_bounds() or range_extend(), assertions
fail. Before range_set_bounds() existed, the open-coded creation
usually got you the empty range instead. Open deathtrap.
Moreover, the code dealing with the janus-faced @end is too clever by
half.
Dumb this down to a more pedestrian representation: members @lob and
@upb are inclusive lower and upper bounds. The empty range is encoded
as @lob = 1, @upb = 0.
Backports commit 6dd726a2bf1b800289d90a84d5fcb5ce7b78a8e1 from qemu
Users of struct Range mess liberally with its members, which makes
refactoring hard. Create a set of methods, and convert all users to
call them instead of accessing members. The methods have carefully
worded contracts, and use assertions to check them.
Backports commit a0efbf16604770b9d805bcf210ec29942321134f from qemu
Commit 7f8f9ef1 introduced the ability to store a list of
integers as a sorted list of ranges, but when merging ranges,
it leaks one or more ranges. It was also using range_get_last()
incorrectly within range_compare() (a range is a start/end pair,
but range_get_last() is for start/len pairs), and will also
mishandle a range ending in UINT64_MAX (remember, we document
that no range covers 2**64 bytes, but that ranges that end on
UINT64_MAX have end < begin).
The whole merge algorithm was rather complex, and included
unnecessary passes over data within glib functions, and enough
indirection to make it hard to easily plug the data leaks.
Since we are already hard-coding things to a list of ranges,
just rewrite the thing to open-code the traversal and
comparisons, by making the range_compare() helper function give
us an answer that is easier to use, at which point we avoid the
need to pass any callbacks to g_list_*(). Then by reusing
range_extend() instead of duplicating effort with range_merge(),
we cover the corner cases correctly.
Drop the now-unused range_merge() and ranges_can_merge().
Doing this lets test-string-{input,output}-visitor pass under
valgrind without leaks.
Backports commit db486cc334aafd3dbdaf107388e37fc3d6d3e171 from qemu
Calling our function g_list_insert_sorted_merged is a misnomer,
since we are NOT writing a glib function. Furthermore, we are
making every caller pass the same comparator function of
range_merge(): any caller that would try otherwise would break
in weird ways since our internal call to ranges_can_merge() is
hard-coded to operate only on ranges, rather than paying
attention to the caller's comparator.
Better is to fix things so that callers don't have to care about
our internal comparator, by picking a function name and updating
the parameter type away from a gratuitous use of void*, to make
it obvious that we are operating specifically on a list of ranges
and not a generic list. Plus, refactoring the code here will
make it easier to plug a memory leak in the next patch.
range_compare() is now internal only, and moves to the .c file.
Backports commit 7c47959d0cb05db43014141a156ada0b6d53a750 from qemu
g_list_insert_sorted_merged() is rather large to be an inline
function; move it to its own file. range_merge() and
ranges_can_merge() can likewise move, as they are only used
internally. Also, it becomes obvious that the condition within
range_merge() is already satisfied by its caller, and that the
return value is not used.
The diffstat is misleading, because of the copyright boilerplate.
Backports commit fec0fc0a13ac7f1a1130433a6740cd850c3db34a from qemu
For KVM to use Transparent Huge Pages (THP) we have to ensure that the
alignment of the userspace address of the KVM memory slot and the IPA
that the guest sees for a memory region have the same offset from the 2M
huge page size boundary.
One way to achieve this is to always align the IPA region at a 2M
boundary and ensure that the mmap alignment is also at 2M.
Unfortunately, we were only doing this for __arm__, not for __aarch64__,
so add this simple condition.
This fixes a performance regression using KVM/ARM on AArch64 platforms
that showed a performance penalty of more than 50%, introduced by the
following commit:
9fac18f (oslib: allocate PROT_NONE pages on top of RAM, 2015-09-10)
We were only lucky before the above commit, because we were allocating
large regions and naturally getting a 2M alignment on those allocations
then.
Backports commit ee1e0f8e5d3682c561edcdceccff72b9d9b16d8b from qemu
Move declarations out of qemu-common.h for functions declared in
utils/ files: e.g. include/qemu/path.h for utils/path.c.
Move inline functions out of qemu-common.h and into new files (e.g.
include/qemu/bcd.h)
Backports commit f348b6d1a53e5271cf1c9f9acc4646b4b98c1771 from qemu
Not only it makes sense, but it gets rid of checkpatch warning:
WARNING: consider using qemu_strtosz in preference to strtosz
Also remove get rid of tabs to please checkpatch.
Backports commit 4677bb40f809394bef5fa07329dea855c0371697 from qemu
Commit 57cb38b included qapi/error.h into qemu/osdep.h to get the
Error typedef. Since then, we've moved to include qemu/osdep.h
everywhere. Its file comment explains: "To avoid getting into
possible circular include dependencies, this file should not include
any other QEMU headers, with the exceptions of config-host.h,
compiler.h, os-posix.h and os-win32.h, all of which are doing a
similar job to this file and are under similar constraints."
qapi/error.h doesn't do a similar job, and it doesn't adhere to
similar constraints: it includes qapi-types.h. That's in excess of
100KiB of crap most .c files don't actually need.
Add the typedef to qemu/typedefs.h, and include that instead of
qapi/error.h. Include qapi/error.h in .c files that need it and don't
get it now. Include qapi-types.h in qom/object.h for uint16List.
Update scripts/clean-includes accordingly. Update it further to match
reality: replace config.h by config-target.h, add sysemu/os-posix.h,
sysemu/os-win32.h. Update the list of includes in the qemu/osdep.h
comment quoted above similarly.
This reduces the number of objects depending on qapi/error.h from "all
of them" to less than a third. Unfortunately, the number depending on
qapi-types.h shrinks only a little. More work is needed for that one.
Backports commit da34e65cb4025728566d6504a99916f6e7e1dd6a from qemu
When &error_abort is passed in, the error reporting code
will print the current error message and then abort() the
process. Unfortunately at the time it aborts, we've not
yet appended the errno detail. This makes debugging certain
problems significantly harder as the log is incomplete.
Backports commit 20e2dec14954568848ad74e73aee9b3aeedd6584 from qemu
Similar to error_abort, but doesn't report where the error was
created, and terminates the process with exit(1) rather than abort().
Backports commit a29a37b994ca3c5a1d39fa0e8934f7e0f2cf57ef from qemu
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.
This commit was created with scripts/clean-includes.
Backports commit aafd758410015e08b1aa8964d739ba8587ce58dc from qemu
getpagesize on Linux returns an int. Fix QEMU's implementation for
Windows to return an int (instead of size_t), too.
This fixes a compiler warning which was introduced recently
(commit 093e3c42).
Backports commit a28c2f2df7679e3a87789e9fb7ed69331f697297 from qemu
Anonymous and file-backed RAM allocation are now almost exactly the same.
Reduce code duplication by moving RAM mmap code out of oslib-posix.c and
exec.c.
Backports commit 794e8f301a17953efa78ab7538019ec43c59e82a from qemu