Commit graph

368 commits

Author SHA1 Message Date
Kirill A. Shutemov eb489625b5
x86: implement la57 paging mode
The new paging more is extension of IA32e mode with more additional page
table level.

It brings support of 57-bit vitrual address space (128PB) and 52-bit
physical address space (4PB).

The structure of new page table level is identical to pml4.

The feature is enumerated with CPUID.(EAX=07H, ECX=0):ECX[bit 16].

CR4.LA57[bit 12] need to be set when pageing enables to activate 5-level
paging mode.

Backports commit 6c7c3c21f95dd9af8a0691c0dd29b07247984122 from qemu
2018-03-01 11:02:07 -05:00
Doug Evans 7c874b1b2b
target-i386: Fix eflags.TF/#DB handling of syscall/sysret insns
The syscall and sysret instructions behave a bit differently:
TF is checked after the instruction completes.
This allows the o/s to disable #DB at a syscall by adding TF to FMASK.
And then when the sysret is executed the #DB is taken "as if" the
syscall insn just completed.

Backports commit c52ab08aee6f7d4717fc6b517174043126bd302f from qemu
2018-03-01 10:56:22 -05:00
Yi Sun f6e624d97b
target-i386: Add Intel SHA_NI instruction support.
Add SHA_NI feature bit. Its spec can be found at:
https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf

Backports commit 638cbd452d3a92a2ab18caee73078483d90f64eb from qemu
2018-03-01 10:52:54 -05:00
Paolo Bonzini 560515941a
target-i386: correctly propagate retaddr into SVM helpers
Commit 2afbdf8 ("target-i386: exception handling for memory helpers",
2015-09-15) changed tlb_fill's cpu_restore_state+raise_exception_err
to raise_exception_err_ra. After this change, the cpu_restore_state
and raise_exception_err's cpu_loop_exit are merged into
raise_exception_err_ra's cpu_loop_exit_restore.

This actually fixed some bugs, but when SVM is enabled there is a
second path from raise_exception_err_ra to cpu_loop_exit. This is
the VMEXIT path, and now cpu_vmexit is called without a
cpu_restore_state before.

The fix is to pass the retaddr to cpu_vmexit (via
cpu_svm_check_intercept_param). All helpers can now use GETPC() to pass
the correct retaddr, too.

Backports commit 823fb688ebc52a7d79c1308acb28c92b56820167 from qemu
2018-03-01 09:31:16 -05:00
Luwei Kang 57533d1adc
x86: add AVX512_4VNNIW and AVX512_4FMAPS features
The spec can be found in Intel Software Developer Manual or in
Instruction Set Extensions Programming Reference.

Backports commit 95ea69fb46266aaa46d0c8b7f0ba8c4903dbe4e3 from qemu
2018-03-01 08:51:09 -05:00
Emilio G. Cota 3dc16ebca3
target-i386: remove helper_lock()
It's been superseded by the atomic helpers.

The use of the atomic helpers provides a significant performance and scalability
improvement. Below is the result of running the atomic_add-test microbenchmark with:
$ x86_64-linux-user/qemu-x86_64 tests/atomic_add-bench -o 5000000 -r $r -n $n
, where $n is the number of threads and $r is the allowed range for the additions.

The scenarios measured are:
- atomic: implements x86' ADDL with the atomic_add helper (i.e. this patchset)
- cmpxchg: implement x86' ADDL with a TCG loop using the cmpxchg helper
- master: before this patchset

Results sorted in ascending range, i.e. descending degree of contention.
Y axis is Throughput in Mops/s. Tests are run on an AMD machine with 64
Opteron 6376 cores.

atomic_add-bench: 5000000 ops/thread, [0,1] range

25 ++---------+----------+---------+----------+----------+----------+---++
+ atomic +-E--+ + + + + + |
|cmpxchg +-H--+ |
20 +Emaster +-N--+ ++
|| |
|++ |
|| |
15 +++ ++
|N| |
|+| |
10 ++| ++
|+|+ |
| | -+E+------ +++ ---+E+------+E+------+E+-----+E+------+E|
|+E+E+- +++ +E+------+E+-- |
5 ++|+ ++
|+N+H+--- +++ |
++++N+--+H++----+++ + +++ --++H+------+H+------+H++----+H+---+--- |
0 ++---------+-----H----+---H-----+----------+----------+----------+---H+
0 10 20 30 40 50 60
Number of threads

atomic_add-bench: 5000000 ops/thread, [0,2] range

25 ++---------+----------+---------+----------+----------+----------+---++
++atomic +-E--+ + + + + + |
|cmpxchg +-H--+ |
20 ++master +-N--+ ++
|E| |
|++ |
||E |
15 ++| ++
|N|| |
|+|| ---+E+------+E+-----+E+------+E|
10 ++| | ---+E+------+E+-----+E+--- +++ +++
||H+E+--+E+-- |
|+++++ |
| || |
5 ++|+H+-- +++ ++
|+N+ - ---+H+------+H+------ |
+ +N+--+H++----+H+---+--+H+----++H+--- + + +H+---+--+H|
0 ++---------+----------+---------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads

atomic_add-bench: 5000000 ops/thread, [0,8] range

40 ++---------+----------+---------+----------+----------+----------+---++
++atomic +-E--+ + + + + + |
35 +cmpxchg +-H--+ ++
| master +-N--+ ---+E+------+E+------+E+-----+E+------+E|
30 ++| ---+E+-- +++ ++
| | -+E+--- |
25 ++E ---- +++ ++
|+++++ -+E+ |
20 +E+ E-- +++ ++
|H|+++ |
|+| +H+------- |
15 ++H+ ---+++ +H+------ ++
|N++H+-- +++--- +H+------++|
10 ++ +++ - +++ ---+H+ +++ +H+
| | +H+-----+H+------+H+-- |
5 ++| +++ ++
++N+N+--+N++ + + + + + |
0 ++---------+----------+---------+----------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads

atomic_add-bench: 5000000 ops/thread, [0,128] range

160 ++---------+---------+----------+---------+----------+----------+---++
+ atomic +-E--+ + + + + + |
140 +cmpxchg +-H--+ +++ +++ ++
| master +-N--+ E--------E------+E+------++|
120 ++ --| | +++ E+
| -- +++ +++ ++|
100 ++ - ++
| +++- +++ ++|
80 ++ -+E+ -+H+------+H+------H--------++
| ---- ---- +++ H|
| ---+E+-----+E+- ---+H+ ++|
60 ++ +E+--- +++ ---+H+--- ++
| --+++ ---+H+-- |
40 ++ +E+-+H+--- ++
| +H+ |
20 +EE+ ++
+N+ + + + + + + |
0 ++N-N---N--+---------+----------+---------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads

atomic_add-bench: 5000000 ops/thread, [0,1024] range

350 ++---------+---------+----------+---------+----------+----------+---++
+ atomic +-E--+ + + + + + |
300 +cmpxchg +-H--+ +++
| master +-N--+ +++ ||
| +++ | ----E|
250 ++ | ----E---- ++
| ----E--- | ---+H|
200 ++ -+E+--- +++ ---+H+--- ++
| ---- -+H+-- |
| +E+ +++ ---- +++ |
150 ++ ---+++ ---+H+- ++
| --- -+H+-- |
100 ++ ---+E+ ---- +++ ++
| +++ ---+E+-----+H+- |
| -+E+------+H+-- |
50 ++ +E+ ++
+EE+ + + + + + + |
0 ++N-N---N--+---------+----------+---------+----------+----------+---++
0 10 20 30 40 50 60
Number of threads

hi-res: http://imgur.com/a/fMRmq

For master I stopped measuring master after 8 threads, because there is little
point in measuring the well-known performance collapse of a contended lock.

Backports commit 37b995f6e7a1cb6fa378c5cd4217b9dd9e1fc98b from qemu
2018-02-27 23:43:22 -05:00
Emilio G. Cota 9d9b7dedac
target-i386: emulate XCHG using atomic helper
Backports commit ea97ebe89f7a879ea9aba90140e40c29b5cbd653 from qemu
2018-02-27 23:40:20 -05:00
Emilio G. Cota 8f96b6beb9
target-i386: emulate LOCK'ed BTX ops using atomic helpers
Backports commit cfe819d309d472f75fd129faf1d1064a2498326c from qemu
2018-02-27 23:39:21 -05:00
Emilio G. Cota 089965fa8d
target-i386: emulate LOCK'ed XADD using atomic helper
Backports commit f53b01817f95781d2bcc8a82e057d1416601e13b from qemu
2018-02-27 23:06:28 -05:00
Emilio G. Cota f9ed728f27
target-i386: emulate LOCK'ed NEG using cmpxchg helper
Backports commit 8eb8c7385608b99bed6055a22d897ff727a6cb8e from qemu
2018-02-27 23:03:28 -05:00
Emilio G. Cota fedeb0f93e
target-i386: emulate LOCK'ed NOT using atomic helper
Backports commit 2a5fe8ae145ef7a3ab480922116d27efcc97b85d from qemu
2018-02-27 23:00:33 -05:00
Emilio G. Cota 05c94546d5
target-i386: emulate LOCK'ed INC using atomic helper
Backports commit 60e573462fcdb83aa1a41e66a9f31dc8a4364399 from qemu
2018-02-27 22:56:05 -05:00
Emilio G. Cota 7c7b0fe746
target-i386: emulate LOCK'ed OP instructions using atomic helpers
Backports commit a7cee522f3529c2fc85379237b391ea98823271e from qemu
2018-02-27 22:53:46 -05:00
Emilio G. Cota a386368f82
target-i386: emulate LOCK'ed cmpxchg using cmpxchg helpers
The diff here is uglier than necessary. All this does is to turn

FOO

into:

if (s->prefix & PREFIX_LOCK) {
BAR
} else {
FOO
}

where FOO is the original implementation of an unlocked cmpxchg.

Backports commit ae03f8de45427042ecd10b0941a005f21ecc064c from qemu
2018-02-27 22:38:37 -05:00
Paolo Bonzini be00a3e100
target-i386: fix 32-bit addresses in LEA
This was found with test-i386. The issue is that instructions
such as

addr32 lea (%eax), %rax

did not perform a 32-bit extension, because the LEA translation
skipped the gen_lea_v_seg step. That step does not just add
segments, it also takes care of extending from address size to
pointer size.

Backports commit 620abfb004543404bef1953e25da2ad77352941a from qemu
2018-02-26 10:06:08 -05:00
Eduardo Habkost b41bb81737
target-i386: Don't use cpu->migratable when filtering features
When explicitly enabling unmigratable flags using "-cpu host"
(e.g. "-cpu host,+invtsc"), the requested feature won't be
enabled because cpu->migratable is true by default.

This is inconsistent with all other CPU models, which don't have
the "migratable" option, making "+invtsc" work without the need
for extra options.

This happens because x86_cpu_filter_features() uses
cpu->migratable as an argument for
x86_cpu_get_supported_feature_word(). This is not useful
because:
2) on "-cpu host" it only makes QEMU disable features that were
explicitly enabled in the command-line;
1) on all the other CPU models, cpu->migratable is already false.

The fix is to just use 'false' as an argument to
x86_cpu_get_supported_feature_word() in
x86_cpu_filter_features().

Note that:

* This won't change anything for people using using
"-cpu host" or "-cpu host,migratable=<on|off>" (with no extra
features) because the x86_cpu_get_supported_feature_word() call
on the cpu->host_features check uses cpu->migratable as
argument.
* This won't change anything for any CPU model except "host"
because they all have cpu->migratable == false (and only "host"
has the "migratable" property that allows it to be changed).
* This will only change things for people using "-cpu host,+<feature>",
where <feature> is a non-migratable feature. The only existing
named non-migratable feature is "invtsc".

In other words, this change will only affect people using
"-cpu host,+invtsc" (that will now get what they asked for: the
invtsc flag will be enabled). All other use cases are unaffected.

Backports commit 46c032f3afcc05a0123914609f1003906ba63fda from qemu
2018-02-26 09:51:14 -05:00
Eduardo Habkost 4096ce0184
target-i386: x86_cpu_load_features() function
When probing for CPU model information, we need to reuse the code
that initializes CPUID fields, but not the remaining side-effects
of x86_cpu_realizefn(). Move that code to a separate function
that can be reused later.

Backports commit 41f3d4d69a423dadb8431fda65d8d7c68c0de0fc from qemu
2018-02-26 09:49:34 -05:00
Eduardo Habkost aa98c8a93f
target-i386: Move warning code outside x86_cpu_filter_features()
x86_cpu_filter_features() will be reused by code that shouldn't
print any warning. Move the warning code to a new
x86_cpu_report_filtered_features() function, and call it from
x86_cpu_realizefn().

Backports commit 8ca30e8673aff9bfcf8f969f8db4266b5f62e49c from qemu
2018-02-26 09:40:11 -05:00
Eduardo Habkost 08bfa41e1b
target-i386: xsave: Add FP and SSE bits to x86_ext_save_areas
Instead of treating the FP and SSE bits as special cases, add
them to the x86_ext_save_areas array. This will simplify the code
that calculates the supported xsave components and the size of
the xsave area.

Backports commit e3c9022b4e2b6a4deb6518361d2bbf33522b9198 from qemu
2018-02-26 09:37:48 -05:00
Eduardo Habkost 54bd827472
target-i386: Register properties for feature aliases manually
Instead of keeping the aliases inside the feature name arrays and
require parsing the strings, just register alias properties
manually. This simplifies the code for property registration and
lookup.

Backports commit 16d2fcaa509b1ca56eb2fcd8fe877279cf65cccc from qemu
2018-02-26 09:34:52 -05:00
Eduardo Habkost b508b9e02a
target-i386: Remove underscores from feat_names arrays
Instead of translating the feature name entries when adding
property names, store the actual property names in the feature
name array.

For reference, here is the full list of functions that use
FeatureWordInfo::feat_names:

* x86_cpu_get_migratable_flags(): not affected, as it just
check for non-NULL values.
* report_unavailable_features(): informative only. It will
start printing feature names with hyphens.
* x86_cpu_list(): informative only. It will start printing
feature names with hyphens
* x86_cpu_register_feature_bit_props(): not affected, as it
was already calling feat2prop(). Now we can remove the
feat2prop() calls safely.

So, the only user-visible effect of this patch are the new names
being used in help and error messages for users.

Backports commit fc7dfd205f3287893c436d932a167bffa30579c8 from qemu
2018-02-26 09:33:15 -05:00
Eduardo Habkost 6d1a7bccb5
target-i386: Disable VME by default with TCG
VME is already disabled automatically when using TCG. So, instead
of pretending it is there when reporting CPU model data on
query-cpu-* QMP commands (making every CPU model to be reported
as not runnable), we can disable it by default on all CPU models
when using TCG.

Do that by adding a tcg_default_props array that will work like
kvm_default_props.

Backports commit 04d99c3c61f4bdc0450dbeb6512b6dd743baca65 from qemu
2018-02-26 08:23:44 -05:00
Eduardo Habkost 594cbeaa06
target-i386: List CPU models using subclass list
Instead of using the builtin_x86_defs array, use the QOM subclass
list to list CPU models on "-cpu ?" and "query-cpu-definitions".

Backports commit ee465a3ef77c2b2975ffa71c72208c05b3f3970d from qemu
2018-02-26 08:17:04 -05:00
Evgeny Yakovlev fa9d708fbd
target-i386: Correct family/model/stepping for Opteron_G3
Current CPU definition for AMD Opteron third generation includes
features like SSE4a and LAHF_LM support in emulated CPUID. These
features are present in K8 rev.E or K10 CPUs and later. However,
current G3 family and model describe 2nd generation K8 cores instead.

This is incorrect but was considered harmless until our tests found a
problem with linux kernels >= 3.10 (and maybe earlier) which specifically
check for Opteron K8 model when parsing CPUID leaf 0x80000001:
http://lxr.free-electrons.com/source/arch/x86/kernel/cpu/amd.c?v=3.16#L552
This code will disable LAHF_LM feature in /proc/cpuinfo if model number
is inconsistent.

This change sets Opteron_G3 family/model/stepping to 16/2/3 which is
a proper Opteron 3rd generation 2350 CPU.

Backports commit 339892d758efb2d0954160d41736a0eac9875d67 from qemu
2018-02-26 04:59:18 -05:00
Eduardo Habkost b7f434373b
target-i386: Report known CPUID[EAX=0xD,ECX=0]:EAX bits as migratable
A regression was introduced by commit 96193c22a "target-i386:
Move xsave component mask to features array": all
CPUID[EAX=0xD,ECX=0]:EAX bits were being reported as unmigratable
because they don't have feature names defined. This broke
"-cpu host" because it enables only migratable features by
default.

This adds a new field to FeatureWordInfo: migratable_flags, which
will make those features be reported as migratable even if they
don't have a property name defined.

Backports commit 6fb2fff75dceed1716e757882a6dfbadd9042407 from qemu
2018-02-26 04:58:05 -05:00
Alex Bennée 33589eb75f
cpus: pass CPUState to run_on_cpu helpers
CPUState is a fairly common pointer to pass to these helpers. This means
if you need other arguments for the async_run_on_cpu case you end up
having to do a g_malloc to stuff additional data into the routine. For
the current users this isn't a massive deal but for MTTCG this gets
cumbersome when the only other parameter is often an address.

This adds the typedef run_on_cpu_func for helper functions which has an
explicit CPUState * passed as the first parameter. All the users of
run_on_cpu and async_run_on_cpu have had their helpers updated to use
CPUState where available.

Backports commit e0eeb4a21a3ca4b296220ce4449d8acef9de9049 from qemu
2018-02-26 04:54:55 -05:00
Eduardo Habkost 49c04d7104
target-i386: Clear KVM CPUID features if KVM is disabled
This will ensure all checks for features[FEAT_KVM] in the code
will be correct in case the KVM CPUID leaf is completely
disabled.

Backports commit aec661de86894e914d2d82431d9cefa9a9a40213 from qemu
2018-02-26 04:47:05 -05:00
Eduardo Habkost f29384c810
target-i386: Move xsave component mask to features array
This will reuse the existing check/enforce logic in
x86_cpu_filter_features() to check the xsave component bits
against GET_SUPPORTED_CPUID.

Backports commit 96193c22ab39ea24f81e386ad7883260ff24f5fd from qemu
2018-02-26 04:45:35 -05:00
Eduardo Habkost 3fb3e6672b
target-i386: xsave: Calculate set of xsave components on realize
Instead of doing complex calculations and calling
kvm_arch_get_supported_cpuid() inside cpu_x86_cpuid(), calculate
the set of required XSAVE components earlier, at realize time.

Backports commit 2ca8a8becc2eeb5262e478ce502f5daa53f3d0bc from qemu
2018-02-26 04:40:41 -05:00
Eduardo Habkost 28f002cbaf
target-i386: xsave: Helper function to calculate xsave area size
Move the xsave area size calculation from cpu_x86_cpuid() inside
its own function. While doing it, change it to use the XSAVE area
struct sizes for the initial size, instead of the magic 0x240
number.

Backports commit 1fda6198e4126af9988754c8824cfc9928649890 from qemu
2018-02-26 04:36:27 -05:00
Eduardo Habkost c35e9eb9af
target-i386: xsave: Simplify CPUID[0xD,0].{EAX,EDX} calculation
Instead of assigning individual bits in a loop, just copy the
values from ena_mask.

Backports commit 8057c621b1b17cbcb35fe67d1a09ada9055873a9 from qemu
2018-02-26 04:35:14 -05:00
Eduardo Habkost c7195afd32
target-i386: xsave: Calculate enabled components only once
Instead of checking both env->features and ena_mask at two
different places in the CPUID code, initialize ena_mask based on
the features that are enabled for the CPU, and then clear
unsupported bits based on kvm_arch_get_supported_cpuid().

The results should be exactly the same, but it will make it
easier to move the mask calculation elsewhare, and reuse
x86_cpu_filter_features() for the kvm_arch_get_supported_cpuid()
check.

Backports commit 4928cd6de6b4211a79f98c8dc39115be1e815c2b from qemu
2018-02-26 04:33:18 -05:00
Eduardo Habkost c3a0cba5b1
target-i386: Don't try to enable PT State xsave component
The code that calculates the set of supported XSAVE components on
CPUID looks at ext_save_areas to find out which components should
be enabled. However, if there are zeroed entries in the
ext_save_areas array, the
((env->features[esa->feature] & esa->bits) == esa->bits)
check will always succeed and QEMU will unconditionally try to
enable the component.

Luckily this never caused any problems because the only missing
entry in ext_save_areas is the PT State component (bit 8), and
KVM currently doesn't support it (so it was cleared on ena_mask).
But the code was still incorrect and would break if KVM starts
returning CPUID[EAX=0xD,ECX=0].EAX[bit 8] as supported on
GET_SUPPORTED_CPUID.

Fix the problem by changing the code to not enable a XSAVE
component if ExtSaveArea::bits is zero.

Backports commit 9646f4927faf68e8690588c2fd6dc9834c440b58 from qemu
2018-02-26 04:30:35 -05:00
Eduardo Habkost 6188c6d6e4
target-i386: Move feature name arrays inside FeatureWordInfo
It makes it easier to guarantee the arrays are the right size,
and to find information when looking at the code.

Backports commit 2d5312da566e4424a807d078da05f92ee7be3eec from qemu
2018-02-26 04:29:47 -05:00
Eduardo Habkost 74ae087743
target-i386: Enable CPUID[0x8000000A] if SVM is enabled
SVM needs CPUID[0x8000000A] to be available. So if SVM is enabled
in a CPU model or explicitly in the command-line, adjust CPUID
xlevel to expose the CPUID[0x8000000A] leaf.

Backports commit 0c3d7c0051576d220e6da0a8ac08f2d8482e2f0b from qemu
2018-02-26 04:05:47 -05:00
Eduardo Habkost 37406874ea
target-i386: Automatically set level/xlevel/xlevel2 when needed
Instead of requiring users and management software to be aware of
required CPUID level/xlevel/xlevel2 values for each feature,
automatically increase those values when features need them.

This was already done for CPUID[7].EBX, and is now made generic
for all CPUID feature flags. Unit test included, to make sure we
don't break ABI on older machine-types and don't mess with the
CPUID level values if they are explicitly set by the user.

Backports commit c39c0edf9bb3b968ba95484465a50c7b19f4aa3a from qemu
2018-02-26 04:03:09 -05:00
Eduardo Habkost 6861fe80cf
target-i386: Add a marker to end of the region zeroed on reset
Instead of using cpuid_level, use an empty struct as a marker
(like we already did with {start,end}_init_save). This will avoid
accidentaly resetting the wrong fields if we change the field
ordering on CPUX86State.

Backports commit 5e992a8e337e710ea2d02f35668ac55a80e15f99 from qemu
2018-02-26 03:59:03 -05:00
Eduardo Habkost c78d24b93c
target-i386: Remove unused X86CPUDefinition::xlevel2 field
No CPU model in builtin_x86_defs has xlevel2 set, so it is always
zero. Delete the field.

Note that this is not an user-visible change. It doesn't remove
the ability to set xlevel2 on the command-line, it just removes
an unused field in builtin_x86_defs.

Backports commit 0456441b5eb6694a561ad5bb8dad52483e6a08d0 from qemu
2018-02-26 03:57:02 -05:00
Richard Henderson 552ef4b3e6
target-i386: Use struct X86XSaveArea in fpu_helper.c
This avoids a double hand-full of magic numbers in the
xsave and xrstor helper functions.

Backports commit 3f32bd21df655e62eb271182a5c63280d631c7b3 from qemu
2018-02-26 03:38:53 -05:00
Pranith Kumar 533e083495
target-i386: Generate fences for x86
Backports commit cc19e497a047193db5083425957d7292c8dd3226 from qemu
2018-02-26 03:28:31 -05:00
Stanislav Shmarov 5f9552657e
target-i386: Fixed syscall posssible segfault
In user-mode emulation env->idt.base memory is
allocated in linux-user/main.c with
size 8*512 = 4096 (for 64-bit).
When fake interrupt EXCP_SYSCALL is thrown
do_interrupt_user checks destination privilege level
for this fake exception, and tries to read 4 bytes
at address base + (256 * 2^4)=4096, that causes
segfault.

Privlege level was checked only for int's, so lets
read dpl from memory only for this case.

Backports commit 885b7c44e4f8b7a012a92770a0dba8b238662caa from qemu
2018-02-26 02:36:09 -05:00
Paolo Bonzini d8d0d08262
target-i386: fix ordering of fields in CPUX86State
Make sure reset zeroes TSC_AUX, XCR0, PKRU. Move XSTATE_BV from the
"vmstate only" section to the "KVM only" section.

Backports commit 7616f1c2da1c0f336a474a56ad6d32e15ccd666e from qemu
2018-02-26 02:34:22 -05:00
Longpeng(Mike) 8b5400d675
target-i386: present virtual L3 cache info for vcpus
Some software algorithms are based on the hardware's cache info, for example,
for x86 linux kernel, when cpu1 want to wakeup a task on cpu2, cpu1 will trigger
a resched IPI and told cpu2 to do the wakeup if they don't share low level
cache. Oppositely, cpu1 will access cpu2's runqueue directly if they share llc.
The relevant linux-kernel code as bellow:

static void ttwu_queue(struct task_struct *p, int cpu)
{
struct rq *rq = cpu_rq(cpu);
......
if (... && !cpus_share_cache(smp_processor_id(), cpu)) {
......
ttwu_queue_remote(p, cpu); /* will trigger RES IPI */
return;
}
......
ttwu_do_activate(rq, p, 0); /* access target's rq directly */
......
}

In real hardware, the cpus on the same socket share L3 cache, so one won't
trigger a resched IPIs when wakeup a task on others. But QEMU doesn't present a
virtual L3 cache info for VM, then the linux guest will trigger lots of RES IPIs
under some workloads even if the virtual cpus belongs to the same virtual socket.

For KVM, there will be lots of vmexit due to guest send IPIs.
The workload is a SAP HANA's testsuite, we run it one round(about 40 minuates)
and observe the (Suse11sp3)Guest's amounts of RES IPIs which triggering during
the period:
No-L3 With-L3(applied this patch)
cpu0:	363890	44582
cpu1:	373405	43109
cpu2:	340783	43797
cpu3:	333854	43409
cpu4:	327170	40038
cpu5:	325491	39922
cpu6:	319129	42391
cpu7:	306480	41035
cpu8:	161139	32188
cpu9:	164649	31024
cpu10:	149823	30398
cpu11:	149823	32455
cpu12:	164830	35143
cpu13:	172269	35805
cpu14:	179979	33898
cpu15:	194505	32754
avg:	268963.6	40129.8

The VM's topology is "1*socket 8*cores 2*threads".
After present virtual L3 cache info for VM, the amounts of RES IPIs in guest
reduce 85%.

For KVM, vcpus send IPIs will cause vmexit which is expensive, so it can cause
severe performance degradation. We had tested the overall system performance if
vcpus actually run on sparate physical socket. With L3 cache, the performance
improves 7.2%~33.1%(avg:15.7%).

Backports commit 14c985cffa6cb177fc01a163d8bcf227c104718c from qemu
2018-02-25 23:16:14 -05:00
Luwei Kang af7b3995dd
target-i386: Add more Intel AVX-512 instructions support
Add more AVX512 feature bits, include AVX512DQ, AVX512IFMA,
AVX512BW, AVX512VL, AVX512VBMI. Its spec can be found at:
https://software.intel.com/sites/default/files/managed/b4/3a/319433-024.pdf

Backports commit cc728d1493eee3e20c1547191862e43d3f55e714 from qemu
2018-02-25 23:09:18 -05:00
Richard Henderson 1547048a22
tcg: Reorg TCGOp chaining
Instead of using -1 as end of chain, use 0, and link through the 0
entry as a fully circular double-linked list.

Backports commit dcb8e75870e2de199db853697f8839cb603beefe from qemu
2018-02-25 21:44:50 -05:00
Igor Mammedov d30410dc9a
target-i386: Add x86_cpu_unrealizefn()
First remove VCPU from exec loop and only then remove lapic.

Backports commit c884776e9dc947105827bd6c22192863f97267d2 from qemu
2018-02-25 20:54:13 -05:00
Igor Mammedov 298b0e6529
target-i386: Fix apic object leak when CPU is deleted
Backports commit 67e55caa6dcb91c80428cee6fe463f8dd8a755ab from qemu
2018-02-25 20:48:40 -05:00
Igor Mammedov e15fb246ab
target-i386: cpu: Do not ignore error and fix apic parent
object_property_add_child() silently fails with error that it can't
create duplicate propery 'apic' as we already have 'apic' property
registered for 'apic' feature. As result generic device_realize puts
apic into unattached container.

As it's programming error, abort if name collision happens in future
and fix property name for apic_state to 'lapic', this way apic is
a child of cpu instance.

Backports commit 6816b1b3811e839540df22855d975b6d76ae438b from qemu
2018-02-25 20:47:46 -05:00
Paolo Bonzini 403021183d
target-i386: Add support for UMIP and RDPID CPUID bits
These are both stored in CPUID[EAX=7,EBX=0].ECX. KVM is going to
be able to emulate both (albeit with a performance loss in the case
of RDPID, which therefore will be in KVM_GET_EMULATED_CPUID rather
than KVM_GET_SUPPORTED_CPUID).

It's also possible to implement both in TCG, but this is for 2.8.

Backports commit c2f193b538032accb9db504998bf2ea7c0ef65af from qemu
2018-02-25 20:46:40 -05:00
Igor Mammedov 6714284211
target-i386: Add socket/core/thread properties to X86CPU
These properties will be used by as address where to plug
CPU with help -device/device_add commands.

Backports commit d89c2b8b98e097b9cad5104b0f178bde1cfa011b from qemu
2018-02-25 20:45:35 -05:00