aboutsummaryrefslogtreecommitdiff
path: root/Documentation/virt
AgeCommit message (Collapse)AuthorFilesLines
2022-03-24Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmGravatar Linus Torvalds 3-72/+247
Pull kvm updates from Paolo Bonzini: "ARM: - Proper emulation of the OSLock feature of the debug architecture - Scalibility improvements for the MMU lock when dirty logging is on - New VMID allocator, which will eventually help with SVA in VMs - Better support for PMUs in heterogenous systems - PSCI 1.1 support, enabling support for SYSTEM_RESET2 - Implement CONFIG_DEBUG_LIST at EL2 - Make CONFIG_ARM64_ERRATUM_2077057 default y - Reduce the overhead of VM exit when no interrupt is pending - Remove traces of 32bit ARM host support from the documentation - Updated vgic selftests - Various cleanups, doc updates and spelling fixes RISC-V: - Prevent KVM_COMPAT from being selected - Optimize __kvm_riscv_switch_to() implementation - RISC-V SBI v0.3 support s390: - memop selftest - fix SCK locking - adapter interruptions virtualization for secure guests - add Claudio Imbrenda as maintainer - first step to do proper storage key checking x86: - Continue switching kvm_x86_ops to static_call(); introduce static_call_cond() and __static_call_ret0 when applicable. - Cleanup unused arguments in several functions - Synthesize AMD 0x80000021 leaf - Fixes and optimization for Hyper-V sparse-bank hypercalls - Implement Hyper-V's enlightened MSR bitmap for nested SVM - Remove MMU auditing - Eager splitting of page tables (new aka "TDP" MMU only) when dirty page tracking is enabled - Cleanup the implementation of the guest PGD cache - Preparation for the implementation of Intel IPI virtualization - Fix some segment descriptor checks in the emulator - Allow AMD AVIC support on systems with physical APIC ID above 255 - Better API to disable virtualization quirks - Fixes and optimizations for the zapping of page tables: - Zap roots in two passes, avoiding RCU read-side critical sections that last too long for very large guests backed by 4 KiB SPTEs. - Zap invalid and defunct roots asynchronously via concurrency-managed work queue. - Allowing yielding when zapping TDP MMU roots in response to the root's last reference being put. - Batch more TLB flushes with an RCU trick. Whoever frees the paging structure now holds RCU as a proxy for all vCPUs running in the guest, i.e. to prolongs the grace period on their behalf. It then kicks the the vCPUs out of guest mode before doing rcu_read_unlock(). Generic: - Introduce __vcalloc and use it for very large allocations that need memcg accounting" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (246 commits) KVM: use kvcalloc for array allocations KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2 kvm: x86: Require const tsc for RT KVM: x86: synthesize CPUID leaf 0x80000021h if useful KVM: x86: add support for CPUID leaf 0x80000021 KVM: x86: do not use KVM_X86_OP_OPTIONAL_RET0 for get_mt_mask Revert "KVM: x86/mmu: Zap only TDP MMU leafs in kvm_zap_gfn_range()" kvm: x86/mmu: Flush TLB before zap_gfn_range releases RCU KVM: arm64: fix typos in comments KVM: arm64: Generalise VM features into a set of flags KVM: s390: selftests: Add error memop tests KVM: s390: selftests: Add more copy memop tests KVM: s390: selftests: Add named stages for memop test KVM: s390: selftests: Add macro as abstraction for MEM_OP KVM: s390: selftests: Split memop tests KVM: s390x: fix SCK locking RISC-V: KVM: Implement SBI HSM suspend call RISC-V: KVM: Add common kvm_riscv_vcpu_wfi() function RISC-V: Add SBI HSM suspend related defines RISC-V: KVM: Implement SBI v0.3 SRST extension ...
2022-03-21Merge tag 'docs-5.18' of git://git.lwn.net/linuxGravatar Linus Torvalds 1-1/+5
Pull documentation updates from Jonathan Corbet: "It has been a moderately busy cycle for documentation; some of the highlights are: - Numerous PDF-generation improvements - Kees's new document with guidelines for researchers studying the development community. - The ongoing stream of Chinese translations - Thorsten's new document on regression handling - A major reworking of the internal documentation for the kernel-doc script. Plus the usual stream of typo fixes and such" * tag 'docs-5.18' of git://git.lwn.net/linux: (80 commits) docs/kernel-parameters: update description of mem= docs/zh_CN: Add sched-nice-design Chinese translation docs: scheduler: Convert schedutil.txt to ReST Docs: ktap: add code-block type docs: serial: fix a reference file name in driver.rst docs: UML: Mention telnetd for port channel docs/zh_CN: add damon reclaim translation docs/zh_CN: add damon usage translation docs/zh_CN: add admin-guide damon start translation docs/zh_CN: add admin-guide damon index translation docs/zh_CN: Refactoring the admin-guide directory index zh_CN: Add translation for admin-guide/mm/index.rst zh_CN: Add translations for admin-guide/mm/ksm.rst Add Chinese translation for vm/ksm.rst docs/zh_CN: Add sched-stats Chinese translation docs/zh_CN: add devicetree of_unittest translation docs/zh_CN: add devicetree usage-model translation docs/zh_CN: add devicetree index translation Documentation: describe how to apply incremental stable patches docs/zh_CN: add peci subsystem translation ...
2022-03-21KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2Gravatar Oliver Upton 1-0/+50
KVM_CAP_DISABLE_QUIRKS is irrevocably broken. The capability does not advertise the set of quirks which may be disabled to userspace, so it is impossible to predict the behavior of KVM. Worse yet, KVM_CAP_DISABLE_QUIRKS will tolerate any value for cap->args[0], meaning it fails to reject attempts to set invalid quirk bits. The only valid workaround for the quirky quirks API is to add a new CAP. Actually advertise the set of quirks that can be disabled to userspace so it can predict KVM's behavior. Reject values for cap->args[0] that contain invalid bits. Finally, add documentation for the new capability and describe the existing quirks. Signed-off-by: Oliver Upton <oupton@google.com> Message-Id: <20220301060351.442881-5-oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-18Merge tag 'kvmarm-5.18' of ↵Gravatar Paolo Bonzini 2-46/+82
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 5.18 - Proper emulation of the OSLock feature of the debug architecture - Scalibility improvements for the MMU lock when dirty logging is on - New VMID allocator, which will eventually help with SVA in VMs - Better support for PMUs in heterogenous systems - PSCI 1.1 support, enabling support for SYSTEM_RESET2 - Implement CONFIG_DEBUG_LIST at EL2 - Make CONFIG_ARM64_ERRATUM_2077057 default y - Reduce the overhead of VM exit when no interrupt is pending - Remove traces of 32bit ARM host support from the documentation - Updated vgic selftests - Various cleanups, doc updates and spelling fixes
2022-03-11docs: UML: Mention telnetd for port channelGravatar Vincent Whitchurch 1-1/+5
It is not obvious from the documentation that using the "port" channel for the console requires telnetd to be installed (see port_connection() in arch/um/drivers/port_user.c). Mention this, and the fact that UML will not boot until a client connects. Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Link: https://lore.kernel.org/r/20220310124230.3069354-1-vincent.whitchurch@axis.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2022-03-09Merge branch kvm-arm64/misc-5.18 into kvmarm-master/nextGravatar Marc Zyngier 2-45/+44
* kvm-arm64/misc-5.18: : . : Misc fixes for KVM/arm64 5.18: : : - Drop unused kvm parameter to kvm_psci_version() : : - Implement CONFIG_DEBUG_LIST at EL2 : : - Make CONFIG_ARM64_ERRATUM_2077057 default y : : - Only do the interrupt dance if we have exited because of an interrupt : : - Remove traces of 32bit ARM host support from the documentation : . Documentation: KVM: Update documentation to indicate KVM is arm64-only KVM: arm64: Only open the interrupt window on exit due to an interrupt KVM: arm64: Enable Cortex-A510 erratum 2077057 by default Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-03-09Documentation: KVM: Update documentation to indicate KVM is arm64-onlyGravatar Oliver Upton 2-45/+44
KVM support for 32-bit ARM hosts (KVM/arm) has been removed from the kernel since commit 541ad0150ca4 ("arm: Remove 32bit KVM host support"). There still exists some remnants of the old architecture in the KVM documentation. Remove all traces of 32-bit host support from the documentation. Note that AArch32 guests are still supported. Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220308172856.2997250-1-oupton@google.com
2022-03-04Merge branch 'kvm-bugfixes' into HEADGravatar Paolo Bonzini 1-1/+1
Merge bugfixes from 5.17 before merging more tricky work.
2022-03-01KVM: Drop KVM_REQ_MMU_RELOAD and update vcpu-requests.rst documentationGravatar Sean Christopherson 1-4/+3
Remove the now unused KVM_REQ_MMU_RELOAD, shift KVM_REQ_VM_DEAD into the unoccupied space, and update vcpu-requests.rst, which was missing an entry for KVM_REQ_VM_DEAD. Switching KVM_REQ_VM_DEAD to entry '1' also fixes the stale comment about bits 4-7 being reserved. Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Message-Id: <20220225182248.3812651-7-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-25Merge branch kvm-arm64/psci-1.1 into kvmarm-master/nextGravatar Marc Zyngier 1-0/+5
* kvm-arm64/psci-1.1: : . : Limited PSCI-1.1 support from Will Deacon: : : This small series exposes the PSCI SYSTEM_RESET2 call to guests, which : allows the propagation of a "reset_type" and a "cookie" back to the VMM. : Although Linux guests only ever pass 0 for the type ("SYSTEM_WARM_RESET"), : the vendor-defined range can be used by a bootloader to provide additional : information about the reset, such as an error code. : . KVM: arm64: Remove unneeded semicolons KVM: arm64: Indicate SYSTEM_RESET2 in kvm_run::system_event flags field KVM: arm64: Expose PSCI SYSTEM_RESET2 call to the guest KVM: arm64: Bump guest PSCI version to 1.1 Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-02-25KVM: x86: Provide per VM capability for disabling PMU virtualizationGravatar David Dunn 1-0/+22
Add a new capability, KVM_CAP_PMU_CAPABILITY, that takes a bitmask of settings/features to allow userspace to configure PMU virtualization on a per-VM basis. For now, support a single flag, KVM_PMU_CAP_DISABLE, to allow disabling PMU virtualization for a VM even when KVM is configured with enable_pmu=true a module level. To keep KVM simple, disallow changing VM's PMU configuration after vCPUs have been created. Signed-off-by: David Dunn <daviddunn@google.com> Message-Id: <20220223225743.2703915-2-daviddunn@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-24Merge branch 'kvm-ppc-cap-210' into kvm-next-5.18Gravatar Paolo Bonzini 1-0/+14
2022-02-22Merge branch 'kvm-ppc-cap-210' into kvm-masterGravatar Paolo Bonzini 1-0/+14
By request of Nick Piggin: > Patch 3 requires a KVM_CAP_PPC number allocated. QEMU maintainers are > happy with it (link in changelog) just waiting on KVM upstreaming. Do > you have objections to the series going to ppc/kvm tree first, or > another option is you could take patch 3 alone first (it's relatively > independent of the other 2) and ppc/kvm gets it from you?
2022-02-22KVM: PPC: reserve capability 210 for KVM_CAP_PPC_AIL_MODE_3Gravatar Nicholas Piggin 1-0/+14
Add KVM_CAP_PPC_AIL_MODE_3 to advertise the capability to set the AIL resource mode to 3 with the H_SET_MODE hypercall. This capability differs between processor types and KVM types (PR, HV, Nested HV), and affects guest-visible behaviour. QEMU will implement a cap-ail-mode-3 to control this behaviour[1], and use the KVM CAP if available to determine KVM support[2]. Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-22KVM: s390: Clarify key argument for MEM_OP in api docsGravatar Janis Schoetterl-Glausch 1-1/+1
Clarify that the key argument represents the access key, not the whole storage key. Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com> Link: https://lore.kernel.org/r/20220221143657.3712481-1-scgl@linux.ibm.com Fixes: 5e35d0eb472b ("KVM: s390: Update api documentation for memop ioctl") Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-02-21KVM: arm64: Indicate SYSTEM_RESET2 in kvm_run::system_event flags fieldGravatar Will Deacon 1-0/+5
When handling reset and power-off PSCI calls from the guest, we initialise X0 to PSCI_RET_INTERNAL_FAILURE in case the VMM tries to re-run the vCPU after issuing the call. Unfortunately, this also means that the VMM cannot see which PSCI call was issued and therefore cannot distinguish between PSCI SYSTEM_RESET and SYSTEM_RESET2 calls, which is necessary in order to determine the validity of the "reset_type" in X1. Allocate bit 0 of the previously unused 'flags' field of the system_event structure so that we can indicate the PSCI call used to initiate the reset. Cc: Marc Zyngier <maz@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220221153524.15397-4-will@kernel.org
2022-02-17KVM: x86: Add KVM_CAP_ENABLE_CAP to x86Gravatar Aaron Lewis 1-1/+1
Follow the precedent set by other architectures that support the VCPU ioctl, KVM_ENABLE_CAP, and advertise the VM extension, KVM_CAP_ENABLE_CAP. This way, userspace can ensure that KVM_ENABLE_CAP is available on a vcpu before using it. Fixes: 5c919412fe61 ("kvm/x86: Hyper-V synthetic interrupt controller") Signed-off-by: Aaron Lewis <aaronlewis@google.com> Message-Id: <20220214212950.1776943-1-aaronlewis@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-14KVM: s390: Update api documentation for memop ioctlGravatar Janis Schoetterl-Glausch 1-22/+90
Document all currently existing operations, flags and explain under which circumstances they are available. Document the recently introduced absolute operations and the storage key protection flag, as well as the existing SIDA operations. Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Link: https://lore.kernel.org/r/20220211182215.2730017-10-scgl@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-02-08KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPUGravatar Alexandru Elisei 1-1/+5
Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU device ioctl. If the VCPU is scheduled on a physical CPU which has a different PMU, the perf events needed to emulate a guest PMU won't be scheduled in and the guest performance counters will stop counting. Treat it as an userspace error and refuse to run the VCPU in this situation. Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220127161759.53553-7-alexandru.elisei@arm.com
2022-02-08KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attributeGravatar Alexandru Elisei 1-0/+28
When KVM creates an event and there are more than one PMUs present on the system, perf_init_event() will go through the list of available PMUs and will choose the first one that can create the event. The order of the PMUs in this list depends on the probe order, which can change under various circumstances, for example if the order of the PMU nodes change in the DTB or if asynchronous driver probing is enabled on the kernel command line (with the driver_async_probe=armv8-pmu option). Another consequence of this approach is that on heteregeneous systems all virtual machines that KVM creates will use the same PMU. This might cause unexpected behaviour for userspace: when a VCPU is executing on the physical CPU that uses this default PMU, PMU events in the guest work correctly; but when the same VCPU executes on another CPU, PMU events in the guest will suddenly stop counting. Fortunately, perf core allows user to specify on which PMU to create an event by using the perf_event_attr->type field, which is used by perf_init_event() as an index in the radix tree of available PMUs. Add the KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU attribute to allow userspace to specify the arm_pmu that KVM will use when creating events for that VCPU. KVM will make no attempt to run the VCPU on the physical CPUs that share the PMU, leaving it up to userspace to manage the VCPU threads' affinity accordingly. To ensure that KVM doesn't expose an asymmetric system to the guest, the PMU set for one VCPU will be used by all other VCPUs. Once a VCPU has run, the PMU cannot be changed in order to avoid changing the list of available events for a VCPU, or to change the semantics of existing events. Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220127161759.53553-6-alexandru.elisei@arm.com
2022-02-08KVM: arm64: Do not change the PMU event filter after a VCPU has runGravatar Marc Zyngier 1-1/+1
Userspace can specify which events a guest is allowed to use with the KVM_ARM_VCPU_PMU_V3_FILTER attribute. The list of allowed events can be identified by a guest from reading the PMCEID{0,1}_EL0 registers. Changing the PMU event filter after a VCPU has run can cause reads of the registers performed before the filter is changed to return different values than reads performed with the new event filter in place. The architecture defines the two registers as read-only, and this behaviour contradicts that. Keep track when the first VCPU has run and deny changes to the PMU event filter to prevent this from happening. Signed-off-by: Marc Zyngier <maz@kernel.org> [ Alexandru E: Added commit message, updated ioctl documentation ] Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220127161759.53553-2-alexandru.elisei@arm.com
2022-01-28Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmGravatar Linus Torvalds 1-1/+3
Pull kvm fixes from Paolo Bonzini: "Two larger x86 series: - Redo incorrect fix for SEV/SMAP erratum - Windows 11 Hyper-V workaround Other x86 changes: - Various x86 cleanups - Re-enable access_tracking_perf_test - Fix for #GP handling on SVM - Fix for CPUID leaf 0Dh in KVM_GET_SUPPORTED_CPUID - Fix for ICEBP in interrupt shadow - Avoid false-positive RCU splat - Enable Enlightened MSR-Bitmap support for real ARM: - Correctly update the shadow register on exception injection when running in nVHE mode - Correctly use the mm_ops indirection when performing cache invalidation from the page-table walker - Restrict the vgic-v3 workaround for SEIS to the two known broken implementations Generic code changes: - Dead code cleanup" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (43 commits) KVM: eventfd: Fix false positive RCU usage warning KVM: nVMX: Allow VMREAD when Enlightened VMCS is in use KVM: nVMX: Implement evmcs_field_offset() suitable for handle_vmread() KVM: nVMX: Rename vmcs_to_field_offset{,_table} KVM: nVMX: eVMCS: Filter out VM_EXIT_SAVE_VMX_PREEMPTION_TIMER KVM: nVMX: Also filter MSR_IA32_VMX_TRUE_PINBASED_CTLS when eVMCS selftests: kvm: check dynamic bits against KVM_X86_XCOMP_GUEST_SUPP KVM: x86: add system attribute to retrieve full set of supported xsave states KVM: x86: Add a helper to retrieve userspace address from kvm_device_attr selftests: kvm: move vm_xsave_req_perm call to amx_test KVM: x86: Sync the states size with the XCR0/IA32_XSS at, any time KVM: x86: Update vCPU's runtime CPUID on write to MSR_IA32_XSS KVM: x86: Keep MSR_IA32_XSS unchanged for INIT KVM: x86: Free kvm_cpuid_entry2 array on post-KVM_RUN KVM_SET_CPUID{,2} KVM: nVMX: WARN on any attempt to allocate shadow VMCS for vmcs02 KVM: selftests: Don't skip L2's VMCALL in SMM test for SVM guest KVM: x86: Check .flags in kvm_cpuid_check_equal() too KVM: x86: Forcibly leave nested virt when SMM state is toggled KVM: SVM: drop unnecessary code in svm_hv_vmcb_dirty_nested_enlightenments() KVM: SVM: hyper-v: Enable Enlightened MSR-Bitmap support for real ...
2022-01-28KVM: x86: add system attribute to retrieve full set of supported xsave statesGravatar Paolo Bonzini 1-1/+3
Because KVM_GET_SUPPORTED_CPUID is meant to be passed (by simple-minded VMMs) to KVM_SET_CPUID2, it cannot include any dynamic xsave states that have not been enabled. Probing those, for example so that they can be passed to ARCH_REQ_XCOMP_GUEST_PERM, requires a new ioctl or arch_prctl. The latter is in fact worse, even though that is what the rest of the API uses, because it would require supported_xcr0 to be moved from the KVM module to the kernel just for this use. In addition, the value would be nonsensical (or an error would have to be returned) until the KVM module is loaded in. Therefore, to limit the growth of system ioctls, add a /dev/kvm variant of KVM_{GET,HAS}_DEVICE_ATTR, and implement it in x86 with just one group (0) and attribute (KVM_X86_XCOMP_GUEST_SUPP). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-22Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmGravatar Linus Torvalds 1-3/+3
Pull more kvm updates from Paolo Bonzini: "Generic: - selftest compilation fix for non-x86 - KVM: avoid warning on s390 in mark_page_dirty x86: - fix page write-protection bug and improve comments - use binary search to lookup the PMU event filter, add test - enable_pmu module parameter support for Intel CPUs - switch blocked_vcpu_on_cpu_lock to raw spinlock - cleanups of blocked vCPU logic - partially allow KVM_SET_CPUID{,2} after KVM_RUN (5.16 regression) - various small fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (46 commits) docs: kvm: fix WARNINGs from api.rst selftests: kvm/x86: Fix the warning in lib/x86_64/processor.c selftests: kvm/x86: Fix the warning in pmu_event_filter_test.c kvm: selftests: Do not indent with spaces kvm: selftests: sync uapi/linux/kvm.h with Linux header selftests: kvm: add amx_test to .gitignore KVM: SVM: Nullify vcpu_(un)blocking() hooks if AVIC is disabled KVM: SVM: Move svm_hardware_setup() and its helpers below svm_x86_ops KVM: SVM: Drop AVIC's intermediate avic_set_running() helper KVM: VMX: Don't do full kick when handling posted interrupt wakeup KVM: VMX: Fold fallback path into triggering posted IRQ helper KVM: VMX: Pass desired vector instead of bool for triggering posted IRQ KVM: VMX: Don't do full kick when triggering posted interrupt "fails" KVM: SVM: Skip AVIC and IRTE updates when loading blocking vCPU KVM: SVM: Use kvm_vcpu_is_blocking() in AVIC load to handle preemption KVM: SVM: Remove unnecessary APICv/AVIC update in vCPU unblocking path KVM: SVM: Don't bother checking for "running" AVIC when kicking for IPIs KVM: SVM: Signal AVIC doorbell iff vCPU is in guest mode KVM: x86: Remove defunct pre_block/post_block kvm_x86_ops hooks KVM: x86: Unexport LAPIC's switch_to_{hv,sw}_timer() helpers ...
2022-01-20docs: kvm: fix WARNINGs from api.rstGravatar Wei Wang 1-3/+3
Use the api number 134 for KVM_GET_XSAVE2, instead of 42, which has been used by KVM_GET_XSAVE. Also, fix the WARNINGs of the underlines being too short. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Wei Wang <wei.w.wang@intel.com> Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> Message-Id: <20220120045003.315177-1-wei.w.wang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-16Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmGravatar Linus Torvalds 2-9/+84
Pull kvm updates from Paolo Bonzini: "RISCV: - Use common KVM implementation of MMU memory caches - SBI v0.2 support for Guest - Initial KVM selftests support - Fix to avoid spurious virtual interrupts after clearing hideleg CSR - Update email address for Anup and Atish ARM: - Simplification of the 'vcpu first run' by integrating it into KVM's 'pid change' flow - Refactoring of the FP and SVE state tracking, also leading to a simpler state and less shared data between EL1 and EL2 in the nVHE case - Tidy up the header file usage for the nvhe hyp object - New HYP unsharing mechanism, finally allowing pages to be unmapped from the Stage-1 EL2 page-tables - Various pKVM cleanups around refcounting and sharing - A couple of vgic fixes for bugs that would trigger once the vcpu xarray rework is merged, but not sooner - Add minimal support for ARMv8.7's PMU extension - Rework kvm_pgtable initialisation ahead of the NV work - New selftest for IRQ injection - Teach selftests about the lack of default IPA space and page sizes - Expand sysreg selftest to deal with Pointer Authentication - The usual bunch of cleanups and doc update s390: - fix sigp sense/start/stop/inconsistency - cleanups x86: - Clean up some function prototypes more - improved gfn_to_pfn_cache with proper invalidation, used by Xen emulation - add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel delivery - completely remove potential TOC/TOU races in nested SVM consistency checks - update some PMCs on emulated instructions - Intel AMX support (joint work between Thomas and Intel) - large MMU cleanups - module parameter to disable PMU virtualization - cleanup register cache - first part of halt handling cleanups - Hyper-V enlightened MSR bitmap support for nested hypervisors Generic: - clean up Makefiles - introduce CONFIG_HAVE_KVM_DIRTY_RING - optimize memslot lookup using a tree - optimize vCPU array usage by converting to xarray" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (268 commits) x86/fpu: Fix inline prefix warnings selftest: kvm: Add amx selftest selftest: kvm: Move struct kvm_x86_state to header selftest: kvm: Reorder vcpu_load_state steps for AMX kvm: x86: Disable interception for IA32_XFD on demand x86/fpu: Provide fpu_sync_guest_vmexit_xfd_state() kvm: selftests: Add support for KVM_CAP_XSAVE2 kvm: x86: Add support for getting/setting expanded xstate buffer x86/fpu: Add uabi_size to guest_fpu kvm: x86: Add CPUID support for Intel AMX kvm: x86: Add XCR0 support for Intel AMX kvm: x86: Disable RDMSR interception of IA32_XFD_ERR kvm: x86: Emulate IA32_XFD_ERR for guest kvm: x86: Intercept #NM for saving IA32_XFD_ERR x86/fpu: Prepare xfd_err in struct fpu_guest kvm: x86: Add emulation for IA32_XFD x86/fpu: Provide fpu_update_guest_xfd() for IA32_XFD emulation kvm: x86: Enable dynamic xfeatures at KVM_SET_CPUID2 x86/fpu: Provide fpu_enable_guest_xfd_features() for KVM x86/fpu: Add guest support to xfd_enable_feature() ...
2022-01-14kvm: x86: Add support for getting/setting expanded xstate bufferGravatar Guang Zeng 1-2/+40
With KVM_CAP_XSAVE, userspace uses a hardcoded 4KB buffer to get/set xstate data from/to KVM. This doesn't work when dynamic xfeatures (e.g. AMX) are exposed to the guest as they require a larger buffer size. Introduce a new capability (KVM_CAP_XSAVE2). Userspace VMM gets the required xstate buffer size via KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2). KVM_SET_XSAVE is extended to work with both legacy and new capabilities by doing properly-sized memdup_user() based on the guest fpu container. KVM_GET_XSAVE is kept for backward-compatible reason. Instead, KVM_GET_XSAVE2 is introduced under KVM_CAP_XSAVE2 as the preferred interface for getting xstate buffer (4KB or larger size) from KVM (Link: https://lkml.org/lkml/2021/12/15/510) Also, update the api doc with the new KVM_GET_XSAVE2 ioctl. Signed-off-by: Guang Zeng <guang.zeng@intel.com> Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220105123532.12586-19-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07kvm: x86: Exclude unpermitted xfeatures at KVM_GET_SUPPORTED_CPUIDGravatar Jing Liu 1-0/+4
KVM_GET_SUPPORTED_CPUID should not include any dynamic xstates in CPUID[0xD] if they have not been requested with prctl. Otherwise a process which directly passes KVM_GET_SUPPORTED_CPUID to KVM_SET_CPUID2 would now fail even if it doesn't intend to use a dynamically enabled feature. Userspace must know that prctl is required and allocate >4K xstate buffer before setting any dynamic bit. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220105123532.12586-5-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07KVM: x86/xen: Add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel deliveryGravatar David Woodhouse 1-0/+21
This adds basic support for delivering 2 level event channels to a guest. Initially, it only supports delivery via the IRQ routing table, triggered by an eventfd. In order to do so, it has a kvm_xen_set_evtchn_fast() function which will use the pre-mapped shared_info page if it already exists and is still valid, while the slow path through the irqfd_inject workqueue will remap the shared_info page if necessary. It sets the bits in the shared_info page but not the vcpu_info; that is deferred to __kvm_xen_has_interrupt() which raises the vector to the appropriate vCPU. Add a 'verbose' mode to xen_shinfo_test while adding test cases for this. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <20211210163625.2886-5-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07KVM: x86/xen: Maintain valid mapping of Xen shared_info pageGravatar David Woodhouse 1-0/+12
Use the newly reinstated gfn_to_pfn_cache to maintain a kernel mapping of the Xen shared_info page so that it can be accessed in atomic context. Note that we do not participate in dirty tracking for the shared info page and we do not explicitly mark it dirty every single tim we deliver an event channel interrupts. We wouldn't want to do that even if we *did* have a valid vCPU context with which to do so. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <20211210163625.2886-4-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-21Merge tag 'kvm-s390-next-5.17-1' of ↵Gravatar Paolo Bonzini 1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Fix and cleanup - fix sigp sense/start/stop/inconsistency - cleanups
2021-12-17crypto: ccp - Add SEV_INIT_EX supportGravatar David Rientjes 1-0/+6
Add new module parameter to allow users to use SEV_INIT_EX instead of SEV_INIT. This helps users who lock their SPI bus to use the PSP for SEV functionality. The 'init_ex_path' parameter defaults to NULL which means the kernel will use SEV_INIT, if a path is specified SEV_INIT_EX will be used with the data found at the path. On certain PSP commands this file is written to as the PSP updates the NV memory region. Depending on file system initialization this file open may fail during module init but the CCP driver for SEV already has sufficient retries for platform initialization. During normal operation of PSP system and SEV commands if the PSP has not been initialized it is at run time. If the file at 'init_ex_path' does not exist the PSP will not be initialized. The user must create the file prior to use with 32Kb of 0xFFs per spec. Signed-off-by: David Rientjes <rientjes@google.com> Co-developed-by: Peter Gonda <pgonda@google.com> Signed-off-by: Peter Gonda <pgonda@google.com> Reviewed-by: Marc Orr <marcorr@google.com> Reported-by: kernel test robot <lkp@intel.com> Acked-by: Brijesh Singh <brijesh.singh@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Marc Orr <marcorr@google.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: David Rientjes <rientjes@google.com> Cc: John Allen <john.allen@amd.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-12-08KVM: X86: Rename gpte_is_8_bytes to has_4_byte_gpte and invert the directionGravatar Lai Jiangshan 1-4/+4
This bit is very close to mean "role.quadrant is not in use", except that it is false also when the MMU is mapping guest physical addresses directly. In that case, role.quadrant is indeed not in use, but there are no guest PTEs at all. Changing the name and direction of the bit removes the special case, since a guest with paging disabled, or not considering guest paging structures as is the case for two-dimensional paging, does not have to deal with 4-byte guest PTEs. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Message-Id: <20211124122055.64424-10-jiangshanlai@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-07KVM: s390: Fix names of skey constants in api documentationGravatar Janis Schoetterl-Glausch 1-3/+3
They are defined in include/uapi/linux/kvm.h as KVM_S390_GET_SKEYS_NONE and KVM_S390_SKEYS_MAX, but the api documetation talks of KVM_S390_GET_KEYS_NONE and KVM_S390_SKEYS_ALLOC_MAX respectively. Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Message-Id: <20211118102522.569660-1-scgl@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2021-11-11Merge branch 'kvm-sev-move-context' into kvm-masterGravatar Paolo Bonzini 1-0/+14
Add support for AMD SEV and SEV-ES intra-host migration support. Intra host migration provides a low-cost mechanism for userspace VMM upgrades. In the common case for intra host migration, we can rely on the normal ioctls for passing data from one VMM to the next. SEV, SEV-ES, and other confidential compute environments make most of this information opaque, and render KVM ioctls such as "KVM_GET_REGS" irrelevant. As a result, we need the ability to pass this opaque metadata from one VMM to the next. The easiest way to do this is to leave this data in the kernel, and transfer ownership of the metadata from one KVM VM (or vCPU) to the next. In-kernel hand off makes it possible to move any data that would be unsafe/impossible for the kernel to hand directly to userspace, and cannot be reproduced using data that can be handed to userspace. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-11KVM: SEV: Add support for SEV intra host migrationGravatar Peter Gonda 1-0/+14
For SEV to work with intra host migration, contents of the SEV info struct such as the ASID (used to index the encryption key in the AMD SP) and the list of memory regions need to be transferred to the target VM. This change adds a commands for a target VMM to get a source SEV VM's sev info. Signed-off-by: Peter Gonda <pgonda@google.com> Suggested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Marc Orr <marcorr@google.com> Cc: Marc Orr <marcorr@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Jim Mattson <jmattson@google.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Message-Id: <20211021174303.385706-3-pgonda@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-04Merge tag 'char-misc-5.16-rc1' of ↵Gravatar Linus Torvalds 1-8/+13
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver updates from Greg KH: "Here is the big set of char and misc and other tiny driver subsystem updates for 5.16-rc1. Loads of things in here, all of which have been in linux-next for a while with no reported problems (except for one called out below.) Included are: - habanana labs driver updates, including dma_buf usage, reviewed and acked by the dma_buf maintainers - iio driver update (going through this tree not staging as they really do not belong going through that tree anymore) - counter driver updates - hwmon driver updates that the counter drivers needed, acked by the hwmon maintainer - xillybus driver updates - binder driver updates - extcon driver updates - dma_buf module namespaces added (will cause a build error in arm64 for allmodconfig, but that change is on its way through the drm tree) - lkdtm driver updates - pvpanic driver updates - phy driver updates - virt acrn and nitr_enclaves driver updates - smaller char and misc driver updates" * tag 'char-misc-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (386 commits) comedi: dt9812: fix DMA buffers on stack comedi: ni_usb6501: fix NULL-deref in command paths arm64: errata: Enable TRBE workaround for write to out-of-range address arm64: errata: Enable workaround for TRBE overwrite in FILL mode coresight: trbe: Work around write to out of range coresight: trbe: Make sure we have enough space coresight: trbe: Add a helper to determine the minimum buffer size coresight: trbe: Workaround TRBE errata overwrite in FILL mode coresight: trbe: Add infrastructure for Errata handling coresight: trbe: Allow driver to choose a different alignment coresight: trbe: Decouple buffer base from the hardware base coresight: trbe: Add a helper to pad a given buffer area coresight: trbe: Add a helper to calculate the trace generated coresight: trbe: Defer the probe on offline CPUs coresight: trbe: Fix incorrect access of the sink specific data coresight: etm4x: Add ETM PID for Kryo-5XX coresight: trbe: Prohibit trace before disabling TRBE coresight: trbe: End the AUX handle on truncation coresight: trbe: Do not truncate buffer on IRQ coresight: trbe: Fix handling of spurious interrupts ...
2021-11-02Merge tag 'docs-5.16' of git://git.lwn.net/linuxGravatar Linus Torvalds 1-57/+62
Pull documentation updates from Jonathan Corbet: "This is a relatively unexciting cycle for documentation. - Some small scripts/kerneldoc fixes - More Chinese translation work, but at a much reduced rate. - The tip-tree maintainer's handbook ...plus the usual array of build fixes, typo fixes, etc" * tag 'docs-5.16' of git://git.lwn.net/linux: (53 commits) kernel-doc: support DECLARE_PHY_INTERFACE_MASK() docs/zh_CN: add core-api xarray translation docs/zh_CN: add core-api assoc_array translation speakup: Fix typo in documentation "boo" -> "boot" docs: submitting-patches: make section about the Link: tag more explicit docs: deprecated.rst: Clarify open-coded arithmetic with literals scripts: documentation-file-ref-check: fix bpf selftests path scripts: documentation-file-ref-check: ignore hidden files coding-style.rst: trivial: fix location of driver model macros docs: f2fs: fix text alignment docs/zh_CN add PCI pci.rst translation docs/zh_CN add PCI index.rst translation docs: translations: zh_CN: memory-hotplug.rst: fix a typo docs: translations: zn_CN: irq-affinity.rst: add a missing extension block: add documentation for inflight scripts: kernel-doc: Ignore __alloc_size() attribute docs: pdfdocs: Adjust \headheight for fancyhdr docs: UML: user_mode_linux_howto_v2 edits docs: use the lore redirector everywhere docs: proc.rst: mountinfo: align columns ...
2021-10-18selftests: KVM: Introduce system counter offset testGravatar Oliver Upton 1-14/+27
Introduce a KVM selftest to verify that userspace manipulation of the TSC (via the new vCPU attribute) results in the correct behavior within the guest. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Oliver Upton <oupton@google.com> Message-Id: <20210916181555.973085-6-oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-18KVM: x86: Expose TSC offset controls to userspaceGravatar Oliver Upton 1-0/+57
To date, VMM-directed TSC synchronization and migration has been a bit messy. KVM has some baked-in heuristics around TSC writes to infer if the VMM is attempting to synchronize. This is problematic, as it depends on host userspace writing to the guest's TSC within 1 second of the last write. A much cleaner approach to configuring the guest's views of the TSC is to simply migrate the TSC offset for every vCPU. Offsets are idempotent, and thus not subject to change depending on when the VMM actually reads/writes values from/to KVM. The VMM can then read the TSC once with KVM_GET_CLOCK to capture a (realtime, host_tsc) pair at the instant when the guest is paused. Cc: David Matlack <dmatlack@google.com> Cc: Sean Christopherson <seanjc@google.com> Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210916181538.968978-8-oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-18KVM: x86: Report host tsc and realtime values in KVM_GET_CLOCKGravatar Oliver Upton 1-9/+39
Handling the migration of TSCs correctly is difficult, in part because Linux does not provide userspace with the ability to retrieve a (TSC, realtime) clock pair for a single instant in time. In lieu of a more convenient facility, KVM can report similar information in the kvm_clock structure. Provide userspace with a host TSC & realtime pair iff the realtime clock is based on the TSC. If userspace provides KVM_SET_CLOCK with a valid realtime value, advance the KVM clock by the amount of elapsed time. Do not step the KVM clock backwards, though, as it is a monotonic oscillator. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210916181538.968978-5-oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-12docs: UML: user_mode_linux_howto_v2 editsGravatar Randy Dunlap 1-57/+62
Fix various typos, command syntax, punctuation, capitalization, and whitespace. Fixes: 04301bf5b072 ("docs: replace the old User Mode Linux HowTo with a new one") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: linux-um@lists.infradead.org Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Acked-By: anton ivanov <anton.ivanov@cambridgegreys.com> Link: https://lore.kernel.org/r/20211010064827.3405-1-rdunlap@infradead.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-10-05Merge tag 'kvm-riscv-5.16-1' of git://github.com/kvm-riscv/linux into HEADGravatar Paolo Bonzini 1-9/+184
Initial KVM RISC-V support Following features are supported by the initial KVM RISC-V support: 1. No RISC-V specific KVM IOCTL 2. Loadable KVM RISC-V module 3. Minimal possible KVM world-switch which touches only GPRs and few CSRs 4. Works on both RV64 and RV32 host 5. Full Guest/VM switch via vcpu_get/vcpu_put infrastructure 6. KVM ONE_REG interface for VCPU register access from KVM user-space 7. Interrupt controller emulation in KVM user-space 8. Timer and IPI emuation in kernel 9. Both Sv39x4 and Sv48x4 supported for RV64 host 10. MMU notifiers supported 11. Generic dirty log supported 12. FP lazy save/restore supported 13. SBI v0.1 emulation for Guest/VM 14. Forward unhandled SBI calls to KVM user-space 15. Hugepage support for Guest/VM 16. IOEVENTFD support for Vhost
2021-10-04RISC-V: KVM: Document RISC-V specific parts of KVM APIGravatar Anup Patel 1-9/+184
Document RISC-V specific parts of the KVM API, such as: - The interrupt numbers passed to the KVM_INTERRUPT ioctl. - The states supported by the KVM_{GET,SET}_MP_STATE ioctls. - The registers supported by the KVM_{GET,SET}_ONE_REG interface and the encoding of those register ids. - The exit reason KVM_EXIT_RISCV_SBI for SBI calls forwarded to userspace tool. CC: Jonathan Corbet <corbet@lwn.net> CC: linux-doc@vger.kernel.org Signed-off-by: Anup Patel <anup.patel@wdc.com> Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-09-30kvm: rename KVM_MAX_VCPU_ID to KVM_MAX_VCPU_IDSGravatar Juergen Gross 2-2/+2
KVM_MAX_VCPU_ID is not specifying the highest allowed vcpu-id, but the number of allowed vcpu-ids. This has already led to confusion, so rename KVM_MAX_VCPU_ID to KVM_MAX_VCPU_IDS to make its semantics more clear Suggested-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210913135745.13944-3-jgross@suse.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-14nitro_enclaves: Update documentation for Arm64 supportGravatar Andra Paraschiv 1-8/+13
Add references for hugepages and booting steps for Arm64. Include info about the current supported architectures for the NE kernel driver. Reviewed-by: George-Aurelian Popescu <popegeo@amazon.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Andra Paraschiv <andraprs@amazon.com> Link: https://lore.kernel.org/r/20210827154930.40608-3-andraprs@amazon.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-07Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmGravatar Linus Torvalds 2-8/+34
Pull KVM updates from Paolo Bonzini: "ARM: - Page ownership tracking between host EL1 and EL2 - Rely on userspace page tables to create large stage-2 mappings - Fix incompatibility between pKVM and kmemleak - Fix the PMU reset state, and improve the performance of the virtual PMU - Move over to the generic KVM entry code - Address PSCI reset issues w.r.t. save/restore - Preliminary rework for the upcoming pKVM fixed feature - A bunch of MM cleanups - a vGIC fix for timer spurious interrupts - Various cleanups s390: - enable interpretation of specification exceptions - fix a vcpu_idx vs vcpu_id mixup x86: - fast (lockless) page fault support for the new MMU - new MMU now the default - increased maximum allowed VCPU count - allow inhibit IRQs on KVM_RUN while debugging guests - let Hyper-V-enabled guests run with virtualized LAPIC as long as they do not enable the Hyper-V "AutoEOI" feature - fixes and optimizations for the toggling of AMD AVIC (virtualized LAPIC) - tuning for the case when two-dimensional paging (EPT/NPT) is disabled - bugfixes and cleanups, especially with respect to vCPU reset and choosing a paging mode based on CR0/CR4/EFER - support for 5-level page table on AMD processors Generic: - MMU notifier invalidation callbacks do not take mmu_lock unless necessary - improved caching of LRU kvm_memory_slot - support for histogram statistics - add statistics for halt polling and remote TLB flush requests" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (210 commits) KVM: Drop unused kvm_dirty_gfn_invalid() KVM: x86: Update vCPU's hv_clock before back to guest when tsc_offset is adjusted KVM: MMU: mark role_regs and role accessors as maybe unused KVM: MIPS: Remove a "set but not used" variable x86/kvm: Don't enable IRQ when IRQ enabled in kvm_wait KVM: stats: Add VM stat for remote tlb flush requests KVM: Remove unnecessary export of kvm_{inc,dec}_notifier_count() KVM: x86/mmu: Move lpage_disallowed_link further "down" in kvm_mmu_page KVM: x86/mmu: Relocate kvm_mmu_page.tdp_mmu_page for better cache locality Revert "KVM: x86: mmu: Add guest physical address check in translate_gpa()" KVM: x86/mmu: Remove unused field mmio_cached in struct kvm_mmu_page kvm: x86: Increase KVM_SOFT_MAX_VCPUS to 710 kvm: x86: Increase MAX_VCPUS to 1024 kvm: x86: Set KVM_MAX_VCPU_ID to 4*KVM_MAX_VCPUS KVM: VMX: avoid running vmx_handle_exit_irqoff in case of emulation KVM: x86/mmu: Don't freak out if pml5_root is NULL on 4-level host KVM: s390: index kvm->arch.idle_mask by vcpu_idx KVM: s390: Enable specification exception interpretation KVM: arm64: Trim guest debug exception handling KVM: SVM: Add 5-level page table support for SVM ...
2021-09-01Merge tag 'docs-5.15' of git://git.lwn.net/linuxGravatar Linus Torvalds 1-14/+18
Pull documentation updates from Jonathan Corbet: "Yet another set of documentation changes: - A reworking of PDF generation to yield better results for documents using CJK fonts in particular. - A new set of translations into traditional Chinese, a dialect for which I am assured there is a community of interested readers. - A lot more regular Chinese translation work as well. ... plus the usual assortment of updates, fixes, typo tweaks, etc" * tag 'docs-5.15' of git://git.lwn.net/linux: (55 commits) docs: sphinx-requirements: Move sphinx_rtd_theme to top docs: pdfdocs: Enable language-specific font choice of zh_TW translations docs: pdfdocs: Teach xeCJK about character classes of quotation marks docs: pdfdocs: Permit AutoFakeSlant for CJK fonts docs: pdfdocs: One-half spacing for CJK translations docs: pdfdocs: Add conf.py local to translations for ascii-art alignment docs: pdfdocs: Preserve inter-phrase space in Korean translations docs: pdfdocs: Choose Serif font as CJK mainfont if possible docs: pdfdocs: Add CJK-language-specific font settings docs: pdfdocs: Refactor config for CJK document scripts/kernel-doc: Override -Werror from KCFLAGS with KDOC_WERROR docs/zh_CN: Add zh_CN/accounting/psi.rst doc: align Italian translation Documentation/features/vm: riscv supports THP now docs/zh_CN: add infiniband user_verbs translation docs/zh_CN: add infiniband user_mad translation docs/zh_CN: add infiniband tag_matching translation docs/zh_CN: add infiniband sysfs translation docs/zh_CN: add infiniband opa_vnic translation docs/zh_CN: add infiniband ipoib translation ...
2021-08-20KVM: x86: implement KVM_GUESTDBG_BLOCKIRQGravatar Maxim Levitsky 1-0/+1
KVM_GUESTDBG_BLOCKIRQ will allow KVM to block all interrupts while running. This change is mostly intended for more robust single stepping of the guest and it has the following benefits when enabled: * Resuming from a breakpoint is much more reliable. When resuming execution from a breakpoint, with interrupts enabled, more often than not, KVM would inject an interrupt and make the CPU jump immediately to the interrupt handler and eventually return to the breakpoint, to trigger it again. From the user point of view it looks like the CPU never executed a single instruction and in some cases that can even prevent forward progress, for example, when the breakpoint is placed by an automated script (e.g lx-symbols), which does something in response to the breakpoint and then continues the guest automatically. If the script execution takes enough time for another interrupt to arrive, the guest will be stuck on the same breakpoint RIP forever. * Normal single stepping is much more predictable, since it won't land the debugger into an interrupt handler. * RFLAGS.TF has less chance to be leaked to the guest: We set that flag behind the guest's back to do single stepping but if single step lands us into an interrupt/exception handler it will be leaked to the guest in the form of being pushed to the stack. This doesn't completely eliminate this problem as exceptions can still happen, but at least this reduces the chances of this happening. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210811122927.900604-6-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-20KVM: stats: Update doc for histogram statisticsGravatar Jing Zhang 1-8/+27
Add documentations for linear and logarithmic histogram statistics. Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210802165633.1866976-3-jingzhangos@google.com> [Small changes to the phrasing. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>