aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/kernel/smp.c
diff options
context:
space:
mode:
authorGravatar Thomas Gleixner <tglx@linutronix.de> 2023-06-15 22:33:57 +0200
committerGravatar Thomas Gleixner <tglx@linutronix.de> 2023-06-20 14:51:47 +0200
commitd7893093a7417527c0d73c9832244e65c9d0114f (patch)
tree95c8ee15bad1d6c51449392e5dae3f3a5c09bab2 /arch/x86/kernel/smp.c
parentx86/smp: Use dedicated cache-line for mwait_play_dead() (diff)
downloadlinux-d7893093a7417527c0d73c9832244e65c9d0114f.tar.gz
linux-d7893093a7417527c0d73c9832244e65c9d0114f.tar.bz2
linux-d7893093a7417527c0d73c9832244e65c9d0114f.zip
x86/smp: Cure kexec() vs. mwait_play_dead() breakage
TLDR: It's a mess. When kexec() is executed on a system with offline CPUs, which are parked in mwait_play_dead() it can end up in a triple fault during the bootup of the kexec kernel or cause hard to diagnose data corruption. The reason is that kexec() eventually overwrites the previous kernel's text, page tables, data and stack. If it writes to the cache line which is monitored by a previously offlined CPU, MWAIT resumes execution and ends up executing the wrong text, dereferencing overwritten page tables or corrupting the kexec kernels data. Cure this by bringing the offlined CPUs out of MWAIT into HLT. Write to the monitored cache line of each offline CPU, which makes MWAIT resume execution. The written control word tells the offlined CPUs to issue HLT, which does not have the MWAIT problem. That does not help, if a stray NMI, MCE or SMI hits the offlined CPUs as those make it come out of HLT. A follow up change will put them into INIT, which protects at least against NMI and SMI. Fixes: ea53069231f9 ("x86, hotplug: Use mwait to offline a processor, fix the legacy case") Reported-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Ashok Raj <ashok.raj@intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230615193330.492257119@linutronix.de
Diffstat (limited to 'arch/x86/kernel/smp.c')
-rw-r--r--arch/x86/kernel/smp.c5
1 files changed, 5 insertions, 0 deletions
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index d842875f986f..174d6232b87f 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -21,6 +21,7 @@
#include <linux/interrupt.h>
#include <linux/cpu.h>
#include <linux/gfp.h>
+#include <linux/kexec.h>
#include <asm/mtrr.h>
#include <asm/tlbflush.h>
@@ -157,6 +158,10 @@ static void native_stop_other_cpus(int wait)
if (atomic_cmpxchg(&stopping_cpu, -1, cpu) != -1)
return;
+ /* For kexec, ensure that offline CPUs are out of MWAIT and in HLT */
+ if (kexec_in_progress)
+ smp_kick_mwait_play_dead();
+
/*
* 1) Send an IPI on the reboot vector to all other CPUs.
*