Merge "Merge remote-tracking branch 'remotes/origin/tmp-b8ce823' into msm-4.14"

This commit is contained in:
Linux Build Service Account 2018-01-03 16:43:32 -08:00 committed by Gerrit - the friendly Code Review server
commit 3056452f15
668 changed files with 6978 additions and 3563 deletions

View File

@ -11,7 +11,7 @@ Required properties:
be used, but a device adhering to this binding may leave out all except be used, but a device adhering to this binding may leave out all except
for usbVID,PID. for usbVID,PID.
- reg: the port number which this device is connecting to, the range - reg: the port number which this device is connecting to, the range
is 1-31. is 1-255.
Example: Example:

View File

@ -4,7 +4,7 @@ ORC unwinder
Overview Overview
-------- --------
The kernel CONFIG_ORC_UNWINDER option enables the ORC unwinder, which is The kernel CONFIG_UNWINDER_ORC option enables the ORC unwinder, which is
similar in concept to a DWARF unwinder. The difference is that the similar in concept to a DWARF unwinder. The difference is that the
format of the ORC data is much simpler than DWARF, which in turn allows format of the ORC data is much simpler than DWARF, which in turn allows
the ORC unwinder to be much simpler and faster. the ORC unwinder to be much simpler and faster.

View File

@ -1,6 +1,4 @@
<previous description obsolete, deleted>
Virtual memory map with 4 level page tables: Virtual memory map with 4 level page tables:
0000000000000000 - 00007fffffffffff (=47 bits) user space, different per mm 0000000000000000 - 00007fffffffffff (=47 bits) user space, different per mm
@ -14,13 +12,15 @@ ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)
... unused hole ... ... unused hole ...
ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB) ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB)
... unused hole ... ... unused hole ...
fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping
ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
... unused hole ... ... unused hole ...
ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
... unused hole ... ... unused hole ...
ffffffff80000000 - ffffffff9fffffff (=512 MB) kernel text mapping, from phys 0 ffffffff80000000 - ffffffff9fffffff (=512 MB) kernel text mapping, from phys 0
ffffffffa0000000 - ffffffffff5fffff (=1526 MB) module mapping space (variable) ffffffffa0000000 - [fixmap start] (~1526 MB) module mapping space (variable)
ffffffffff600000 - ffffffffffdfffff (=8 MB) vsyscalls [fixmap start] - ffffffffff5fffff kernel-internal fixmap range
ffffffffff600000 - ffffffffff600fff (=4 kB) legacy vsyscall ABI
ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole
Virtual memory map with 5 level page tables: Virtual memory map with 5 level page tables:
@ -34,21 +34,24 @@ ff92000000000000 - ffd1ffffffffffff (=54 bits) vmalloc/ioremap space
ffd2000000000000 - ffd3ffffffffffff (=49 bits) hole ffd2000000000000 - ffd3ffffffffffff (=49 bits) hole
ffd4000000000000 - ffd5ffffffffffff (=49 bits) virtual memory map (512TB) ffd4000000000000 - ffd5ffffffffffff (=49 bits) virtual memory map (512TB)
... unused hole ... ... unused hole ...
ffd8000000000000 - fff7ffffffffffff (=53 bits) kasan shadow memory (8PB) ffdf000000000000 - fffffc0000000000 (=53 bits) kasan shadow memory (8PB)
... unused hole ... ... unused hole ...
fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping
ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
... unused hole ... ... unused hole ...
ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
... unused hole ... ... unused hole ...
ffffffff80000000 - ffffffff9fffffff (=512 MB) kernel text mapping, from phys 0 ffffffff80000000 - ffffffff9fffffff (=512 MB) kernel text mapping, from phys 0
ffffffffa0000000 - ffffffffff5fffff (=1526 MB) module mapping space ffffffffa0000000 - [fixmap start] (~1526 MB) module mapping space
ffffffffff600000 - ffffffffffdfffff (=8 MB) vsyscalls [fixmap start] - ffffffffff5fffff kernel-internal fixmap range
ffffffffff600000 - ffffffffff600fff (=4 kB) legacy vsyscall ABI
ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole
Architecture defines a 64-bit virtual address. Implementations can support Architecture defines a 64-bit virtual address. Implementations can support
less. Currently supported are 48- and 57-bit virtual addresses. Bits 63 less. Currently supported are 48- and 57-bit virtual addresses. Bits 63
through to the most-significant implemented bit are set to either all ones through to the most-significant implemented bit are sign extended.
or all zero. This causes hole between user space and kernel addresses. This causes hole between user space and kernel addresses if you interpret them
as unsigned.
The direct mapping covers all memory in the system up to the highest The direct mapping covers all memory in the system up to the highest
memory address (this means in some cases it can also include PCI memory memory address (this means in some cases it can also include PCI memory
@ -58,9 +61,6 @@ vmalloc space is lazily synchronized into the different PML4/PML5 pages of
the processes using the page fault handler, with init_top_pgt as the processes using the page fault handler, with init_top_pgt as
reference. reference.
Current X86-64 implementations support up to 46 bits of address space (64 TB),
which is our current limit. This expands into MBZ space in the page tables.
We map EFI runtime services in the 'efi_pgd' PGD in a 64Gb large virtual We map EFI runtime services in the 'efi_pgd' PGD in a 64Gb large virtual
memory window (this size is arbitrary, it can be raised later if needed). memory window (this size is arbitrary, it can be raised later if needed).
The mappings are not part of any other kernel PGD and are only available The mappings are not part of any other kernel PGD and are only available
@ -72,5 +72,3 @@ following fixmap section.
Note that if CONFIG_RANDOMIZE_MEMORY is enabled, the direct mapping of all Note that if CONFIG_RANDOMIZE_MEMORY is enabled, the direct mapping of all
physical memory, vmalloc/ioremap space and virtual memory map are randomized. physical memory, vmalloc/ioremap space and virtual memory map are randomized.
Their order is preserved but their base will be offset early at boot time. Their order is preserved but their base will be offset early at boot time.
-Andi Kleen, Jul 2004

View File

@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0 # SPDX-License-Identifier: GPL-2.0
VERSION = 4 VERSION = 4
PATCHLEVEL = 14 PATCHLEVEL = 14
SUBLEVEL = 5 SUBLEVEL = 10
EXTRAVERSION = EXTRAVERSION =
NAME = Petit Gorille NAME = Petit Gorille
@ -377,8 +377,6 @@ LDFLAGS_MODULE =
CFLAGS_KERNEL = CFLAGS_KERNEL =
AFLAGS_KERNEL = AFLAGS_KERNEL =
LDFLAGS_vmlinux = LDFLAGS_vmlinux =
CFLAGS_GCOV := -fprofile-arcs -ftest-coverage -fno-tree-loop-im $(call cc-disable-warning,maybe-uninitialized,)
CFLAGS_KCOV := $(call cc-option,-fsanitize-coverage=trace-pc,)
# Use USERINCLUDE when you must reference the UAPI directories only. # Use USERINCLUDE when you must reference the UAPI directories only.
USERINCLUDE := \ USERINCLUDE := \
@ -397,21 +395,19 @@ LINUXINCLUDE := \
-I$(objtree)/include \ -I$(objtree)/include \
$(USERINCLUDE) $(USERINCLUDE)
KBUILD_CPPFLAGS := -D__KERNEL__ KBUILD_AFLAGS := -D__ASSEMBLY__
KBUILD_CFLAGS := -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \ KBUILD_CFLAGS := -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \
-fno-strict-aliasing -fno-common -fshort-wchar \ -fno-strict-aliasing -fno-common -fshort-wchar \
-Werror-implicit-function-declaration \ -Werror-implicit-function-declaration \
-Wno-format-security \ -Wno-format-security \
-std=gnu89 $(call cc-option,-fno-PIE) -std=gnu89
KBUILD_CPPFLAGS := -D__KERNEL__
KBUILD_AFLAGS_KERNEL := KBUILD_AFLAGS_KERNEL :=
KBUILD_CFLAGS_KERNEL := KBUILD_CFLAGS_KERNEL :=
KBUILD_AFLAGS := -D__ASSEMBLY__ $(call cc-option,-fno-PIE)
KBUILD_AFLAGS_MODULE := -DMODULE KBUILD_AFLAGS_MODULE := -DMODULE
KBUILD_CFLAGS_MODULE := -DMODULE KBUILD_CFLAGS_MODULE := -DMODULE
KBUILD_LDFLAGS_MODULE := -T $(srctree)/scripts/module-common.lds KBUILD_LDFLAGS_MODULE := -T $(srctree)/scripts/module-common.lds
GCC_PLUGINS_CFLAGS :=
# Read KERNELRELEASE from include/config/kernel.release (if it exists) # Read KERNELRELEASE from include/config/kernel.release (if it exists)
KERNELRELEASE = $(shell cat include/config/kernel.release 2> /dev/null) KERNELRELEASE = $(shell cat include/config/kernel.release 2> /dev/null)
@ -424,7 +420,7 @@ export MAKE AWK GENKSYMS INSTALLKERNEL PERL PYTHON UTS_MACHINE
export HOSTCXX HOSTCXXFLAGS LDFLAGS_MODULE CHECK CHECKFLAGS export HOSTCXX HOSTCXXFLAGS LDFLAGS_MODULE CHECK CHECKFLAGS
export KBUILD_CPPFLAGS NOSTDINC_FLAGS LINUXINCLUDE OBJCOPYFLAGS LDFLAGS export KBUILD_CPPFLAGS NOSTDINC_FLAGS LINUXINCLUDE OBJCOPYFLAGS LDFLAGS
export KBUILD_CFLAGS CFLAGS_KERNEL CFLAGS_MODULE CFLAGS_GCOV CFLAGS_KCOV CFLAGS_KASAN CFLAGS_UBSAN export KBUILD_CFLAGS CFLAGS_KERNEL CFLAGS_MODULE CFLAGS_KASAN CFLAGS_UBSAN
export KBUILD_AFLAGS AFLAGS_KERNEL AFLAGS_MODULE export KBUILD_AFLAGS AFLAGS_KERNEL AFLAGS_MODULE
export KBUILD_AFLAGS_MODULE KBUILD_CFLAGS_MODULE KBUILD_LDFLAGS_MODULE export KBUILD_AFLAGS_MODULE KBUILD_CFLAGS_MODULE KBUILD_LDFLAGS_MODULE
export KBUILD_AFLAGS_KERNEL KBUILD_CFLAGS_KERNEL export KBUILD_AFLAGS_KERNEL KBUILD_CFLAGS_KERNEL
@ -625,6 +621,12 @@ endif
# Defaults to vmlinux, but the arch makefile usually adds further targets # Defaults to vmlinux, but the arch makefile usually adds further targets
all: vmlinux all: vmlinux
KBUILD_CFLAGS += $(call cc-option,-fno-PIE)
KBUILD_AFLAGS += $(call cc-option,-fno-PIE)
CFLAGS_GCOV := -fprofile-arcs -ftest-coverage -fno-tree-loop-im $(call cc-disable-warning,maybe-uninitialized,)
CFLAGS_KCOV := $(call cc-option,-fsanitize-coverage=trace-pc,)
export CFLAGS_GCOV CFLAGS_KCOV
# The arch Makefile can set ARCH_{CPP,A,C}FLAGS to override the default # The arch Makefile can set ARCH_{CPP,A,C}FLAGS to override the default
# values of the respective KBUILD_* variables # values of the respective KBUILD_* variables
ARCH_CPPFLAGS := ARCH_CPPFLAGS :=
@ -947,8 +949,8 @@ ifdef CONFIG_STACK_VALIDATION
ifeq ($(has_libelf),1) ifeq ($(has_libelf),1)
objtool_target := tools/objtool FORCE objtool_target := tools/objtool FORCE
else else
ifdef CONFIG_ORC_UNWINDER ifdef CONFIG_UNWINDER_ORC
$(error "Cannot generate ORC metadata for CONFIG_ORC_UNWINDER=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel") $(error "Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel")
else else
$(warning "Cannot use CONFIG_STACK_VALIDATION=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel") $(warning "Cannot use CONFIG_STACK_VALIDATION=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel")
endif endif

View File

@ -433,15 +433,6 @@
clock-names = "ipg", "per"; clock-names = "ipg", "per";
}; };
srtc: srtc@53fa4000 {
compatible = "fsl,imx53-rtc", "fsl,imx25-rtc";
reg = <0x53fa4000 0x4000>;
interrupts = <24>;
interrupt-parent = <&tzic>;
clocks = <&clks IMX5_CLK_SRTC_GATE>;
clock-names = "ipg";
};
iomuxc: iomuxc@53fa8000 { iomuxc: iomuxc@53fa8000 {
compatible = "fsl,imx53-iomuxc"; compatible = "fsl,imx53-iomuxc";
reg = <0x53fa8000 0x4000>; reg = <0x53fa8000 0x4000>;

View File

@ -244,7 +244,7 @@ CONFIG_USB_STORAGE_ONETOUCH=m
CONFIG_USB_STORAGE_KARMA=m CONFIG_USB_STORAGE_KARMA=m
CONFIG_USB_STORAGE_CYPRESS_ATACB=m CONFIG_USB_STORAGE_CYPRESS_ATACB=m
CONFIG_USB_STORAGE_ENE_UB6250=m CONFIG_USB_STORAGE_ENE_UB6250=m
CONFIG_USB_UAS=m CONFIG_USB_UAS=y
CONFIG_USB_DWC3=y CONFIG_USB_DWC3=y
CONFIG_USB_DWC2=y CONFIG_USB_DWC2=y
CONFIG_USB_HSIC_USB3503=y CONFIG_USB_HSIC_USB3503=y

View File

@ -518,4 +518,22 @@ THUMB( orr \reg , \reg , #PSR_T_BIT )
#endif #endif
.endm .endm
.macro bug, msg, line
#ifdef CONFIG_THUMB2_KERNEL
1: .inst 0xde02
#else
1: .inst 0xe7f001f2
#endif
#ifdef CONFIG_DEBUG_BUGVERBOSE
.pushsection .rodata.str, "aMS", %progbits, 1
2: .asciz "\msg"
.popsection
.pushsection __bug_table, "aw"
.align 2
.word 1b, 2b
.hword \line
.popsection
#endif
.endm
#endif /* __ASM_ASSEMBLER_H__ */ #endif /* __ASM_ASSEMBLER_H__ */

View File

@ -161,8 +161,7 @@
#else #else
#define VTTBR_X (5 - KVM_T0SZ) #define VTTBR_X (5 - KVM_T0SZ)
#endif #endif
#define VTTBR_BADDR_SHIFT (VTTBR_X - 1) #define VTTBR_BADDR_MASK (((_AC(1, ULL) << (40 - VTTBR_X)) - 1) << VTTBR_X)
#define VTTBR_BADDR_MASK (((_AC(1, ULL) << (40 - VTTBR_X)) - 1) << VTTBR_BADDR_SHIFT)
#define VTTBR_VMID_SHIFT _AC(48, ULL) #define VTTBR_VMID_SHIFT _AC(48, ULL)
#define VTTBR_VMID_MASK(size) (_AT(u64, (1 << size) - 1) << VTTBR_VMID_SHIFT) #define VTTBR_VMID_MASK(size) (_AT(u64, (1 << size) - 1) << VTTBR_VMID_SHIFT)

View File

@ -126,8 +126,7 @@ extern unsigned long profile_pc(struct pt_regs *regs);
/* /*
* kprobe-based event tracer support * kprobe-based event tracer support
*/ */
#include <linux/stddef.h> #include <linux/compiler.h>
#include <linux/types.h>
#define MAX_REG_OFFSET (offsetof(struct pt_regs, ARM_ORIG_r0)) #define MAX_REG_OFFSET (offsetof(struct pt_regs, ARM_ORIG_r0))
extern int regs_query_register_offset(const char *name); extern int regs_query_register_offset(const char *name);

View File

@ -300,6 +300,8 @@
mov r2, sp mov r2, sp
ldr r1, [r2, #\offset + S_PSR] @ get calling cpsr ldr r1, [r2, #\offset + S_PSR] @ get calling cpsr
ldr lr, [r2, #\offset + S_PC]! @ get pc ldr lr, [r2, #\offset + S_PC]! @ get pc
tst r1, #PSR_I_BIT | 0x0f
bne 1f
msr spsr_cxsf, r1 @ save in spsr_svc msr spsr_cxsf, r1 @ save in spsr_svc
#if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_32v6K) #if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_32v6K)
@ We must avoid clrex due to Cortex-A15 erratum #830321 @ We must avoid clrex due to Cortex-A15 erratum #830321
@ -314,6 +316,7 @@
@ after ldm {}^ @ after ldm {}^
add sp, sp, #\offset + PT_REGS_SIZE add sp, sp, #\offset + PT_REGS_SIZE
movs pc, lr @ return & move spsr_svc into cpsr movs pc, lr @ return & move spsr_svc into cpsr
1: bug "Returning to usermode but unexpected PSR bits set?", \@
#elif defined(CONFIG_CPU_V7M) #elif defined(CONFIG_CPU_V7M)
@ V7M restore. @ V7M restore.
@ Note that we don't need to do clrex here as clearing the local @ Note that we don't need to do clrex here as clearing the local
@ -329,6 +332,8 @@
ldr r1, [sp, #\offset + S_PSR] @ get calling cpsr ldr r1, [sp, #\offset + S_PSR] @ get calling cpsr
ldr lr, [sp, #\offset + S_PC] @ get pc ldr lr, [sp, #\offset + S_PC] @ get pc
add sp, sp, #\offset + S_SP add sp, sp, #\offset + S_SP
tst r1, #PSR_I_BIT | 0x0f
bne 1f
msr spsr_cxsf, r1 @ save in spsr_svc msr spsr_cxsf, r1 @ save in spsr_svc
@ We must avoid clrex due to Cortex-A15 erratum #830321 @ We must avoid clrex due to Cortex-A15 erratum #830321
@ -341,6 +346,7 @@
.endif .endif
add sp, sp, #PT_REGS_SIZE - S_SP add sp, sp, #PT_REGS_SIZE - S_SP
movs pc, lr @ return & move spsr_svc into cpsr movs pc, lr @ return & move spsr_svc into cpsr
1: bug "Returning to usermode but unexpected PSR bits set?", \@
#endif /* !CONFIG_THUMB2_KERNEL */ #endif /* !CONFIG_THUMB2_KERNEL */
.endm .endm

View File

@ -14,8 +14,12 @@ LDFLAGS_vmlinux :=-p --no-undefined -X
CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET) CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET)
GZFLAGS :=-9 GZFLAGS :=-9
ifneq ($(CONFIG_RELOCATABLE),) ifeq ($(CONFIG_RELOCATABLE), y)
LDFLAGS_vmlinux += -pie -shared -Bsymbolic # Pass --no-apply-dynamic-relocs to restore pre-binutils-2.27 behaviour
# for relative relocs, since this leads to better Image compression
# with the relocation offsets always being zero.
LDFLAGS_vmlinux += -pie -shared -Bsymbolic \
$(call ld-option, --no-apply-dynamic-relocs)
endif endif
ifeq ($(CONFIG_ARM64_ERRATUM_843419),y) ifeq ($(CONFIG_ARM64_ERRATUM_843419),y)

View File

@ -301,6 +301,7 @@
&usb1_phy { &usb1_phy {
status = "okay"; status = "okay";
phy-supply = <&usb_otg_pwr>;
}; };
&usb0 { &usb0 {

View File

@ -215,7 +215,6 @@ typedef struct compat_siginfo {
} compat_siginfo_t; } compat_siginfo_t;
#define COMPAT_OFF_T_MAX 0x7fffffff #define COMPAT_OFF_T_MAX 0x7fffffff
#define COMPAT_LOFF_T_MAX 0x7fffffffffffffffL
/* /*
* A pointer passed in from user mode. This should not * A pointer passed in from user mode. This should not

View File

@ -132,10 +132,8 @@ static inline void efi_set_pgd(struct mm_struct *mm)
* Defer the switch to the current thread's TTBR0_EL1 * Defer the switch to the current thread's TTBR0_EL1
* until uaccess_enable(). Restore the current * until uaccess_enable(). Restore the current
* thread's saved ttbr0 corresponding to its active_mm * thread's saved ttbr0 corresponding to its active_mm
* (if different from init_mm).
*/ */
cpu_set_reserved_ttbr0(); cpu_set_reserved_ttbr0();
if (current->active_mm != &init_mm)
update_saved_ttbr0(current, current->active_mm); update_saved_ttbr0(current, current->active_mm);
} }
} }

View File

@ -51,6 +51,13 @@ enum fixed_addresses {
FIX_EARLYCON_MEM_BASE, FIX_EARLYCON_MEM_BASE,
FIX_TEXT_POKE0, FIX_TEXT_POKE0,
#ifdef CONFIG_ACPI_APEI_GHES
/* Used for GHES mapping from assorted contexts */
FIX_APEI_GHES_IRQ,
FIX_APEI_GHES_NMI,
#endif /* CONFIG_ACPI_APEI_GHES */
__end_of_permanent_fixed_addresses, __end_of_permanent_fixed_addresses,
/* /*

View File

@ -170,8 +170,7 @@
#define VTCR_EL2_FLAGS (VTCR_EL2_COMMON_BITS | VTCR_EL2_TGRAN_FLAGS) #define VTCR_EL2_FLAGS (VTCR_EL2_COMMON_BITS | VTCR_EL2_TGRAN_FLAGS)
#define VTTBR_X (VTTBR_X_TGRAN_MAGIC - VTCR_EL2_T0SZ_IPA) #define VTTBR_X (VTTBR_X_TGRAN_MAGIC - VTCR_EL2_T0SZ_IPA)
#define VTTBR_BADDR_SHIFT (VTTBR_X - 1) #define VTTBR_BADDR_MASK (((UL(1) << (PHYS_MASK_SHIFT - VTTBR_X)) - 1) << VTTBR_X)
#define VTTBR_BADDR_MASK (((UL(1) << (PHYS_MASK_SHIFT - VTTBR_X)) - 1) << VTTBR_BADDR_SHIFT)
#define VTTBR_VMID_SHIFT (UL(48)) #define VTTBR_VMID_SHIFT (UL(48))
#define VTTBR_VMID_MASK(size) (_AT(u64, (1 << size) - 1) << VTTBR_VMID_SHIFT) #define VTTBR_VMID_MASK(size) (_AT(u64, (1 << size) - 1) << VTTBR_VMID_SHIFT)

View File

@ -162,29 +162,21 @@ void check_and_switch_context(struct mm_struct *mm, unsigned int cpu);
#define init_new_context(tsk,mm) ({ atomic64_set(&(mm)->context.id, 0); 0; }) #define init_new_context(tsk,mm) ({ atomic64_set(&(mm)->context.id, 0); 0; })
/*
* This is called when "tsk" is about to enter lazy TLB mode.
*
* mm: describes the currently active mm context
* tsk: task which is entering lazy tlb
* cpu: cpu number which is entering lazy tlb
*
* tsk->mm will be NULL
*/
static inline void
enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk)
{
}
#ifdef CONFIG_ARM64_SW_TTBR0_PAN #ifdef CONFIG_ARM64_SW_TTBR0_PAN
static inline void update_saved_ttbr0(struct task_struct *tsk, static inline void update_saved_ttbr0(struct task_struct *tsk,
struct mm_struct *mm) struct mm_struct *mm)
{ {
if (system_uses_ttbr0_pan()) { u64 ttbr;
BUG_ON(mm->pgd == swapper_pg_dir);
task_thread_info(tsk)->ttbr0 = if (!system_uses_ttbr0_pan())
virt_to_phys(mm->pgd) | ASID(mm) << 48; return;
}
if (mm == &init_mm)
ttbr = __pa_symbol(empty_zero_page);
else
ttbr = virt_to_phys(mm->pgd) | ASID(mm) << 48;
task_thread_info(tsk)->ttbr0 = ttbr;
} }
#else #else
static inline void update_saved_ttbr0(struct task_struct *tsk, static inline void update_saved_ttbr0(struct task_struct *tsk,
@ -193,6 +185,16 @@ static inline void update_saved_ttbr0(struct task_struct *tsk,
} }
#endif #endif
static inline void
enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk)
{
/*
* We don't actually care about the ttbr0 mapping, so point it at the
* zero page.
*/
update_saved_ttbr0(tsk, &init_mm);
}
static inline void __switch_mm(struct mm_struct *next) static inline void __switch_mm(struct mm_struct *next)
{ {
unsigned int cpu = smp_processor_id(); unsigned int cpu = smp_processor_id();
@ -220,10 +222,8 @@ switch_mm(struct mm_struct *prev, struct mm_struct *next,
* Update the saved TTBR0_EL1 of the scheduled-in task as the previous * Update the saved TTBR0_EL1 of the scheduled-in task as the previous
* value may have not been initialised yet (activate_mm caller) or the * value may have not been initialised yet (activate_mm caller) or the
* ASID has changed since the last run (following the context switch * ASID has changed since the last run (following the context switch
* of another thread of the same process). Avoid setting the reserved * of another thread of the same process).
* TTBR0_EL1 to swapper_pg_dir (init_mm; e.g. via idle_task_exit).
*/ */
if (next != &init_mm)
update_saved_ttbr0(tsk, next); update_saved_ttbr0(tsk, next);
} }

View File

@ -150,12 +150,20 @@ static inline pte_t pte_mkwrite(pte_t pte)
static inline pte_t pte_mkclean(pte_t pte) static inline pte_t pte_mkclean(pte_t pte)
{ {
return clear_pte_bit(pte, __pgprot(PTE_DIRTY)); pte = clear_pte_bit(pte, __pgprot(PTE_DIRTY));
pte = set_pte_bit(pte, __pgprot(PTE_RDONLY));
return pte;
} }
static inline pte_t pte_mkdirty(pte_t pte) static inline pte_t pte_mkdirty(pte_t pte)
{ {
return set_pte_bit(pte, __pgprot(PTE_DIRTY)); pte = set_pte_bit(pte, __pgprot(PTE_DIRTY));
if (pte_write(pte))
pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY));
return pte;
} }
static inline pte_t pte_mkold(pte_t pte) static inline pte_t pte_mkold(pte_t pte)
@ -671,28 +679,23 @@ static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm,
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
/* /*
* ptep_set_wrprotect - mark read-only while preserving the hardware update of * ptep_set_wrprotect - mark read-only while trasferring potential hardware
* the Access Flag. * dirty status (PTE_DBM && !PTE_RDONLY) to the software PTE_DIRTY bit.
*/ */
#define __HAVE_ARCH_PTEP_SET_WRPROTECT #define __HAVE_ARCH_PTEP_SET_WRPROTECT
static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long address, pte_t *ptep) static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long address, pte_t *ptep)
{ {
pte_t old_pte, pte; pte_t old_pte, pte;
/*
* ptep_set_wrprotect() is only called on CoW mappings which are
* private (!VM_SHARED) with the pte either read-only (!PTE_WRITE &&
* PTE_RDONLY) or writable and software-dirty (PTE_WRITE &&
* !PTE_RDONLY && PTE_DIRTY); see is_cow_mapping() and
* protection_map[]. There is no race with the hardware update of the
* dirty state: clearing of PTE_RDONLY when PTE_WRITE (a.k.a. PTE_DBM)
* is set.
*/
VM_WARN_ONCE(pte_write(*ptep) && !pte_dirty(*ptep),
"%s: potential race with hardware DBM", __func__);
pte = READ_ONCE(*ptep); pte = READ_ONCE(*ptep);
do { do {
old_pte = pte; old_pte = pte;
/*
* If hardware-dirty (PTE_WRITE/DBM bit set and PTE_RDONLY
* clear), set the PTE_DIRTY bit.
*/
if (pte_hw_dirty(pte))
pte = pte_mkdirty(pte);
pte = pte_wrprotect(pte); pte = pte_wrprotect(pte);
pte_val(pte) = cmpxchg_relaxed(&pte_val(*ptep), pte_val(pte) = cmpxchg_relaxed(&pte_val(*ptep),
pte_val(old_pte), pte_val(pte)); pte_val(old_pte), pte_val(pte));

View File

@ -319,6 +319,15 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
memset(&p->thread.cpu_context, 0, sizeof(struct cpu_context)); memset(&p->thread.cpu_context, 0, sizeof(struct cpu_context));
/*
* In case p was allocated the same task_struct pointer as some
* other recently-exited task, make sure p is disassociated from
* any cpu that may have run that now-exited task recently.
* Otherwise we could erroneously skip reloading the FPSIMD
* registers for p.
*/
fpsimd_flush_task_state(p);
if (likely(!(p->flags & PF_KTHREAD))) { if (likely(!(p->flags & PF_KTHREAD))) {
*childregs = *current_pt_regs(); *childregs = *current_pt_regs();
childregs->regs[0] = 0; childregs->regs[0] = 0;

View File

@ -84,6 +84,9 @@ static void __hyp_text __debug_save_spe_nvhe(u64 *pmscr_el1)
{ {
u64 reg; u64 reg;
/* Clear pmscr in case of early return */
*pmscr_el1 = 0;
/* SPE present on this CPU? */ /* SPE present on this CPU? */
if (!cpuid_feature_extract_unsigned_field(read_sysreg(id_aa64dfr0_el1), if (!cpuid_feature_extract_unsigned_field(read_sysreg(id_aa64dfr0_el1),
ID_AA64DFR0_PMSVER_SHIFT)) ID_AA64DFR0_PMSVER_SHIFT))

View File

@ -389,7 +389,7 @@ void ptdump_check_wx(void)
.check_wx = true, .check_wx = true,
}; };
walk_pgd(&st, &init_mm, 0); walk_pgd(&st, &init_mm, VA_START);
note_page(&st, 0, 0, 0); note_page(&st, 0, 0, 0);
if (st.wx_pages || st.uxn_pages) if (st.wx_pages || st.uxn_pages)
pr_warn("Checked W+X mappings: FAILED, %lu W+X pages found, %lu non-UXN pages found\n", pr_warn("Checked W+X mappings: FAILED, %lu W+X pages found, %lu non-UXN pages found\n",

View File

@ -478,6 +478,8 @@ void __init arm64_memblock_init(void)
reserve_elfcorehdr(); reserve_elfcorehdr();
high_memory = __va(memblock_end_of_DRAM() - 1) + 1;
dma_contiguous_reserve(arm64_dma_phys_limit); dma_contiguous_reserve(arm64_dma_phys_limit);
memblock_allow_resize(); memblock_allow_resize();
@ -504,7 +506,6 @@ void __init bootmem_init(void)
sparse_init(); sparse_init();
zone_sizes_init(min, max); zone_sizes_init(min, max);
high_memory = __va((max << PAGE_SHIFT) - 1) + 1;
memblock_dump_all(); memblock_dump_all();
} }

View File

@ -321,11 +321,14 @@ config BF53x
config GPIO_ADI config GPIO_ADI
def_bool y def_bool y
depends on !PINCTRL
depends on (BF51x || BF52x || BF53x || BF538 || BF539 || BF561) depends on (BF51x || BF52x || BF53x || BF538 || BF539 || BF561)
config PINCTRL config PINCTRL_BLACKFIN_ADI2
def_bool y def_bool y
depends on BF54x || BF60x depends on (BF54x || BF60x)
select PINCTRL
select PINCTRL_ADI2
config MEM_MT48LC64M4A2FB_7E config MEM_MT48LC64M4A2FB_7E
bool bool

View File

@ -18,6 +18,7 @@ config DEBUG_VERBOSE
config DEBUG_MMRS config DEBUG_MMRS
tristate "Generate Blackfin MMR tree" tristate "Generate Blackfin MMR tree"
depends on !PINCTRL
select DEBUG_FS select DEBUG_FS
help help
Create a tree of Blackfin MMRs via the debugfs tree. If Create a tree of Blackfin MMRs via the debugfs tree. If

View File

@ -200,7 +200,6 @@ typedef struct compat_siginfo {
} compat_siginfo_t; } compat_siginfo_t;
#define COMPAT_OFF_T_MAX 0x7fffffff #define COMPAT_OFF_T_MAX 0x7fffffff
#define COMPAT_LOFF_T_MAX 0x7fffffffffffffffL
/* /*
* A pointer passed in from user mode. This should not * A pointer passed in from user mode. This should not

View File

@ -195,7 +195,6 @@ typedef struct compat_siginfo {
} compat_siginfo_t; } compat_siginfo_t;
#define COMPAT_OFF_T_MAX 0x7fffffff #define COMPAT_OFF_T_MAX 0x7fffffff
#define COMPAT_LOFF_T_MAX 0x7fffffffffffffffL
struct compat_ipc64_perm { struct compat_ipc64_perm {
compat_key_t key; compat_key_t key;

View File

@ -878,9 +878,6 @@ ENTRY_CFI(syscall_exit_rfi)
STREG %r19,PT_SR7(%r16) STREG %r19,PT_SR7(%r16)
intr_return: intr_return:
/* NOTE: Need to enable interrupts incase we schedule. */
ssm PSW_SM_I, %r0
/* check for reschedule */ /* check for reschedule */
mfctl %cr30,%r1 mfctl %cr30,%r1
LDREG TI_FLAGS(%r1),%r19 /* sched.h: TIF_NEED_RESCHED */ LDREG TI_FLAGS(%r1),%r19 /* sched.h: TIF_NEED_RESCHED */
@ -907,6 +904,11 @@ intr_check_sig:
LDREG PT_IASQ1(%r16), %r20 LDREG PT_IASQ1(%r16), %r20
cmpib,COND(=),n 0,%r20,intr_restore /* backward */ cmpib,COND(=),n 0,%r20,intr_restore /* backward */
/* NOTE: We need to enable interrupts if we have to deliver
* signals. We used to do this earlier but it caused kernel
* stack overflows. */
ssm PSW_SM_I, %r0
copy %r0, %r25 /* long in_syscall = 0 */ copy %r0, %r25 /* long in_syscall = 0 */
#ifdef CONFIG_64BIT #ifdef CONFIG_64BIT
ldo -16(%r30),%r29 /* Reference param save area */ ldo -16(%r30),%r29 /* Reference param save area */
@ -958,6 +960,10 @@ intr_do_resched:
cmpib,COND(=) 0, %r20, intr_do_preempt cmpib,COND(=) 0, %r20, intr_do_preempt
nop nop
/* NOTE: We need to enable interrupts if we schedule. We used
* to do this earlier but it caused kernel stack overflows. */
ssm PSW_SM_I, %r0
#ifdef CONFIG_64BIT #ifdef CONFIG_64BIT
ldo -16(%r30),%r29 /* Reference param save area */ ldo -16(%r30),%r29 /* Reference param save area */
#endif #endif

View File

@ -305,6 +305,7 @@ ENDPROC_CFI(os_hpmc)
__INITRODATA __INITRODATA
.align 4
.export os_hpmc_size .export os_hpmc_size
os_hpmc_size: os_hpmc_size:
.word .os_hpmc_end-.os_hpmc .word .os_hpmc_end-.os_hpmc

View File

@ -185,7 +185,6 @@ typedef struct compat_siginfo {
} compat_siginfo_t; } compat_siginfo_t;
#define COMPAT_OFF_T_MAX 0x7fffffff #define COMPAT_OFF_T_MAX 0x7fffffff
#define COMPAT_LOFF_T_MAX 0x7fffffffffffffffL
/* /*
* A pointer passed in from user mode. This should not * A pointer passed in from user mode. This should not

View File

@ -76,6 +76,7 @@ struct machdep_calls {
void __noreturn (*restart)(char *cmd); void __noreturn (*restart)(char *cmd);
void __noreturn (*halt)(void); void __noreturn (*halt)(void);
void (*panic)(char *str);
void (*cpu_die)(void); void (*cpu_die)(void);
long (*time_init)(void); /* Optional, may be NULL */ long (*time_init)(void); /* Optional, may be NULL */

View File

@ -114,9 +114,10 @@ static inline void enter_lazy_tlb(struct mm_struct *mm,
#endif #endif
} }
static inline void arch_dup_mmap(struct mm_struct *oldmm, static inline int arch_dup_mmap(struct mm_struct *oldmm,
struct mm_struct *mm) struct mm_struct *mm)
{ {
return 0;
} }
static inline void arch_exit_mmap(struct mm_struct *mm) static inline void arch_exit_mmap(struct mm_struct *mm)

View File

@ -24,6 +24,7 @@ extern void reloc_got2(unsigned long);
void check_for_initrd(void); void check_for_initrd(void);
void initmem_init(void); void initmem_init(void);
void setup_panic(void);
#define ARCH_PANIC_TIMEOUT 180 #define ARCH_PANIC_TIMEOUT 180
#ifdef CONFIG_PPC_PSERIES #ifdef CONFIG_PPC_PSERIES

View File

@ -102,6 +102,7 @@ _GLOBAL(__setup_cpu_power9)
li r0,0 li r0,0
mtspr SPRN_PSSCR,r0 mtspr SPRN_PSSCR,r0
mtspr SPRN_LPID,r0 mtspr SPRN_LPID,r0
mtspr SPRN_PID,r0
mfspr r3,SPRN_LPCR mfspr r3,SPRN_LPCR
LOAD_REG_IMMEDIATE(r4, LPCR_PECEDH | LPCR_PECE_HVEE | LPCR_HVICE | LPCR_HEIC) LOAD_REG_IMMEDIATE(r4, LPCR_PECEDH | LPCR_PECE_HVEE | LPCR_HVICE | LPCR_HEIC)
or r3, r3, r4 or r3, r3, r4
@ -126,6 +127,7 @@ _GLOBAL(__restore_cpu_power9)
li r0,0 li r0,0
mtspr SPRN_PSSCR,r0 mtspr SPRN_PSSCR,r0
mtspr SPRN_LPID,r0 mtspr SPRN_LPID,r0
mtspr SPRN_PID,r0
mfspr r3,SPRN_LPCR mfspr r3,SPRN_LPCR
LOAD_REG_IMMEDIATE(r4, LPCR_PECEDH | LPCR_PECE_HVEE | LPCR_HVICE | LPCR_HEIC) LOAD_REG_IMMEDIATE(r4, LPCR_PECEDH | LPCR_PECE_HVEE | LPCR_HVICE | LPCR_HEIC)
or r3, r3, r4 or r3, r3, r4

View File

@ -1453,25 +1453,6 @@ static void fadump_init_files(void)
return; return;
} }
static int fadump_panic_event(struct notifier_block *this,
unsigned long event, void *ptr)
{
/*
* If firmware-assisted dump has been registered then trigger
* firmware-assisted dump and let firmware handle everything
* else. If this returns, then fadump was not registered, so
* go through the rest of the panic path.
*/
crash_fadump(NULL, ptr);
return NOTIFY_DONE;
}
static struct notifier_block fadump_panic_block = {
.notifier_call = fadump_panic_event,
.priority = INT_MIN /* may not return; must be done last */
};
/* /*
* Prepare for firmware-assisted dump. * Prepare for firmware-assisted dump.
*/ */
@ -1504,9 +1485,6 @@ int __init setup_fadump(void)
init_fadump_mem_struct(&fdm, fw_dump.reserve_dump_area_start); init_fadump_mem_struct(&fdm, fw_dump.reserve_dump_area_start);
fadump_init_files(); fadump_init_files();
atomic_notifier_chain_register(&panic_notifier_list,
&fadump_panic_block);
return 1; return 1;
} }
subsys_initcall(setup_fadump); subsys_initcall(setup_fadump);

View File

@ -704,6 +704,30 @@ int check_legacy_ioport(unsigned long base_port)
} }
EXPORT_SYMBOL(check_legacy_ioport); EXPORT_SYMBOL(check_legacy_ioport);
static int ppc_panic_event(struct notifier_block *this,
unsigned long event, void *ptr)
{
/*
* If firmware-assisted dump has been registered then trigger
* firmware-assisted dump and let firmware handle everything else.
*/
crash_fadump(NULL, ptr);
ppc_md.panic(ptr); /* May not return */
return NOTIFY_DONE;
}
static struct notifier_block ppc_panic_block = {
.notifier_call = ppc_panic_event,
.priority = INT_MIN /* may not return; must be done last */
};
void __init setup_panic(void)
{
if (!ppc_md.panic)
return;
atomic_notifier_chain_register(&panic_notifier_list, &ppc_panic_block);
}
#ifdef CONFIG_CHECK_CACHE_COHERENCY #ifdef CONFIG_CHECK_CACHE_COHERENCY
/* /*
* For platforms that have configurable cache-coherency. This function * For platforms that have configurable cache-coherency. This function
@ -848,6 +872,9 @@ void __init setup_arch(char **cmdline_p)
/* Probe the machine type, establish ppc_md. */ /* Probe the machine type, establish ppc_md. */
probe_machine(); probe_machine();
/* Setup panic notifier if requested by the platform. */
setup_panic();
/* /*
* Configure ppc_md.power_save (ppc32 only, 64-bit machines do * Configure ppc_md.power_save (ppc32 only, 64-bit machines do
* it from their respective probe() function. * it from their respective probe() function.

View File

@ -276,9 +276,12 @@ void arch_touch_nmi_watchdog(void)
{ {
unsigned long ticks = tb_ticks_per_usec * wd_timer_period_ms * 1000; unsigned long ticks = tb_ticks_per_usec * wd_timer_period_ms * 1000;
int cpu = smp_processor_id(); int cpu = smp_processor_id();
u64 tb = get_tb();
if (get_tb() - per_cpu(wd_timer_tb, cpu) >= ticks) if (tb - per_cpu(wd_timer_tb, cpu) >= ticks) {
watchdog_timer_interrupt(cpu); per_cpu(wd_timer_tb, cpu) = tb;
wd_smp_clear_cpu_pending(cpu, tb);
}
} }
EXPORT_SYMBOL(arch_touch_nmi_watchdog); EXPORT_SYMBOL(arch_touch_nmi_watchdog);

View File

@ -725,7 +725,8 @@ u64 kvmppc_xive_get_icp(struct kvm_vcpu *vcpu)
/* Return the per-cpu state for state saving/migration */ /* Return the per-cpu state for state saving/migration */
return (u64)xc->cppr << KVM_REG_PPC_ICP_CPPR_SHIFT | return (u64)xc->cppr << KVM_REG_PPC_ICP_CPPR_SHIFT |
(u64)xc->mfrr << KVM_REG_PPC_ICP_MFRR_SHIFT; (u64)xc->mfrr << KVM_REG_PPC_ICP_MFRR_SHIFT |
(u64)0xff << KVM_REG_PPC_ICP_PPRI_SHIFT;
} }
int kvmppc_xive_set_icp(struct kvm_vcpu *vcpu, u64 icpval) int kvmppc_xive_set_icp(struct kvm_vcpu *vcpu, u64 icpval)
@ -1558,7 +1559,7 @@ static int xive_set_source(struct kvmppc_xive *xive, long irq, u64 addr)
/* /*
* Restore P and Q. If the interrupt was pending, we * Restore P and Q. If the interrupt was pending, we
* force both P and Q, which will trigger a resend. * force Q and !P, which will trigger a resend.
* *
* That means that a guest that had both an interrupt * That means that a guest that had both an interrupt
* pending (queued) and Q set will restore with only * pending (queued) and Q set will restore with only
@ -1566,7 +1567,7 @@ static int xive_set_source(struct kvmppc_xive *xive, long irq, u64 addr)
* is perfectly fine as coalescing interrupts that haven't * is perfectly fine as coalescing interrupts that haven't
* been presented yet is always allowed. * been presented yet is always allowed.
*/ */
if (val & KVM_XICS_PRESENTED || val & KVM_XICS_PENDING) if (val & KVM_XICS_PRESENTED && !(val & KVM_XICS_PENDING))
state->old_p = true; state->old_p = true;
if (val & KVM_XICS_QUEUED || val & KVM_XICS_PENDING) if (val & KVM_XICS_QUEUED || val & KVM_XICS_PENDING)
state->old_q = true; state->old_q = true;

View File

@ -762,7 +762,8 @@ emit_clear:
func = (u8 *) __bpf_call_base + imm; func = (u8 *) __bpf_call_base + imm;
/* Save skb pointer if we need to re-cache skb data */ /* Save skb pointer if we need to re-cache skb data */
if (bpf_helper_changes_pkt_data(func)) if ((ctx->seen & SEEN_SKB) &&
bpf_helper_changes_pkt_data(func))
PPC_BPF_STL(3, 1, bpf_jit_stack_local(ctx)); PPC_BPF_STL(3, 1, bpf_jit_stack_local(ctx));
bpf_jit_emit_func_call(image, ctx, (u64)func); bpf_jit_emit_func_call(image, ctx, (u64)func);
@ -771,7 +772,8 @@ emit_clear:
PPC_MR(b2p[BPF_REG_0], 3); PPC_MR(b2p[BPF_REG_0], 3);
/* refresh skb cache */ /* refresh skb cache */
if (bpf_helper_changes_pkt_data(func)) { if ((ctx->seen & SEEN_SKB) &&
bpf_helper_changes_pkt_data(func)) {
/* reload skb pointer to r3 */ /* reload skb pointer to r3 */
PPC_BPF_LL(3, 1, bpf_jit_stack_local(ctx)); PPC_BPF_LL(3, 1, bpf_jit_stack_local(ctx));
bpf_jit_emit_skb_loads(image, ctx); bpf_jit_emit_skb_loads(image, ctx);

View File

@ -410,8 +410,12 @@ static __u64 power_pmu_bhrb_to(u64 addr)
int ret; int ret;
__u64 target; __u64 target;
if (is_kernel_addr(addr)) if (is_kernel_addr(addr)) {
return branch_target((unsigned int *)addr); if (probe_kernel_read(&instr, (void *)addr, sizeof(instr)))
return 0;
return branch_target(&instr);
}
/* Userspace: need copy instruction here then translate it */ /* Userspace: need copy instruction here then translate it */
pagefault_disable(); pagefault_disable();

View File

@ -540,7 +540,7 @@ static int memord(const void *d1, size_t s1, const void *d2, size_t s2)
{ {
if (s1 < s2) if (s1 < s2)
return 1; return 1;
if (s2 > s1) if (s1 > s2)
return -1; return -1;
return memcmp(d1, d2, s1); return memcmp(d1, d2, s1);

View File

@ -39,18 +39,18 @@ int __opal_async_get_token(void)
int token; int token;
spin_lock_irqsave(&opal_async_comp_lock, flags); spin_lock_irqsave(&opal_async_comp_lock, flags);
token = find_first_bit(opal_async_complete_map, opal_max_async_tokens); token = find_first_zero_bit(opal_async_token_map, opal_max_async_tokens);
if (token >= opal_max_async_tokens) { if (token >= opal_max_async_tokens) {
token = -EBUSY; token = -EBUSY;
goto out; goto out;
} }
if (__test_and_set_bit(token, opal_async_token_map)) { if (!__test_and_clear_bit(token, opal_async_complete_map)) {
token = -EBUSY; token = -EBUSY;
goto out; goto out;
} }
__clear_bit(token, opal_async_complete_map); __set_bit(token, opal_async_token_map);
out: out:
spin_unlock_irqrestore(&opal_async_comp_lock, flags); spin_unlock_irqrestore(&opal_async_comp_lock, flags);

View File

@ -191,9 +191,11 @@ static int opal_imc_counters_probe(struct platform_device *pdev)
break; break;
} }
if (!imc_pmu_create(imc_dev, pmu_count, domain)) if (!imc_pmu_create(imc_dev, pmu_count, domain)) {
if (domain == IMC_DOMAIN_NEST)
pmu_count++; pmu_count++;
} }
}
return 0; return 0;
} }

View File

@ -319,7 +319,7 @@ static unsigned long pnv_get_proc_freq(unsigned int cpu)
{ {
unsigned long ret_freq; unsigned long ret_freq;
ret_freq = cpufreq_quick_get(cpu) * 1000ul; ret_freq = cpufreq_get(cpu) * 1000ul;
/* /*
* If the backend cpufreq driver does not exist, * If the backend cpufreq driver does not exist,

View File

@ -104,6 +104,20 @@ static void __noreturn ps3_halt(void)
ps3_sys_manager_halt(); /* never returns */ ps3_sys_manager_halt(); /* never returns */
} }
static void ps3_panic(char *str)
{
DBG("%s:%d %s\n", __func__, __LINE__, str);
smp_send_stop();
printk("\n");
printk(" System does not reboot automatically.\n");
printk(" Please press POWER button.\n");
printk("\n");
while(1)
lv1_pause(1);
}
#if defined(CONFIG_FB_PS3) || defined(CONFIG_FB_PS3_MODULE) || \ #if defined(CONFIG_FB_PS3) || defined(CONFIG_FB_PS3_MODULE) || \
defined(CONFIG_PS3_FLASH) || defined(CONFIG_PS3_FLASH_MODULE) defined(CONFIG_PS3_FLASH) || defined(CONFIG_PS3_FLASH_MODULE)
static void __init prealloc(struct ps3_prealloc *p) static void __init prealloc(struct ps3_prealloc *p)
@ -255,6 +269,7 @@ define_machine(ps3) {
.probe = ps3_probe, .probe = ps3_probe,
.setup_arch = ps3_setup_arch, .setup_arch = ps3_setup_arch,
.init_IRQ = ps3_init_IRQ, .init_IRQ = ps3_init_IRQ,
.panic = ps3_panic,
.get_boot_time = ps3_get_boot_time, .get_boot_time = ps3_get_boot_time,
.set_dabr = ps3_set_dabr, .set_dabr = ps3_set_dabr,
.calibrate_decr = ps3_calibrate_decr, .calibrate_decr = ps3_calibrate_decr,

View File

@ -726,6 +726,7 @@ define_machine(pseries) {
.pcibios_fixup = pSeries_final_fixup, .pcibios_fixup = pSeries_final_fixup,
.restart = rtas_restart, .restart = rtas_restart,
.halt = rtas_halt, .halt = rtas_halt,
.panic = rtas_os_term,
.get_boot_time = rtas_get_boot_time, .get_boot_time = rtas_get_boot_time,
.get_rtc_time = rtas_get_rtc_time, .get_rtc_time = rtas_get_rtc_time,
.set_rtc_time = rtas_set_rtc_time, .set_rtc_time = rtas_set_rtc_time,

View File

@ -1592,6 +1592,8 @@ ATTRIBUTE_GROUPS(vio_dev);
void vio_unregister_device(struct vio_dev *viodev) void vio_unregister_device(struct vio_dev *viodev)
{ {
device_unregister(&viodev->dev); device_unregister(&viodev->dev);
if (viodev->family == VDEVICE)
irq_dispose_mapping(viodev->irq);
} }
EXPORT_SYMBOL(vio_unregister_device); EXPORT_SYMBOL(vio_unregister_device);

View File

@ -846,12 +846,12 @@ void ipic_disable_mcp(enum ipic_mcp_irq mcp_irq)
u32 ipic_get_mcp_status(void) u32 ipic_get_mcp_status(void)
{ {
return ipic_read(primary_ipic->regs, IPIC_SERMR); return ipic_read(primary_ipic->regs, IPIC_SERSR);
} }
void ipic_clear_mcp_status(u32 mask) void ipic_clear_mcp_status(u32 mask)
{ {
ipic_write(primary_ipic->regs, IPIC_SERMR, mask); ipic_write(primary_ipic->regs, IPIC_SERSR, mask);
} }
/* Return an interrupt vector or 0 if no interrupt is pending. */ /* Return an interrupt vector or 0 if no interrupt is pending. */

View File

@ -530,14 +530,19 @@ static int xmon_core(struct pt_regs *regs, int fromipi)
waiting: waiting:
secondary = 1; secondary = 1;
spin_begin();
while (secondary && !xmon_gate) { while (secondary && !xmon_gate) {
if (in_xmon == 0) { if (in_xmon == 0) {
if (fromipi) if (fromipi) {
spin_end();
goto leave; goto leave;
}
secondary = test_and_set_bit(0, &in_xmon); secondary = test_and_set_bit(0, &in_xmon);
} }
barrier(); spin_cpu_relax();
touch_nmi_watchdog();
} }
spin_end();
if (!secondary && !xmon_gate) { if (!secondary && !xmon_gate) {
/* we are the first cpu to come in */ /* we are the first cpu to come in */
@ -568,21 +573,25 @@ static int xmon_core(struct pt_regs *regs, int fromipi)
mb(); mb();
xmon_gate = 1; xmon_gate = 1;
barrier(); barrier();
touch_nmi_watchdog();
} }
cmdloop: cmdloop:
while (in_xmon) { while (in_xmon) {
if (secondary) { if (secondary) {
spin_begin();
if (cpu == xmon_owner) { if (cpu == xmon_owner) {
if (!test_and_set_bit(0, &xmon_taken)) { if (!test_and_set_bit(0, &xmon_taken)) {
secondary = 0; secondary = 0;
spin_end();
continue; continue;
} }
/* missed it */ /* missed it */
while (cpu == xmon_owner) while (cpu == xmon_owner)
barrier(); spin_cpu_relax();
} }
barrier(); spin_cpu_relax();
touch_nmi_watchdog();
} else { } else {
cmd = cmds(regs); cmd = cmds(regs);
if (cmd != 0) { if (cmd != 0) {
@ -2475,6 +2484,11 @@ static void dump_xives(void)
unsigned long num; unsigned long num;
int c; int c;
if (!xive_enabled()) {
printf("Xive disabled on this system\n");
return;
}
c = inchar(); c = inchar();
if (c == 'a') { if (c == 'a') {
dump_all_xives(); dump_all_xives();

View File

@ -263,7 +263,6 @@ typedef struct compat_siginfo {
#define si_overrun _sifields._timer._overrun #define si_overrun _sifields._timer._overrun
#define COMPAT_OFF_T_MAX 0x7fffffff #define COMPAT_OFF_T_MAX 0x7fffffff
#define COMPAT_LOFF_T_MAX 0x7fffffffffffffffL
/* /*
* A pointer passed in from user mode. This should not * A pointer passed in from user mode. This should not

View File

@ -31,19 +31,18 @@ static inline void restore_access_regs(unsigned int *acrs)
} }
#define switch_to(prev, next, last) do { \ #define switch_to(prev, next, last) do { \
if (prev->mm) { \ /* save_fpu_regs() sets the CIF_FPU flag, which enforces \
* a restore of the floating point / vector registers as \
* soon as the next task returns to user space \
*/ \
save_fpu_regs(); \ save_fpu_regs(); \
save_access_regs(&prev->thread.acrs[0]); \ save_access_regs(&prev->thread.acrs[0]); \
save_ri_cb(prev->thread.ri_cb); \ save_ri_cb(prev->thread.ri_cb); \
save_gs_cb(prev->thread.gs_cb); \ save_gs_cb(prev->thread.gs_cb); \
} \
update_cr_regs(next); \ update_cr_regs(next); \
if (next->mm) { \
set_cpu_flag(CIF_FPU); \
restore_access_regs(&next->thread.acrs[0]); \ restore_access_regs(&next->thread.acrs[0]); \
restore_ri_cb(next->thread.ri_cb, prev->thread.ri_cb); \ restore_ri_cb(next->thread.ri_cb, prev->thread.ri_cb); \
restore_gs_cb(next->thread.gs_cb); \ restore_gs_cb(next->thread.gs_cb); \
} \
prev = __switch_to(prev, next); \ prev = __switch_to(prev, next); \
} while (0) } while (0)

View File

@ -263,6 +263,7 @@ COMPAT_SYSCALL_DEFINE2(s390_setgroups16, int, gidsetsize, u16 __user *, grouplis
return retval; return retval;
} }
groups_sort(group_info);
retval = set_current_groups(group_info); retval = set_current_groups(group_info);
put_group_info(group_info); put_group_info(group_info);

View File

@ -370,10 +370,10 @@ SYSCALL(sys_recvmmsg,compat_sys_recvmmsg)
SYSCALL(sys_sendmmsg,compat_sys_sendmmsg) SYSCALL(sys_sendmmsg,compat_sys_sendmmsg)
SYSCALL(sys_socket,sys_socket) SYSCALL(sys_socket,sys_socket)
SYSCALL(sys_socketpair,compat_sys_socketpair) /* 360 */ SYSCALL(sys_socketpair,compat_sys_socketpair) /* 360 */
SYSCALL(sys_bind,sys_bind) SYSCALL(sys_bind,compat_sys_bind)
SYSCALL(sys_connect,sys_connect) SYSCALL(sys_connect,compat_sys_connect)
SYSCALL(sys_listen,sys_listen) SYSCALL(sys_listen,sys_listen)
SYSCALL(sys_accept4,sys_accept4) SYSCALL(sys_accept4,compat_sys_accept4)
SYSCALL(sys_getsockopt,compat_sys_getsockopt) /* 365 */ SYSCALL(sys_getsockopt,compat_sys_getsockopt) /* 365 */
SYSCALL(sys_setsockopt,compat_sys_setsockopt) SYSCALL(sys_setsockopt,compat_sys_setsockopt)
SYSCALL(sys_getsockname,compat_sys_getsockname) SYSCALL(sys_getsockname,compat_sys_getsockname)

View File

@ -235,8 +235,6 @@ static int try_handle_skey(struct kvm_vcpu *vcpu)
VCPU_EVENT(vcpu, 4, "%s", "retrying storage key operation"); VCPU_EVENT(vcpu, 4, "%s", "retrying storage key operation");
return -EAGAIN; return -EAGAIN;
} }
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
return 0; return 0;
} }
@ -247,6 +245,9 @@ static int handle_iske(struct kvm_vcpu *vcpu)
int reg1, reg2; int reg1, reg2;
int rc; int rc;
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
rc = try_handle_skey(vcpu); rc = try_handle_skey(vcpu);
if (rc) if (rc)
return rc != -EAGAIN ? rc : 0; return rc != -EAGAIN ? rc : 0;
@ -276,6 +277,9 @@ static int handle_rrbe(struct kvm_vcpu *vcpu)
int reg1, reg2; int reg1, reg2;
int rc; int rc;
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
rc = try_handle_skey(vcpu); rc = try_handle_skey(vcpu);
if (rc) if (rc)
return rc != -EAGAIN ? rc : 0; return rc != -EAGAIN ? rc : 0;
@ -311,6 +315,9 @@ static int handle_sske(struct kvm_vcpu *vcpu)
int reg1, reg2; int reg1, reg2;
int rc; int rc;
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
rc = try_handle_skey(vcpu); rc = try_handle_skey(vcpu);
if (rc) if (rc)
return rc != -EAGAIN ? rc : 0; return rc != -EAGAIN ? rc : 0;

View File

@ -85,8 +85,6 @@ int crst_table_upgrade(struct mm_struct *mm, unsigned long end)
/* upgrade should only happen from 3 to 4, 3 to 5, or 4 to 5 levels */ /* upgrade should only happen from 3 to 4, 3 to 5, or 4 to 5 levels */
VM_BUG_ON(mm->context.asce_limit < _REGION2_SIZE); VM_BUG_ON(mm->context.asce_limit < _REGION2_SIZE);
if (end >= TASK_SIZE_MAX)
return -ENOMEM;
rc = 0; rc = 0;
notify = 0; notify = 0;
while (mm->context.asce_limit < end) { while (mm->context.asce_limit < end) {

View File

@ -55,8 +55,7 @@ struct bpf_jit {
#define SEEN_LITERAL 8 /* code uses literals */ #define SEEN_LITERAL 8 /* code uses literals */
#define SEEN_FUNC 16 /* calls C functions */ #define SEEN_FUNC 16 /* calls C functions */
#define SEEN_TAIL_CALL 32 /* code uses tail calls */ #define SEEN_TAIL_CALL 32 /* code uses tail calls */
#define SEEN_SKB_CHANGE 64 /* code changes skb data */ #define SEEN_REG_AX 64 /* code uses constant blinding */
#define SEEN_REG_AX 128 /* code uses constant blinding */
#define SEEN_STACK (SEEN_FUNC | SEEN_MEM | SEEN_SKB) #define SEEN_STACK (SEEN_FUNC | SEEN_MEM | SEEN_SKB)
/* /*
@ -448,13 +447,13 @@ static void bpf_jit_prologue(struct bpf_jit *jit)
EMIT6_DISP_LH(0xe3000000, 0x0024, REG_W1, REG_0, EMIT6_DISP_LH(0xe3000000, 0x0024, REG_W1, REG_0,
REG_15, 152); REG_15, 152);
} }
if (jit->seen & SEEN_SKB) if (jit->seen & SEEN_SKB) {
emit_load_skb_data_hlen(jit); emit_load_skb_data_hlen(jit);
if (jit->seen & SEEN_SKB_CHANGE)
/* stg %b1,ST_OFF_SKBP(%r0,%r15) */ /* stg %b1,ST_OFF_SKBP(%r0,%r15) */
EMIT6_DISP_LH(0xe3000000, 0x0024, BPF_REG_1, REG_0, REG_15, EMIT6_DISP_LH(0xe3000000, 0x0024, BPF_REG_1, REG_0, REG_15,
STK_OFF_SKBP); STK_OFF_SKBP);
} }
}
/* /*
* Function epilogue * Function epilogue
@ -983,8 +982,8 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp, int i
EMIT2(0x0d00, REG_14, REG_W1); EMIT2(0x0d00, REG_14, REG_W1);
/* lgr %b0,%r2: load return value into %b0 */ /* lgr %b0,%r2: load return value into %b0 */
EMIT4(0xb9040000, BPF_REG_0, REG_2); EMIT4(0xb9040000, BPF_REG_0, REG_2);
if (bpf_helper_changes_pkt_data((void *)func)) { if ((jit->seen & SEEN_SKB) &&
jit->seen |= SEEN_SKB_CHANGE; bpf_helper_changes_pkt_data((void *)func)) {
/* lg %b1,ST_OFF_SKBP(%r15) */ /* lg %b1,ST_OFF_SKBP(%r15) */
EMIT6_DISP_LH(0xe3000000, 0x0004, BPF_REG_1, REG_0, EMIT6_DISP_LH(0xe3000000, 0x0004, BPF_REG_1, REG_0,
REG_15, STK_OFF_SKBP); REG_15, STK_OFF_SKBP);

View File

@ -209,7 +209,6 @@ typedef struct compat_siginfo {
} compat_siginfo_t; } compat_siginfo_t;
#define COMPAT_OFF_T_MAX 0x7fffffff #define COMPAT_OFF_T_MAX 0x7fffffff
#define COMPAT_LOFF_T_MAX 0x7fffffffffffffffL
/* /*
* A pointer passed in from user mode. This should not * A pointer passed in from user mode. This should not

View File

@ -7,6 +7,7 @@
#if defined(__sparc__) && defined(__arch64__) #if defined(__sparc__) && defined(__arch64__)
#ifndef __ASSEMBLY__ #ifndef __ASSEMBLY__
#include <linux/compiler.h>
#include <linux/threads.h> #include <linux/threads.h>
#include <asm/switch_to.h> #include <asm/switch_to.h>

View File

@ -2540,9 +2540,16 @@ void __init mem_init(void)
{ {
high_memory = __va(last_valid_pfn << PAGE_SHIFT); high_memory = __va(last_valid_pfn << PAGE_SHIFT);
register_page_bootmem_info();
free_all_bootmem(); free_all_bootmem();
/*
* Must be done after boot memory is put on freelist, because here we
* might set fields in deferred struct pages that have not yet been
* initialized, and free_all_bootmem() initializes all the reserved
* deferred pages for us.
*/
register_page_bootmem_info();
/* /*
* Set up the zero page, mark it reserved, so that page count * Set up the zero page, mark it reserved, so that page count
* is not manipulated when freeing the page from user ptes. * is not manipulated when freeing the page from user ptes.

View File

@ -1245,14 +1245,16 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
u8 *func = ((u8 *)__bpf_call_base) + imm; u8 *func = ((u8 *)__bpf_call_base) + imm;
ctx->saw_call = true; ctx->saw_call = true;
if (ctx->saw_ld_abs_ind && bpf_helper_changes_pkt_data(func))
emit_reg_move(bpf2sparc[BPF_REG_1], L7, ctx);
emit_call((u32 *)func, ctx); emit_call((u32 *)func, ctx);
emit_nop(ctx); emit_nop(ctx);
emit_reg_move(O0, bpf2sparc[BPF_REG_0], ctx); emit_reg_move(O0, bpf2sparc[BPF_REG_0], ctx);
if (bpf_helper_changes_pkt_data(func) && ctx->saw_ld_abs_ind) if (ctx->saw_ld_abs_ind && bpf_helper_changes_pkt_data(func))
load_skb_regs(ctx, bpf2sparc[BPF_REG_6]); load_skb_regs(ctx, L7);
break; break;
} }

View File

@ -173,7 +173,6 @@ typedef struct compat_siginfo {
} compat_siginfo_t; } compat_siginfo_t;
#define COMPAT_OFF_T_MAX 0x7fffffff #define COMPAT_OFF_T_MAX 0x7fffffff
#define COMPAT_LOFF_T_MAX 0x7fffffffffffffffL
struct compat_ipc64_perm { struct compat_ipc64_perm {
compat_key_t key; compat_key_t key;

View File

@ -1,4 +1,5 @@
generic-y += barrier.h generic-y += barrier.h
generic-y += bpf_perf_event.h
generic-y += bug.h generic-y += bug.h
generic-y += clkdev.h generic-y += clkdev.h
generic-y += current.h generic-y += current.h

View File

@ -15,9 +15,10 @@ extern void uml_setup_stubs(struct mm_struct *mm);
/* /*
* Needed since we do not use the asm-generic/mm_hooks.h: * Needed since we do not use the asm-generic/mm_hooks.h:
*/ */
static inline void arch_dup_mmap(struct mm_struct *oldmm, struct mm_struct *mm) static inline int arch_dup_mmap(struct mm_struct *oldmm, struct mm_struct *mm)
{ {
uml_setup_stubs(mm); uml_setup_stubs(mm);
return 0;
} }
extern void arch_exit_mmap(struct mm_struct *mm); extern void arch_exit_mmap(struct mm_struct *mm);
static inline void arch_unmap(struct mm_struct *mm, static inline void arch_unmap(struct mm_struct *mm,

View File

@ -41,7 +41,7 @@
typedef int (*initcall_t)(void); typedef int (*initcall_t)(void);
typedef void (*exitcall_t)(void); typedef void (*exitcall_t)(void);
#include <linux/compiler.h> #include <linux/compiler_types.h>
/* These are for everybody (although not all archs will actually /* These are for everybody (although not all archs will actually
discard it in modules) */ discard it in modules) */

View File

@ -81,9 +81,10 @@ do { \
} \ } \
} while (0) } while (0)
static inline void arch_dup_mmap(struct mm_struct *oldmm, static inline int arch_dup_mmap(struct mm_struct *oldmm,
struct mm_struct *mm) struct mm_struct *mm)
{ {
return 0;
} }
static inline void arch_unmap(struct mm_struct *mm, static inline void arch_unmap(struct mm_struct *mm,

View File

@ -108,7 +108,7 @@ config X86
select HAVE_ARCH_AUDITSYSCALL select HAVE_ARCH_AUDITSYSCALL
select HAVE_ARCH_HUGE_VMAP if X86_64 || X86_PAE select HAVE_ARCH_HUGE_VMAP if X86_64 || X86_PAE
select HAVE_ARCH_JUMP_LABEL select HAVE_ARCH_JUMP_LABEL
select HAVE_ARCH_KASAN if X86_64 && SPARSEMEM_VMEMMAP select HAVE_ARCH_KASAN if X86_64
select HAVE_ARCH_KGDB select HAVE_ARCH_KGDB
select HAVE_ARCH_KMEMCHECK select HAVE_ARCH_KMEMCHECK
select HAVE_ARCH_MMAP_RND_BITS if MMU select HAVE_ARCH_MMAP_RND_BITS if MMU
@ -171,7 +171,7 @@ config X86
select HAVE_PERF_USER_STACK_DUMP select HAVE_PERF_USER_STACK_DUMP
select HAVE_RCU_TABLE_FREE select HAVE_RCU_TABLE_FREE
select HAVE_REGS_AND_STACK_ACCESS_API select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_RELIABLE_STACKTRACE if X86_64 && FRAME_POINTER_UNWINDER && STACK_VALIDATION select HAVE_RELIABLE_STACKTRACE if X86_64 && UNWINDER_FRAME_POINTER && STACK_VALIDATION
select HAVE_STACK_VALIDATION if X86_64 select HAVE_STACK_VALIDATION if X86_64
select HAVE_SYSCALL_TRACEPOINTS select HAVE_SYSCALL_TRACEPOINTS
select HAVE_UNSTABLE_SCHED_CLOCK select HAVE_UNSTABLE_SCHED_CLOCK
@ -303,7 +303,6 @@ config ARCH_SUPPORTS_DEBUG_PAGEALLOC
config KASAN_SHADOW_OFFSET config KASAN_SHADOW_OFFSET
hex hex
depends on KASAN depends on KASAN
default 0xdff8000000000000 if X86_5LEVEL
default 0xdffffc0000000000 default 0xdffffc0000000000
config HAVE_INTEL_TXT config HAVE_INTEL_TXT
@ -926,7 +925,8 @@ config MAXSMP
config NR_CPUS config NR_CPUS
int "Maximum number of CPUs" if SMP && !MAXSMP int "Maximum number of CPUs" if SMP && !MAXSMP
range 2 8 if SMP && X86_32 && !X86_BIGSMP range 2 8 if SMP && X86_32 && !X86_BIGSMP
range 2 512 if SMP && !MAXSMP && !CPUMASK_OFFSTACK range 2 64 if SMP && X86_32 && X86_BIGSMP
range 2 512 if SMP && !MAXSMP && !CPUMASK_OFFSTACK && X86_64
range 2 8192 if SMP && !MAXSMP && CPUMASK_OFFSTACK && X86_64 range 2 8192 if SMP && !MAXSMP && CPUMASK_OFFSTACK && X86_64
default "1" if !SMP default "1" if !SMP
default "8192" if MAXSMP default "8192" if MAXSMP

View File

@ -359,28 +359,14 @@ config PUNIT_ATOM_DEBUG
choice choice
prompt "Choose kernel unwinder" prompt "Choose kernel unwinder"
default FRAME_POINTER_UNWINDER default UNWINDER_ORC if X86_64
default UNWINDER_FRAME_POINTER if X86_32
---help--- ---help---
This determines which method will be used for unwinding kernel stack This determines which method will be used for unwinding kernel stack
traces for panics, oopses, bugs, warnings, perf, /proc/<pid>/stack, traces for panics, oopses, bugs, warnings, perf, /proc/<pid>/stack,
livepatch, lockdep, and more. livepatch, lockdep, and more.
config FRAME_POINTER_UNWINDER config UNWINDER_ORC
bool "Frame pointer unwinder"
select FRAME_POINTER
---help---
This option enables the frame pointer unwinder for unwinding kernel
stack traces.
The unwinder itself is fast and it uses less RAM than the ORC
unwinder, but the kernel text size will grow by ~3% and the kernel's
overall performance will degrade by roughly 5-10%.
This option is recommended if you want to use the livepatch
consistency model, as this is currently the only way to get a
reliable stack trace (CONFIG_HAVE_RELIABLE_STACKTRACE).
config ORC_UNWINDER
bool "ORC unwinder" bool "ORC unwinder"
depends on X86_64 depends on X86_64
select STACK_VALIDATION select STACK_VALIDATION
@ -396,7 +382,22 @@ config ORC_UNWINDER
Enabling this option will increase the kernel's runtime memory usage Enabling this option will increase the kernel's runtime memory usage
by roughly 2-4MB, depending on your kernel config. by roughly 2-4MB, depending on your kernel config.
config GUESS_UNWINDER config UNWINDER_FRAME_POINTER
bool "Frame pointer unwinder"
select FRAME_POINTER
---help---
This option enables the frame pointer unwinder for unwinding kernel
stack traces.
The unwinder itself is fast and it uses less RAM than the ORC
unwinder, but the kernel text size will grow by ~3% and the kernel's
overall performance will degrade by roughly 5-10%.
This option is recommended if you want to use the livepatch
consistency model, as this is currently the only way to get a
reliable stack trace (CONFIG_HAVE_RELIABLE_STACKTRACE).
config UNWINDER_GUESS
bool "Guess unwinder" bool "Guess unwinder"
depends on EXPERT depends on EXPERT
---help--- ---help---
@ -411,7 +412,7 @@ config GUESS_UNWINDER
endchoice endchoice
config FRAME_POINTER config FRAME_POINTER
depends on !ORC_UNWINDER && !GUESS_UNWINDER depends on !UNWINDER_ORC && !UNWINDER_GUESS
bool bool
endmenu endmenu

View File

@ -78,6 +78,7 @@ vmlinux-objs-$(CONFIG_EARLY_PRINTK) += $(obj)/early_serial_console.o
vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr.o vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/kaslr.o
ifdef CONFIG_X86_64 ifdef CONFIG_X86_64
vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/pagetable.o vmlinux-objs-$(CONFIG_RANDOMIZE_BASE) += $(obj)/pagetable.o
vmlinux-objs-y += $(obj)/pgtable_64.o
endif endif
$(obj)/eboot.o: KBUILD_CFLAGS += -fshort-wchar -mno-red-zone $(obj)/eboot.o: KBUILD_CFLAGS += -fshort-wchar -mno-red-zone

View File

@ -289,10 +289,18 @@ ENTRY(startup_64)
leaq boot_stack_end(%rbx), %rsp leaq boot_stack_end(%rbx), %rsp
#ifdef CONFIG_X86_5LEVEL #ifdef CONFIG_X86_5LEVEL
/* Check if 5-level paging has already enabled */ /*
movq %cr4, %rax * Check if we need to enable 5-level paging.
testl $X86_CR4_LA57, %eax * RSI holds real mode data and need to be preserved across
jnz lvl5 * a function call.
*/
pushq %rsi
call l5_paging_required
popq %rsi
/* If l5_paging_required() returned zero, we're done here. */
cmpq $0, %rax
je lvl5
/* /*
* At this point we are in long mode with 4-level paging enabled, * At this point we are in long mode with 4-level paging enabled,

View File

@ -169,6 +169,16 @@ void __puthex(unsigned long value)
} }
} }
static bool l5_supported(void)
{
/* Check if leaf 7 is supported. */
if (native_cpuid_eax(0) < 7)
return 0;
/* Check if la57 is supported. */
return native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31));
}
#if CONFIG_X86_NEED_RELOCS #if CONFIG_X86_NEED_RELOCS
static void handle_relocations(void *output, unsigned long output_len, static void handle_relocations(void *output, unsigned long output_len,
unsigned long virt_addr) unsigned long virt_addr)
@ -362,6 +372,12 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
console_init(); console_init();
debug_putstr("early console in extract_kernel\n"); debug_putstr("early console in extract_kernel\n");
if (IS_ENABLED(CONFIG_X86_5LEVEL) && !l5_supported()) {
error("This linux kernel as configured requires 5-level paging\n"
"This CPU does not support the required 'cr4.la57' feature\n"
"Unable to boot - please use a kernel appropriate for your CPU\n");
}
free_mem_ptr = heap; /* Heap */ free_mem_ptr = heap; /* Heap */
free_mem_end_ptr = heap + BOOT_HEAP_SIZE; free_mem_end_ptr = heap + BOOT_HEAP_SIZE;

View File

@ -0,0 +1,28 @@
#include <asm/processor.h>
/*
* __force_order is used by special_insns.h asm code to force instruction
* serialization.
*
* It is not referenced from the code, but GCC < 5 with -fPIE would fail
* due to an undefined symbol. Define it to make these ancient GCCs work.
*/
unsigned long __force_order;
int l5_paging_required(void)
{
/* Check if leaf 7 is supported. */
if (native_cpuid_eax(0) < 7)
return 0;
/* Check if la57 is supported. */
if (!(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31))))
return 0;
/* Check if 5-level paging has already been enabled. */
if (native_read_cr4() & X86_CR4_LA57)
return 0;
return 1;
}

View File

@ -1,5 +1,5 @@
CONFIG_NOHIGHMEM=y CONFIG_NOHIGHMEM=y
# CONFIG_HIGHMEM4G is not set # CONFIG_HIGHMEM4G is not set
# CONFIG_HIGHMEM64G is not set # CONFIG_HIGHMEM64G is not set
CONFIG_GUESS_UNWINDER=y CONFIG_UNWINDER_GUESS=y
# CONFIG_FRAME_POINTER_UNWINDER is not set # CONFIG_UNWINDER_FRAME_POINTER is not set

View File

@ -299,6 +299,7 @@ CONFIG_DEBUG_STACKOVERFLOW=y
# CONFIG_DEBUG_RODATA_TEST is not set # CONFIG_DEBUG_RODATA_TEST is not set
CONFIG_DEBUG_BOOT_PARAMS=y CONFIG_DEBUG_BOOT_PARAMS=y
CONFIG_OPTIMIZE_INLINING=y CONFIG_OPTIMIZE_INLINING=y
CONFIG_UNWINDER_ORC=y
CONFIG_SECURITY=y CONFIG_SECURITY=y
CONFIG_SECURITY_NETWORK=y CONFIG_SECURITY_NETWORK=y
CONFIG_SECURITY_SELINUX=y CONFIG_SECURITY_SELINUX=y

View File

@ -59,13 +59,6 @@ static int encrypt(struct blkcipher_desc *desc,
salsa20_ivsetup(ctx, walk.iv); salsa20_ivsetup(ctx, walk.iv);
if (likely(walk.nbytes == nbytes))
{
salsa20_encrypt_bytes(ctx, walk.src.virt.addr,
walk.dst.virt.addr, nbytes);
return blkcipher_walk_done(desc, &walk, 0);
}
while (walk.nbytes >= 64) { while (walk.nbytes >= 64) {
salsa20_encrypt_bytes(ctx, walk.src.virt.addr, salsa20_encrypt_bytes(ctx, walk.src.virt.addr,
walk.dst.virt.addr, walk.dst.virt.addr,

View File

@ -142,56 +142,25 @@ For 32-bit we have the following conventions - kernel is built with
UNWIND_HINT_REGS offset=\offset UNWIND_HINT_REGS offset=\offset
.endm .endm
.macro RESTORE_EXTRA_REGS offset=0 .macro POP_EXTRA_REGS
movq 0*8+\offset(%rsp), %r15 popq %r15
movq 1*8+\offset(%rsp), %r14 popq %r14
movq 2*8+\offset(%rsp), %r13 popq %r13
movq 3*8+\offset(%rsp), %r12 popq %r12
movq 4*8+\offset(%rsp), %rbp popq %rbp
movq 5*8+\offset(%rsp), %rbx popq %rbx
UNWIND_HINT_REGS offset=\offset extra=0
.endm .endm
.macro RESTORE_C_REGS_HELPER rstor_rax=1, rstor_rcx=1, rstor_r11=1, rstor_r8910=1, rstor_rdx=1 .macro POP_C_REGS
.if \rstor_r11 popq %r11
movq 6*8(%rsp), %r11 popq %r10
.endif popq %r9
.if \rstor_r8910 popq %r8
movq 7*8(%rsp), %r10 popq %rax
movq 8*8(%rsp), %r9 popq %rcx
movq 9*8(%rsp), %r8 popq %rdx
.endif popq %rsi
.if \rstor_rax popq %rdi
movq 10*8(%rsp), %rax
.endif
.if \rstor_rcx
movq 11*8(%rsp), %rcx
.endif
.if \rstor_rdx
movq 12*8(%rsp), %rdx
.endif
movq 13*8(%rsp), %rsi
movq 14*8(%rsp), %rdi
UNWIND_HINT_IRET_REGS offset=16*8
.endm
.macro RESTORE_C_REGS
RESTORE_C_REGS_HELPER 1,1,1,1,1
.endm
.macro RESTORE_C_REGS_EXCEPT_RAX
RESTORE_C_REGS_HELPER 0,1,1,1,1
.endm
.macro RESTORE_C_REGS_EXCEPT_RCX
RESTORE_C_REGS_HELPER 1,0,1,1,1
.endm
.macro RESTORE_C_REGS_EXCEPT_R11
RESTORE_C_REGS_HELPER 1,1,0,1,1
.endm
.macro RESTORE_C_REGS_EXCEPT_RCX_R11
RESTORE_C_REGS_HELPER 1,0,0,1,1
.endm
.macro REMOVE_PT_GPREGS_FROM_STACK addskip=0
subq $-(15*8+\addskip), %rsp
.endm .endm
.macro icebp .macro icebp

View File

@ -941,9 +941,10 @@ ENTRY(debug)
movl %esp, %eax # pt_regs pointer movl %esp, %eax # pt_regs pointer
/* Are we currently on the SYSENTER stack? */ /* Are we currently on the SYSENTER stack? */
PER_CPU(cpu_tss + CPU_TSS_SYSENTER_stack + SIZEOF_SYSENTER_stack, %ecx) movl PER_CPU_VAR(cpu_entry_area), %ecx
subl %eax, %ecx /* ecx = (end of SYSENTER_stack) - esp */ addl $CPU_ENTRY_AREA_entry_stack + SIZEOF_entry_stack, %ecx
cmpl $SIZEOF_SYSENTER_stack, %ecx subl %eax, %ecx /* ecx = (end of entry_stack) - esp */
cmpl $SIZEOF_entry_stack, %ecx
jb .Ldebug_from_sysenter_stack jb .Ldebug_from_sysenter_stack
TRACE_IRQS_OFF TRACE_IRQS_OFF
@ -984,9 +985,10 @@ ENTRY(nmi)
movl %esp, %eax # pt_regs pointer movl %esp, %eax # pt_regs pointer
/* Are we currently on the SYSENTER stack? */ /* Are we currently on the SYSENTER stack? */
PER_CPU(cpu_tss + CPU_TSS_SYSENTER_stack + SIZEOF_SYSENTER_stack, %ecx) movl PER_CPU_VAR(cpu_entry_area), %ecx
subl %eax, %ecx /* ecx = (end of SYSENTER_stack) - esp */ addl $CPU_ENTRY_AREA_entry_stack + SIZEOF_entry_stack, %ecx
cmpl $SIZEOF_SYSENTER_stack, %ecx subl %eax, %ecx /* ecx = (end of entry_stack) - esp */
cmpl $SIZEOF_entry_stack, %ecx
jb .Lnmi_from_sysenter_stack jb .Lnmi_from_sysenter_stack
/* Not on SYSENTER stack. */ /* Not on SYSENTER stack. */

View File

@ -136,6 +136,64 @@ END(native_usergs_sysret64)
* with them due to bugs in both AMD and Intel CPUs. * with them due to bugs in both AMD and Intel CPUs.
*/ */
.pushsection .entry_trampoline, "ax"
/*
* The code in here gets remapped into cpu_entry_area's trampoline. This means
* that the assembler and linker have the wrong idea as to where this code
* lives (and, in fact, it's mapped more than once, so it's not even at a
* fixed address). So we can't reference any symbols outside the entry
* trampoline and expect it to work.
*
* Instead, we carefully abuse %rip-relative addressing.
* _entry_trampoline(%rip) refers to the start of the remapped) entry
* trampoline. We can thus find cpu_entry_area with this macro:
*/
#define CPU_ENTRY_AREA \
_entry_trampoline - CPU_ENTRY_AREA_entry_trampoline(%rip)
/* The top word of the SYSENTER stack is hot and is usable as scratch space. */
#define RSP_SCRATCH CPU_ENTRY_AREA_entry_stack + \
SIZEOF_entry_stack - 8 + CPU_ENTRY_AREA
ENTRY(entry_SYSCALL_64_trampoline)
UNWIND_HINT_EMPTY
swapgs
/* Stash the user RSP. */
movq %rsp, RSP_SCRATCH
/* Load the top of the task stack into RSP */
movq CPU_ENTRY_AREA_tss + TSS_sp1 + CPU_ENTRY_AREA, %rsp
/* Start building the simulated IRET frame. */
pushq $__USER_DS /* pt_regs->ss */
pushq RSP_SCRATCH /* pt_regs->sp */
pushq %r11 /* pt_regs->flags */
pushq $__USER_CS /* pt_regs->cs */
pushq %rcx /* pt_regs->ip */
/*
* x86 lacks a near absolute jump, and we can't jump to the real
* entry text with a relative jump. We could push the target
* address and then use retq, but this destroys the pipeline on
* many CPUs (wasting over 20 cycles on Sandy Bridge). Instead,
* spill RDI and restore it in a second-stage trampoline.
*/
pushq %rdi
movq $entry_SYSCALL_64_stage2, %rdi
jmp *%rdi
END(entry_SYSCALL_64_trampoline)
.popsection
ENTRY(entry_SYSCALL_64_stage2)
UNWIND_HINT_EMPTY
popq %rdi
jmp entry_SYSCALL_64_after_hwframe
END(entry_SYSCALL_64_stage2)
ENTRY(entry_SYSCALL_64) ENTRY(entry_SYSCALL_64)
UNWIND_HINT_EMPTY UNWIND_HINT_EMPTY
/* /*
@ -221,10 +279,9 @@ entry_SYSCALL_64_fastpath:
TRACE_IRQS_ON /* user mode is traced as IRQs on */ TRACE_IRQS_ON /* user mode is traced as IRQs on */
movq RIP(%rsp), %rcx movq RIP(%rsp), %rcx
movq EFLAGS(%rsp), %r11 movq EFLAGS(%rsp), %r11
RESTORE_C_REGS_EXCEPT_RCX_R11 addq $6*8, %rsp /* skip extra regs -- they were preserved */
movq RSP(%rsp), %rsp
UNWIND_HINT_EMPTY UNWIND_HINT_EMPTY
USERGS_SYSRET64 jmp .Lpop_c_regs_except_rcx_r11_and_sysret
1: 1:
/* /*
@ -246,17 +303,18 @@ entry_SYSCALL64_slow_path:
call do_syscall_64 /* returns with IRQs disabled */ call do_syscall_64 /* returns with IRQs disabled */
return_from_SYSCALL_64: return_from_SYSCALL_64:
RESTORE_EXTRA_REGS
TRACE_IRQS_IRETQ /* we're about to change IF */ TRACE_IRQS_IRETQ /* we're about to change IF */
/* /*
* Try to use SYSRET instead of IRET if we're returning to * Try to use SYSRET instead of IRET if we're returning to
* a completely clean 64-bit userspace context. * a completely clean 64-bit userspace context. If we're not,
* go to the slow exit path.
*/ */
movq RCX(%rsp), %rcx movq RCX(%rsp), %rcx
movq RIP(%rsp), %r11 movq RIP(%rsp), %r11
cmpq %rcx, %r11 /* RCX == RIP */
jne opportunistic_sysret_failed cmpq %rcx, %r11 /* SYSRET requires RCX == RIP */
jne swapgs_restore_regs_and_return_to_usermode
/* /*
* On Intel CPUs, SYSRET with non-canonical RCX/RIP will #GP * On Intel CPUs, SYSRET with non-canonical RCX/RIP will #GP
@ -274,14 +332,14 @@ return_from_SYSCALL_64:
/* If this changed %rcx, it was not canonical */ /* If this changed %rcx, it was not canonical */
cmpq %rcx, %r11 cmpq %rcx, %r11
jne opportunistic_sysret_failed jne swapgs_restore_regs_and_return_to_usermode
cmpq $__USER_CS, CS(%rsp) /* CS must match SYSRET */ cmpq $__USER_CS, CS(%rsp) /* CS must match SYSRET */
jne opportunistic_sysret_failed jne swapgs_restore_regs_and_return_to_usermode
movq R11(%rsp), %r11 movq R11(%rsp), %r11
cmpq %r11, EFLAGS(%rsp) /* R11 == RFLAGS */ cmpq %r11, EFLAGS(%rsp) /* R11 == RFLAGS */
jne opportunistic_sysret_failed jne swapgs_restore_regs_and_return_to_usermode
/* /*
* SYSCALL clears RF when it saves RFLAGS in R11 and SYSRET cannot * SYSCALL clears RF when it saves RFLAGS in R11 and SYSRET cannot
@ -302,12 +360,12 @@ return_from_SYSCALL_64:
* would never get past 'stuck_here'. * would never get past 'stuck_here'.
*/ */
testq $(X86_EFLAGS_RF|X86_EFLAGS_TF), %r11 testq $(X86_EFLAGS_RF|X86_EFLAGS_TF), %r11
jnz opportunistic_sysret_failed jnz swapgs_restore_regs_and_return_to_usermode
/* nothing to check for RSP */ /* nothing to check for RSP */
cmpq $__USER_DS, SS(%rsp) /* SS must match SYSRET */ cmpq $__USER_DS, SS(%rsp) /* SS must match SYSRET */
jne opportunistic_sysret_failed jne swapgs_restore_regs_and_return_to_usermode
/* /*
* We win! This label is here just for ease of understanding * We win! This label is here just for ease of understanding
@ -315,14 +373,36 @@ return_from_SYSCALL_64:
*/ */
syscall_return_via_sysret: syscall_return_via_sysret:
/* rcx and r11 are already restored (see code above) */ /* rcx and r11 are already restored (see code above) */
RESTORE_C_REGS_EXCEPT_RCX_R11
movq RSP(%rsp), %rsp
UNWIND_HINT_EMPTY UNWIND_HINT_EMPTY
USERGS_SYSRET64 POP_EXTRA_REGS
.Lpop_c_regs_except_rcx_r11_and_sysret:
popq %rsi /* skip r11 */
popq %r10
popq %r9
popq %r8
popq %rax
popq %rsi /* skip rcx */
popq %rdx
popq %rsi
opportunistic_sysret_failed: /*
SWAPGS * Now all regs are restored except RSP and RDI.
jmp restore_c_regs_and_iret * Save old stack pointer and switch to trampoline stack.
*/
movq %rsp, %rdi
movq PER_CPU_VAR(cpu_tss_rw + TSS_sp0), %rsp
pushq RSP-RDI(%rdi) /* RSP */
pushq (%rdi) /* RDI */
/*
* We are on the trampoline stack. All regs except RDI are live.
* We can do future final exit work right here.
*/
popq %rdi
popq %rsp
USERGS_SYSRET64
END(entry_SYSCALL_64) END(entry_SYSCALL_64)
ENTRY(stub_ptregs_64) ENTRY(stub_ptregs_64)
@ -423,8 +503,7 @@ ENTRY(ret_from_fork)
movq %rsp, %rdi movq %rsp, %rdi
call syscall_return_slowpath /* returns with IRQs disabled */ call syscall_return_slowpath /* returns with IRQs disabled */
TRACE_IRQS_ON /* user mode is traced as IRQS on */ TRACE_IRQS_ON /* user mode is traced as IRQS on */
SWAPGS jmp swapgs_restore_regs_and_return_to_usermode
jmp restore_regs_and_iret
1: 1:
/* kernel thread */ /* kernel thread */
@ -457,12 +536,13 @@ END(irq_entries_start)
.macro DEBUG_ENTRY_ASSERT_IRQS_OFF .macro DEBUG_ENTRY_ASSERT_IRQS_OFF
#ifdef CONFIG_DEBUG_ENTRY #ifdef CONFIG_DEBUG_ENTRY
pushfq pushq %rax
testl $X86_EFLAGS_IF, (%rsp) SAVE_FLAGS(CLBR_RAX)
testl $X86_EFLAGS_IF, %eax
jz .Lokay_\@ jz .Lokay_\@
ud2 ud2
.Lokay_\@: .Lokay_\@:
addq $8, %rsp popq %rax
#endif #endif
.endm .endm
@ -554,6 +634,13 @@ END(irq_entries_start)
/* 0(%rsp): ~(interrupt number) */ /* 0(%rsp): ~(interrupt number) */
.macro interrupt func .macro interrupt func
cld cld
testb $3, CS-ORIG_RAX(%rsp)
jz 1f
SWAPGS
call switch_to_thread_stack
1:
ALLOC_PT_GPREGS_ON_STACK ALLOC_PT_GPREGS_ON_STACK
SAVE_C_REGS SAVE_C_REGS
SAVE_EXTRA_REGS SAVE_EXTRA_REGS
@ -563,12 +650,8 @@ END(irq_entries_start)
jz 1f jz 1f
/* /*
* IRQ from user mode. Switch to kernel gsbase and inform context * IRQ from user mode.
* tracking that we're in kernel mode. *
*/
SWAPGS
/*
* We need to tell lockdep that IRQs are off. We can't do this until * We need to tell lockdep that IRQs are off. We can't do this until
* we fix gsbase, and we should do it before enter_from_user_mode * we fix gsbase, and we should do it before enter_from_user_mode
* (which can take locks). Since TRACE_IRQS_OFF idempotent, * (which can take locks). Since TRACE_IRQS_OFF idempotent,
@ -612,8 +695,52 @@ GLOBAL(retint_user)
mov %rsp,%rdi mov %rsp,%rdi
call prepare_exit_to_usermode call prepare_exit_to_usermode
TRACE_IRQS_IRETQ TRACE_IRQS_IRETQ
GLOBAL(swapgs_restore_regs_and_return_to_usermode)
#ifdef CONFIG_DEBUG_ENTRY
/* Assert that pt_regs indicates user mode. */
testb $3, CS(%rsp)
jnz 1f
ud2
1:
#endif
POP_EXTRA_REGS
popq %r11
popq %r10
popq %r9
popq %r8
popq %rax
popq %rcx
popq %rdx
popq %rsi
/*
* The stack is now user RDI, orig_ax, RIP, CS, EFLAGS, RSP, SS.
* Save old stack pointer and switch to trampoline stack.
*/
movq %rsp, %rdi
movq PER_CPU_VAR(cpu_tss_rw + TSS_sp0), %rsp
/* Copy the IRET frame to the trampoline stack. */
pushq 6*8(%rdi) /* SS */
pushq 5*8(%rdi) /* RSP */
pushq 4*8(%rdi) /* EFLAGS */
pushq 3*8(%rdi) /* CS */
pushq 2*8(%rdi) /* RIP */
/* Push user RDI on the trampoline stack. */
pushq (%rdi)
/*
* We are on the trampoline stack. All regs except RDI are live.
* We can do future final exit work right here.
*/
/* Restore RDI. */
popq %rdi
SWAPGS SWAPGS
jmp restore_regs_and_iret INTERRUPT_RETURN
/* Returning to kernel space */ /* Returning to kernel space */
retint_kernel: retint_kernel:
@ -633,15 +760,17 @@ retint_kernel:
*/ */
TRACE_IRQS_IRETQ TRACE_IRQS_IRETQ
/* GLOBAL(restore_regs_and_return_to_kernel)
* At this label, code paths which return to kernel and to user, #ifdef CONFIG_DEBUG_ENTRY
* which come from interrupts/exception and from syscalls, merge. /* Assert that pt_regs indicates kernel mode. */
*/ testb $3, CS(%rsp)
GLOBAL(restore_regs_and_iret) jz 1f
RESTORE_EXTRA_REGS ud2
restore_c_regs_and_iret: 1:
RESTORE_C_REGS #endif
REMOVE_PT_GPREGS_FROM_STACK 8 POP_EXTRA_REGS
POP_C_REGS
addq $8, %rsp /* skip regs->orig_ax */
INTERRUPT_RETURN INTERRUPT_RETURN
ENTRY(native_iret) ENTRY(native_iret)
@ -805,7 +934,33 @@ apicinterrupt IRQ_WORK_VECTOR irq_work_interrupt smp_irq_work_interrupt
/* /*
* Exception entry points. * Exception entry points.
*/ */
#define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss) + (TSS_ist + ((x) - 1) * 8) #define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss_rw) + (TSS_ist + ((x) - 1) * 8)
/*
* Switch to the thread stack. This is called with the IRET frame and
* orig_ax on the stack. (That is, RDI..R12 are not on the stack and
* space has not been allocated for them.)
*/
ENTRY(switch_to_thread_stack)
UNWIND_HINT_FUNC
pushq %rdi
movq %rsp, %rdi
movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
UNWIND_HINT sp_offset=16 sp_reg=ORC_REG_DI
pushq 7*8(%rdi) /* regs->ss */
pushq 6*8(%rdi) /* regs->rsp */
pushq 5*8(%rdi) /* regs->eflags */
pushq 4*8(%rdi) /* regs->cs */
pushq 3*8(%rdi) /* regs->ip */
pushq 2*8(%rdi) /* regs->orig_ax */
pushq 8(%rdi) /* return address */
UNWIND_HINT_FUNC
movq (%rdi), %rdi
ret
END(switch_to_thread_stack)
.macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1 .macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1
ENTRY(\sym) ENTRY(\sym)
@ -818,17 +973,18 @@ ENTRY(\sym)
ASM_CLAC ASM_CLAC
.ifeq \has_error_code .if \has_error_code == 0
pushq $-1 /* ORIG_RAX: no syscall to restart */ pushq $-1 /* ORIG_RAX: no syscall to restart */
.endif .endif
ALLOC_PT_GPREGS_ON_STACK ALLOC_PT_GPREGS_ON_STACK
.if \paranoid .if \paranoid < 2
.if \paranoid == 1
testb $3, CS(%rsp) /* If coming from userspace, switch stacks */ testb $3, CS(%rsp) /* If coming from userspace, switch stacks */
jnz 1f jnz .Lfrom_usermode_switch_stack_\@
.endif .endif
.if \paranoid
call paranoid_entry call paranoid_entry
.else .else
call error_entry call error_entry
@ -870,20 +1026,15 @@ ENTRY(\sym)
jmp error_exit jmp error_exit
.endif .endif
.if \paranoid == 1 .if \paranoid < 2
/* /*
* Paranoid entry from userspace. Switch stacks and treat it * Entry from userspace. Switch stacks and treat it
* as a normal entry. This means that paranoid handlers * as a normal entry. This means that paranoid handlers
* run in real process context if user_mode(regs). * run in real process context if user_mode(regs).
*/ */
1: .Lfrom_usermode_switch_stack_\@:
call error_entry call error_entry
movq %rsp, %rdi /* pt_regs pointer */
call sync_regs
movq %rax, %rsp /* switch stack */
movq %rsp, %rdi /* pt_regs pointer */ movq %rsp, %rdi /* pt_regs pointer */
.if \has_error_code .if \has_error_code
@ -1059,6 +1210,7 @@ idtentry int3 do_int3 has_error_code=0 paranoid=1 shift_ist=DEBUG_STACK
idtentry stack_segment do_stack_segment has_error_code=1 idtentry stack_segment do_stack_segment has_error_code=1
#ifdef CONFIG_XEN #ifdef CONFIG_XEN
idtentry xennmi do_nmi has_error_code=0
idtentry xendebug do_debug has_error_code=0 idtentry xendebug do_debug has_error_code=0
idtentry xenint3 do_int3 has_error_code=0 idtentry xenint3 do_int3 has_error_code=0
#endif #endif
@ -1112,17 +1264,14 @@ ENTRY(paranoid_exit)
DISABLE_INTERRUPTS(CLBR_ANY) DISABLE_INTERRUPTS(CLBR_ANY)
TRACE_IRQS_OFF_DEBUG TRACE_IRQS_OFF_DEBUG
testl %ebx, %ebx /* swapgs needed? */ testl %ebx, %ebx /* swapgs needed? */
jnz paranoid_exit_no_swapgs jnz .Lparanoid_exit_no_swapgs
TRACE_IRQS_IRETQ TRACE_IRQS_IRETQ
SWAPGS_UNSAFE_STACK SWAPGS_UNSAFE_STACK
jmp paranoid_exit_restore jmp .Lparanoid_exit_restore
paranoid_exit_no_swapgs: .Lparanoid_exit_no_swapgs:
TRACE_IRQS_IRETQ_DEBUG TRACE_IRQS_IRETQ_DEBUG
paranoid_exit_restore: .Lparanoid_exit_restore:
RESTORE_EXTRA_REGS jmp restore_regs_and_return_to_kernel
RESTORE_C_REGS
REMOVE_PT_GPREGS_FROM_STACK 8
INTERRUPT_RETURN
END(paranoid_exit) END(paranoid_exit)
/* /*
@ -1146,6 +1295,14 @@ ENTRY(error_entry)
SWAPGS SWAPGS
.Lerror_entry_from_usermode_after_swapgs: .Lerror_entry_from_usermode_after_swapgs:
/* Put us onto the real thread stack. */
popq %r12 /* save return addr in %12 */
movq %rsp, %rdi /* arg0 = pt_regs pointer */
call sync_regs
movq %rax, %rsp /* switch stack */
ENCODE_FRAME_POINTER
pushq %r12
/* /*
* We need to tell lockdep that IRQs are off. We can't do this until * We need to tell lockdep that IRQs are off. We can't do this until
* we fix gsbase, and we should do it before enter_from_user_mode * we fix gsbase, and we should do it before enter_from_user_mode
@ -1223,10 +1380,13 @@ ENTRY(error_exit)
jmp retint_user jmp retint_user
END(error_exit) END(error_exit)
/* Runs on exception stack */ /*
/* XXX: broken on Xen PV */ * Runs on exception stack. Xen PV does not go through this path at all,
* so we can use real assembly here.
*/
ENTRY(nmi) ENTRY(nmi)
UNWIND_HINT_IRET_REGS UNWIND_HINT_IRET_REGS
/* /*
* We allow breakpoints in NMIs. If a breakpoint occurs, then * We allow breakpoints in NMIs. If a breakpoint occurs, then
* the iretq it performs will take us out of NMI context. * the iretq it performs will take us out of NMI context.
@ -1284,7 +1444,7 @@ ENTRY(nmi)
* stacks lest we corrupt the "NMI executing" variable. * stacks lest we corrupt the "NMI executing" variable.
*/ */
SWAPGS_UNSAFE_STACK swapgs
cld cld
movq %rsp, %rdx movq %rsp, %rdx
movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
@ -1328,8 +1488,7 @@ ENTRY(nmi)
* Return back to user mode. We must *not* do the normal exit * Return back to user mode. We must *not* do the normal exit
* work, because we don't want to enable interrupts. * work, because we don't want to enable interrupts.
*/ */
SWAPGS jmp swapgs_restore_regs_and_return_to_usermode
jmp restore_regs_and_iret
.Lnmi_from_kernel: .Lnmi_from_kernel:
/* /*
@ -1450,7 +1609,7 @@ nested_nmi_out:
popq %rdx popq %rdx
/* We are returning to kernel mode, so this cannot result in a fault. */ /* We are returning to kernel mode, so this cannot result in a fault. */
INTERRUPT_RETURN iretq
first_nmi: first_nmi:
/* Restore rdx. */ /* Restore rdx. */
@ -1481,7 +1640,7 @@ first_nmi:
pushfq /* RFLAGS */ pushfq /* RFLAGS */
pushq $__KERNEL_CS /* CS */ pushq $__KERNEL_CS /* CS */
pushq $1f /* RIP */ pushq $1f /* RIP */
INTERRUPT_RETURN /* continues at repeat_nmi below */ iretq /* continues at repeat_nmi below */
UNWIND_HINT_IRET_REGS UNWIND_HINT_IRET_REGS
1: 1:
#endif #endif
@ -1544,29 +1703,34 @@ end_repeat_nmi:
nmi_swapgs: nmi_swapgs:
SWAPGS_UNSAFE_STACK SWAPGS_UNSAFE_STACK
nmi_restore: nmi_restore:
RESTORE_EXTRA_REGS POP_EXTRA_REGS
RESTORE_C_REGS POP_C_REGS
/* Point RSP at the "iret" frame. */ /*
REMOVE_PT_GPREGS_FROM_STACK 6*8 * Skip orig_ax and the "outermost" frame to point RSP at the "iret"
* at the "iret" frame.
*/
addq $6*8, %rsp
/* /*
* Clear "NMI executing". Set DF first so that we can easily * Clear "NMI executing". Set DF first so that we can easily
* distinguish the remaining code between here and IRET from * distinguish the remaining code between here and IRET from
* the SYSCALL entry and exit paths. On a native kernel, we * the SYSCALL entry and exit paths.
* could just inspect RIP, but, on paravirt kernels, *
* INTERRUPT_RETURN can translate into a jump into a * We arguably should just inspect RIP instead, but I (Andy) wrote
* hypercall page. * this code when I had the misapprehension that Xen PV supported
* NMIs, and Xen PV would break that approach.
*/ */
std std
movq $0, 5*8(%rsp) /* clear "NMI executing" */ movq $0, 5*8(%rsp) /* clear "NMI executing" */
/* /*
* INTERRUPT_RETURN reads the "iret" frame and exits the NMI * iretq reads the "iret" frame and exits the NMI stack in a
* stack in a single instruction. We are returning to kernel * single instruction. We are returning to kernel mode, so this
* mode, so this cannot result in a fault. * cannot result in a fault. Similarly, we don't need to worry
* about espfix64 on the way back to kernel mode.
*/ */
INTERRUPT_RETURN iretq
END(nmi) END(nmi)
ENTRY(ignore_sysret) ENTRY(ignore_sysret)

View File

@ -48,7 +48,7 @@
*/ */
ENTRY(entry_SYSENTER_compat) ENTRY(entry_SYSENTER_compat)
/* Interrupts are off on entry. */ /* Interrupts are off on entry. */
SWAPGS_UNSAFE_STACK SWAPGS
movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
/* /*
@ -306,8 +306,11 @@ ENTRY(entry_INT80_compat)
*/ */
movl %eax, %eax movl %eax, %eax
/* Construct struct pt_regs on stack (iret frame is already on stack) */
pushq %rax /* pt_regs->orig_ax */ pushq %rax /* pt_regs->orig_ax */
/* switch to thread stack expects orig_ax to be pushed */
call switch_to_thread_stack
pushq %rdi /* pt_regs->di */ pushq %rdi /* pt_regs->di */
pushq %rsi /* pt_regs->si */ pushq %rsi /* pt_regs->si */
pushq %rdx /* pt_regs->dx */ pushq %rdx /* pt_regs->dx */
@ -337,8 +340,7 @@ ENTRY(entry_INT80_compat)
/* Go back to user mode. */ /* Go back to user mode. */
TRACE_IRQS_ON TRACE_IRQS_ON
SWAPGS jmp swapgs_restore_regs_and_return_to_usermode
jmp restore_regs_and_iret
END(entry_INT80_compat) END(entry_INT80_compat)
ENTRY(stub32_clone) ENTRY(stub32_clone)

View File

@ -1,6 +1,6 @@
# SPDX-License-Identifier: GPL-2.0 # SPDX-License-Identifier: GPL-2.0
out := $(obj)/../../include/generated/asm out := arch/$(SRCARCH)/include/generated/asm
uapi := $(obj)/../../include/generated/uapi/asm uapi := arch/$(SRCARCH)/include/generated/uapi/asm
# Create output directory if not already present # Create output directory if not already present
_dummy := $(shell [ -d '$(out)' ] || mkdir -p '$(out)') \ _dummy := $(shell [ -d '$(out)' ] || mkdir -p '$(out)') \

View File

@ -37,6 +37,7 @@
#include <asm/unistd.h> #include <asm/unistd.h>
#include <asm/fixmap.h> #include <asm/fixmap.h>
#include <asm/traps.h> #include <asm/traps.h>
#include <asm/paravirt.h>
#define CREATE_TRACE_POINTS #define CREATE_TRACE_POINTS
#include "vsyscall_trace.h" #include "vsyscall_trace.h"
@ -138,6 +139,10 @@ bool emulate_vsyscall(struct pt_regs *regs, unsigned long address)
WARN_ON_ONCE(address != regs->ip); WARN_ON_ONCE(address != regs->ip);
/* This should be unreachable in NATIVE mode. */
if (WARN_ON(vsyscall_mode == NATIVE))
return false;
if (vsyscall_mode == NONE) { if (vsyscall_mode == NONE) {
warn_bad_vsyscall(KERN_INFO, regs, warn_bad_vsyscall(KERN_INFO, regs,
"vsyscall attempted with vsyscall=none"); "vsyscall attempted with vsyscall=none");
@ -329,16 +334,47 @@ int in_gate_area_no_mm(unsigned long addr)
return vsyscall_mode != NONE && (addr & PAGE_MASK) == VSYSCALL_ADDR; return vsyscall_mode != NONE && (addr & PAGE_MASK) == VSYSCALL_ADDR;
} }
/*
* The VSYSCALL page is the only user-accessible page in the kernel address
* range. Normally, the kernel page tables can have _PAGE_USER clear, but
* the tables covering VSYSCALL_ADDR need _PAGE_USER set if vsyscalls
* are enabled.
*
* Some day we may create a "minimal" vsyscall mode in which we emulate
* vsyscalls but leave the page not present. If so, we skip calling
* this.
*/
static void __init set_vsyscall_pgtable_user_bits(void)
{
pgd_t *pgd;
p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pgd = pgd_offset_k(VSYSCALL_ADDR);
set_pgd(pgd, __pgd(pgd_val(*pgd) | _PAGE_USER));
p4d = p4d_offset(pgd, VSYSCALL_ADDR);
#if CONFIG_PGTABLE_LEVELS >= 5
p4d->p4d |= _PAGE_USER;
#endif
pud = pud_offset(p4d, VSYSCALL_ADDR);
set_pud(pud, __pud(pud_val(*pud) | _PAGE_USER));
pmd = pmd_offset(pud, VSYSCALL_ADDR);
set_pmd(pmd, __pmd(pmd_val(*pmd) | _PAGE_USER));
}
void __init map_vsyscall(void) void __init map_vsyscall(void)
{ {
extern char __vsyscall_page; extern char __vsyscall_page;
unsigned long physaddr_vsyscall = __pa_symbol(&__vsyscall_page); unsigned long physaddr_vsyscall = __pa_symbol(&__vsyscall_page);
if (vsyscall_mode != NONE) if (vsyscall_mode != NONE) {
__set_fixmap(VSYSCALL_PAGE, physaddr_vsyscall, __set_fixmap(VSYSCALL_PAGE, physaddr_vsyscall,
vsyscall_mode == NATIVE vsyscall_mode == NATIVE
? PAGE_KERNEL_VSYSCALL ? PAGE_KERNEL_VSYSCALL
: PAGE_KERNEL_VVAR); : PAGE_KERNEL_VVAR);
set_vsyscall_pgtable_user_bits();
}
BUILD_BUG_ON((unsigned long)__fix_to_virt(VSYSCALL_PAGE) != BUILD_BUG_ON((unsigned long)__fix_to_virt(VSYSCALL_PAGE) !=
(unsigned long)VSYSCALL_ADDR); (unsigned long)VSYSCALL_ADDR);

View File

@ -2371,7 +2371,7 @@ static unsigned long get_segment_base(unsigned int segment)
struct ldt_struct *ldt; struct ldt_struct *ldt;
/* IRQs are off, so this synchronizes with smp_store_release */ /* IRQs are off, so this synchronizes with smp_store_release */
ldt = lockless_dereference(current->active_mm->context.ldt); ldt = READ_ONCE(current->active_mm->context.ldt);
if (!ldt || idx >= ldt->nr_entries) if (!ldt || idx >= ldt->nr_entries)
return 0; return 0;

View File

@ -2958,6 +2958,10 @@ static unsigned long intel_pmu_free_running_flags(struct perf_event *event)
if (event->attr.use_clockid) if (event->attr.use_clockid)
flags &= ~PERF_SAMPLE_TIME; flags &= ~PERF_SAMPLE_TIME;
if (!event->attr.exclude_kernel)
flags &= ~PERF_SAMPLE_REGS_USER;
if (event->attr.sample_regs_user & ~PEBS_REGS)
flags &= ~(PERF_SAMPLE_REGS_USER | PERF_SAMPLE_REGS_INTR);
return flags; return flags;
} }

View File

@ -85,13 +85,15 @@ struct amd_nb {
* Flags PEBS can handle without an PMI. * Flags PEBS can handle without an PMI.
* *
* TID can only be handled by flushing at context switch. * TID can only be handled by flushing at context switch.
* REGS_USER can be handled for events limited to ring 3.
* *
*/ */
#define PEBS_FREERUNNING_FLAGS \ #define PEBS_FREERUNNING_FLAGS \
(PERF_SAMPLE_IP | PERF_SAMPLE_TID | PERF_SAMPLE_ADDR | \ (PERF_SAMPLE_IP | PERF_SAMPLE_TID | PERF_SAMPLE_ADDR | \
PERF_SAMPLE_ID | PERF_SAMPLE_CPU | PERF_SAMPLE_STREAM_ID | \ PERF_SAMPLE_ID | PERF_SAMPLE_CPU | PERF_SAMPLE_STREAM_ID | \
PERF_SAMPLE_DATA_SRC | PERF_SAMPLE_IDENTIFIER | \ PERF_SAMPLE_DATA_SRC | PERF_SAMPLE_IDENTIFIER | \
PERF_SAMPLE_TRANSACTION | PERF_SAMPLE_PHYS_ADDR) PERF_SAMPLE_TRANSACTION | PERF_SAMPLE_PHYS_ADDR | \
PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER)
/* /*
* A debug store configuration. * A debug store configuration.
@ -110,6 +112,26 @@ struct debug_store {
u64 pebs_event_reset[MAX_PEBS_EVENTS]; u64 pebs_event_reset[MAX_PEBS_EVENTS];
}; };
#define PEBS_REGS \
(PERF_REG_X86_AX | \
PERF_REG_X86_BX | \
PERF_REG_X86_CX | \
PERF_REG_X86_DX | \
PERF_REG_X86_DI | \
PERF_REG_X86_SI | \
PERF_REG_X86_SP | \
PERF_REG_X86_BP | \
PERF_REG_X86_IP | \
PERF_REG_X86_FLAGS | \
PERF_REG_X86_R8 | \
PERF_REG_X86_R9 | \
PERF_REG_X86_R10 | \
PERF_REG_X86_R11 | \
PERF_REG_X86_R12 | \
PERF_REG_X86_R13 | \
PERF_REG_X86_R14 | \
PERF_REG_X86_R15)
/* /*
* Per register state. * Per register state.
*/ */

View File

@ -113,7 +113,7 @@ void hyperv_init(void)
u64 guest_id; u64 guest_id;
union hv_x64_msr_hypercall_contents hypercall_msr; union hv_x64_msr_hypercall_contents hypercall_msr;
if (x86_hyper != &x86_hyper_ms_hyperv) if (x86_hyper_type != X86_HYPER_MS_HYPERV)
return; return;
/* Allocate percpu VP index */ /* Allocate percpu VP index */

View File

@ -45,7 +45,7 @@ static inline bool rdrand_long(unsigned long *v)
bool ok; bool ok;
unsigned int retry = RDRAND_RETRY_LOOPS; unsigned int retry = RDRAND_RETRY_LOOPS;
do { do {
asm volatile(RDRAND_LONG "\n\t" asm volatile(RDRAND_LONG
CC_SET(c) CC_SET(c)
: CC_OUT(c) (ok), "=a" (*v)); : CC_OUT(c) (ok), "=a" (*v));
if (ok) if (ok)
@ -59,7 +59,7 @@ static inline bool rdrand_int(unsigned int *v)
bool ok; bool ok;
unsigned int retry = RDRAND_RETRY_LOOPS; unsigned int retry = RDRAND_RETRY_LOOPS;
do { do {
asm volatile(RDRAND_INT "\n\t" asm volatile(RDRAND_INT
CC_SET(c) CC_SET(c)
: CC_OUT(c) (ok), "=a" (*v)); : CC_OUT(c) (ok), "=a" (*v));
if (ok) if (ok)
@ -71,7 +71,7 @@ static inline bool rdrand_int(unsigned int *v)
static inline bool rdseed_long(unsigned long *v) static inline bool rdseed_long(unsigned long *v)
{ {
bool ok; bool ok;
asm volatile(RDSEED_LONG "\n\t" asm volatile(RDSEED_LONG
CC_SET(c) CC_SET(c)
: CC_OUT(c) (ok), "=a" (*v)); : CC_OUT(c) (ok), "=a" (*v));
return ok; return ok;
@ -80,7 +80,7 @@ static inline bool rdseed_long(unsigned long *v)
static inline bool rdseed_int(unsigned int *v) static inline bool rdseed_int(unsigned int *v)
{ {
bool ok; bool ok;
asm volatile(RDSEED_INT "\n\t" asm volatile(RDSEED_INT
CC_SET(c) CC_SET(c)
: CC_OUT(c) (ok), "=a" (*v)); : CC_OUT(c) (ok), "=a" (*v));
return ok; return ok;

View File

@ -143,7 +143,7 @@ static __always_inline void __clear_bit(long nr, volatile unsigned long *addr)
static __always_inline bool clear_bit_unlock_is_negative_byte(long nr, volatile unsigned long *addr) static __always_inline bool clear_bit_unlock_is_negative_byte(long nr, volatile unsigned long *addr)
{ {
bool negative; bool negative;
asm volatile(LOCK_PREFIX "andb %2,%1\n\t" asm volatile(LOCK_PREFIX "andb %2,%1"
CC_SET(s) CC_SET(s)
: CC_OUT(s) (negative), ADDR : CC_OUT(s) (negative), ADDR
: "ir" ((char) ~(1 << nr)) : "memory"); : "ir" ((char) ~(1 << nr)) : "memory");
@ -246,7 +246,7 @@ static __always_inline bool __test_and_set_bit(long nr, volatile unsigned long *
{ {
bool oldbit; bool oldbit;
asm("bts %2,%1\n\t" asm("bts %2,%1"
CC_SET(c) CC_SET(c)
: CC_OUT(c) (oldbit), ADDR : CC_OUT(c) (oldbit), ADDR
: "Ir" (nr)); : "Ir" (nr));
@ -286,7 +286,7 @@ static __always_inline bool __test_and_clear_bit(long nr, volatile unsigned long
{ {
bool oldbit; bool oldbit;
asm volatile("btr %2,%1\n\t" asm volatile("btr %2,%1"
CC_SET(c) CC_SET(c)
: CC_OUT(c) (oldbit), ADDR : CC_OUT(c) (oldbit), ADDR
: "Ir" (nr)); : "Ir" (nr));
@ -298,7 +298,7 @@ static __always_inline bool __test_and_change_bit(long nr, volatile unsigned lon
{ {
bool oldbit; bool oldbit;
asm volatile("btc %2,%1\n\t" asm volatile("btc %2,%1"
CC_SET(c) CC_SET(c)
: CC_OUT(c) (oldbit), ADDR : CC_OUT(c) (oldbit), ADDR
: "Ir" (nr) : "memory"); : "Ir" (nr) : "memory");
@ -329,7 +329,7 @@ static __always_inline bool variable_test_bit(long nr, volatile const unsigned l
{ {
bool oldbit; bool oldbit;
asm volatile("bt %2,%1\n\t" asm volatile("bt %2,%1"
CC_SET(c) CC_SET(c)
: CC_OUT(c) (oldbit) : CC_OUT(c) (oldbit)
: "m" (*(unsigned long *)addr), "Ir" (nr)); : "m" (*(unsigned long *)addr), "Ir" (nr));

View File

@ -7,6 +7,7 @@
*/ */
#include <linux/types.h> #include <linux/types.h>
#include <linux/sched.h> #include <linux/sched.h>
#include <linux/sched/task_stack.h>
#include <asm/processor.h> #include <asm/processor.h>
#include <asm/user32.h> #include <asm/user32.h>
#include <asm/unistd.h> #include <asm/unistd.h>
@ -209,7 +210,6 @@ typedef struct compat_siginfo {
} compat_siginfo_t; } compat_siginfo_t;
#define COMPAT_OFF_T_MAX 0x7fffffff #define COMPAT_OFF_T_MAX 0x7fffffff
#define COMPAT_LOFF_T_MAX 0x7fffffffffffffffL
struct compat_ipc64_perm { struct compat_ipc64_perm {
compat_key_t key; compat_key_t key;

View File

@ -0,0 +1,68 @@
// SPDX-License-Identifier: GPL-2.0
#ifndef _ASM_X86_CPU_ENTRY_AREA_H
#define _ASM_X86_CPU_ENTRY_AREA_H
#include <linux/percpu-defs.h>
#include <asm/processor.h>
/*
* cpu_entry_area is a percpu region that contains things needed by the CPU
* and early entry/exit code. Real types aren't used for all fields here
* to avoid circular header dependencies.
*
* Every field is a virtual alias of some other allocated backing store.
* There is no direct allocation of a struct cpu_entry_area.
*/
struct cpu_entry_area {
char gdt[PAGE_SIZE];
/*
* The GDT is just below entry_stack and thus serves (on x86_64) as
* a a read-only guard page.
*/
struct entry_stack_page entry_stack_page;
/*
* On x86_64, the TSS is mapped RO. On x86_32, it's mapped RW because
* we need task switches to work, and task switches write to the TSS.
*/
struct tss_struct tss;
char entry_trampoline[PAGE_SIZE];
#ifdef CONFIG_X86_64
/*
* Exception stacks used for IST entries.
*
* In the future, this should have a separate slot for each stack
* with guard pages between them.
*/
char exception_stacks[(N_EXCEPTION_STACKS - 1) * EXCEPTION_STKSZ + DEBUG_STKSZ];
#endif
};
#define CPU_ENTRY_AREA_SIZE (sizeof(struct cpu_entry_area))
#define CPU_ENTRY_AREA_TOT_SIZE (CPU_ENTRY_AREA_SIZE * NR_CPUS)
DECLARE_PER_CPU(struct cpu_entry_area *, cpu_entry_area);
extern void setup_cpu_entry_areas(void);
extern void cea_set_pte(void *cea_vaddr, phys_addr_t pa, pgprot_t flags);
#define CPU_ENTRY_AREA_RO_IDT CPU_ENTRY_AREA_BASE
#define CPU_ENTRY_AREA_PER_CPU (CPU_ENTRY_AREA_RO_IDT + PAGE_SIZE)
#define CPU_ENTRY_AREA_RO_IDT_VADDR ((void *)CPU_ENTRY_AREA_RO_IDT)
#define CPU_ENTRY_AREA_MAP_SIZE \
(CPU_ENTRY_AREA_PER_CPU + CPU_ENTRY_AREA_TOT_SIZE - CPU_ENTRY_AREA_BASE)
extern struct cpu_entry_area *get_cpu_entry_area(int cpu);
static inline struct entry_stack *cpu_entry_stack(int cpu)
{
return &get_cpu_entry_area(cpu)->entry_stack_page.stack;
}
#endif

View File

@ -126,16 +126,17 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
#define boot_cpu_has(bit) cpu_has(&boot_cpu_data, bit) #define boot_cpu_has(bit) cpu_has(&boot_cpu_data, bit)
#define set_cpu_cap(c, bit) set_bit(bit, (unsigned long *)((c)->x86_capability)) #define set_cpu_cap(c, bit) set_bit(bit, (unsigned long *)((c)->x86_capability))
#define clear_cpu_cap(c, bit) clear_bit(bit, (unsigned long *)((c)->x86_capability))
#define setup_clear_cpu_cap(bit) do { \ extern void setup_clear_cpu_cap(unsigned int bit);
clear_cpu_cap(&boot_cpu_data, bit); \ extern void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int bit);
set_bit(bit, (unsigned long *)cpu_caps_cleared); \
} while (0)
#define setup_force_cpu_cap(bit) do { \ #define setup_force_cpu_cap(bit) do { \
set_cpu_cap(&boot_cpu_data, bit); \ set_cpu_cap(&boot_cpu_data, bit); \
set_bit(bit, (unsigned long *)cpu_caps_set); \ set_bit(bit, (unsigned long *)cpu_caps_set); \
} while (0) } while (0)
#define setup_force_cpu_bug(bit) setup_force_cpu_cap(bit)
#if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_X86_FAST_FEATURE_TESTS) #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_X86_FAST_FEATURE_TESTS)
/* /*
* Static testing of CPU features. Used the same as boot_cpu_has(). * Static testing of CPU features. Used the same as boot_cpu_has().

View File

@ -20,9 +20,12 @@
* Note: If the comment begins with a quoted string, that string is used * Note: If the comment begins with a quoted string, that string is used
* in /proc/cpuinfo instead of the macro name. If the string is "", * in /proc/cpuinfo instead of the macro name. If the string is "",
* this feature bit is not displayed in /proc/cpuinfo at all. * this feature bit is not displayed in /proc/cpuinfo at all.
*
* When adding new features here that depend on other features,
* please update the table in kernel/cpu/cpuid-deps.c as well.
*/ */
/* Intel-defined CPU features, CPUID level 0x00000001 (edx), word 0 */ /* Intel-defined CPU features, CPUID level 0x00000001 (EDX), word 0 */
#define X86_FEATURE_FPU ( 0*32+ 0) /* Onboard FPU */ #define X86_FEATURE_FPU ( 0*32+ 0) /* Onboard FPU */
#define X86_FEATURE_VME ( 0*32+ 1) /* Virtual Mode Extensions */ #define X86_FEATURE_VME ( 0*32+ 1) /* Virtual Mode Extensions */
#define X86_FEATURE_DE ( 0*32+ 2) /* Debugging Extensions */ #define X86_FEATURE_DE ( 0*32+ 2) /* Debugging Extensions */
@ -37,8 +40,7 @@
#define X86_FEATURE_MTRR ( 0*32+12) /* Memory Type Range Registers */ #define X86_FEATURE_MTRR ( 0*32+12) /* Memory Type Range Registers */
#define X86_FEATURE_PGE ( 0*32+13) /* Page Global Enable */ #define X86_FEATURE_PGE ( 0*32+13) /* Page Global Enable */
#define X86_FEATURE_MCA ( 0*32+14) /* Machine Check Architecture */ #define X86_FEATURE_MCA ( 0*32+14) /* Machine Check Architecture */
#define X86_FEATURE_CMOV ( 0*32+15) /* CMOV instructions */ #define X86_FEATURE_CMOV ( 0*32+15) /* CMOV instructions (plus FCMOVcc, FCOMI with FPU) */
/* (plus FCMOVcc, FCOMI with FPU) */
#define X86_FEATURE_PAT ( 0*32+16) /* Page Attribute Table */ #define X86_FEATURE_PAT ( 0*32+16) /* Page Attribute Table */
#define X86_FEATURE_PSE36 ( 0*32+17) /* 36-bit PSEs */ #define X86_FEATURE_PSE36 ( 0*32+17) /* 36-bit PSEs */
#define X86_FEATURE_PN ( 0*32+18) /* Processor serial number */ #define X86_FEATURE_PN ( 0*32+18) /* Processor serial number */
@ -58,15 +60,15 @@
/* AMD-defined CPU features, CPUID level 0x80000001, word 1 */ /* AMD-defined CPU features, CPUID level 0x80000001, word 1 */
/* Don't duplicate feature flags which are redundant with Intel! */ /* Don't duplicate feature flags which are redundant with Intel! */
#define X86_FEATURE_SYSCALL ( 1*32+11) /* SYSCALL/SYSRET */ #define X86_FEATURE_SYSCALL ( 1*32+11) /* SYSCALL/SYSRET */
#define X86_FEATURE_MP ( 1*32+19) /* MP Capable. */ #define X86_FEATURE_MP ( 1*32+19) /* MP Capable */
#define X86_FEATURE_NX ( 1*32+20) /* Execute Disable */ #define X86_FEATURE_NX ( 1*32+20) /* Execute Disable */
#define X86_FEATURE_MMXEXT ( 1*32+22) /* AMD MMX extensions */ #define X86_FEATURE_MMXEXT ( 1*32+22) /* AMD MMX extensions */
#define X86_FEATURE_FXSR_OPT ( 1*32+25) /* FXSAVE/FXRSTOR optimizations */ #define X86_FEATURE_FXSR_OPT ( 1*32+25) /* FXSAVE/FXRSTOR optimizations */
#define X86_FEATURE_GBPAGES ( 1*32+26) /* "pdpe1gb" GB pages */ #define X86_FEATURE_GBPAGES ( 1*32+26) /* "pdpe1gb" GB pages */
#define X86_FEATURE_RDTSCP ( 1*32+27) /* RDTSCP */ #define X86_FEATURE_RDTSCP ( 1*32+27) /* RDTSCP */
#define X86_FEATURE_LM ( 1*32+29) /* Long Mode (x86-64) */ #define X86_FEATURE_LM ( 1*32+29) /* Long Mode (x86-64, 64-bit support) */
#define X86_FEATURE_3DNOWEXT ( 1*32+30) /* AMD 3DNow! extensions */ #define X86_FEATURE_3DNOWEXT ( 1*32+30) /* AMD 3DNow extensions */
#define X86_FEATURE_3DNOW ( 1*32+31) /* 3DNow! */ #define X86_FEATURE_3DNOW ( 1*32+31) /* 3DNow */
/* Transmeta-defined CPU features, CPUID level 0x80860001, word 2 */ /* Transmeta-defined CPU features, CPUID level 0x80860001, word 2 */
#define X86_FEATURE_RECOVERY ( 2*32+ 0) /* CPU in recovery mode */ #define X86_FEATURE_RECOVERY ( 2*32+ 0) /* CPU in recovery mode */
@ -79,66 +81,67 @@
#define X86_FEATURE_K6_MTRR ( 3*32+ 1) /* AMD K6 nonstandard MTRRs */ #define X86_FEATURE_K6_MTRR ( 3*32+ 1) /* AMD K6 nonstandard MTRRs */
#define X86_FEATURE_CYRIX_ARR ( 3*32+ 2) /* Cyrix ARRs (= MTRRs) */ #define X86_FEATURE_CYRIX_ARR ( 3*32+ 2) /* Cyrix ARRs (= MTRRs) */
#define X86_FEATURE_CENTAUR_MCR ( 3*32+ 3) /* Centaur MCRs (= MTRRs) */ #define X86_FEATURE_CENTAUR_MCR ( 3*32+ 3) /* Centaur MCRs (= MTRRs) */
/* cpu types for specific tunings: */
/* CPU types for specific tunings: */
#define X86_FEATURE_K8 ( 3*32+ 4) /* "" Opteron, Athlon64 */ #define X86_FEATURE_K8 ( 3*32+ 4) /* "" Opteron, Athlon64 */
#define X86_FEATURE_K7 ( 3*32+ 5) /* "" Athlon */ #define X86_FEATURE_K7 ( 3*32+ 5) /* "" Athlon */
#define X86_FEATURE_P3 ( 3*32+ 6) /* "" P3 */ #define X86_FEATURE_P3 ( 3*32+ 6) /* "" P3 */
#define X86_FEATURE_P4 ( 3*32+ 7) /* "" P4 */ #define X86_FEATURE_P4 ( 3*32+ 7) /* "" P4 */
#define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* TSC ticks at a constant rate */ #define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* TSC ticks at a constant rate */
#define X86_FEATURE_UP ( 3*32+ 9) /* smp kernel running on up */ #define X86_FEATURE_UP ( 3*32+ 9) /* SMP kernel running on UP */
#define X86_FEATURE_ART ( 3*32+10) /* Platform has always running timer (ART) */ #define X86_FEATURE_ART ( 3*32+10) /* Always running timer (ART) */
#define X86_FEATURE_ARCH_PERFMON ( 3*32+11) /* Intel Architectural PerfMon */ #define X86_FEATURE_ARCH_PERFMON ( 3*32+11) /* Intel Architectural PerfMon */
#define X86_FEATURE_PEBS ( 3*32+12) /* Precise-Event Based Sampling */ #define X86_FEATURE_PEBS ( 3*32+12) /* Precise-Event Based Sampling */
#define X86_FEATURE_BTS ( 3*32+13) /* Branch Trace Store */ #define X86_FEATURE_BTS ( 3*32+13) /* Branch Trace Store */
#define X86_FEATURE_SYSCALL32 ( 3*32+14) /* "" syscall in ia32 userspace */ #define X86_FEATURE_SYSCALL32 ( 3*32+14) /* "" syscall in IA32 userspace */
#define X86_FEATURE_SYSENTER32 ( 3*32+15) /* "" sysenter in ia32 userspace */ #define X86_FEATURE_SYSENTER32 ( 3*32+15) /* "" sysenter in IA32 userspace */
#define X86_FEATURE_REP_GOOD ( 3*32+16) /* rep microcode works well */ #define X86_FEATURE_REP_GOOD ( 3*32+16) /* REP microcode works well */
#define X86_FEATURE_MFENCE_RDTSC ( 3*32+17) /* "" Mfence synchronizes RDTSC */ #define X86_FEATURE_MFENCE_RDTSC ( 3*32+17) /* "" MFENCE synchronizes RDTSC */
#define X86_FEATURE_LFENCE_RDTSC ( 3*32+18) /* "" Lfence synchronizes RDTSC */ #define X86_FEATURE_LFENCE_RDTSC ( 3*32+18) /* "" LFENCE synchronizes RDTSC */
#define X86_FEATURE_ACC_POWER ( 3*32+19) /* AMD Accumulated Power Mechanism */ #define X86_FEATURE_ACC_POWER ( 3*32+19) /* AMD Accumulated Power Mechanism */
#define X86_FEATURE_NOPL ( 3*32+20) /* The NOPL (0F 1F) instructions */ #define X86_FEATURE_NOPL ( 3*32+20) /* The NOPL (0F 1F) instructions */
#define X86_FEATURE_ALWAYS ( 3*32+21) /* "" Always-present feature */ #define X86_FEATURE_ALWAYS ( 3*32+21) /* "" Always-present feature */
#define X86_FEATURE_XTOPOLOGY ( 3*32+22) /* cpu topology enum extensions */ #define X86_FEATURE_XTOPOLOGY ( 3*32+22) /* CPU topology enum extensions */
#define X86_FEATURE_TSC_RELIABLE ( 3*32+23) /* TSC is known to be reliable */ #define X86_FEATURE_TSC_RELIABLE ( 3*32+23) /* TSC is known to be reliable */
#define X86_FEATURE_NONSTOP_TSC ( 3*32+24) /* TSC does not stop in C states */ #define X86_FEATURE_NONSTOP_TSC ( 3*32+24) /* TSC does not stop in C states */
#define X86_FEATURE_CPUID ( 3*32+25) /* CPU has CPUID instruction itself */ #define X86_FEATURE_CPUID ( 3*32+25) /* CPU has CPUID instruction itself */
#define X86_FEATURE_EXTD_APICID ( 3*32+26) /* has extended APICID (8 bits) */ #define X86_FEATURE_EXTD_APICID ( 3*32+26) /* Extended APICID (8 bits) */
#define X86_FEATURE_AMD_DCM ( 3*32+27) /* multi-node processor */ #define X86_FEATURE_AMD_DCM ( 3*32+27) /* AMD multi-node processor */
#define X86_FEATURE_APERFMPERF ( 3*32+28) /* APERFMPERF */ #define X86_FEATURE_APERFMPERF ( 3*32+28) /* P-State hardware coordination feedback capability (APERF/MPERF MSRs) */
#define X86_FEATURE_NONSTOP_TSC_S3 ( 3*32+30) /* TSC doesn't stop in S3 state */ #define X86_FEATURE_NONSTOP_TSC_S3 ( 3*32+30) /* TSC doesn't stop in S3 state */
#define X86_FEATURE_TSC_KNOWN_FREQ ( 3*32+31) /* TSC has known frequency */ #define X86_FEATURE_TSC_KNOWN_FREQ ( 3*32+31) /* TSC has known frequency */
/* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */ /* Intel-defined CPU features, CPUID level 0x00000001 (ECX), word 4 */
#define X86_FEATURE_XMM3 ( 4*32+ 0) /* "pni" SSE-3 */ #define X86_FEATURE_XMM3 ( 4*32+ 0) /* "pni" SSE-3 */
#define X86_FEATURE_PCLMULQDQ ( 4*32+ 1) /* PCLMULQDQ instruction */ #define X86_FEATURE_PCLMULQDQ ( 4*32+ 1) /* PCLMULQDQ instruction */
#define X86_FEATURE_DTES64 ( 4*32+ 2) /* 64-bit Debug Store */ #define X86_FEATURE_DTES64 ( 4*32+ 2) /* 64-bit Debug Store */
#define X86_FEATURE_MWAIT ( 4*32+ 3) /* "monitor" Monitor/Mwait support */ #define X86_FEATURE_MWAIT ( 4*32+ 3) /* "monitor" MONITOR/MWAIT support */
#define X86_FEATURE_DSCPL ( 4*32+ 4) /* "ds_cpl" CPL Qual. Debug Store */ #define X86_FEATURE_DSCPL ( 4*32+ 4) /* "ds_cpl" CPL-qualified (filtered) Debug Store */
#define X86_FEATURE_VMX ( 4*32+ 5) /* Hardware virtualization */ #define X86_FEATURE_VMX ( 4*32+ 5) /* Hardware virtualization */
#define X86_FEATURE_SMX ( 4*32+ 6) /* Safer mode */ #define X86_FEATURE_SMX ( 4*32+ 6) /* Safer Mode eXtensions */
#define X86_FEATURE_EST ( 4*32+ 7) /* Enhanced SpeedStep */ #define X86_FEATURE_EST ( 4*32+ 7) /* Enhanced SpeedStep */
#define X86_FEATURE_TM2 ( 4*32+ 8) /* Thermal Monitor 2 */ #define X86_FEATURE_TM2 ( 4*32+ 8) /* Thermal Monitor 2 */
#define X86_FEATURE_SSSE3 ( 4*32+ 9) /* Supplemental SSE-3 */ #define X86_FEATURE_SSSE3 ( 4*32+ 9) /* Supplemental SSE-3 */
#define X86_FEATURE_CID ( 4*32+10) /* Context ID */ #define X86_FEATURE_CID ( 4*32+10) /* Context ID */
#define X86_FEATURE_SDBG ( 4*32+11) /* Silicon Debug */ #define X86_FEATURE_SDBG ( 4*32+11) /* Silicon Debug */
#define X86_FEATURE_FMA ( 4*32+12) /* Fused multiply-add */ #define X86_FEATURE_FMA ( 4*32+12) /* Fused multiply-add */
#define X86_FEATURE_CX16 ( 4*32+13) /* CMPXCHG16B */ #define X86_FEATURE_CX16 ( 4*32+13) /* CMPXCHG16B instruction */
#define X86_FEATURE_XTPR ( 4*32+14) /* Send Task Priority Messages */ #define X86_FEATURE_XTPR ( 4*32+14) /* Send Task Priority Messages */
#define X86_FEATURE_PDCM ( 4*32+15) /* Performance Capabilities */ #define X86_FEATURE_PDCM ( 4*32+15) /* Perf/Debug Capabilities MSR */
#define X86_FEATURE_PCID ( 4*32+17) /* Process Context Identifiers */ #define X86_FEATURE_PCID ( 4*32+17) /* Process Context Identifiers */
#define X86_FEATURE_DCA ( 4*32+18) /* Direct Cache Access */ #define X86_FEATURE_DCA ( 4*32+18) /* Direct Cache Access */
#define X86_FEATURE_XMM4_1 ( 4*32+19) /* "sse4_1" SSE-4.1 */ #define X86_FEATURE_XMM4_1 ( 4*32+19) /* "sse4_1" SSE-4.1 */
#define X86_FEATURE_XMM4_2 ( 4*32+20) /* "sse4_2" SSE-4.2 */ #define X86_FEATURE_XMM4_2 ( 4*32+20) /* "sse4_2" SSE-4.2 */
#define X86_FEATURE_X2APIC ( 4*32+21) /* x2APIC */ #define X86_FEATURE_X2APIC ( 4*32+21) /* X2APIC */
#define X86_FEATURE_MOVBE ( 4*32+22) /* MOVBE instruction */ #define X86_FEATURE_MOVBE ( 4*32+22) /* MOVBE instruction */
#define X86_FEATURE_POPCNT ( 4*32+23) /* POPCNT instruction */ #define X86_FEATURE_POPCNT ( 4*32+23) /* POPCNT instruction */
#define X86_FEATURE_TSC_DEADLINE_TIMER ( 4*32+24) /* Tsc deadline timer */ #define X86_FEATURE_TSC_DEADLINE_TIMER ( 4*32+24) /* TSC deadline timer */
#define X86_FEATURE_AES ( 4*32+25) /* AES instructions */ #define X86_FEATURE_AES ( 4*32+25) /* AES instructions */
#define X86_FEATURE_XSAVE ( 4*32+26) /* XSAVE/XRSTOR/XSETBV/XGETBV */ #define X86_FEATURE_XSAVE ( 4*32+26) /* XSAVE/XRSTOR/XSETBV/XGETBV instructions */
#define X86_FEATURE_OSXSAVE ( 4*32+27) /* "" XSAVE enabled in the OS */ #define X86_FEATURE_OSXSAVE ( 4*32+27) /* "" XSAVE instruction enabled in the OS */
#define X86_FEATURE_AVX ( 4*32+28) /* Advanced Vector Extensions */ #define X86_FEATURE_AVX ( 4*32+28) /* Advanced Vector Extensions */
#define X86_FEATURE_F16C ( 4*32+29) /* 16-bit fp conversions */ #define X86_FEATURE_F16C ( 4*32+29) /* 16-bit FP conversions */
#define X86_FEATURE_RDRAND ( 4*32+30) /* The RDRAND instruction */ #define X86_FEATURE_RDRAND ( 4*32+30) /* RDRAND instruction */
#define X86_FEATURE_HYPERVISOR ( 4*32+31) /* Running on a hypervisor */ #define X86_FEATURE_HYPERVISOR ( 4*32+31) /* Running on a hypervisor */
/* VIA/Cyrix/Centaur-defined CPU features, CPUID level 0xC0000001, word 5 */ /* VIA/Cyrix/Centaur-defined CPU features, CPUID level 0xC0000001, word 5 */
@ -153,10 +156,10 @@
#define X86_FEATURE_PMM ( 5*32+12) /* PadLock Montgomery Multiplier */ #define X86_FEATURE_PMM ( 5*32+12) /* PadLock Montgomery Multiplier */
#define X86_FEATURE_PMM_EN ( 5*32+13) /* PMM enabled */ #define X86_FEATURE_PMM_EN ( 5*32+13) /* PMM enabled */
/* More extended AMD flags: CPUID level 0x80000001, ecx, word 6 */ /* More extended AMD flags: CPUID level 0x80000001, ECX, word 6 */
#define X86_FEATURE_LAHF_LM ( 6*32+ 0) /* LAHF/SAHF in long mode */ #define X86_FEATURE_LAHF_LM ( 6*32+ 0) /* LAHF/SAHF in long mode */
#define X86_FEATURE_CMP_LEGACY ( 6*32+ 1) /* If yes HyperThreading not valid */ #define X86_FEATURE_CMP_LEGACY ( 6*32+ 1) /* If yes HyperThreading not valid */
#define X86_FEATURE_SVM ( 6*32+ 2) /* Secure virtual machine */ #define X86_FEATURE_SVM ( 6*32+ 2) /* Secure Virtual Machine */
#define X86_FEATURE_EXTAPIC ( 6*32+ 3) /* Extended APIC space */ #define X86_FEATURE_EXTAPIC ( 6*32+ 3) /* Extended APIC space */
#define X86_FEATURE_CR8_LEGACY ( 6*32+ 4) /* CR8 in 32-bit mode */ #define X86_FEATURE_CR8_LEGACY ( 6*32+ 4) /* CR8 in 32-bit mode */
#define X86_FEATURE_ABM ( 6*32+ 5) /* Advanced bit manipulation */ #define X86_FEATURE_ABM ( 6*32+ 5) /* Advanced bit manipulation */
@ -170,16 +173,16 @@
#define X86_FEATURE_WDT ( 6*32+13) /* Watchdog timer */ #define X86_FEATURE_WDT ( 6*32+13) /* Watchdog timer */
#define X86_FEATURE_LWP ( 6*32+15) /* Light Weight Profiling */ #define X86_FEATURE_LWP ( 6*32+15) /* Light Weight Profiling */
#define X86_FEATURE_FMA4 ( 6*32+16) /* 4 operands MAC instructions */ #define X86_FEATURE_FMA4 ( 6*32+16) /* 4 operands MAC instructions */
#define X86_FEATURE_TCE ( 6*32+17) /* translation cache extension */ #define X86_FEATURE_TCE ( 6*32+17) /* Translation Cache Extension */
#define X86_FEATURE_NODEID_MSR ( 6*32+19) /* NodeId MSR */ #define X86_FEATURE_NODEID_MSR ( 6*32+19) /* NodeId MSR */
#define X86_FEATURE_TBM ( 6*32+21) /* trailing bit manipulations */ #define X86_FEATURE_TBM ( 6*32+21) /* Trailing Bit Manipulations */
#define X86_FEATURE_TOPOEXT ( 6*32+22) /* topology extensions CPUID leafs */ #define X86_FEATURE_TOPOEXT ( 6*32+22) /* Topology extensions CPUID leafs */
#define X86_FEATURE_PERFCTR_CORE ( 6*32+23) /* core performance counter extensions */ #define X86_FEATURE_PERFCTR_CORE ( 6*32+23) /* Core performance counter extensions */
#define X86_FEATURE_PERFCTR_NB ( 6*32+24) /* NB performance counter extensions */ #define X86_FEATURE_PERFCTR_NB ( 6*32+24) /* NB performance counter extensions */
#define X86_FEATURE_BPEXT (6*32+26) /* data breakpoint extension */ #define X86_FEATURE_BPEXT ( 6*32+26) /* Data breakpoint extension */
#define X86_FEATURE_PTSC ( 6*32+27) /* performance time-stamp counter */ #define X86_FEATURE_PTSC ( 6*32+27) /* Performance time-stamp counter */
#define X86_FEATURE_PERFCTR_LLC ( 6*32+28) /* Last Level Cache performance counter extensions */ #define X86_FEATURE_PERFCTR_LLC ( 6*32+28) /* Last Level Cache performance counter extensions */
#define X86_FEATURE_MWAITX ( 6*32+29) /* MWAIT extension (MONITORX/MWAITX) */ #define X86_FEATURE_MWAITX ( 6*32+29) /* MWAIT extension (MONITORX/MWAITX instructions) */
/* /*
* Auxiliary flags: Linux defined - For features scattered in various * Auxiliary flags: Linux defined - For features scattered in various
@ -187,7 +190,7 @@
* *
* Reuse free bits when adding new feature flags! * Reuse free bits when adding new feature flags!
*/ */
#define X86_FEATURE_RING3MWAIT ( 7*32+ 0) /* Ring 3 MONITOR/MWAIT */ #define X86_FEATURE_RING3MWAIT ( 7*32+ 0) /* Ring 3 MONITOR/MWAIT instructions */
#define X86_FEATURE_CPUID_FAULT ( 7*32+ 1) /* Intel CPUID faulting */ #define X86_FEATURE_CPUID_FAULT ( 7*32+ 1) /* Intel CPUID faulting */
#define X86_FEATURE_CPB ( 7*32+ 2) /* AMD Core Performance Boost */ #define X86_FEATURE_CPB ( 7*32+ 2) /* AMD Core Performance Boost */
#define X86_FEATURE_EPB ( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */ #define X86_FEATURE_EPB ( 7*32+ 3) /* IA32_ENERGY_PERF_BIAS support */
@ -213,19 +216,19 @@
#define X86_FEATURE_EPT ( 8*32+ 3) /* Intel Extended Page Table */ #define X86_FEATURE_EPT ( 8*32+ 3) /* Intel Extended Page Table */
#define X86_FEATURE_VPID ( 8*32+ 4) /* Intel Virtual Processor ID */ #define X86_FEATURE_VPID ( 8*32+ 4) /* Intel Virtual Processor ID */
#define X86_FEATURE_VMMCALL ( 8*32+15) /* Prefer vmmcall to vmcall */ #define X86_FEATURE_VMMCALL ( 8*32+15) /* Prefer VMMCALL to VMCALL */
#define X86_FEATURE_XENPV ( 8*32+16) /* "" Xen paravirtual guest */ #define X86_FEATURE_XENPV ( 8*32+16) /* "" Xen paravirtual guest */
/* Intel-defined CPU features, CPUID level 0x00000007:0 (ebx), word 9 */ /* Intel-defined CPU features, CPUID level 0x00000007:0 (EBX), word 9 */
#define X86_FEATURE_FSGSBASE ( 9*32+ 0) /* {RD/WR}{FS/GS}BASE instructions*/ #define X86_FEATURE_FSGSBASE ( 9*32+ 0) /* RDFSBASE, WRFSBASE, RDGSBASE, WRGSBASE instructions*/
#define X86_FEATURE_TSC_ADJUST ( 9*32+ 1) /* TSC adjustment MSR 0x3b */ #define X86_FEATURE_TSC_ADJUST ( 9*32+ 1) /* TSC adjustment MSR 0x3B */
#define X86_FEATURE_BMI1 ( 9*32+ 3) /* 1st group bit manipulation extensions */ #define X86_FEATURE_BMI1 ( 9*32+ 3) /* 1st group bit manipulation extensions */
#define X86_FEATURE_HLE ( 9*32+ 4) /* Hardware Lock Elision */ #define X86_FEATURE_HLE ( 9*32+ 4) /* Hardware Lock Elision */
#define X86_FEATURE_AVX2 ( 9*32+ 5) /* AVX2 instructions */ #define X86_FEATURE_AVX2 ( 9*32+ 5) /* AVX2 instructions */
#define X86_FEATURE_SMEP ( 9*32+ 7) /* Supervisor Mode Execution Protection */ #define X86_FEATURE_SMEP ( 9*32+ 7) /* Supervisor Mode Execution Protection */
#define X86_FEATURE_BMI2 ( 9*32+ 8) /* 2nd group bit manipulation extensions */ #define X86_FEATURE_BMI2 ( 9*32+ 8) /* 2nd group bit manipulation extensions */
#define X86_FEATURE_ERMS ( 9*32+ 9) /* Enhanced REP MOVSB/STOSB */ #define X86_FEATURE_ERMS ( 9*32+ 9) /* Enhanced REP MOVSB/STOSB instructions */
#define X86_FEATURE_INVPCID ( 9*32+10) /* Invalidate Processor Context ID */ #define X86_FEATURE_INVPCID ( 9*32+10) /* Invalidate Processor Context ID */
#define X86_FEATURE_RTM ( 9*32+11) /* Restricted Transactional Memory */ #define X86_FEATURE_RTM ( 9*32+11) /* Restricted Transactional Memory */
#define X86_FEATURE_CQM ( 9*32+12) /* Cache QoS Monitoring */ #define X86_FEATURE_CQM ( 9*32+12) /* Cache QoS Monitoring */
@ -233,8 +236,8 @@
#define X86_FEATURE_RDT_A ( 9*32+15) /* Resource Director Technology Allocation */ #define X86_FEATURE_RDT_A ( 9*32+15) /* Resource Director Technology Allocation */
#define X86_FEATURE_AVX512F ( 9*32+16) /* AVX-512 Foundation */ #define X86_FEATURE_AVX512F ( 9*32+16) /* AVX-512 Foundation */
#define X86_FEATURE_AVX512DQ ( 9*32+17) /* AVX-512 DQ (Double/Quad granular) Instructions */ #define X86_FEATURE_AVX512DQ ( 9*32+17) /* AVX-512 DQ (Double/Quad granular) Instructions */
#define X86_FEATURE_RDSEED ( 9*32+18) /* The RDSEED instruction */ #define X86_FEATURE_RDSEED ( 9*32+18) /* RDSEED instruction */
#define X86_FEATURE_ADX ( 9*32+19) /* The ADCX and ADOX instructions */ #define X86_FEATURE_ADX ( 9*32+19) /* ADCX and ADOX instructions */
#define X86_FEATURE_SMAP ( 9*32+20) /* Supervisor Mode Access Prevention */ #define X86_FEATURE_SMAP ( 9*32+20) /* Supervisor Mode Access Prevention */
#define X86_FEATURE_AVX512IFMA ( 9*32+21) /* AVX-512 Integer Fused Multiply-Add instructions */ #define X86_FEATURE_AVX512IFMA ( 9*32+21) /* AVX-512 Integer Fused Multiply-Add instructions */
#define X86_FEATURE_CLFLUSHOPT ( 9*32+23) /* CLFLUSHOPT instruction */ #define X86_FEATURE_CLFLUSHOPT ( 9*32+23) /* CLFLUSHOPT instruction */
@ -246,25 +249,26 @@
#define X86_FEATURE_AVX512BW ( 9*32+30) /* AVX-512 BW (Byte/Word granular) Instructions */ #define X86_FEATURE_AVX512BW ( 9*32+30) /* AVX-512 BW (Byte/Word granular) Instructions */
#define X86_FEATURE_AVX512VL ( 9*32+31) /* AVX-512 VL (128/256 Vector Length) Extensions */ #define X86_FEATURE_AVX512VL ( 9*32+31) /* AVX-512 VL (128/256 Vector Length) Extensions */
/* Extended state features, CPUID level 0x0000000d:1 (eax), word 10 */ /* Extended state features, CPUID level 0x0000000d:1 (EAX), word 10 */
#define X86_FEATURE_XSAVEOPT (10*32+ 0) /* XSAVEOPT */ #define X86_FEATURE_XSAVEOPT (10*32+ 0) /* XSAVEOPT instruction */
#define X86_FEATURE_XSAVEC (10*32+ 1) /* XSAVEC */ #define X86_FEATURE_XSAVEC (10*32+ 1) /* XSAVEC instruction */
#define X86_FEATURE_XGETBV1 (10*32+ 2) /* XGETBV with ECX = 1 */ #define X86_FEATURE_XGETBV1 (10*32+ 2) /* XGETBV with ECX = 1 instruction */
#define X86_FEATURE_XSAVES (10*32+ 3) /* XSAVES/XRSTORS */ #define X86_FEATURE_XSAVES (10*32+ 3) /* XSAVES/XRSTORS instructions */
/* Intel-defined CPU QoS Sub-leaf, CPUID level 0x0000000F:0 (edx), word 11 */ /* Intel-defined CPU QoS Sub-leaf, CPUID level 0x0000000F:0 (EDX), word 11 */
#define X86_FEATURE_CQM_LLC (11*32+ 1) /* LLC QoS if 1 */ #define X86_FEATURE_CQM_LLC (11*32+ 1) /* LLC QoS if 1 */
/* Intel-defined CPU QoS Sub-leaf, CPUID level 0x0000000F:1 (edx), word 12 */ /* Intel-defined CPU QoS Sub-leaf, CPUID level 0x0000000F:1 (EDX), word 12 */
#define X86_FEATURE_CQM_OCCUP_LLC (12*32+ 0) /* LLC occupancy monitoring if 1 */ #define X86_FEATURE_CQM_OCCUP_LLC (12*32+ 0) /* LLC occupancy monitoring */
#define X86_FEATURE_CQM_MBM_TOTAL (12*32+ 1) /* LLC Total MBM monitoring */ #define X86_FEATURE_CQM_MBM_TOTAL (12*32+ 1) /* LLC Total MBM monitoring */
#define X86_FEATURE_CQM_MBM_LOCAL (12*32+ 2) /* LLC Local MBM monitoring */ #define X86_FEATURE_CQM_MBM_LOCAL (12*32+ 2) /* LLC Local MBM monitoring */
/* AMD-defined CPU features, CPUID level 0x80000008 (ebx), word 13 */ /* AMD-defined CPU features, CPUID level 0x80000008 (EBX), word 13 */
#define X86_FEATURE_CLZERO (13*32+ 0) /* CLZERO instruction */ #define X86_FEATURE_CLZERO (13*32+ 0) /* CLZERO instruction */
#define X86_FEATURE_IRPERF (13*32+ 1) /* Instructions Retired Count */ #define X86_FEATURE_IRPERF (13*32+ 1) /* Instructions Retired Count */
#define X86_FEATURE_XSAVEERPTR (13*32+ 2) /* Always save/restore FP error pointers */
/* Thermal and Power Management Leaf, CPUID level 0x00000006 (eax), word 14 */ /* Thermal and Power Management Leaf, CPUID level 0x00000006 (EAX), word 14 */
#define X86_FEATURE_DTHERM (14*32+ 0) /* Digital Thermal Sensor */ #define X86_FEATURE_DTHERM (14*32+ 0) /* Digital Thermal Sensor */
#define X86_FEATURE_IDA (14*32+ 1) /* Intel Dynamic Acceleration */ #define X86_FEATURE_IDA (14*32+ 1) /* Intel Dynamic Acceleration */
#define X86_FEATURE_ARAT (14*32+ 2) /* Always Running APIC Timer */ #define X86_FEATURE_ARAT (14*32+ 2) /* Always Running APIC Timer */
@ -276,7 +280,7 @@
#define X86_FEATURE_HWP_EPP (14*32+10) /* HWP Energy Perf. Preference */ #define X86_FEATURE_HWP_EPP (14*32+10) /* HWP Energy Perf. Preference */
#define X86_FEATURE_HWP_PKG_REQ (14*32+11) /* HWP Package Level Request */ #define X86_FEATURE_HWP_PKG_REQ (14*32+11) /* HWP Package Level Request */
/* AMD SVM Feature Identification, CPUID level 0x8000000a (edx), word 15 */ /* AMD SVM Feature Identification, CPUID level 0x8000000a (EDX), word 15 */
#define X86_FEATURE_NPT (15*32+ 0) /* Nested Page Table support */ #define X86_FEATURE_NPT (15*32+ 0) /* Nested Page Table support */
#define X86_FEATURE_LBRV (15*32+ 1) /* LBR Virtualization support */ #define X86_FEATURE_LBRV (15*32+ 1) /* LBR Virtualization support */
#define X86_FEATURE_SVML (15*32+ 2) /* "svm_lock" SVM locking MSR */ #define X86_FEATURE_SVML (15*32+ 2) /* "svm_lock" SVM locking MSR */
@ -291,15 +295,22 @@
#define X86_FEATURE_V_VMSAVE_VMLOAD (15*32+15) /* Virtual VMSAVE VMLOAD */ #define X86_FEATURE_V_VMSAVE_VMLOAD (15*32+15) /* Virtual VMSAVE VMLOAD */
#define X86_FEATURE_VGIF (15*32+16) /* Virtual GIF */ #define X86_FEATURE_VGIF (15*32+16) /* Virtual GIF */
/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx), word 16 */ /* Intel-defined CPU features, CPUID level 0x00000007:0 (ECX), word 16 */
#define X86_FEATURE_AVX512VBMI (16*32+ 1) /* AVX512 Vector Bit Manipulation instructions*/ #define X86_FEATURE_AVX512VBMI (16*32+ 1) /* AVX512 Vector Bit Manipulation instructions*/
#define X86_FEATURE_UMIP (16*32+ 2) /* User Mode Instruction Protection */
#define X86_FEATURE_PKU (16*32+ 3) /* Protection Keys for Userspace */ #define X86_FEATURE_PKU (16*32+ 3) /* Protection Keys for Userspace */
#define X86_FEATURE_OSPKE (16*32+ 4) /* OS Protection Keys Enable */ #define X86_FEATURE_OSPKE (16*32+ 4) /* OS Protection Keys Enable */
#define X86_FEATURE_AVX512_VBMI2 (16*32+ 6) /* Additional AVX512 Vector Bit Manipulation Instructions */
#define X86_FEATURE_GFNI (16*32+ 8) /* Galois Field New Instructions */
#define X86_FEATURE_VAES (16*32+ 9) /* Vector AES */
#define X86_FEATURE_VPCLMULQDQ (16*32+10) /* Carry-Less Multiplication Double Quadword */
#define X86_FEATURE_AVX512_VNNI (16*32+11) /* Vector Neural Network Instructions */
#define X86_FEATURE_AVX512_BITALG (16*32+12) /* Support for VPOPCNT[B,W] and VPSHUF-BITQMB instructions */
#define X86_FEATURE_AVX512_VPOPCNTDQ (16*32+14) /* POPCNT for vectors of DW/QW */ #define X86_FEATURE_AVX512_VPOPCNTDQ (16*32+14) /* POPCNT for vectors of DW/QW */
#define X86_FEATURE_LA57 (16*32+16) /* 5-level page tables */ #define X86_FEATURE_LA57 (16*32+16) /* 5-level page tables */
#define X86_FEATURE_RDPID (16*32+22) /* RDPID instruction */ #define X86_FEATURE_RDPID (16*32+22) /* RDPID instruction */
/* AMD-defined CPU features, CPUID level 0x80000007 (ebx), word 17 */ /* AMD-defined CPU features, CPUID level 0x80000007 (EBX), word 17 */
#define X86_FEATURE_OVERFLOW_RECOV (17*32+ 0) /* MCA overflow recovery support */ #define X86_FEATURE_OVERFLOW_RECOV (17*32+ 0) /* MCA overflow recovery support */
#define X86_FEATURE_SUCCOR (17*32+ 1) /* Uncorrectable error containment and recovery */ #define X86_FEATURE_SUCCOR (17*32+ 1) /* Uncorrectable error containment and recovery */
#define X86_FEATURE_SMCA (17*32+ 3) /* Scalable MCA */ #define X86_FEATURE_SMCA (17*32+ 3) /* Scalable MCA */
@ -329,4 +340,5 @@
#define X86_BUG_SWAPGS_FENCE X86_BUG(11) /* SWAPGS without input dep on GS */ #define X86_BUG_SWAPGS_FENCE X86_BUG(11) /* SWAPGS without input dep on GS */
#define X86_BUG_MONITOR X86_BUG(12) /* IPI required to wake up remote CPU */ #define X86_BUG_MONITOR X86_BUG(12) /* IPI required to wake up remote CPU */
#define X86_BUG_AMD_E400 X86_BUG(13) /* CPU is among the affected by Erratum 400 */ #define X86_BUG_AMD_E400 X86_BUG(13) /* CPU is among the affected by Erratum 400 */
#endif /* _ASM_X86_CPUFEATURES_H */ #endif /* _ASM_X86_CPUFEATURES_H */

View File

@ -7,6 +7,7 @@
#include <asm/mmu.h> #include <asm/mmu.h>
#include <asm/fixmap.h> #include <asm/fixmap.h>
#include <asm/irq_vectors.h> #include <asm/irq_vectors.h>
#include <asm/cpu_entry_area.h>
#include <linux/smp.h> #include <linux/smp.h>
#include <linux/percpu.h> #include <linux/percpu.h>
@ -60,17 +61,10 @@ static inline struct desc_struct *get_current_gdt_rw(void)
return this_cpu_ptr(&gdt_page)->gdt; return this_cpu_ptr(&gdt_page)->gdt;
} }
/* Get the fixmap index for a specific processor */
static inline unsigned int get_cpu_gdt_ro_index(int cpu)
{
return FIX_GDT_REMAP_BEGIN + cpu;
}
/* Provide the fixmap address of the remapped GDT */ /* Provide the fixmap address of the remapped GDT */
static inline struct desc_struct *get_cpu_gdt_ro(int cpu) static inline struct desc_struct *get_cpu_gdt_ro(int cpu)
{ {
unsigned int idx = get_cpu_gdt_ro_index(cpu); return (struct desc_struct *)&get_cpu_entry_area(cpu)->gdt;
return (struct desc_struct *)__fix_to_virt(idx);
} }
/* Provide the current read-only GDT */ /* Provide the current read-only GDT */
@ -185,7 +179,7 @@ static inline void set_tssldt_descriptor(void *d, unsigned long addr,
#endif #endif
} }
static inline void __set_tss_desc(unsigned cpu, unsigned int entry, void *addr) static inline void __set_tss_desc(unsigned cpu, unsigned int entry, struct x86_hw_tss *addr)
{ {
struct desc_struct *d = get_cpu_gdt_rw(cpu); struct desc_struct *d = get_cpu_gdt_rw(cpu);
tss_desc tss; tss_desc tss;

View File

@ -2,7 +2,7 @@
#ifndef _ASM_X86_ESPFIX_H #ifndef _ASM_X86_ESPFIX_H
#define _ASM_X86_ESPFIX_H #define _ASM_X86_ESPFIX_H
#ifdef CONFIG_X86_64 #ifdef CONFIG_X86_ESPFIX64
#include <asm/percpu.h> #include <asm/percpu.h>
@ -11,7 +11,8 @@ DECLARE_PER_CPU_READ_MOSTLY(unsigned long, espfix_waddr);
extern void init_espfix_bsp(void); extern void init_espfix_bsp(void);
extern void init_espfix_ap(int cpu); extern void init_espfix_ap(int cpu);
#else
#endif /* CONFIG_X86_64 */ static inline void init_espfix_ap(int cpu) { }
#endif
#endif /* _ASM_X86_ESPFIX_H */ #endif /* _ASM_X86_ESPFIX_H */

View File

@ -44,7 +44,6 @@ extern unsigned long __FIXADDR_TOP;
PAGE_SIZE) PAGE_SIZE)
#endif #endif
/* /*
* Here we define all the compile-time 'special' virtual * Here we define all the compile-time 'special' virtual
* addresses. The point is to have a constant address at * addresses. The point is to have a constant address at
@ -84,7 +83,6 @@ enum fixed_addresses {
FIX_IO_APIC_BASE_0, FIX_IO_APIC_BASE_0,
FIX_IO_APIC_BASE_END = FIX_IO_APIC_BASE_0 + MAX_IO_APICS - 1, FIX_IO_APIC_BASE_END = FIX_IO_APIC_BASE_0 + MAX_IO_APICS - 1,
#endif #endif
FIX_RO_IDT, /* Virtual mapping for read-only IDT */
#ifdef CONFIG_X86_32 #ifdef CONFIG_X86_32
FIX_KMAP_BEGIN, /* reserved pte's for temporary kernel mappings */ FIX_KMAP_BEGIN, /* reserved pte's for temporary kernel mappings */
FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1, FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1,
@ -100,9 +98,12 @@ enum fixed_addresses {
#ifdef CONFIG_X86_INTEL_MID #ifdef CONFIG_X86_INTEL_MID
FIX_LNW_VRTC, FIX_LNW_VRTC,
#endif #endif
/* Fixmap entries to remap the GDTs, one per processor. */
FIX_GDT_REMAP_BEGIN, #ifdef CONFIG_ACPI_APEI_GHES
FIX_GDT_REMAP_END = FIX_GDT_REMAP_BEGIN + NR_CPUS - 1, /* Used for GHES mapping from assorted contexts */
FIX_APEI_GHES_IRQ,
FIX_APEI_GHES_NMI,
#endif
__end_of_permanent_fixed_addresses, __end_of_permanent_fixed_addresses,

View File

@ -20,14 +20,22 @@
#ifndef _ASM_X86_HYPERVISOR_H #ifndef _ASM_X86_HYPERVISOR_H
#define _ASM_X86_HYPERVISOR_H #define _ASM_X86_HYPERVISOR_H
/* x86 hypervisor types */
enum x86_hypervisor_type {
X86_HYPER_NATIVE = 0,
X86_HYPER_VMWARE,
X86_HYPER_MS_HYPERV,
X86_HYPER_XEN_PV,
X86_HYPER_XEN_HVM,
X86_HYPER_KVM,
};
#ifdef CONFIG_HYPERVISOR_GUEST #ifdef CONFIG_HYPERVISOR_GUEST
#include <asm/kvm_para.h> #include <asm/kvm_para.h>
#include <asm/x86_init.h>
#include <asm/xen/hypervisor.h> #include <asm/xen/hypervisor.h>
/*
* x86 hypervisor information
*/
struct hypervisor_x86 { struct hypervisor_x86 {
/* Hypervisor name */ /* Hypervisor name */
const char *name; const char *name;
@ -35,40 +43,27 @@ struct hypervisor_x86 {
/* Detection routine */ /* Detection routine */
uint32_t (*detect)(void); uint32_t (*detect)(void);
/* Platform setup (run once per boot) */ /* Hypervisor type */
void (*init_platform)(void); enum x86_hypervisor_type type;
/* X2APIC detection (run once per boot) */ /* init time callbacks */
bool (*x2apic_available)(void); struct x86_hyper_init init;
/* pin current vcpu to specified physical cpu (run rarely) */ /* runtime callbacks */
void (*pin_vcpu)(int); struct x86_hyper_runtime runtime;
/* called during init_mem_mapping() to setup early mappings. */
void (*init_mem_mapping)(void);
}; };
extern const struct hypervisor_x86 *x86_hyper; extern enum x86_hypervisor_type x86_hyper_type;
/* Recognized hypervisors */
extern const struct hypervisor_x86 x86_hyper_vmware;
extern const struct hypervisor_x86 x86_hyper_ms_hyperv;
extern const struct hypervisor_x86 x86_hyper_xen_pv;
extern const struct hypervisor_x86 x86_hyper_xen_hvm;
extern const struct hypervisor_x86 x86_hyper_kvm;
extern void init_hypervisor_platform(void); extern void init_hypervisor_platform(void);
extern bool hypervisor_x2apic_available(void); static inline bool hypervisor_is_type(enum x86_hypervisor_type type)
extern void hypervisor_pin_vcpu(int cpu);
static inline void hypervisor_init_mem_mapping(void)
{ {
if (x86_hyper && x86_hyper->init_mem_mapping) return x86_hyper_type == type;
x86_hyper->init_mem_mapping();
} }
#else #else
static inline void init_hypervisor_platform(void) { } static inline void init_hypervisor_platform(void) { }
static inline bool hypervisor_x2apic_available(void) { return false; } static inline bool hypervisor_is_type(enum x86_hypervisor_type type)
static inline void hypervisor_init_mem_mapping(void) { } {
return type == X86_HYPER_NATIVE;
}
#endif /* CONFIG_HYPERVISOR_GUEST */ #endif /* CONFIG_HYPERVISOR_GUEST */
#endif /* _ASM_X86_HYPERVISOR_H */ #endif /* _ASM_X86_HYPERVISOR_H */

View File

@ -97,6 +97,16 @@
#define INAT_MAKE_GROUP(grp) ((grp << INAT_GRP_OFFS) | INAT_MODRM) #define INAT_MAKE_GROUP(grp) ((grp << INAT_GRP_OFFS) | INAT_MODRM)
#define INAT_MAKE_IMM(imm) (imm << INAT_IMM_OFFS) #define INAT_MAKE_IMM(imm) (imm << INAT_IMM_OFFS)
/* Identifiers for segment registers */
#define INAT_SEG_REG_IGNORE 0
#define INAT_SEG_REG_DEFAULT 1
#define INAT_SEG_REG_CS 2
#define INAT_SEG_REG_SS 3
#define INAT_SEG_REG_DS 4
#define INAT_SEG_REG_ES 5
#define INAT_SEG_REG_FS 6
#define INAT_SEG_REG_GS 7
/* Attribute search APIs */ /* Attribute search APIs */
extern insn_attr_t inat_get_opcode_attribute(insn_byte_t opcode); extern insn_attr_t inat_get_opcode_attribute(insn_byte_t opcode);
extern int inat_get_last_prefix_id(insn_byte_t last_pfx); extern int inat_get_last_prefix_id(insn_byte_t last_pfx);

View File

@ -0,0 +1,53 @@
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _ASM_X86_INVPCID
#define _ASM_X86_INVPCID
static inline void __invpcid(unsigned long pcid, unsigned long addr,
unsigned long type)
{
struct { u64 d[2]; } desc = { { pcid, addr } };
/*
* The memory clobber is because the whole point is to invalidate
* stale TLB entries and, especially if we're flushing global
* mappings, we don't want the compiler to reorder any subsequent
* memory accesses before the TLB flush.
*
* The hex opcode is invpcid (%ecx), %eax in 32-bit mode and
* invpcid (%rcx), %rax in long mode.
*/
asm volatile (".byte 0x66, 0x0f, 0x38, 0x82, 0x01"
: : "m" (desc), "a" (type), "c" (&desc) : "memory");
}
#define INVPCID_TYPE_INDIV_ADDR 0
#define INVPCID_TYPE_SINGLE_CTXT 1
#define INVPCID_TYPE_ALL_INCL_GLOBAL 2
#define INVPCID_TYPE_ALL_NON_GLOBAL 3
/* Flush all mappings for a given pcid and addr, not including globals. */
static inline void invpcid_flush_one(unsigned long pcid,
unsigned long addr)
{
__invpcid(pcid, addr, INVPCID_TYPE_INDIV_ADDR);
}
/* Flush all mappings for a given PCID, not including globals. */
static inline void invpcid_flush_single_context(unsigned long pcid)
{
__invpcid(pcid, 0, INVPCID_TYPE_SINGLE_CTXT);
}
/* Flush all mappings, including globals, for all PCIDs. */
static inline void invpcid_flush_all(void)
{
__invpcid(0, 0, INVPCID_TYPE_ALL_INCL_GLOBAL);
}
/* Flush all mappings for all PCIDs except globals. */
static inline void invpcid_flush_all_nonglobals(void)
{
__invpcid(0, 0, INVPCID_TYPE_ALL_NON_GLOBAL);
}
#endif /* _ASM_X86_INVPCID */

View File

@ -142,6 +142,9 @@ static inline notrace unsigned long arch_local_irq_save(void)
swapgs; \ swapgs; \
sysretl sysretl
#ifdef CONFIG_DEBUG_ENTRY
#define SAVE_FLAGS(x) pushfq; popq %rax
#endif
#else #else
#define INTERRUPT_RETURN iret #define INTERRUPT_RETURN iret
#define ENABLE_INTERRUPTS_SYSEXIT sti; sysexit #define ENABLE_INTERRUPTS_SYSEXIT sti; sysexit

View File

@ -26,6 +26,7 @@ extern void die(const char *, struct pt_regs *,long);
extern int __must_check __die(const char *, struct pt_regs *, long); extern int __must_check __die(const char *, struct pt_regs *, long);
extern void show_stack_regs(struct pt_regs *regs); extern void show_stack_regs(struct pt_regs *regs);
extern void __show_regs(struct pt_regs *regs, int all); extern void __show_regs(struct pt_regs *regs, int all);
extern void show_iret_regs(struct pt_regs *regs);
extern unsigned long oops_begin(void); extern unsigned long oops_begin(void);
extern void oops_end(unsigned long, struct pt_regs *, int signr); extern void oops_end(unsigned long, struct pt_regs *, int signr);

View File

@ -1426,4 +1426,7 @@ static inline int kvm_cpu_get_apicid(int mps_cpu)
#endif #endif
} }
void kvm_arch_mmu_notifier_invalidate_range(struct kvm *kvm,
unsigned long start, unsigned long end);
#endif /* _ASM_X86_KVM_HOST_H */ #endif /* _ASM_X86_KVM_HOST_H */

View File

@ -3,6 +3,7 @@
#define _ASM_X86_MMU_H #define _ASM_X86_MMU_H
#include <linux/spinlock.h> #include <linux/spinlock.h>
#include <linux/rwsem.h>
#include <linux/mutex.h> #include <linux/mutex.h>
#include <linux/atomic.h> #include <linux/atomic.h>
@ -27,6 +28,7 @@ typedef struct {
atomic64_t tlb_gen; atomic64_t tlb_gen;
#ifdef CONFIG_MODIFY_LDT_SYSCALL #ifdef CONFIG_MODIFY_LDT_SYSCALL
struct rw_semaphore ldt_usr_sem;
struct ldt_struct *ldt; struct ldt_struct *ldt;
#endif #endif

View File

@ -57,10 +57,16 @@ struct ldt_struct {
/* /*
* Used for LDT copy/destruction. * Used for LDT copy/destruction.
*/ */
int init_new_context_ldt(struct task_struct *tsk, struct mm_struct *mm); static inline void init_new_context_ldt(struct mm_struct *mm)
{
mm->context.ldt = NULL;
init_rwsem(&mm->context.ldt_usr_sem);
}
int ldt_dup_context(struct mm_struct *oldmm, struct mm_struct *mm);
void destroy_context_ldt(struct mm_struct *mm); void destroy_context_ldt(struct mm_struct *mm);
#else /* CONFIG_MODIFY_LDT_SYSCALL */ #else /* CONFIG_MODIFY_LDT_SYSCALL */
static inline int init_new_context_ldt(struct task_struct *tsk, static inline void init_new_context_ldt(struct mm_struct *mm) { }
static inline int ldt_dup_context(struct mm_struct *oldmm,
struct mm_struct *mm) struct mm_struct *mm)
{ {
return 0; return 0;
@ -73,8 +79,8 @@ static inline void load_mm_ldt(struct mm_struct *mm)
#ifdef CONFIG_MODIFY_LDT_SYSCALL #ifdef CONFIG_MODIFY_LDT_SYSCALL
struct ldt_struct *ldt; struct ldt_struct *ldt;
/* lockless_dereference synchronizes with smp_store_release */ /* READ_ONCE synchronizes with smp_store_release */
ldt = lockless_dereference(mm->context.ldt); ldt = READ_ONCE(mm->context.ldt);
/* /*
* Any change to mm->context.ldt is followed by an IPI to all * Any change to mm->context.ldt is followed by an IPI to all
@ -132,6 +138,8 @@ void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk);
static inline int init_new_context(struct task_struct *tsk, static inline int init_new_context(struct task_struct *tsk,
struct mm_struct *mm) struct mm_struct *mm)
{ {
mutex_init(&mm->context.lock);
mm->context.ctx_id = atomic64_inc_return(&last_mm_ctx_id); mm->context.ctx_id = atomic64_inc_return(&last_mm_ctx_id);
atomic64_set(&mm->context.tlb_gen, 0); atomic64_set(&mm->context.tlb_gen, 0);
@ -143,7 +151,8 @@ static inline int init_new_context(struct task_struct *tsk,
mm->context.execute_only_pkey = -1; mm->context.execute_only_pkey = -1;
} }
#endif #endif
return init_new_context_ldt(tsk, mm); init_new_context_ldt(mm);
return 0;
} }
static inline void destroy_context(struct mm_struct *mm) static inline void destroy_context(struct mm_struct *mm)
{ {
@ -176,10 +185,10 @@ do { \
} while (0) } while (0)
#endif #endif
static inline void arch_dup_mmap(struct mm_struct *oldmm, static inline int arch_dup_mmap(struct mm_struct *oldmm, struct mm_struct *mm)
struct mm_struct *mm)
{ {
paravirt_arch_dup_mmap(oldmm, mm); paravirt_arch_dup_mmap(oldmm, mm);
return ldt_dup_context(oldmm, mm);
} }
static inline void arch_exit_mmap(struct mm_struct *mm) static inline void arch_exit_mmap(struct mm_struct *mm)
@ -281,33 +290,6 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
return __pkru_allows_pkey(vma_pkey(vma), write); return __pkru_allows_pkey(vma_pkey(vma), write);
} }
/*
* If PCID is on, ASID-aware code paths put the ASID+1 into the PCID
* bits. This serves two purposes. It prevents a nasty situation in
* which PCID-unaware code saves CR3, loads some other value (with PCID
* == 0), and then restores CR3, thus corrupting the TLB for ASID 0 if
* the saved ASID was nonzero. It also means that any bugs involving
* loading a PCID-enabled CR3 with CR4.PCIDE off will trigger
* deterministically.
*/
static inline unsigned long build_cr3(struct mm_struct *mm, u16 asid)
{
if (static_cpu_has(X86_FEATURE_PCID)) {
VM_WARN_ON_ONCE(asid > 4094);
return __sme_pa(mm->pgd) | (asid + 1);
} else {
VM_WARN_ON_ONCE(asid != 0);
return __sme_pa(mm->pgd);
}
}
static inline unsigned long build_cr3_noflush(struct mm_struct *mm, u16 asid)
{
VM_WARN_ON_ONCE(asid > 4094);
return __sme_pa(mm->pgd) | (asid + 1) | CR3_NOFLUSH;
}
/* /*
* This can be used from process context to figure out what the value of * This can be used from process context to figure out what the value of
* CR3 is without needing to do a (slow) __read_cr3(). * CR3 is without needing to do a (slow) __read_cr3().
@ -317,7 +299,7 @@ static inline unsigned long build_cr3_noflush(struct mm_struct *mm, u16 asid)
*/ */
static inline unsigned long __get_current_cr3_fast(void) static inline unsigned long __get_current_cr3_fast(void)
{ {
unsigned long cr3 = build_cr3(this_cpu_read(cpu_tlbstate.loaded_mm), unsigned long cr3 = build_cr3(this_cpu_read(cpu_tlbstate.loaded_mm)->pgd,
this_cpu_read(cpu_tlbstate.loaded_mm_asid)); this_cpu_read(cpu_tlbstate.loaded_mm_asid));
/* For now, be very restrictive about when this can be called. */ /* For now, be very restrictive about when this can be called. */

Some files were not shown because too many files have changed in this diff Show More