* refs/heads/tmp-4c9e0a9
Linux 4.14.43
x86/bugs: Rename SSBD_NO to SSB_NO
KVM: SVM: Implement VIRT_SPEC_CTRL support for SSBD
x86/speculation, KVM: Implement support for VIRT_SPEC_CTRL/LS_CFG
x86/bugs: Rework spec_ctrl base and mask logic
x86/bugs: Remove x86_spec_ctrl_set()
x86/bugs: Expose x86_spec_ctrl_base directly
x86/bugs: Unify x86_spec_ctrl_{set_guest,restore_host}
x86/speculation: Rework speculative_store_bypass_update()
x86/speculation: Add virtualized speculative store bypass disable support
x86/bugs, KVM: Extend speculation control for VIRT_SPEC_CTRL
x86/speculation: Handle HT correctly on AMD
x86/cpufeatures: Add FEATURE_ZEN
x86/cpufeatures: Disentangle SSBD enumeration
x86/cpufeatures: Disentangle MSR_SPEC_CTRL enumeration from IBRS
x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP
KVM: SVM: Move spec control call after restore of GS
x86/cpu: Make alternative_msr_write work for 32-bit code
x86/bugs: Fix the parameters alignment and missing void
x86/bugs: Make cpu_show_common() static
x86/bugs: Fix __ssb_select_mitigation() return type
Documentation/spec_ctrl: Do some minor cleanups
proc: Use underscores for SSBD in 'status'
x86/bugs: Rename _RDS to _SSBD
x86/speculation: Make "seccomp" the default mode for Speculative Store Bypass
seccomp: Move speculation migitation control to arch code
seccomp: Add filter flag to opt-out of SSB mitigation
seccomp: Use PR_SPEC_FORCE_DISABLE
prctl: Add force disable speculation
x86/bugs: Make boot modes __ro_after_init
seccomp: Enable speculation flaw mitigations
proc: Provide details on speculation flaw mitigations
nospec: Allow getting/setting on non-current task
x86/speculation: Add prctl for Speculative Store Bypass mitigation
x86/process: Allow runtime control of Speculative Store Bypass
prctl: Add speculation control prctls
x86/speculation: Create spec-ctrl.h to avoid include hell
x86/KVM/VMX: Expose SPEC_CTRL Bit(2) to the guest
x86/bugs/AMD: Add support to disable RDS on Fam[15,16,17]h if requested
x86/bugs: Whitelist allowed SPEC_CTRL MSR values
x86/bugs/intel: Set proper CPU features and setup RDS
x86/bugs: Provide boot parameters for the spec_store_bypass_disable mitigation
x86/cpufeatures: Add X86_FEATURE_RDS
x86/bugs: Expose /sys/../spec_store_bypass
x86/bugs, KVM: Support the combination of guest and host IBRS
x86/bugs: Read SPEC_CTRL MSR during boot and re-use reserved bits
x86/bugs: Concentrate bug reporting into a separate function
x86/bugs: Concentrate bug detection into a separate function
x86/nospec: Simplify alternative_msr_write()
btrfs: fix reading stale metadata blocks after degraded raid1 mounts
btrfs: Fix delalloc inodes invalidation during transaction abort
btrfs: Split btrfs_del_delalloc_inode into 2 functions
btrfs: fix crash when trying to resume balance without the resume flag
btrfs: property: Set incompat flag if lzo/zstd compression is set
Btrfs: send, fix invalid access to commit roots due to concurrent snapshotting
Btrfs: fix xattr loss after power failure
ARM: 8772/1: kprobes: Prohibit kprobes on get_user functions
ARM: 8770/1: kprobes: Prohibit probing on optimized_callback
ARM: 8769/1: kprobes: Fix to use get_kprobe_ctlblk after irq-disabed
tick/broadcast: Use for_each_cpu() specially on UP kernels
x86/mm: Drop TS_COMPAT on 64-bit exec() syscall
ARM: 8771/1: kprobes: Prohibit kprobes on do_undefinstr
efi: Avoid potential crashes, fix the 'struct efi_pci_io_protocol_32' definition for mixed mode
x86/pkeys: Do not special case protection key 0
x86/pkeys: Override pkey when moving away from PROT_EXEC
s390: remove indirect branch from do_softirq_own_stack
s390/qdio: don't release memory in qdio_setup_irq()
s390/cpum_sf: ensure sample frequency of perf event attributes is non-zero
s390/qdio: fix access to uninitialized qdio_q fields
drm/i915/gen9: Add WaClearHIZ_WM_CHICKEN3 for bxt and glk
mm: don't allow deferred pages with NEED_PER_CPU_KM
radix tree: fix multi-order iteration race
lib/test_bitmap.c: fix bitmap optimisation tests to report errors correctly
drm: Match sysfs name in link removal to link creation
powerpc/powernv: Fix NVRAM sleep in invalid context when crashing
i2c: designware: fix poll-after-enable regression
netfilter: nf_socket: Fix out of bounds access in nf_sk_lookup_slow_v{4,6}
netfilter: nf_tables: can't fail after linking rule into active rule list
netfilter: nf_tables: free set name in error path
tee: shm: fix use-after-free via temporarily dropped reference
tracing/x86/xen: Remove zero data size trace events trace_xen_mmu_flush_tlb{_all}
vfio: ccw: fix cleanup if cp_prefetch fails
powerpc: Don't preempt_disable() in show_cpuinfo()
KVM: arm/arm64: VGIC/ITS: protect kvm_read_guest() calls with SRCU lock
KVM: arm/arm64: VGIC/ITS save/restore: protect kvm_read_guest() calls
spi: bcm-qspi: Always read and set BSPI_MAST_N_BOOT_CTRL
spi: bcm-qspi: Avoid setting MSPI_CDRAM_PCS for spi-nor master
spi: pxa2xx: Allow 64-bit DMA
ALSA: control: fix a redundant-copy issue
ALSA: hda: Add Lenovo C50 All in one to the power_save blacklist
ALSA: usb: mixer: volume quirk for CM102-A+/102S+
usbip: usbip_host: fix bad unlock balance during stub_probe()
usbip: usbip_host: fix NULL-ptr deref and use-after-free errors
usbip: usbip_host: run rebind from exit when module is removed
usbip: usbip_host: delete device from busid_table after rebind
usbip: usbip_host: refine probe and disconnect debug msgs to be useful
Linux 4.14.42
proc: do not access cmdline nor environ from file-backed areas
l2tp: revert "l2tp: fix missing print session offset info"
xfrm: fix xfrm_do_migrate() with AEAD e.g(AES-GCM)
btrfs: Take trans lock before access running trans in check_delayed_ref
xfrm: Use __skb_queue_tail in xfrm_trans_queue
scsi: aacraid: Correct hba_send to include iu_type
udp: fix SO_BINDTODEVICE
nsh: fix infinite loop
net/mlx5e: Allow offloading ipv4 header re-write for icmp
ipv6: fix uninit-value in ip6_multipath_l3_keys()
hv_netvsc: set master device
net/mlx5: Avoid cleaning flow steering table twice during error flow
net/mlx5e: TX, Use correct counter in dma_map error flow
net: sched: fix error path in tcf_proto_create() when modules are not configured
bonding: send learning packets for vlans on slave
bonding: do not allow rlb updates to invalid mac
tg3: Fix vunmap() BUG_ON() triggered from tg3_free_consistent().
tcp: ignore Fast Open on repair mode
tcp_bbr: fix to zero idle_restart only upon S/ACKed data
sctp: use the old asoc when making the cookie-ack chunk in dupcook_d
sctp: remove sctp_chunk_put from fail_mark err path in sctp_ulpevent_make_rcvmsg
sctp: handle two v4 addrs comparison in sctp_inet6_cmp_addr
sctp: fix the issue that the cookie-ack with auth can't get processed
sctp: delay the authentication for the duplicated cookie-echo chunk
rds: do not leak kernel memory to user land
r8169: fix powering up RTL8168h
qmi_wwan: do not steal interfaces from class drivers
openvswitch: Don't swap table in nlattr_set() after OVS_ATTR_NESTED is found
net/tls: Fix connection stall on partial tls record
net/tls: Don't recursively call push_record during tls_write_space callbacks
net: support compat 64-bit time in {s,g}etsockopt
net_sched: fq: take care of throttled flows before reuse
net sched actions: fix refcnt leak in skbmod
net/mlx5: E-Switch, Include VF RDMA stats in vport statistics
net/mlx5e: Err if asked to offload TC match on frag being first
net/mlx4_en: Verify coalescing parameters are in range
net/mlx4_en: Fix an error handling path in 'mlx4_en_init_netdev()'
net: ethernet: ti: cpsw: fix packet leaking in dual_mac mode
net: ethernet: sun: niu set correct packet size in skb
llc: better deal with too small mtu
ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg
ipv4: fix fnhe usage by non-cached routes
dccp: fix tasklet usage
bridge: check iface upper dev when setting master via ioctl
8139too: Use disable_irq_nosync() in rtl8139_poll_controller()
ANDROID: sdcardfs: Don't d_drop in d_revalidate
FROMLIST: brcmfmac: fix initialization of struct cfg80211_inform_bss variable
FROMLIST: brcmfmac: reports boottime_ns while informing bss
Change-Id: I43c27b71b153a2a87070de3ea393002769856960
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlsESzAACgkQONu9yGCS
aT71uhAAtwH5Dvy395KPNS+IqabGaFnEeVpIEsxtBlIa7crspp9eUqiqEWP6nAGg
dPeBE4jLEf8lVed0ErZ+p0eJTuhjgUmve4/5+LBWQtZIz+9ppttwklRysxCfPixs
/cPBfSbfjQTqeQqpB3jOpQAZXnyeipxFMMjxlLoXEcKxcVM9qr3b+oNJ1lw/ETH3
3NMIYL+PSKyYp2cnAFUpUeU7grJQeTAwPDVy+ziZ8tF0aU5JbHMNRL19d9NxhQCX
efk4sr8smkKUv9wayM63FMtjlm/MYc6cxLRz2DsWEAQuC6qkEEqwf7vZ4XEGrqci
1tGWibzzTpo1v+01r57U5VXkS+DMyjYajikZNTe3ixUp19iKQyMSsMrBNupapOMy
s2x+lZLKFa3q8PGpIy0kJ8yCYw2DZMlrEC+VAfr1S9M3vz9pPzLv398r7eYcHhJb
Q8hHPdWgX3dcsYhju5/gekDFn7M41dsU3vtoooz50HKDcqVovJNwZNgzsLR8Fs4F
X3yanXyP5rjBnM9dQUnhi0PvJA6E/ZWDmp6LF9ZiySX1xJ9+5gflI+MnvxRvVuXk
UP3f8ace87x3zWYzmGin7vouUzsIOueCJXKZCGCvcV5/NLMGAW3NBGCZWnnH6OTy
RPsDUeKj36QBmalitR9yYF25Ss/zDx1b8RRdeVkD1E0YpfgMubg=
=opxx
-----END PGP SIGNATURE-----
Merge 4.14.43 into android-4.14
Changes in 4.14.43
usbip: usbip_host: refine probe and disconnect debug msgs to be useful
usbip: usbip_host: delete device from busid_table after rebind
usbip: usbip_host: run rebind from exit when module is removed
usbip: usbip_host: fix NULL-ptr deref and use-after-free errors
usbip: usbip_host: fix bad unlock balance during stub_probe()
ALSA: usb: mixer: volume quirk for CM102-A+/102S+
ALSA: hda: Add Lenovo C50 All in one to the power_save blacklist
ALSA: control: fix a redundant-copy issue
spi: pxa2xx: Allow 64-bit DMA
spi: bcm-qspi: Avoid setting MSPI_CDRAM_PCS for spi-nor master
spi: bcm-qspi: Always read and set BSPI_MAST_N_BOOT_CTRL
KVM: arm/arm64: VGIC/ITS save/restore: protect kvm_read_guest() calls
KVM: arm/arm64: VGIC/ITS: protect kvm_read_guest() calls with SRCU lock
powerpc: Don't preempt_disable() in show_cpuinfo()
vfio: ccw: fix cleanup if cp_prefetch fails
tracing/x86/xen: Remove zero data size trace events trace_xen_mmu_flush_tlb{_all}
tee: shm: fix use-after-free via temporarily dropped reference
netfilter: nf_tables: free set name in error path
netfilter: nf_tables: can't fail after linking rule into active rule list
netfilter: nf_socket: Fix out of bounds access in nf_sk_lookup_slow_v{4,6}
i2c: designware: fix poll-after-enable regression
powerpc/powernv: Fix NVRAM sleep in invalid context when crashing
drm: Match sysfs name in link removal to link creation
lib/test_bitmap.c: fix bitmap optimisation tests to report errors correctly
radix tree: fix multi-order iteration race
mm: don't allow deferred pages with NEED_PER_CPU_KM
drm/i915/gen9: Add WaClearHIZ_WM_CHICKEN3 for bxt and glk
s390/qdio: fix access to uninitialized qdio_q fields
s390/cpum_sf: ensure sample frequency of perf event attributes is non-zero
s390/qdio: don't release memory in qdio_setup_irq()
s390: remove indirect branch from do_softirq_own_stack
x86/pkeys: Override pkey when moving away from PROT_EXEC
x86/pkeys: Do not special case protection key 0
efi: Avoid potential crashes, fix the 'struct efi_pci_io_protocol_32' definition for mixed mode
ARM: 8771/1: kprobes: Prohibit kprobes on do_undefinstr
x86/mm: Drop TS_COMPAT on 64-bit exec() syscall
tick/broadcast: Use for_each_cpu() specially on UP kernels
ARM: 8769/1: kprobes: Fix to use get_kprobe_ctlblk after irq-disabed
ARM: 8770/1: kprobes: Prohibit probing on optimized_callback
ARM: 8772/1: kprobes: Prohibit kprobes on get_user functions
Btrfs: fix xattr loss after power failure
Btrfs: send, fix invalid access to commit roots due to concurrent snapshotting
btrfs: property: Set incompat flag if lzo/zstd compression is set
btrfs: fix crash when trying to resume balance without the resume flag
btrfs: Split btrfs_del_delalloc_inode into 2 functions
btrfs: Fix delalloc inodes invalidation during transaction abort
btrfs: fix reading stale metadata blocks after degraded raid1 mounts
x86/nospec: Simplify alternative_msr_write()
x86/bugs: Concentrate bug detection into a separate function
x86/bugs: Concentrate bug reporting into a separate function
x86/bugs: Read SPEC_CTRL MSR during boot and re-use reserved bits
x86/bugs, KVM: Support the combination of guest and host IBRS
x86/bugs: Expose /sys/../spec_store_bypass
x86/cpufeatures: Add X86_FEATURE_RDS
x86/bugs: Provide boot parameters for the spec_store_bypass_disable mitigation
x86/bugs/intel: Set proper CPU features and setup RDS
x86/bugs: Whitelist allowed SPEC_CTRL MSR values
x86/bugs/AMD: Add support to disable RDS on Fam[15,16,17]h if requested
x86/KVM/VMX: Expose SPEC_CTRL Bit(2) to the guest
x86/speculation: Create spec-ctrl.h to avoid include hell
prctl: Add speculation control prctls
x86/process: Allow runtime control of Speculative Store Bypass
x86/speculation: Add prctl for Speculative Store Bypass mitigation
nospec: Allow getting/setting on non-current task
proc: Provide details on speculation flaw mitigations
seccomp: Enable speculation flaw mitigations
x86/bugs: Make boot modes __ro_after_init
prctl: Add force disable speculation
seccomp: Use PR_SPEC_FORCE_DISABLE
seccomp: Add filter flag to opt-out of SSB mitigation
seccomp: Move speculation migitation control to arch code
x86/speculation: Make "seccomp" the default mode for Speculative Store Bypass
x86/bugs: Rename _RDS to _SSBD
proc: Use underscores for SSBD in 'status'
Documentation/spec_ctrl: Do some minor cleanups
x86/bugs: Fix __ssb_select_mitigation() return type
x86/bugs: Make cpu_show_common() static
x86/bugs: Fix the parameters alignment and missing void
x86/cpu: Make alternative_msr_write work for 32-bit code
KVM: SVM: Move spec control call after restore of GS
x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP
x86/cpufeatures: Disentangle MSR_SPEC_CTRL enumeration from IBRS
x86/cpufeatures: Disentangle SSBD enumeration
x86/cpufeatures: Add FEATURE_ZEN
x86/speculation: Handle HT correctly on AMD
x86/bugs, KVM: Extend speculation control for VIRT_SPEC_CTRL
x86/speculation: Add virtualized speculative store bypass disable support
x86/speculation: Rework speculative_store_bypass_update()
x86/bugs: Unify x86_spec_ctrl_{set_guest,restore_host}
x86/bugs: Expose x86_spec_ctrl_base directly
x86/bugs: Remove x86_spec_ctrl_set()
x86/bugs: Rework spec_ctrl base and mask logic
x86/speculation, KVM: Implement support for VIRT_SPEC_CTRL/LS_CFG
KVM: SVM: Implement VIRT_SPEC_CTRL support for SSBD
x86/bugs: Rename SSBD_NO to SSB_NO
Linux 4.14.43
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit ab1e8d8960b68f54af42b6484b5950bd13a4054b upstream.
It is unsafe to do virtual to physical translations before mm_init() is
called if struct page is needed in order to determine the memory section
number (see SECTION_IN_PAGE_FLAGS). This is because only in mm_init()
we initialize struct pages for all the allocated memory when deferred
struct pages are used.
My recent fix in commit c9e97a1997 ("mm: initialize pages on demand
during boot") exposed this problem, because it greatly reduced number of
pages that are initialized before mm_init(), but the problem existed
even before my fix, as Fengguang Wu found.
Below is a more detailed explanation of the problem.
We initialize struct pages in four places:
1. Early in boot a small set of struct pages is initialized to fill the
first section, and lower zones.
2. During mm_init() we initialize "struct pages" for all the memory that
is allocated, i.e reserved in memblock.
3. Using on-demand logic when pages are allocated after mm_init call
(when memblock is finished)
4. After smp_init() when the rest free deferred pages are initialized.
The problem occurs if we try to do va to phys translation of a memory
between steps 1 and 2. Because we have not yet initialized struct pages
for all the reserved pages, it is inherently unsafe to do va to phys if
the translation itself requires access of "struct page" as in case of
this combination: CONFIG_SPARSE && !CONFIG_SPARSE_VMEMMAP
The following path exposes the problem:
start_kernel()
trap_init()
setup_cpu_entry_areas()
setup_cpu_entry_area(cpu)
get_cpu_gdt_paddr(cpu)
per_cpu_ptr_to_phys(addr)
pcpu_addr_to_page(addr)
virt_to_page(addr)
pfn_to_page(__pa(addr) >> PAGE_SHIFT)
We disable this path by not allowing NEED_PER_CPU_KM with deferred
struct pages feature.
The problems are discussed in these threads:
http://lkml.kernel.org/r/20180418135300.inazvpxjxowogyge@wfg-t540p.sh.intel.comhttp://lkml.kernel.org/r/20180419013128.iurzouiqxvcnpbvz@wfg-t540p.sh.intel.comhttp://lkml.kernel.org/r/20180426202619.2768-1-pasha.tatashin@oracle.com
Link: http://lkml.kernel.org/r/20180515175124.1770-1-pasha.tatashin@oracle.com
Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set")
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Dennis Zhou <dennisszhou@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Newer kernels allocate swap slots in batches to reduce
the contention on swap info lock. This results in the
max write values defined by swap ratio for fast and
slow swap devices multiply by batch size. This causes
the longer writes to one particular swap device failing
the swap ratio feature.
Change-Id: I9bb927b235fbf5b6f8b40bcdeb406ae6c48d9fb0
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
It is pointless to calculate the swap ratio when there is only
one swap device in the group. Moreover the existing code would
result in a spinlock recursion because of not taking this into
consideration. Interestingly, this check is already performed
in swap_ratio_slow by this piece of code
if (&(*si)->avail_list == plist_last(&swap_avail_head)) {
/* just to make skip work */
n = *si;
ret = -ENODEV;
goto skip;
}
But there is window where we drop the swap_avail_lock before
invoking swap_ratio() and take it back again in swap_ratio_slow.
In this period the si can get removed from swap_avail_head,
resulting in the failure of above logic. So recheck again.
Similarly, bail out from swap_ratio() if the sysctl is disabled,
and thus avoiding overhead of taking unnecessary locks.
Change-Id: I81a9dd61d24b7da55d5341c48a1f71d2b4b1978d
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
This patch makes do_swap_page() not need to be aware of two different swap
readahead algorithms. Just unify cluster-based and vma-based readahead
function call.
Change-Id: I45eeedb6347c245fbdb38744ac484119ade07d9c
Link: http://lkml.kernel.org/r/1509520520-32367-3-git-send-email-minchan@kernel.org
Link: http://lkml.kernel.org/r/20180220085249.151400-3-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Git-commit: 9d1b4ad90ada396540a10ab5691671aa4c229a1d
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Add support to receive a static ratio from userspace to
divide the swap pages between ZRAM and disk based swap
devices. The existing infrastructure allows to keep
same priority for multiple swap devices, which results
in round robin distribution of pages. With this patch,
the ratio can be defined.
Change-Id: I54f54489db84cabb206569dd62d61a8a7a898991
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
When I see recent change of swap readahead, I am very unhappy about
current code structure which diverges two swap readahead algorithm in
do_swap_page. This patch is to clean it up.
Main motivation is that fault handler doesn't need to be aware of
readahead algorithms but just should call swapin_readahead.
As first step, this patch cleans up a little bit but not perfect (I just
separate for review easier) so next patch will make the goal complete.
Change-Id: Ia5420f4deb5f02e2f3e286ee2d9008a0cff4eeb4
Link: http://lkml.kernel.org/r/1509520520-32367-2-git-send-email-minchan@kernel.org
Link: http://lkml.kernel.org/r/20180220085249.151400-2-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Git-commit: f80207727aaca3aa34a9cd80659393534de69cad
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
These global variables are only set during initialization or rarely
change, so declare them as __read_mostly.
Change-Id: Iefa78448381eae5bb0e08eb7a5e0e61cc251a4b3
Link: http://lkml.kernel.org/r/1507802349-5554-1-git-send-email-changbin.du@intel.com
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 783cb68ee2d25d621326366c0b615bf2ccf3b402
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
James reported a bug in swap paging-in from his testing. It is that
do_swap_page doesn't release locked page so system hang-up happens due
to a deadlock on PG_locked.
It was introduced by 0bcac06f27d7 ("mm, swap: skip swapcache for swapin
of synchronous device") because I missed swap cache hit places to update
swapcache variable to work well with other logics against swapcache in
do_swap_page.
This patch fixes it.
Debugged by James Bottomley.
Change-Id: Icadfcda54a0489b78d3680fa00bae71e8eb6ca1a
Link: http://lkml.kernel.org/r/<1514407817.4169.4.camel@HansenPartnership.com>
Link: http://lkml.kernel.org/r/20180102235606.GA19438@bbox
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reported-by: James Bottomley <James.Bottomley@hansenpartnership.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: f80207727aaca3aa34a9cd80659393534de69cad
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
When SWP_SYNCHRONOUS_IO swapped-in pages are shared by several
processes, it can cause unnecessary memory wastage by skipping swap
cache. Because, with swapin fault by read, they could share a page if
the page were in swap cache. Thus, it avoids allocating same content
new pages.
This patch makes the swapcache skipping work only if the swap pte is
non-sharable.
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/1507620825-5537-1-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: Id70ae9f748e8dc92aaca1ed0c85b7f297c5d0379
Git-commit: aa8d22a11da933dbf880b4933b58931f4aefe91c
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
With fast swap storage, the platforms want to use swap more aggressively
and swap-in is crucial to application latency.
The rw_page() based synchronous devices like zram, pmem and btt are such
fast storage. When I profile swapin performance with zram lz4
decompress test, S/W overhead is more than 70%. Maybe, it would be
bigger in nvdimm.
This patch aims to reduce swap-in latency by skipping swapcache if the
swap device is synchronous device like rw_page based device. It
enhances 45% my swapin test(5G sequential swapin, no readahead, from
2.41sec to 1.64sec).
Change-Id: I3f8d0c3b4487331e6e0c02a09bea3077223534b8
Link: http://lkml.kernel.org/r/1505886205-9671-5-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 0bcac06f27d7528591c27ac2b093ccd71c5d0168
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
If a particular cma region has lesser free blocks then
increase retries to avoid allocation failure due to page
being temporarily busy.
Change-Id: I92021fd75315b266a978f7a5b0235344c800cba2
Signed-off-by: Shiraz Hashim <shashim@codeaurora.org>
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
If rw-page based fast storage is used for swap devices, we need to
detect it to enhance swap IO operations. This patch is preparation for
optimizing of swap-in operation with next patch.
Change-Id: I25b0b93441fc602b9a697e5ee231eb7b5dd3dbfe
Link: http://lkml.kernel.org/r/1505886205-9671-4-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: 539a6fea7fdcade532bd3e77be2862a683f8f0c9
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlr/3ksACgkQONu9yGCS
aT5vdg/+NrZhrryO0+MisGGRnym0awDDo+TV0Wxuw2VCoCxAGbH0sGSJp9DtKcet
TDtLmw8RuJFU2NPBcN4aPuGFby5kLmlOslQhKg32mKcW0tnhK67DFhiqceZB/FeY
JdReYzvMv0UBsr5QFzPA3F5rbwjGV8N//3+spXOt3DykjtwR9wddGp7GxqWxIm/x
wF28tHr9LAdVuwPHw/Tpkl5ouDn8TGsuNejgv544EDWbACurZCKxxG7IYKD0vFTG
vrDPTuBoAXpzW/QI2kF7j6hy1hlzREGRak9CLYz2YAcMvXi2Lxlx5eL8lYMjTk5M
3uvkZQ6lXjIZpKd8mRxUzj6TtZ/g3iM/mTozLBFw/JIsnCNIzyHheVZRuPARd5xT
PF56P0cLrpO4d7Tdsn5bTcjuZDqNHn+II2ZvB9TaynJD1kDw5bpbfLi/KwZWAEHj
2KVl4AR1swpoGsQBcjH+w2k3zYHhX1WmrAzMaN/wnybcVwxwVizpWpIIMb6t6ejk
llG8va2ZSF8UA+OfwrTLUr483kSg3hYW72+85DdvL64K8yMOvmYhV2TncEQBH4aK
YGjomZDKcT10afIpY5/vAVFdtCBvSB3ar/6pMS/tio0UK/SBwTV81nYCoPWoB8R5
2gq6JJxjf92AMQhhbGnmPX8knDmbBOodDq3W8thLISIOG1qnJBA=
=w3oc
-----END PGP SIGNATURE-----
Merge 4.14.42 into android-4.14
Changes in 4.14.42
8139too: Use disable_irq_nosync() in rtl8139_poll_controller()
bridge: check iface upper dev when setting master via ioctl
dccp: fix tasklet usage
ipv4: fix fnhe usage by non-cached routes
ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg
llc: better deal with too small mtu
net: ethernet: sun: niu set correct packet size in skb
net: ethernet: ti: cpsw: fix packet leaking in dual_mac mode
net/mlx4_en: Fix an error handling path in 'mlx4_en_init_netdev()'
net/mlx4_en: Verify coalescing parameters are in range
net/mlx5e: Err if asked to offload TC match on frag being first
net/mlx5: E-Switch, Include VF RDMA stats in vport statistics
net sched actions: fix refcnt leak in skbmod
net_sched: fq: take care of throttled flows before reuse
net: support compat 64-bit time in {s,g}etsockopt
net/tls: Don't recursively call push_record during tls_write_space callbacks
net/tls: Fix connection stall on partial tls record
openvswitch: Don't swap table in nlattr_set() after OVS_ATTR_NESTED is found
qmi_wwan: do not steal interfaces from class drivers
r8169: fix powering up RTL8168h
rds: do not leak kernel memory to user land
sctp: delay the authentication for the duplicated cookie-echo chunk
sctp: fix the issue that the cookie-ack with auth can't get processed
sctp: handle two v4 addrs comparison in sctp_inet6_cmp_addr
sctp: remove sctp_chunk_put from fail_mark err path in sctp_ulpevent_make_rcvmsg
sctp: use the old asoc when making the cookie-ack chunk in dupcook_d
tcp_bbr: fix to zero idle_restart only upon S/ACKed data
tcp: ignore Fast Open on repair mode
tg3: Fix vunmap() BUG_ON() triggered from tg3_free_consistent().
bonding: do not allow rlb updates to invalid mac
bonding: send learning packets for vlans on slave
net: sched: fix error path in tcf_proto_create() when modules are not configured
net/mlx5e: TX, Use correct counter in dma_map error flow
net/mlx5: Avoid cleaning flow steering table twice during error flow
hv_netvsc: set master device
ipv6: fix uninit-value in ip6_multipath_l3_keys()
net/mlx5e: Allow offloading ipv4 header re-write for icmp
nsh: fix infinite loop
udp: fix SO_BINDTODEVICE
scsi: aacraid: Correct hba_send to include iu_type
xfrm: Use __skb_queue_tail in xfrm_trans_queue
btrfs: Take trans lock before access running trans in check_delayed_ref
xfrm: fix xfrm_do_migrate() with AEAD e.g(AES-GCM)
l2tp: revert "l2tp: fix missing print session offset info"
proc: do not access cmdline nor environ from file-backed areas
Linux 4.14.42
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit 7f7ccc2ccc2e70c6054685f5e3522efa81556830 upstream.
proc_pid_cmdline_read() and environ_read() directly access the target
process' VM to retrieve the command line and environment. If this
process remaps these areas onto a file via mmap(), the requesting
process may experience various issues such as extra delays if the
underlying device is slow to respond.
Let's simply refuse to access file-backed areas in these functions.
For this we add a new FOLL_ANON gup flag that is passed to all calls
to access_remote_vm(). The code already takes care of such failures
(including unmapped areas). Accesses via /proc/pid/mem were not
changed though.
This was assigned CVE-2018-1120.
Note for stable backports: the patch may apply to kernels prior to 4.11
but silently miss one location; it must be checked that no call to
access_remote_vm() keeps zero as the last argument.
Reported-by: Qualys Security Advisory <qsa@qualys.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
cma entertains only movable requests, so it is better to fill up
the cma regions first so that MIGRATE_MOVABLE regions remain
available to satisfy a steal by unmovable requests, thus
improving the chances of unmovable allocation successes during
low memory situtations.
Change-Id: I01904b86feb5307c17c17072ca9360cf0c17b408
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Introduce a variable to save bootloader enforced memory limits and
restricts adding beyond this boundary during a memory hotplug. Also,
export this symbol so that other kernel module have access to it.
Change-Id: I28c100644b7287ec4625c4c018b5fffc865e2e72
Signed-off-by: Arun KS <arunks@codeaurora.org>
[sudaraja@codeaurora.org: check limit with physical address of page]
Signed-off-by: Sudarshan Rajagopalan <sudaraja@codeaurora.org>
* refs/heads/tmp-04f740d
Linux 4.14.41
KVM: x86: remove APIC Timer periodic/oneshot spikes
KVM: PPC: Book3S HV: Fix handling of large pages in radix page fault handler
perf/x86: Fix possible Spectre-v1 indexing for x86_pmu::event_map()
perf/core: Fix possible Spectre-v1 indexing for ->aux_pages[]
perf/x86/msr: Fix possible Spectre-v1 indexing in the MSR driver
perf/x86/cstate: Fix possible Spectre-v1 indexing for pkg_msr
perf/x86: Fix possible Spectre-v1 indexing for hw_perf_event cache_*
tracing/uprobe_event: Fix strncpy corner case
sched/autogroup: Fix possible Spectre-v1 indexing for sched_prio_to_weight[]
smb3: directory sync should not return an error
nvme: add quirk to force medium priority for SQ creation
thermal: exynos: Propagate error value from tmu_read()
thermal: exynos: Reading temperature makes sense only when TMU is turned on
Bluetooth: btusb: Only check needs_reset_resume DMI table for QCA rome chipsets
Bluetooth: btusb: Add Dell XPS 13 9360 to btusb_needs_reset_resume_table
Revert "Bluetooth: btusb: Fix quirk for Atheros 1525/QCA6174"
cpufreq: schedutil: Avoid using invalid next_freq
PCI / PM: Check device_may_wakeup() in pci_enable_wake()
PCI / PM: Always check PME wakeup capability for runtime wakeup support
atm: zatm: Fix potential Spectre v1
net: atm: Fix potential Spectre v1
drm/atomic: Clean private obj old_state/new_state in drm_atomic_state_default_clear()
drm/atomic: Clean old_state/new_state in drm_atomic_state_default_clear()
drm/nouveau: Fix deadlock in nv50_mstm_register_connector()
drm/i915: Fix drm:intel_enable_lvds ERROR message in kernel log
drm/vc4: Fix scaling of uni-planar formats
can: hi311x: Work around TX complete interrupt erratum
can: hi311x: Acquire SPI lock on ->do_get_berr_counter
can: kvaser_usb: Increase correct stats counter in kvaser_usb_rx_can_msg()
ceph: fix rsize/wsize capping in ceph_direct_read_write()
mm, oom: fix concurrent munlock and oom reaper unmap, v3
mm: sections are not offlined during memory hotremove
z3fold: fix reclaim lock-ups
tracing: Fix regex_match_front() to not over compare the test string
dm integrity: use kvfree for kvmalloc'd memory
libata: Apply NOLPM quirk for SanDisk SD7UB3Q*G1001 SSDs
rfkill: gpio: fix memory leak in probe error path
gpio: fix error path in lineevent_create
gpio: fix aspeed_gpio unmask irq
gpioib: do not free unrequested descriptors
compat: fix 4-byte infoleak via uninitialized struct field
arm64: Add work around for Arm Cortex-A55 Erratum 1024718
KVM: PPC: Book3S HV: Fix VRMA initialization with 2MB or 1GB memory backing
KVM: PPC: Book3S HV: Fix guest time accounting with VIRT_CPU_ACCOUNTING_GEN
KVM: PPC: Book3S HV: Fix trap number return from __kvmppc_vcore_entry
bdi: Fix oops in wb_workfn()
bdi: wake up concurrent wb_shutdown() callers.
tcp: fix TCP_REPAIR_QUEUE bound checking
perf: Remove superfluous allocation error check
memcg: fix per_node_info cleanup
inetpeer: fix uninit-value in inet_getpeer
soreuseport: initialise timewait reuseport field
ipv4: fix uninit-value in ip_route_output_key_hash_rcu()
dccp: initialize ireq->ir_mark
net: fix uninit-value in __hw_addr_add_ex()
net: initialize skb->peeked when cloning
net: fix rtnh_ok()
netlink: fix uninit-value in netlink_sendmsg
crypto: af_alg - fix possible uninit-value in alg_bind()
kcm: Call strp_stop before strp_done in kcm_attach
netfilter: ebtables: don't attempt to allocate 0-sized compat array
ipvs: fix rtnl_lock lockups caused by start_sync_thread
ANDROID: goldfish: drop CONFIG_INPUT_KEYCHORD
Linux 4.14.40
tracing: Fix bad use of igrab in trace_uprobe.c
irqchip/qcom: Fix check for spurious interrupts
platform/x86: asus-wireless: Fix NULL pointer dereference
usb: musb: trace: fix NULL pointer dereference in musb_g_tx()
usb: musb: host: fix potential NULL pointer dereference
USB: serial: option: adding support for ublox R410M
USB: serial: option: reimplement interface masking
USB: Accept bulk endpoints with 1024-byte maxpacket
usb: dwc3: gadget: Fix list_del corruption in dwc3_ep_dequeue
USB: serial: visor: handle potential invalid device configuration
errseq: Always report a writeback error once
test_firmware: fix setting old custom fw path back on exit, second try
drm/bridge: vga-dac: Fix edid memory leak
drm/vmwgfx: Fix a buffer object leak
iw_cxgb4: Atomically flush per QP HW CQEs
IB/hfi1: Fix NULL pointer dereference when invalid num_vls is used
IB/hfi1: Fix loss of BECN with AHG
IB/hfi1: Fix handling of FECN marked multicast packet
IB/mlx5: Use unlimited rate when static rate is not supported
NET: usb: qmi_wwan: add support for ublox R410M PID 0x90b2
RDMA/mlx5: Protect from shift operand overflow
RDMA/mlx5: Fix multiple NULL-ptr deref errors in rereg_mr flow
RDMA/ucma: Allow resolving address w/o specifying source address
RDMA/cxgb4: release hw resources on device removal
xfs: prevent creating negative-sized file via INSERT_RANGE
rtlwifi: cleanup 8723be ant_sel definition
rtlwifi: btcoex: Add power_on_setting routine
Input: atmel_mxt_ts - add touchpad button mapping for Samsung Chromebook Pro
Input: leds - fix out of bound access
scsi: target: Fix fortify_panic kernel exception
tracepoint: Do not warn on ENOMEM
ALSA: aloop: Add missing cable lock to ctl API callbacks
ALSA: aloop: Mark paused device as inactive
ALSA: dice: fix kernel NULL pointer dereference due to invalid calculation for array index
ALSA: seq: Fix races at MIDI encoding in snd_virmidi_output_trigger()
ALSA: pcm: Check PCM state at xfern compat ioctl
ALSA: hda - Fix incorrect usage of IS_REACHABLE()
USB: serial: option: Add support for Quectel EP06
ACPI / button: make module loadable when booted in non-ACPI mode
crypto: talitos - fix IPsec cipher in length
percpu: include linux/sched.h for cond_resched()
net: don't call update_pmtu unconditionally
geneve: update skb dst pmtu on tx path
UPSTREAM: f2fs: avoid fsync() failure caused by EAGAIN in writepage()
UPSTREAM: f2fs: clear PageError on writepage - part 2
ANDROID: build.config: enforce trace_printk check
FROMLIST: staging: Fix sparse warnings in vsoc driver.
FROMLIST: staging: vsoc: Fix a i386-randconfig warning.
FROMLIST: staging: vsoc: Create wc kernel mapping for region shm.
Change-Id: I697004775203b8bb5cace4fdf7e6489cfd32b54b
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlr753gACgkQONu9yGCS
aT7p/Q//TIC9EKe21E2Lb1Kh4lL5SDjmwe/rkA3PxiqxbkXfUDBehMCfDk4YVNVG
TlH1TXOubzpS/8cZJPRFHEkrYXPKIA3+hKlAvJukUJCBQqmW1ILEAX5m7jrSmf+B
tLe/r0ijOtlfB1xQdUs5RxXGIndw0gMGhpo/QTXPAC0hGh0Ykd8v2s4YAjxOvdKw
z4DaUKtZGEPBWFVK/Bx1Fv3iAmJMt2yerERUqz8MVegYXJt+2RUGoJtsxHuvOk1p
9q0lzHBWYihQVt1tJ0es/8cB7WsYt8txnVmeN907sryUhDjvTWIxQJb5jEV0gxxK
AL89PHy4Hfki6l6r+tqYi92frFda8aLfsaSseOhlmqsv0MlwngW2dx3UbjaYd4If
IQA6n0hWHuxUvjrjsPpsMAa4lvTW+/kFilb0mD6Vixy3ru+/RelKnuawJm6kbMNu
Cb8QSVSJrhvC/UZLvwO7a3viJdKoI5B9pTh5FTKcY5wUPI1k01pg3WlWNxmnv4ZJ
LPImR06aoJYhvbutf94AvxbCOt/au8sY4s/yk9oHgvGUEIccrGYf3BwX6ciWRt4b
r4ZN92C9ZuD+u/ATFgi/akngtjjixw5YrZ20aX86dYcBZ25hYOiIMoc482tYQ12Z
1vqyvKg9o1oMypG9orF09PWstbNRu3ihGATKdXL9lfAhDklOTKc=
=zWTK
-----END PGP SIGNATURE-----
Merge 4.14.41 into android-4.14
Changes in 4.14.41
ipvs: fix rtnl_lock lockups caused by start_sync_thread
netfilter: ebtables: don't attempt to allocate 0-sized compat array
kcm: Call strp_stop before strp_done in kcm_attach
crypto: af_alg - fix possible uninit-value in alg_bind()
netlink: fix uninit-value in netlink_sendmsg
net: fix rtnh_ok()
net: initialize skb->peeked when cloning
net: fix uninit-value in __hw_addr_add_ex()
dccp: initialize ireq->ir_mark
ipv4: fix uninit-value in ip_route_output_key_hash_rcu()
soreuseport: initialise timewait reuseport field
inetpeer: fix uninit-value in inet_getpeer
memcg: fix per_node_info cleanup
perf: Remove superfluous allocation error check
tcp: fix TCP_REPAIR_QUEUE bound checking
bdi: wake up concurrent wb_shutdown() callers.
bdi: Fix oops in wb_workfn()
KVM: PPC: Book3S HV: Fix trap number return from __kvmppc_vcore_entry
KVM: PPC: Book3S HV: Fix guest time accounting with VIRT_CPU_ACCOUNTING_GEN
KVM: PPC: Book3S HV: Fix VRMA initialization with 2MB or 1GB memory backing
arm64: Add work around for Arm Cortex-A55 Erratum 1024718
compat: fix 4-byte infoleak via uninitialized struct field
gpioib: do not free unrequested descriptors
gpio: fix aspeed_gpio unmask irq
gpio: fix error path in lineevent_create
rfkill: gpio: fix memory leak in probe error path
libata: Apply NOLPM quirk for SanDisk SD7UB3Q*G1001 SSDs
dm integrity: use kvfree for kvmalloc'd memory
tracing: Fix regex_match_front() to not over compare the test string
z3fold: fix reclaim lock-ups
mm: sections are not offlined during memory hotremove
mm, oom: fix concurrent munlock and oom reaper unmap, v3
ceph: fix rsize/wsize capping in ceph_direct_read_write()
can: kvaser_usb: Increase correct stats counter in kvaser_usb_rx_can_msg()
can: hi311x: Acquire SPI lock on ->do_get_berr_counter
can: hi311x: Work around TX complete interrupt erratum
drm/vc4: Fix scaling of uni-planar formats
drm/i915: Fix drm:intel_enable_lvds ERROR message in kernel log
drm/nouveau: Fix deadlock in nv50_mstm_register_connector()
drm/atomic: Clean old_state/new_state in drm_atomic_state_default_clear()
drm/atomic: Clean private obj old_state/new_state in drm_atomic_state_default_clear()
net: atm: Fix potential Spectre v1
atm: zatm: Fix potential Spectre v1
PCI / PM: Always check PME wakeup capability for runtime wakeup support
PCI / PM: Check device_may_wakeup() in pci_enable_wake()
cpufreq: schedutil: Avoid using invalid next_freq
Revert "Bluetooth: btusb: Fix quirk for Atheros 1525/QCA6174"
Bluetooth: btusb: Add Dell XPS 13 9360 to btusb_needs_reset_resume_table
Bluetooth: btusb: Only check needs_reset_resume DMI table for QCA rome chipsets
thermal: exynos: Reading temperature makes sense only when TMU is turned on
thermal: exynos: Propagate error value from tmu_read()
nvme: add quirk to force medium priority for SQ creation
smb3: directory sync should not return an error
sched/autogroup: Fix possible Spectre-v1 indexing for sched_prio_to_weight[]
tracing/uprobe_event: Fix strncpy corner case
perf/x86: Fix possible Spectre-v1 indexing for hw_perf_event cache_*
perf/x86/cstate: Fix possible Spectre-v1 indexing for pkg_msr
perf/x86/msr: Fix possible Spectre-v1 indexing in the MSR driver
perf/core: Fix possible Spectre-v1 indexing for ->aux_pages[]
perf/x86: Fix possible Spectre-v1 indexing for x86_pmu::event_map()
KVM: PPC: Book3S HV: Fix handling of large pages in radix page fault handler
KVM: x86: remove APIC Timer periodic/oneshot spikes
Linux 4.14.41
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit 27ae357fa82be5ab73b2ef8d39dcb8ca2563483a upstream.
Since exit_mmap() is done without the protection of mm->mmap_sem, it is
possible for the oom reaper to concurrently operate on an mm until
MMF_OOM_SKIP is set.
This allows munlock_vma_pages_all() to concurrently run while the oom
reaper is operating on a vma. Since munlock_vma_pages_range() depends
on clearing VM_LOCKED from vm_flags before actually doing the munlock to
determine if any other vmas are locking the same memory, the check for
VM_LOCKED in the oom reaper is racy.
This is especially noticeable on architectures such as powerpc where
clearing a huge pmd requires serialize_against_pte_lookup(). If the pmd
is zapped by the oom reaper during follow_page_mask() after the check
for pmd_none() is bypassed, this ends up deferencing a NULL ptl or a
kernel oops.
Fix this by manually freeing all possible memory from the mm before
doing the munlock and then setting MMF_OOM_SKIP. The oom reaper can not
run on the mm anymore so the munlock is safe to do in exit_mmap(). It
also matches the logic that the oom reaper currently uses for
determining when to set MMF_OOM_SKIP itself, so there's no new risk of
excessive oom killing.
This issue fixes CVE-2018-1000200.
Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1804241526320.238665@chino.kir.corp.google.com
Fixes: 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently")
Signed-off-by: David Rientjes <rientjes@google.com>
Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org> [4.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 27227c733852f71008e9bf165950bb2edaed3a90 upstream.
Memory hotplug and hotremove operate with per-block granularity. If the
machine has a large amount of memory (more than 64G), the size of a
memory block can span multiple sections. By mistake, during hotremove
we set only the first section to offline state.
The bug was discovered because kernel selftest started to fail:
https://lkml.kernel.org/r/20180423011247.GK5563@yexl-desktop
After commit, "mm/memory_hotplug: optimize probe routine". But, the bug
is older than this commit. In this optimization we also added a check
for sections to be in a proper state during hotplug operation.
Link: http://lkml.kernel.org/r/20180427145257.15222-1-pasha.tatashin@oracle.com
Fixes: 2d070eab2e82 ("mm: consider zone which is not fully populated to have holes")
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6098d7e136692f9c6e23ae362c62ec822343e4d5 upstream.
Do not try to optimize in-page object layout while the page is under
reclaim. This fixes lock-ups on reclaim and improves reclaim
performance at the same time.
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20180430125800.444cae9706489f412ad12621@gmail.com
Signed-off-by: Vitaly Wool <vitaly.vul@sony.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: <Oleksiy.Avramchenko@sony.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8236b0ae31c837d2b3a2565c5f8d77f637e824cc upstream.
syzbot is reporting hung tasks at wait_on_bit(WB_shutting_down) in
wb_shutdown() [1]. This seems to be because commit 5318ce7d46866e1d ("bdi:
Shutdown writeback on all cgwbs in cgwb_bdi_destroy()") forgot to call
wake_up_bit(WB_shutting_down) after clear_bit(WB_shutting_down).
Introduce a helper function clear_and_wake_up_bit() and use it, in order
to avoid similar errors in future.
[1] https://syzkaller.appspot.com/bug?id=b297474817af98d5796bc544e1bb806fc3da0e5e
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+c0cf869505e03bdf1a24@syzkaller.appspotmail.com>
Fixes: 5318ce7d46866e1d ("bdi: Shutdown writeback on all cgwbs in cgwb_bdi_destroy()")
Cc: Tejun Heo <tj@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4eaf431f6f71bbed40a4c733ffe93a7e8cedf9d9 upstream.
syzbot has triggered a NULL ptr dereference when allocation fault
injection enforces a failure and alloc_mem_cgroup_per_node_info
initializes memcg->nodeinfo only half way through.
But __mem_cgroup_free still tries to free all per-node data and
dereferences pn->lruvec_stat_cpu unconditioanlly even if the specific
per-node data hasn't been initialized.
The bug is quite unlikely to hit because small allocations do not fail
and we would need quite some numa nodes to make struct
mem_cgroup_per_node large enough to cross the costly order.
Link: http://lkml.kernel.org/r/20180406100906.17790-1-mhocko@kernel.org
Reported-by: syzbot+8a5de3cce7cdc70e9ebe@syzkaller.appspotmail.com
Fixes: 00f3ca2c2d66 ("mm: memcontrol: per-lruvec stats infrastructure")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add arm64 to the list of architectures which supports
memory hotplug.
Change-Id: Iefeb8294bf06eaebb17a3b3aa8b33bb3b7133099
Signed-off-by: Arun KS <arunks@codeaurora.org>
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
Signed-off-by: Sudarshan Rajagopalan <sudaraja@codeaurora.org>
Indirectly reclaimable memory can consume a significant part of total
memory and it's actually reclaimable (it will be released under actual
memory pressure).
So, the overcommit logic should treat it as free.
Otherwise, it's possible to cause random system-wide memory allocation
failures by consuming a significant amount of memory by indirectly
reclaimable memory, e.g. dentry external names.
If overcommit policy GUESS is used, it might be used for denial of
service attack under some conditions.
The following program illustrates the approach. It causes the kernel to
allocate an unreclaimable kmalloc-256 chunk for each stat() call, so
that at some point the overcommit logic may start blocking large
allocation system-wide.
int main()
{
char buf[256];
unsigned long i;
struct stat statbuf;
buf[0] = '/';
for (i = 1; i < sizeof(buf); i++)
buf[i] = '_';
for (i = 0; 1; i++) {
sprintf(&buf[248], "%8lu", i);
stat(buf, &statbuf);
}
return 0;
}
This patch in combination with related indirectly reclaimable memory
patches closes this issue.
Link: http://lkml.kernel.org/r/20180313130041.8078-1-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-Commit: d79f7aa496fc94d763f67b833a1f36f4c171176f
Git-Repo: git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
Change-Id: I6daf49a77a687446135c5d21828932e28a79fc19
Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>
Patch series "indirectly reclaimable memory", v2.
This patchset introduces the concept of indirectly reclaimable memory
and applies it to fix the issue of when a big number of dentries with
external names can significantly affect the MemAvailable value.
This patch (of 3):
Introduce a concept of indirectly reclaimable memory and adds the
corresponding memory counter and /proc/vmstat item.
Indirectly reclaimable memory is any sort of memory, used by the kernel
(except of reclaimable slabs), which is actually reclaimable, i.e. will
be released under memory pressure.
The counter is in bytes, as it's not always possible to count such
objects in pages. The name contains BYTES by analogy to
NR_KERNEL_STACK_KB.
Link: http://lkml.kernel.org/r/20180305133743.12746-2-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-Commit: eb59254608bc1d42c4c6afdcdce9c0d3ce02b318
Git-Repo: git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
Change-Id: I6ea0d449210973c92f57f3b7f5173e1ec85c81f8
Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org>
Port support from 3.10 for retrying cma allocations
to 3.18 to help resolve cma allocation failures.
It was observed that CMA pages are sometimes getting
pinned down by BG processes scheduled out in their exit
path. Since BG processes have lower priority they end up
getting less time slice by scheduler there by consuming
more time to free up CMA pages.
Also when a process is being forked copy_one_pte
may create copy-on-write mappings, when this is done
the page _count and page _mapcount are each
incremented sequentially. If the process is context
switched out after incrementing the _count but before
incrementing the _mapcount then the page will appear
temporarily pinned.
So instead of failing to allocate and directly
returning an error on the CMA allocation path we do 2
retries, with sleeps, to give the system an opportunity
to unpin any pinned pages.
Change-Id: I022a9341f8ee44f281c7cb34769695843e97d684
Signed-off-by: Susheel Khiani <skhiani@codeaurora.org>
Signed-off-by: Liam Mark <lmark@codeaurora.org>
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
Commit "mm: vmscan: fix the page state calculation in too_many_isolated"
fixed an issue where a number of tasks were blocked in reclaim path
for seconds, because of vmstat_diff not being synced in time.
A similar problem can happen in isolate_migratepages_block, where
similar calculation is performed. This patch fixes that.
Change-Id: Ie74f108ef770da688017b515fe37faea6f384589
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
vmstat events currently count pgpgout, but that includes
only the writebacks, and not the reclaim of clean
pages. Add an event to count clean page evictions. This is
helpful to evaluate page thrashing cases.
Change-Id: Icfb797877a544a58c289074bdc290dfbc1384514
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
Allow other functions to dump the list of tasks.
Useful for when debugging memory leaks.
Change-Id: I76c33a118a9765b4c2276e8c76de36399c78dbf6
Signed-off-by: Liam Mark <lmark@codeaurora.org>
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>