12063 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
818299f6bd This is the 4.14.56 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAltNuVYACgkQONu9yGCS
 aT7kTA/+MRHC5oFvdnhSsF6jAHsY9rgJNQXPtZCFhZnHhhYHtubQ2OJOmSZ7IfM0
 9yhz/7vijC9+tLufXQxQnu2UUL3ojNu1+l+q9s0U1GUzNiONlJ9q/CyB4xjXFRCS
 1RdiDZaQbIqUCYs38UCTsEJF65uKjzQ6dpF21XdIXp5FPxgiZawo4HpjQRJswbAl
 Du97ybMEPN3XnAn207GjZwy58ubRLF5HDG1sqNGfjVWJ7oMTi+QJOCvY3PJtU3j2
 unS0qjxLU432rOyDfaJK7Yj9s61zu0PurbJrHo+dw3O3hd/Og7soqoqohUEjZWXd
 z7jjrntXZOZ/0st2yHmygfAPUJm/8jsh7Pd39Jgyfeu/3Clo51gO494rwATQsyE5
 mwIdllyzyMNBEJI2F2fxE60WlFsbTjeBOX3BaOwnF8pGRJWsCAfbFknRbuKh1fO5
 czFbUSOi00POw4WHT1rxV9u0yDBXmP47fy9zHquOim+PfK8pFvWuf6GSFjvqRTv8
 20w1w7eixMi09ZXOkgTJ3S00MKHSpxoaenI3n2NcEVVRgDEVfh3C/zelvvfCDMHD
 i36DN39Sj41PNA/R4n0TIA4W+ab9qBVzQl16yaj9JURR2rA92GyMVC1+Xjqo1Py3
 GRFOf2Gprlm0/vfkiRsMu9coAJuKV6+8fHXQU4mzHulKUaDWuJ0=
 =/wBU
 -----END PGP SIGNATURE-----

Merge 4.14.56 into android-4.14

Changes in 4.14.56
	media: rc: mce_kbd decoder: fix stuck keys
	ASoC: mediatek: preallocate pages use platform device
	MIPS: Call dump_stack() from show_regs()
	MIPS: Use async IPIs for arch_trigger_cpumask_backtrace()
	MIPS: Fix ioremap() RAM check
	mmc: sdhci-esdhc-imx: allow 1.8V modes without 100/200MHz pinctrl states
	mmc: dw_mmc: fix card threshold control configuration
	ibmasm: don't write out of bounds in read handler
	staging: rtl8723bs: Prevent an underflow in rtw_check_beacon_data().
	staging: r8822be: Fix RTL8822be can't find any wireless AP
	ata: Fix ZBC_OUT command block check
	ata: Fix ZBC_OUT all bit handling
	vmw_balloon: fix inflation with batching
	ahci: Disable LPM on Lenovo 50 series laptops with a too old BIOS
	USB: serial: ch341: fix type promotion bug in ch341_control_in()
	USB: serial: cp210x: add another USB ID for Qivicon ZigBee stick
	USB: serial: keyspan_pda: fix modem-status error handling
	USB: yurex: fix out-of-bounds uaccess in read handler
	USB: serial: mos7840: fix status-register error handling
	usb: quirks: add delay quirks for Corsair Strafe
	xhci: xhci-mem: off by one in xhci_stream_id_to_ring()
	devpts: hoist out check for DEVPTS_SUPER_MAGIC
	devpts: resolve devpts bind-mounts
	Fix up non-directory creation in SGID directories
	genirq/affinity: assign vectors to all possible CPUs
	scsi: megaraid_sas: use adapter_type for all gen controllers
	scsi: megaraid_sas: replace instance->ctrl_context checks with instance->adapter_type
	scsi: megaraid_sas: replace is_ventura with adapter_type checks
	scsi: megaraid_sas: Create separate functions to allocate ctrl memory
	scsi: megaraid_sas: fix selection of reply queue
	ALSA: hda/realtek - two more lenovo models need fixup of MIC_LOCATION
	ALSA: hda - Handle pm failure during hotplug
	mm: do not drop unused pages when userfaultd is running
	fs/proc/task_mmu.c: fix Locked field in /proc/pid/smaps*
	fs, elf: make sure to page align bss in load_elf_library
	mm: do not bug_on on incorrect length in __mm_populate()
	tracing: Reorder display of TGID to be after PID
	kbuild: delete INSTALL_FW_PATH from kbuild documentation
	arm64: neon: Fix function may_use_simd() return error status
	tools build: fix # escaping in .cmd files for future Make
	IB/hfi1: Fix incorrect mixing of ERR_PTR and NULL return values
	i2c: tegra: Fix NACK error handling
	iw_cxgb4: correctly enforce the max reg_mr depth
	xen: setup pv irq ops vector earlier
	nvme-pci: Remap CMB SQ entries on every controller reset
	crypto: x86/salsa20 - remove x86 salsa20 implementations
	uprobes/x86: Remove incorrect WARN_ON() in uprobe_init_insn()
	netfilter: nf_queue: augment nfqa_cfg_policy
	netfilter: x_tables: initialise match/target check parameter struct
	loop: add recursion validation to LOOP_CHANGE_FD
	PM / hibernate: Fix oops at snapshot_write()
	RDMA/ucm: Mark UCM interface as BROKEN
	loop: remember whether sysfs_create_group() was done
	f2fs: give message and set need_fsck given broken node id
	Linux 4.14.56

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-07-17 12:29:15 +02:00
Michal Hocko
81ebc9decd mm: do not bug_on on incorrect length in __mm_populate()
commit bb177a732c4369bb58a1fe1df8f552b6f0f7db5f upstream.

syzbot has noticed that a specially crafted library can easily hit
VM_BUG_ON in __mm_populate

  kernel BUG at mm/gup.c:1242!
  invalid opcode: 0000 [#1] SMP
  CPU: 2 PID: 9667 Comm: a.out Not tainted 4.18.0-rc3 #644
  Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
  RIP: 0010:__mm_populate+0x1e2/0x1f0
  Code: 55 d0 65 48 33 14 25 28 00 00 00 89 d8 75 21 48 83 c4 20 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 75 18 f1 ff 0f 0b e8 6e 18 f1 ff <0f> 0b 31 db eb c9 e8 93 06 e0 ff 0f 1f 00 55 48 89 e5 53 48 89 fb
  Call Trace:
     vm_brk_flags+0xc3/0x100
     vm_brk+0x1f/0x30
     load_elf_library+0x281/0x2e0
     __ia32_sys_uselib+0x170/0x1e0
     do_fast_syscall_32+0xca/0x420
     entry_SYSENTER_compat+0x70/0x7f

The reason is that the length of the new brk is not page aligned when we
try to populate the it.  There is no reason to bug on that though.
do_brk_flags already aligns the length properly so the mapping is
expanded as it should.  All we need is to tell mm_populate about it.
Besides that there is absolutely no reason to to bug_on in the first
place.  The worst thing that could happen is that the last page wouldn't
get populated and that is far from putting system into an inconsistent
state.

Fix the issue by moving the length sanitization code from do_brk_flags
up to vm_brk_flags.  The only other caller of do_brk_flags is brk
syscall entry and it makes sure to provide the proper length so t here
is no need for sanitation and so we can use do_brk_flags without it.

Also remove the bogus BUG_ONs.

[osalvador@techadventures.net: fix up vm_brk_flags s@request@len@]
Link: http://lkml.kernel.org/r/20180706090217.GI32658@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: syzbot <syzbot+5dcb560fe12aa5091c06@syzkaller.appspotmail.com>
Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-17 11:39:29 +02:00
Christian Borntraeger
684a2d8ed5 mm: do not drop unused pages when userfaultd is running
commit bce73e4842390f7b7309c8e253e139db71288ac3 upstream.

KVM guests on s390 can notify the host of unused pages.  This can result
in pte_unused callbacks to be true for KVM guest memory.

If a page is unused (checked with pte_unused) we might drop this page
instead of paging it.  This can have side-effects on userfaultd, when
the page in question was already migrated:

The next access of that page will trigger a fault and a user fault
instead of faulting in a new and empty zero page.  As QEMU does not
expect a userfault on an already migrated page this migration will fail.

The most straightforward solution is to ignore the pte_unused hint if a
userfault context is active for this VMA.

Link: http://lkml.kernel.org/r/20180703171854.63981-1-borntraeger@de.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-17 11:39:29 +02:00
Greg Kroah-Hartman
2e9aed164f This is the 4.14.55 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAltGFEUACgkQONu9yGCS
 aT5jMw//Y70OdIlEj7R/AjZZqAzwczZQhW/00ytJvCUZPzTazEhXxSbyc4d65VjQ
 4mdrl6nfYLOR0bF8gdWlGFCZRc405CXAc9INvixEUbj0w5MPWPQOmqT9gyOCg+Mu
 Iv5FXXEhk+H5vCIpN7g2PnoPFDFX7cC1vlqtbFfKRXCyGUZirmPl2vVcgID6scWN
 gB3+oWWcgNaCWuyz+tXrzzEQOJhMW84Y55wga1T1gjpE3yBreMU0j6DOXPTxrf/E
 VFs/h75ObR9yNB8O38d7zPrzQpaJHK1rhtqpJB+Thftxr0nO3Bn4Bg2FjnzMp8qP
 HNQKseeFfn0C7uNPjl3Pc5DH5BWfveOUPfbUHzuzyQZbK8E5O22BLhMxu+yS9PO2
 xzlN0OF8vP1VIR+gs12qopF9aGRCBM88YVCALb93fK+vEHhVOOa1kmfyTu3rCf/p
 M3rqw1YuW3TSwcskeL2MlSjnmxmM7HR/PmLJGD4xdmCwQtLAljVTD/sIUZOiPchh
 fH8CQc6QJEWo25oNSvdjQTdQtTTORMaU7JZ8TxEfbE7DRb4ziBpLNIxAanYc8vEw
 qXRXkigTdOW/Fb2X7vLxANXxXc5Xd4gRxjRJZfvN0ekw8GSkyk7wpNyURGDGt9UY
 kPMal06BUg7zEjHc16xVhrIed7PzE+FfTTzEspBOtbMkVzmHCTk=
 =dpg4
 -----END PGP SIGNATURE-----

Merge 4.14.55 into android-4.14

Changes in 4.14.55
	userfaultfd: hugetlbfs: fix userfaultfd_huge_must_wait() pte access
	mm: hugetlb: yield when prepping struct pages
	tracing: Fix missing return symbol in function_graph output
	scsi: sg: mitigate read/write abuse
	scsi: target: Fix truncated PR-in ReadKeys response
	s390: Correct register corruption in critical section cleanup
	drbd: fix access after free
	vfio: Use get_user_pages_longterm correctly
	cifs: Fix use after free of a mid_q_entry
	cifs: Fix memory leak in smb2_set_ea()
	cifs: Fix infinite loop when using hard mount option
	cifs: Fix slab-out-of-bounds in send_set_info() on SMB2 ACE setting
	drm: Use kvzalloc for allocating blob property memory
	drm/udl: fix display corruption of the last line
	jbd2: don't mark block as modified if the handle is out of credits
	ext4: add corruption check in ext4_xattr_set_entry()
	ext4: always verify the magic number in xattr blocks
	ext4: make sure bitmaps and the inode table don't overlap with bg descriptors
	ext4: always check block group bounds in ext4_init_block_bitmap()
	ext4: only look at the bg_flags field if it is valid
	ext4: verify the depth of extent tree in ext4_find_extent()
	ext4: include the illegal physical block in the bad map ext4_error msg
	ext4: clear i_data in ext4_inode_info when removing inline data
	ext4: never move the system.data xattr out of the inode body
	ext4: avoid running out of journal credits when appending to an inline file
	ext4: add more inode number paranoia checks
	ext4: add more mount time checks of the superblock
	ext4: check superblock mapped prior to committing
	block: factor out __blkdev_issue_zero_pages()
	block: cope with WRITE ZEROES failing in blkdev_issue_zeroout()
	HID: i2c-hid: Fix "incomplete report" noise
	HID: hiddev: fix potential Spectre v1
	HID: debug: check length before copy_to_user()
	irq/core: Fix boot crash when the irqaffinity= boot parameter is passed on CPUMASK_OFFSTACK=y kernels(v1)
	mm: hwpoison: disable memory error handling on 1GB hugepage
	media: vb2: core: Finish buffers at the end of the stream
	f2fs: truncate preallocated blocks in error case
	Revert "dpaa_eth: fix error in dpaa_remove()"
	Kbuild: fix # escaping in .cmd files for future Make
	media: cx25840: Use subdev host data for PLL override
	mtd: rawnand: mxc: set spare area size register explicitly
	fs: allow per-device dax status checking for filesystems
	dax: change bdev_dax_supported() to support boolean returns
	dax: check for QUEUE_FLAG_DAX in bdev_dax_supported()
	dm: set QUEUE_FLAG_DAX accordingly in dm_table_set_restrictions()
	dm: prevent DAX mounts if not supported
	mtd: cfi_cmdset_0002: Change definition naming to retry write operation
	mtd: cfi_cmdset_0002: Change erase functions to retry for error
	mtd: cfi_cmdset_0002: Change erase functions to check chip good only
	netfilter: nf_log: don't hold nf_log_mutex during user access
	staging: comedi: quatech_daqp_cs: fix no-op loop daqp_ao_insn_write()
	sched, tracing: Fix trace_sched_pi_setprio() for deboosting
	Revert mm/vmstat.c: fix vmstat_update() preemption BUG
	Linux 4.14.55

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-07-11 16:46:10 +02:00
Sebastian Andrzej Siewior
b3ef356a09 Revert mm/vmstat.c: fix vmstat_update() preemption BUG
commit 28557cc106e6d2aa8b8c5c7687ea9f8055ff3911 upstream.

Revert commit c7f26ccfb2c3 ("mm/vmstat.c: fix vmstat_update() preemption
BUG").  Steven saw a "using smp_processor_id() in preemptible" message
and added a preempt_disable() section around it to keep it quiet.  This
is not the right thing to do it does not fix the real problem.

vmstat_update() is invoked by a kworker on a specific CPU.  This worker
it bound to this CPU.  The name of the worker was "kworker/1:1" so it
should have been a worker which was bound to CPU1.  A worker which can
run on any CPU would have a `u' before the first digit.

smp_processor_id() can be used in a preempt-enabled region as long as
the task is bound to a single CPU which is the case here.  If it could
run on an arbitrary CPU then this is the problem we have an should seek
to resolve.

Not only this smp_processor_id() must not be migrated to another CPU but
also refresh_cpu_vm_stats() which might access wrong per-CPU variables.
Not to mention that other code relies on the fact that such a worker
runs on one specific CPU only.

Therefore revert that commit and we should look instead what broke the
affinity mask of the kworker.

Link: http://lkml.kernel.org/r/20180504104451.20278-1-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Steven J. Hill <steven.hill@cavium.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-11 16:29:24 +02:00
Naoya Horiguchi
b16a6af974 mm: hwpoison: disable memory error handling on 1GB hugepage
commit 31286a8484a85e8b4e91ddb0f5415aee8a416827 upstream.

Recently the following BUG was reported:

    Injecting memory failure for pfn 0x3c0000 at process virtual address 0x7fe300000000
    Memory failure: 0x3c0000: recovery action for huge page: Recovered
    BUG: unable to handle kernel paging request at ffff8dfcc0003000
    IP: gup_pgd_range+0x1f0/0xc20
    PGD 17ae72067 P4D 17ae72067 PUD 0
    Oops: 0000 [#1] SMP PTI
    ...
    CPU: 3 PID: 5467 Comm: hugetlb_1gb Not tainted 4.15.0-rc8-mm1-abc+ #3
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014

You can easily reproduce this by calling madvise(MADV_HWPOISON) twice on
a 1GB hugepage.  This happens because get_user_pages_fast() is not aware
of a migration entry on pud that was created in the 1st madvise() event.

I think that conversion to pud-aligned migration entry is working, but
other MM code walking over page table isn't prepared for it.  We need
some time and effort to make all this work properly, so this patch
avoids the reported bug by just disabling error handling for 1GB
hugepage.

[n-horiguchi@ah.jp.nec.com: v2]
  Link: http://lkml.kernel.org/r/1517284444-18149-1-git-send-email-n-horiguchi@ah.jp.nec.com
Link: http://lkml.kernel.org/r/1517207283-15769-1-git-send-email-n-horiguchi@ah.jp.nec.com
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-11 16:29:20 +02:00
Cannon Matthews
48b019a51a mm: hugetlb: yield when prepping struct pages
commit 520495fe96d74e05db585fc748351e0504d8f40d upstream.

When booting with very large numbers of gigantic (i.e.  1G) pages, the
operations in the loop of gather_bootmem_prealloc, and specifically
prep_compound_gigantic_page, takes a very long time, and can cause a
softlockup if enough pages are requested at boot.

For example booting with 3844 1G pages requires prepping
(set_compound_head, init the count) over 1 billion 4K tail pages, which
takes considerable time.

Add a cond_resched() to the outer loop in gather_bootmem_prealloc() to
prevent this lockup.

Tested: Booted with softlockup_panic=1 hugepagesz=1G hugepages=3844 and
no softlockup is reported, and the hugepages are reported as
successfully setup.

Link: http://lkml.kernel.org/r/20180627214447.260804-1-cannonmatthews@google.com
Signed-off-by: Cannon Matthews <cannonmatthews@google.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-11 16:29:13 +02:00
Greg Kroah-Hartman
57c28741d0 This is the 4.14.53 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAls7QPEACgkQONu9yGCS
 aT5Zuw//UYR0Hahnjiv61N2NCo5cH+uSOc0XjR/a8iTBHVa5lN459dmrKVUDJKyS
 JrIJjwsaUL5H/VHN/XrdRUQMqo38osQ395t+sVCzVaouaJ0nYlEaxVexI0E87mpk
 zsd7qF0HfgGxOEEVfCcxlwKDzgstSNMP3KWprTZZ/5V04NjPlOXPsNOnKj6PWKTI
 4XCp7OrVQhL5zFQKm0kPok9CHrunjjYpF0pgftKblhdB/RPi0E/XbpLrW5hDxOvY
 MxnzKWKHsbEzV6PJKFNmEvFc4D3/Dm3mDG9aI7fL4FbnSBxkxKrzkAX8HP163Lc1
 cNiwhqo4v2IsfVvuJcV9+toVsg+UHcmPETd02hfhIBnN7lCo56+IBoo2FTsV9BRy
 AIWtwzpBj52j0gXTHhORYRhQqa6Jd/N7+9Aay40avWs8NI1tokOGfgifLoJlbXqE
 spfMZdK1ihiUNav2PmY7WklPlN4OeGGcMKvt0bJ4IY2nprI/oeKEUvAkwC5CVRo+
 w/Qvgp94vJDALWRA7e0dUR2cQMN0Y9ELLCy08KgdzRDTUY5f0xVw9Qz0Swx1Zxgk
 DwD+nxscEzr4n0wKtcLkkt2wu9sS/eUeAAHKFqNKRtHQvgqx0oymgow35pw4XHjt
 04sXUemWUXzR73T55HC960vWBrpu67HbNAyGqlCbiATX63euEDY=
 =YCfp
 -----END PGP SIGNATURE-----

Merge 4.14.53 into android-4.14

Changes in 4.14.53
	x86/spectre_v1: Disable compiler optimizations over array_index_mask_nospec()
	x86/xen: Add call of speculative_store_bypass_ht_init() to PV paths
	x86/mce: Improve error message when kernel cannot recover
	x86/mce: Check for alternate indication of machine check recovery on Skylake
	x86/mce: Fix incorrect "Machine check from unknown source" message
	x86/mce: Do not overwrite MCi_STATUS in mce_no_way_out()
	x86: Call fixup_exception() before notify_die() in math_error()
	m68k/mm: Adjust VM area to be unmapped by gap size for __iounmap()
	m68k/mac: Fix SWIM memory resource end address
	serial: sh-sci: Use spin_{try}lock_irqsave instead of open coding version
	signal/xtensa: Consistenly use SIGBUS in do_unaligned_user
	PM / Domains: Fix error path during attach in genpd
	PM / core: Fix supplier device runtime PM usage counter imbalance
	PM / OPP: Update voltage in case freq == old_freq
	usb: do not reset if a low-speed or full-speed device timed out
	1wire: family module autoload fails because of upper/lower case mismatch.
	ASoC: dapm: delete dapm_kcontrol_data paths list before freeing it
	ASoC: cs35l35: Add use_single_rw to regmap config
	ASoC: cirrus: i2s: Fix LRCLK configuration
	ASoC: cirrus: i2s: Fix {TX|RX}LinCtrlData setup
	thermal: bcm2835: Stop using printk format %pCr
	clk: renesas: cpg-mssr: Stop using printk format %pCr
	lib/vsprintf: Remove atomic-unsafe support for %pCr
	ftrace/selftest: Have the reset_trigger code be a bit more careful
	mips: ftrace: fix static function graph tracing
	branch-check: fix long->int truncation when profiling branches
	ipmi:bt: Set the timeout before doing a capabilities check
	Bluetooth: hci_qca: Avoid missing rampatch failure with userspace fw loader
	printk: fix possible reuse of va_list variable
	fuse: fix congested state leak on aborted connections
	fuse: atomic_o_trunc should truncate pagecache
	fuse: don't keep dead fuse_conn at fuse_fill_super().
	fuse: fix control dir setup and teardown
	powerpc/mm/hash: Add missing isync prior to kernel stack SLB switch
	powerpc/ptrace: Fix setting 512B aligned breakpoints with PTRACE_SET_DEBUGREG
	powerpc/perf: Fix memory allocation for core-imc based on num_possible_cpus()
	powerpc/ptrace: Fix enforcement of DAWR constraints
	powerpc/powernv/ioda2: Remove redundant free of TCE pages
	powerpc/powernv: copy/paste - Mask SO bit in CR
	powerpc/powernv/cpuidle: Init all present cpus for deep states
	cpuidle: powernv: Fix promotion from snooze if next state disabled
	powerpc/fadump: Unregister fadump on kexec down path.
	soc: rockchip: power-domain: Fix wrong value when power up pd with writemask
	cxl: Disable prefault_mode in Radix mode
	ARM: 8764/1: kgdb: fix NUMREGBYTES so that gdb_regs[] is the correct size
	ARM: dts: Fix SPI node for Arria10
	ARM: dts: socfpga: Fix NAND controller node compatible
	ARM: dts: socfpga: Fix NAND controller clock supply
	ARM: dts: socfpga: Fix NAND controller node compatible for Arria10
	arm64: Fix syscall restarting around signal suppressed by tracer
	arm64: kpti: Use early_param for kpti= command-line option
	arm64: mm: Ensure writes to swapper are ordered wrt subsequent cache maintenance
	ARM64: dts: meson: disable sd-uhs modes on the libretech-cc
	of: overlay: validate offset from property fixups
	of: unittest: for strings, account for trailing \0 in property length field
	of: platform: stop accessing invalid dev in of_platform_device_destroy
	tpm: fix use after free in tpm2_load_context()
	tpm: fix race condition in tpm_common_write()
	IB/qib: Fix DMA api warning with debug kernel
	IB/{hfi1, qib}: Add handling of kernel restart
	IB/mlx4: Mark user MR as writable if actual virtual memory is writable
	IB/core: Make testing MR flags for writability a static inline function
	IB/mlx5: Fetch soft WQE's on fatal error state
	IB/isert: Fix for lib/dma_debug check_sync warning
	IB/isert: fix T10-pi check mask setting
	IB/hfi1: Fix fault injection init/exit issues
	IB/hfi1: Reorder incorrect send context disable
	IB/hfi1: Optimize kthread pointer locking when queuing CQ entries
	IB/hfi1: Fix user context tail allocation for DMA_RTAIL
	RDMA/mlx4: Discard unknown SQP work requests
	xprtrdma: Return -ENOBUFS when no pages are available
	mtd: cfi_cmdset_0002: Change write buffer to check correct value
	mtd: cfi_cmdset_0002: Use right chip in do_ppb_xxlock()
	mtd: cfi_cmdset_0002: fix SEGV unlocking multiple chips
	mtd: cfi_cmdset_0002: Fix unlocking requests crossing a chip boudary
	mtd: cfi_cmdset_0002: Avoid walking all chips when unlocking.
	MIPS: BCM47XX: Enable 74K Core ExternalSync for PCIe erratum
	PCI: hv: Make sure the bus domain is really unique
	PCI: Add ACS quirk for Intel 7th & 8th Gen mobile
	PCI: Add ACS quirk for Intel 300 series
	PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume
	auxdisplay: fix broken menu
	pinctrl: samsung: Correct EINTG banks order
	pinctrl: devicetree: Fix pctldev pointer overwrite
	cpufreq: intel_pstate: Fix scaling max/min limits with Turbo 3.0
	MIPS: io: Add barrier after register read in inX()
	time: Make sure jiffies_to_msecs() preserves non-zero time periods
	irqchip/gic-v3-its: Don't bind LPI to unavailable NUMA node
	X.509: unpack RSA signatureValue field from BIT STRING
	Btrfs: fix return value on rename exchange failure
	iio: adc: ad7791: remove sample freq sysfs attributes
	iio: sca3000: Fix an error handling path in 'sca3000_probe()'
	mm: fix __gup_device_huge vs unmap
	scsi: hpsa: disable device during shutdown
	scsi: qla2xxx: Fix setting lower transfer speed if GPSC fails
	scsi: qla2xxx: Mask off Scope bits in retry delay
	scsi: zfcp: fix missing SCSI trace for result of eh_host_reset_handler
	scsi: zfcp: fix missing SCSI trace for retry of abort / scsi_eh TMF
	scsi: zfcp: fix misleading REC trigger trace where erp_action setup failed
	scsi: zfcp: fix missing REC trigger trace on terminate_rport_io early return
	scsi: zfcp: fix missing REC trigger trace on terminate_rport_io for ERP_FAILED
	scsi: zfcp: fix missing REC trigger trace for all objects in ERP_FAILED
	scsi: zfcp: fix missing REC trigger trace on enqueue without ERP thread
	linvdimm, pmem: Preserve read-only setting for pmem devices
	clk: at91: PLL recalc_rate() now using cached MUL and DIV values
	rtc: sun6i: Fix bit_idx value for clk_register_gate
	md: fix two problems with setting the "re-add" device state.
	rpmsg: smd: do not use mananged resources for endpoints and channels
	ubi: fastmap: Cancel work upon detach
	ubi: fastmap: Correctly handle interrupted erasures in EBA
	UBIFS: Fix potential integer overflow in allocation
	backlight: as3711_bl: Fix Device Tree node lookup
	backlight: max8925_bl: Fix Device Tree node lookup
	backlight: tps65217_bl: Fix Device Tree node lookup
	mfd: intel-lpss: Program REMAP register in PIO mode
	mfd: intel-lpss: Fix Intel Cannon Lake LPSS I2C input clock
	arm: dts: mt7623: fix invalid memory node being generated
	perf tools: Fix symbol and object code resolution for vdso32 and vdsox32
	perf intel-pt: Fix sync_switch INTEL_PT_SS_NOT_TRACING
	perf intel-pt: Fix decoding to accept CBR between FUP and corresponding TIP
	perf intel-pt: Fix MTC timing after overflow
	perf intel-pt: Fix "Unexpected indirect branch" error
	perf intel-pt: Fix packet decoding of CYC packets
	perf vendor events: Add Goldmont Plus V1 event file
	perf/x86/intel/uncore: Add event constraint for BDX PCU
	media: vsp1: Release buffers for each video node
	media: v4l2-compat-ioctl32: prevent go past max size
	media: cx231xx: Add support for AverMedia DVD EZMaker 7
	media: dvb_frontend: fix locking issues at dvb_frontend_get_event()
	nfsd: restrict rd_maxcount to svc_max_payload in nfsd_encode_readdir
	NFSv4: Fix possible 1-byte stack overflow in nfs_idmap_read_and_verify_message
	NFSv4: Revert commit 5f83d86cf531d ("NFSv4.x: Fix wraparound issues..")
	NFSv4: Fix a typo in nfs41_sequence_process
	video: uvesafb: Fix integer overflow in allocation
	ACPI / LPSS: Add missing prv_offset setting for byt/cht PWM devices
	Input: elan_i2c - add ELAN0618 (Lenovo v330 15IKB) ACPI ID
	pwm: lpss: platform: Save/restore the ctrl register over a suspend/resume
	rbd: flush rbd_dev->watch_dwork after watch is unregistered
	mm/ksm.c: ignore STABLE_FLAG of rmap_item->address in rmap_walk_ksm()
	mm: fix devmem_is_allowed() for sub-page System RAM intersections
	xen: Remove unnecessary BUG_ON from __unbind_from_irq()
	udf: Detect incorrect directory size
	Input: xpad - fix GPD Win 2 controller name
	Input: elan_i2c_smbus - fix more potential stack buffer overflows
	Input: elantech - enable middle button of touchpads on ThinkPad P52
	Input: elantech - fix V4 report decoding for module with middle key
	ALSA: timer: Fix UBSAN warning at SNDRV_TIMER_IOCTL_NEXT_DEVICE ioctl
	ALSA: hda/realtek - Fix pop noise on Lenovo P50 & co
	ALSA: hda/realtek - Add a quirk for FSC ESPRIMO U9210
	ALSA: hda/realtek - Fix the problem of two front mics on more machines
	slub: fix failure when we delete and create a slab cache
	block: Fix transfer when chunk sectors exceeds max
	block: Fix cloning of requests with a special payload
	x86/efi: Fix efi_call_phys_epilog() with CONFIG_X86_5LEVEL=y
	dm zoned: avoid triggering reclaim from inside dmz_map()
	dm thin: handle running out of data space vs concurrent discard
	xhci: Fix use-after-free in xhci_free_virt_device
	Linux 4.14.53

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-07-03 18:26:32 +02:00
Mikulas Patocka
804a0db743 slub: fix failure when we delete and create a slab cache
commit d50d82faa0c964e31f7a946ba8aba7c715ca7ab0 upstream.

In kernel 4.17 I removed some code from dm-bufio that did slab cache
merging (commit 21bb13276768: "dm bufio: remove code that merges slab
caches") - both slab and slub support merging caches with identical
attributes, so dm-bufio now just calls kmem_cache_create and relies on
implicit merging.

This uncovered a bug in the slub subsystem - if we delete a cache and
immediatelly create another cache with the same attributes, it fails
because of duplicate filename in /sys/kernel/slab/.  The slub subsystem
offloads freeing the cache to a workqueue - and if we create the new
cache before the workqueue runs, it complains because of duplicate
filename in sysfs.

This patch fixes the bug by moving the call of kobject_del from
sysfs_slab_remove_workfn to shutdown_cache.  kobject_del must be called
while we hold slab_mutex - so that the sysfs entry is deleted before a
cache with the same attributes could be created.

Running device-mapper-test-suite with:

  dmtest run --suite thin-provisioning -n /commit_failure_causes_fallback/

triggered:

  Buffer I/O error on dev dm-0, logical block 1572848, async page read
  device-mapper: thin: 253:1: metadata operation 'dm_pool_alloc_data_block' failed: error = -5
  device-mapper: thin: 253:1: aborting current metadata transaction
  sysfs: cannot create duplicate filename '/kernel/slab/:a-0000144'
  CPU: 2 PID: 1037 Comm: kworker/u48:1 Not tainted 4.17.0.snitm+ #25
  Hardware name: Supermicro SYS-1029P-WTR/X11DDW-L, BIOS 2.0a 12/06/2017
  Workqueue: dm-thin do_worker [dm_thin_pool]
  Call Trace:
   dump_stack+0x5a/0x73
   sysfs_warn_dup+0x58/0x70
   sysfs_create_dir_ns+0x77/0x80
   kobject_add_internal+0xba/0x2e0
   kobject_init_and_add+0x70/0xb0
   sysfs_slab_add+0xb1/0x250
   __kmem_cache_create+0x116/0x150
   create_cache+0xd9/0x1f0
   kmem_cache_create_usercopy+0x1c1/0x250
   kmem_cache_create+0x18/0x20
   dm_bufio_client_create+0x1ae/0x410 [dm_bufio]
   dm_block_manager_create+0x5e/0x90 [dm_persistent_data]
   __create_persistent_data_objects+0x38/0x940 [dm_thin_pool]
   dm_pool_abort_metadata+0x64/0x90 [dm_thin_pool]
   metadata_operation_failed+0x59/0x100 [dm_thin_pool]
   alloc_data_block.isra.53+0x86/0x180 [dm_thin_pool]
   process_cell+0x2a3/0x550 [dm_thin_pool]
   do_worker+0x28d/0x8f0 [dm_thin_pool]
   process_one_work+0x171/0x370
   worker_thread+0x49/0x3f0
   kthread+0xf8/0x130
   ret_from_fork+0x35/0x40
  kobject_add_internal failed for :a-0000144 with -EEXIST, don't try to register things with the same name in the same directory.
  kmem_cache_create(dm_bufio_buffer-16) failed with error -17

Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1806151817130.6333@file01.intranet.prod.int.rdu2.redhat.com
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-by: Mike Snitzer <snitzer@redhat.com>
Tested-by: Mike Snitzer <snitzer@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:25:04 +02:00
Jia He
6f23028480 mm/ksm.c: ignore STABLE_FLAG of rmap_item->address in rmap_walk_ksm()
commit 1105a2fc022f3c7482e32faf516e8bc44095f778 upstream.

In our armv8a server(QDF2400), I noticed lots of WARN_ON caused by
PAGE_SIZE unaligned for rmap_item->address under memory pressure
tests(start 20 guests and run memhog in the host).

  WARNING: CPU: 4 PID: 4641 at virt/kvm/arm/mmu.c:1826 kvm_age_hva_handler+0xc0/0xc8
  CPU: 4 PID: 4641 Comm: memhog Tainted: G        W 4.17.0-rc3+ #8
  Call trace:
   kvm_age_hva_handler+0xc0/0xc8
   handle_hva_to_gpa+0xa8/0xe0
   kvm_age_hva+0x4c/0xe8
   kvm_mmu_notifier_clear_flush_young+0x54/0x98
   __mmu_notifier_clear_flush_young+0x6c/0xa0
   page_referenced_one+0x154/0x1d8
   rmap_walk_ksm+0x12c/0x1d0
   rmap_walk+0x94/0xa0
   page_referenced+0x194/0x1b0
   shrink_page_list+0x674/0xc28
   shrink_inactive_list+0x26c/0x5b8
   shrink_node_memcg+0x35c/0x620
   shrink_node+0x100/0x430
   do_try_to_free_pages+0xe0/0x3a8
   try_to_free_pages+0xe4/0x230
   __alloc_pages_nodemask+0x564/0xdc0
   alloc_pages_vma+0x90/0x228
   do_anonymous_page+0xc8/0x4d0
   __handle_mm_fault+0x4a0/0x508
   handle_mm_fault+0xf8/0x1b0
   do_page_fault+0x218/0x4b8
   do_translation_fault+0x90/0xa0
   do_mem_abort+0x68/0xf0
   el0_da+0x24/0x28

In rmap_walk_ksm, the rmap_item->address might still have the
STABLE_FLAG, then the start and end in handle_hva_to_gpa might not be
PAGE_SIZE aligned.  Thus it will cause exceptions in handle_hva_to_gpa
on arm64.

This patch fixes it by ignoring (not removing) the low bits of address
when doing rmap_walk_ksm.

IMO, it should be backported to stable tree.  the storm of WARN_ONs is
very easy for me to reproduce.  More than that, I watched a panic (not
reproducible) as follows:

  page:ffff7fe003742d80 count:-4871 mapcount:-2126053375 mapping: (null) index:0x0
  flags: 0x1fffc00000000000()
  raw: 1fffc00000000000 0000000000000000 0000000000000000 ffffecf981470000
  raw: dead000000000100 dead000000000200 ffff8017c001c000 0000000000000000
  page dumped because: nonzero _refcount
  CPU: 29 PID: 18323 Comm: qemu-kvm Tainted: G W 4.14.15-5.hxt.aarch64 #1
  Hardware name: <snip for confidential issues>
  Call trace:
    dump_backtrace+0x0/0x22c
    show_stack+0x24/0x2c
    dump_stack+0x8c/0xb0
    bad_page+0xf4/0x154
    free_pages_check_bad+0x90/0x9c
    free_pcppages_bulk+0x464/0x518
    free_hot_cold_page+0x22c/0x300
    __put_page+0x54/0x60
    unmap_stage2_range+0x170/0x2b4
    kvm_unmap_hva_handler+0x30/0x40
    handle_hva_to_gpa+0xb0/0xec
    kvm_unmap_hva_range+0x5c/0xd0

I even injected a fault on purpose in kvm_unmap_hva_range by seting
size=size-0x200, the call trace is similar as above.  So I thought the
panic is similarly caused by the root cause of WARN_ON.

Andrea said:

: It looks a straightforward safe fix, on x86 hva_to_gfn_memslot would
: zap those bits and hide the misalignment caused by the low metadata
: bits being erroneously left set in the address, but the arm code
: notices when that's the last page in the memslot and the hva_end is
: getting aligned and the size is below one page.
:
: I think the problem triggers in the addr += PAGE_SIZE of
: unmap_stage2_ptes that never matches end because end is aligned but
: addr is not.
:
: 	} while (pte++, addr += PAGE_SIZE, addr != end);
:
: x86 again only works on hva_start/hva_end after converting it to
: gfn_start/end and that being in pfn units the bits are zapped before
: they risk to cause trouble.

Jia He said:

: I've tested by myself in arm64 server (QDF2400,46 cpus,96G mem) Without
: this patch, the WARN_ON is very easy for reproducing.  After this patch, I
: have run the same benchmarch for a whole day without any WARN_ONs

Link: http://lkml.kernel.org/r/1525403506-6750-1-git-send-email-hejianet@gmail.com
Signed-off-by: Jia He <jia.he@hxt-semitech.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Tested-by: Jia He <hejianet@gmail.com>
Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:25:03 +02:00
Dan Williams
2d329968a8 mm: fix __gup_device_huge vs unmap
commit a9b6de77b1a3ff729f7bfc54b2e17711776a416c upstream.

get_user_pages_fast() for device pages is missing the typical validation
that all page references have been taken while the mapping was valid.
Without this validation truncate operations can not reliably coordinate
against new page reference events like O_DIRECT.

Cc: <stable@vger.kernel.org>
Fixes: 3565fce3a659 ("mm, x86: get_user_pages() for dax mappings")
Reported-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:57 +02:00
Greg Kroah-Hartman
08850d51f9 This is the 4.14.52 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlsxg4wACgkQONu9yGCS
 aT5gohAAz5xy4C1KerI0nbJTpmGC1RNRTI1Ynwx8g+E69cEe0DhDqak6o+ZZBgNq
 asoVrDDUi9FkpeJfX2gK023pkbMcdFU9uXadlWtmMXFeXyAteVyw6OgSJOM1qMlH
 4H2XsHyEROpE6lwqVsT5Qk+UnzzjT7ypG3b1czn89szFeJf0mGzExtSTo01VaJad
 wccCwZ5MA1djhS34YZqZfSz1Nb0SUlT7zAoyES8+Cc70wTxT0xv/OhmXtukvTKzW
 5Yr/QS+OEa6eWMt2ObqkJsLB2bZogoR/QIkhEQCPnq+V8/QVrRu0dE0PbjJ2Ocn5
 tpORkQVELl/V7cTjevtcFH6dyH/7C82qHAlW7qRHLvYAuwamTppyt+a0jwyhnOEt
 vkb15A7GRgqwTLDS89M4kUxvR3Kkz5cOdFk95jgv3dkYc43nQvstV6GrXjtW+6oT
 P1tD/2oucwKIrOOx2FLkhETG9vCV408lOBQXK0Jb1bxBUVQTtl8b5mk4xIdmQF5E
 a8WJQYIs3NpCXzIbS2AAp6u82q2Cs931n13vqjIPlQ/fl8uImxZyyC+6hSne3X6y
 dhqERs9uHk9xKSp18K7BxBflyXaW5fWKGh/CmExxIKfIIrNDYAf6HFoSWhKcIbwT
 /g2S3eR5QaeYCmSA02ReBjb8D5PLhpdtM+FEo+xeI0UkGhblhf0=
 =8PkZ
 -----END PGP SIGNATURE-----

Merge 4.14.52 into android-4.14

Changes in 4.14.52
	bonding: re-evaluate force_primary when the primary slave name changes
	cdc_ncm: avoid padding beyond end of skb
	ipv6: allow PMTU exceptions to local routes
	net: dsa: add error handling for pskb_trim_rcsum
	net/sched: act_simple: fix parsing of TCA_DEF_DATA
	tcp: verify the checksum of the first data segment in a new connection
	socket: close race condition between sock_close() and sockfs_setattr()
	udp: fix rx queue len reported by diag and proc interface
	net: in virtio_net_hdr only add VLAN_HLEN to csum_start if payload holds vlan
	hv_netvsc: Fix a network regression after ifdown/ifup
	tls: fix use-after-free in tls_push_record
	NFSv4.1: Fix up replays of interrupted requests
	ext4: fix hole length detection in ext4_ind_map_blocks()
	ext4: update mtime in ext4_punch_hole even if no blocks are released
	ext4: do not allow external inodes for inline data
	ext4: bubble errors from ext4_find_inline_data_nolock() up to ext4_iget()
	ext4: correctly handle a zero-length xattr with a non-zero e_value_offs
	ext4: fix fencepost error in check for inode count overflow during resize
	driver core: Don't ignore class_dir_create_and_add() failure.
	Btrfs: fix clone vs chattr NODATASUM race
	Btrfs: fix memory and mount leak in btrfs_ioctl_rm_dev_v2()
	btrfs: return error value if create_io_em failed in cow_file_range
	btrfs: scrub: Don't use inode pages for device replace
	ALSA: hda/realtek - Enable mic-mute hotkey for several Lenovo AIOs
	ALSA: hda/conexant - Add fixup for HP Z2 G4 workstation
	ALSA: hda - Handle kzalloc() failure in snd_hda_attach_pcm_stream()
	ALSA: hda: add dock and led support for HP EliteBook 830 G5
	ALSA: hda: add dock and led support for HP ProBook 640 G4
	x86/MCE: Fix stack out-of-bounds write in mce-inject.c: Flags_read()
	smb3: fix various xid leaks
	smb3: on reconnect set PreviousSessionId field
	CIFS: 511c54a2f69195b28afb9dd119f03787b1625bb4 adds a check for session expiry
	cifs: For SMB2 security informaion query, check for minimum sized security descriptor instead of sizeof FileAllInformation class
	nbd: fix nbd device deletion
	nbd: update size when connected
	nbd: use bd_set_size when updating disk size
	blk-mq: reinit q->tag_set_list entry only after grace period
	bdi: Move cgroup bdi_writeback to a dedicated low concurrency workqueue
	cpufreq: Fix new policy initialization during limits updates via sysfs
	cpufreq: governors: Fix long idle detection logic in load calculation
	libata: zpodd: small read overflow in eject_tray()
	libata: Drop SanDisk SD7UB3Q*G1001 NOLPM quirk
	w1: mxc_w1: Enable clock before calling clk_get_rate() on it
	x86/intel_rdt: Enable CMT and MBM on new Skylake stepping
	iwlwifi: fw: harden page loading code
	orangefs: set i_size on new symlink
	orangefs: report attributes_mask and attributes for statx
	HID: intel_ish-hid: ipc: register more pm callbacks to support hibernation
	HID: wacom: Correct logical maximum Y for 2nd-gen Intuos Pro large
	vhost: fix info leak due to uninitialized memory
	fs/binfmt_misc.c: do not allow offset overflow
	mm, page_alloc: do not break __GFP_THISNODE by zonelist reset
	Linux 4.14.52

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-06-26 09:14:49 +08:00
Vlastimil Babka
1d26c11295 mm, page_alloc: do not break __GFP_THISNODE by zonelist reset
commit 7810e6781e0fcbca78b91cf65053f895bf59e85f upstream.

In __alloc_pages_slowpath() we reset zonelist and preferred_zoneref for
allocations that can ignore memory policies.  The zonelist is obtained
from current CPU's node.  This is a problem for __GFP_THISNODE
allocations that want to allocate on a different node, e.g.  because the
allocating thread has been migrated to a different CPU.

This has been observed to break SLAB in our 4.4-based kernel, because
there it relies on __GFP_THISNODE working as intended.  If a slab page
is put on wrong node's list, then further list manipulations may corrupt
the list because page_to_nid() is used to determine which node's
list_lock should be locked and thus we may take a wrong lock and race.

Current SLAB implementation seems to be immune by luck thanks to commit
511e3a058812 ("mm/slab: make cache_grow() handle the page allocated on
arbitrary node") but there may be others assuming that __GFP_THISNODE
works as promised.

We can fix it by simply removing the zonelist reset completely.  There
is actually no reason to reset it, because memory policies and cpusets
don't affect the zonelist choice in the first place.  This was different
when commit 183f6371aac2 ("mm: ignore mempolicies when using
ALLOC_NO_WATERMARK") introduced the code, as mempolicies provided their
own restricted zonelists.

We might consider this for 4.17 although I don't know if there's
anything currently broken.

SLAB is currently not affected, but in kernels older than 4.7 that don't
yet have 511e3a058812 ("mm/slab: make cache_grow() handle the page
allocated on arbitrary node") it is.  That's at least 4.4 LTS.  Older
ones I'll have to check.

So stable backports should be more important, but will have to be
reviewed carefully, as the code went through many changes.  BTW I think
that also the ac->preferred_zoneref reset is currently useless if we
don't also reset ac->nodemask from a mempolicy to NULL first (which we
probably should for the OOM victims etc?), but I would leave that for a
separate patch.

Link: http://lkml.kernel.org/r/20180525130853.13915-1-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Fixes: 183f6371aac2 ("mm: ignore mempolicies when using ALLOC_NO_WATERMARK")
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-26 08:06:33 +08:00
Tejun Heo
67b46304b9 bdi: Move cgroup bdi_writeback to a dedicated low concurrency workqueue
commit f183464684190bacbfb14623bd3e4e51b7575b4c upstream.

From 0aa2e9b921d6db71150633ff290199554f0842a8 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Wed, 23 May 2018 10:29:00 -0700

cgwb_release() punts the actual release to cgwb_release_workfn() on
system_wq.  Depending on the number of cgroups or block devices, there
can be a lot of cgwb_release_workfn() in flight at the same time.

We're periodically seeing close to 256 kworkers getting stuck with the
following stack trace and overtime the entire system gets stuck.

  [<ffffffff810ee40c>] _synchronize_rcu_expedited.constprop.72+0x2fc/0x330
  [<ffffffff810ee634>] synchronize_rcu_expedited+0x24/0x30
  [<ffffffff811ccf23>] bdi_unregister+0x53/0x290
  [<ffffffff811cd1e9>] release_bdi+0x89/0xc0
  [<ffffffff811cd645>] wb_exit+0x85/0xa0
  [<ffffffff811cdc84>] cgwb_release_workfn+0x54/0xb0
  [<ffffffff810a68d0>] process_one_work+0x150/0x410
  [<ffffffff810a71fd>] worker_thread+0x6d/0x520
  [<ffffffff810ad3dc>] kthread+0x12c/0x160
  [<ffffffff81969019>] ret_from_fork+0x29/0x40
  [<ffffffffffffffff>] 0xffffffffffffffff

The events leading to the lockup are...

1. A lot of cgwb_release_workfn() is queued at the same time and all
   system_wq kworkers are assigned to execute them.

2. They all end up calling synchronize_rcu_expedited().  One of them
   wins and tries to perform the expedited synchronization.

3. However, that invovles queueing rcu_exp_work to system_wq and
   waiting for it.  Because #1 is holding all available kworkers on
   system_wq, rcu_exp_work can't be executed.  cgwb_release_workfn()
   is waiting for synchronize_rcu_expedited() which in turn is waiting
   for cgwb_release_workfn() to free up some of the kworkers.

We shouldn't be scheduling hundreds of cgwb_release_workfn() at the
same time.  There's nothing to be gained from that.  This patch
updates cgwb release path to use a dedicated percpu workqueue with
@max_active of 1.

While this resolves the problem at hand, it might be a good idea to
isolate rcu_exp_work to its own workqueue too as it can be used from
various paths and is prone to this sort of indirect A-A deadlocks.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-26 08:06:32 +08:00
Greg Kroah-Hartman
a51b40cc70 This is the 4.14.51 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlsqpOYACgkQONu9yGCS
 aT7cZw/+NE0Bmn8BhIyf2h//jWKqQ50epMtuOrROhaB9onBS3gbH00JsH6Aop9jh
 9SJdJPveHb+cBEcNGIlx5u/WLvRxG64mDd1GgNcGoFnYOxl9y68XPS+2zlFGI66F
 CUqCDQS4DNS5KoXiLBJ48cDtuZNoSdlt8H5bC5qlFs16WIpj41CCG4cbkUk1eDzH
 CCR44mw7GxnmsF/44xuswhZZjCzGuOACWnhuYh8/dspGPZYOS0vBCX9RvhjBUFwD
 taLu9cm1kq8kQZBwt70+M36+OTwSS/rtVj/2g96l6QrLLCBk+OIjGO0yGaLXcTPx
 WA5Lxkt3stQbuttayddNkRsFsE+Cvi0r/wye9zKFxVqhaPad4/87aklHzKAnEehg
 Eu1JDR3ds2R4zSjifl7ACo2hWM//xIUcEDz4BvVjJSjVYTQamdsFHatRNl2NEW96
 TYgmrbJALdYPIl5AD6hmeCwU2WqjrJPZnV0X5jVcWgVTp07mIag6qxibwUmY0TOa
 IfBEXG1zHzAgYycAbQw1OFz0IHavX10tmpmoKZE4ay4vi3Rnt/OIsCZtXnabZbjy
 xpiBumMUz3GGdU+5yKT4Iw1Cfg4EEAp9+sWSiJzx+frrB9pn5pafK2/RhdvOCF+8
 MGyLOTbjz5v2IvprA5v76lUT1CjXcRbRE+YxmRSemAu1ruetBWY=
 =eyGS
 -----END PGP SIGNATURE-----

Merge 4.14.51 into android-4.14

Changes in 4.14.51
	clocksource/drivers/imx-tpm: Correct some registers operation flow
	Input: synaptics-rmi4 - fix an unchecked out of memory error path
	KVM: X86: fix incorrect reference of trace_kvm_pi_irte_update
	x86: Add check for APIC access address for vmentry of L2 guests
	MIPS: io: Prevent compiler reordering writeX()
	nfp: ignore signals when communicating with management FW
	perf report: Fix switching to another perf.data file
	fsnotify: fix ignore mask logic in send_to_group()
	MIPS: io: Add barrier after register read in readX()
	s390/smsgiucv: disable SMSG on module unload
	isofs: fix potential memory leak in mount option parsing
	MIPS: dts: Boston: Fix PCI bus dtc warnings:
	spi: sh-msiof: Fix bit field overflow writes to TSCR/RSCR
	doc: Add vendor prefix for Kieback & Peter GmbH
	dt-bindings: pinctrl: sunxi: Fix reference to driver
	dt-bindings: serial: sh-sci: Add support for r8a77965 (H)SCIF
	dt-bindings: dmaengine: rcar-dmac: document R8A77965 support
	clk: honor CLK_MUX_ROUND_CLOSEST in generic clk mux
	ASoC: rt5514: Add the missing register in the readable table
	eCryptfs: don't pass up plaintext names when using filename encryption
	soc: bcm: raspberrypi-power: Fix use of __packed
	soc: bcm2835: Make !RASPBERRYPI_FIRMWARE dummies return failure
	PCI: kirin: Fix reset gpio name
	ASoC: topology: Fix bugs of freeing soc topology
	xen: xenbus_dev_frontend: Really return response string
	ASoC: topology: Check widget kcontrols before deref.
	spi: cadence: Add usleep_range() for cdns_spi_fill_tx_fifo()
	blkcg: don't hold blkcg lock when deactivating policy
	tipc: fix infinite loop when dumping link monitor summary
	scsi: iscsi: respond to netlink with unicast when appropriate
	scsi: megaraid_sas: Do not log an error if FW successfully initializes.
	scsi: target: fix crash with iscsi target and dvd
	netfilter: nf_tables: NAT chain and extensions require NF_TABLES
	netfilter: nf_tables: fix out-of-bounds in nft_chain_commit_update
	ASoC: msm8916-wcd-analog: use threaded context for mbhc events
	drm/msm: Fix possible null dereference on failure of get_pages()
	drm/msm/dsi: use correct enum in dsi_get_cmd_fmt
	drm/msm: don't deref error pointer in the msm_fbdev_create error path
	blkcg: init root blkcg_gq under lock
	net: hns: Avoid action name truncation
	vfs: Undo an overly zealous MS_RDONLY -> SB_RDONLY conversion
	parisc: time: Convert read_persistent_clock() to read_persistent_clock64()
	scsi: storvsc: Set up correct queue depth values for IDE devices
	scsi: isci: Fix infinite loop in while loop
	mm, pagemap: fix swap offset value for PMD migration entry
	proc: revalidate kernel thread inodes to root:root
	kexec_file: do not add extra alignment to efi memmap
	mm: memcg: add __GFP_NOWARN in __memcg_schedule_kmem_cache_create()
	usb: typec: ucsi: fix tracepoint related build error
	ACPI / PM: Blacklist Low Power S0 Idle _DSM for ThinkPad X1 Tablet(2016)
	dt-bindings: meson-uart: DT fix s/clocks-names/clock-names/
	powerpc/powernv/memtrace: Let the arch hotunplug code flush cache
	net: phy: marvell: clear wol event before setting it
	ARM: dts: da850: fix W=1 warnings with pinmux node
	ACPI / watchdog: Prefer iTCO_wdt on Lenovo Z50-70
	drm/amdkfd: fix clock counter retrieval for node without GPU
	thermal: int3403_thermal: Fix NULL pointer deref on module load / probe
	net: ethtool: Add missing kernel doc for FEC parameters
	arm64: ptrace: remove addr_limit manipulation
	HID: lenovo: Add support for IBM/Lenovo Scrollpoint mice
	HID: wacom: Release device resource data obtained by devres_alloc()
	selftests: ftrace: Add a testcase for multiple actions on trigger
	rds: ib: Fix missing call to rds_ib_dev_put in rds_ib_setup_qp
	perf/x86/intel: Don't enable freeze-on-smi for PerfMon V1
	remoteproc: qcom: Fix potential device node leaks
	rpmsg: added MODULE_ALIAS for rpmsg_char
	HID: intel-ish-hid: use put_device() instead of kfree()
	blk-mq: fix sysfs inflight counter
	arm64: fix possible spectre-v1 in ptrace_hbp_get_event()
	KVM: arm/arm64: vgic: fix possible spectre-v1 in vgic_mmio_read_apr()
	libahci: Allow drivers to override stop_engine
	ata: ahci: mvebu: override ahci_stop_engine for mvebu AHCI
	x86/cpu/intel: Add missing TLB cpuid values
	bpf: fix uninitialized variable in bpf tools
	i2c: sprd: Prevent i2c accesses after suspend is called
	i2c: sprd: Fix the i2c count issue
	tipc: fix bug in function tipc_nl_node_dump_monitor
	nvme: depend on INFINIBAND_ADDR_TRANS
	nvmet-rdma: depend on INFINIBAND_ADDR_TRANS
	ib_srpt: depend on INFINIBAND_ADDR_TRANS
	ib_srp: depend on INFINIBAND_ADDR_TRANS
	IB: make INFINIBAND_ADDR_TRANS configurable
	IB/uverbs: Fix validating mandatory attributes
	RDMA/cma: Fix use after destroy access to net namespace for IPoIB
	RDMA/iwpm: fix memory leak on map_info
	IB/rxe: add RXE_START_MASK for rxe_opcode IB_OPCODE_RC_SEND_ONLY_INV
	IB/rxe: avoid double kfree_skb
	<linux/stringhash.h>: fix end_name_hash() for 64bit long
	IB/core: Make ib_mad_client_id atomic
	ARM: davinci: board-da830-evm: fix GPIO lookup for MMC/SD
	ARM: davinci: board-da850-evm: fix GPIO lookup for MMC/SD
	ARM: davinci: board-omapl138-hawk: fix GPIO numbers for MMC/SD lookup
	ARM: davinci: board-dm355-evm: fix broken networking
	dt-bindings: panel: lvds: Fix path to display timing bindings
	ARM: OMAP2+: powerdomain: use raw_smp_processor_id() for trace
	ARM: dts: logicpd-som-lv: Fix WL127x Startup Issues
	ARM: dts: logicpd-som-lv: Fix Audio Mute
	Input: atmel_mxt_ts - fix the firmware update
	hexagon: add memset_io() helper
	hexagon: export csum_partial_copy_nocheck
	scsi: vmw-pvscsi: return DID_BUS_BUSY for adapter-initated aborts
	bpf, x64: fix memleak when not converging after image
	parisc: drivers.c: Fix section mismatches
	stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock
	kthread, sched/wait: Fix kthread_parkme() wait-loop
	arm64: tegra: Make BCM89610 PHY interrupt as active low
	iommu/vt-d: fix shift-out-of-bounds in bug checking
	nvme: fix potential memory leak in option parsing
	nvme: Set integrity flag for user passthrough commands
	ARM: OMAP1: ams-delta: fix deferred_fiq handler
	smc: fix sendpage() call
	IB/hfi1 Use correct type for num_user_context
	IB/hfi1: Fix memory leak in exception path in get_irq_affinity()
	RDMA/cma: Do not query GID during QP state transition to RTR
	spi: bcm2835aux: ensure interrupts are enabled for shared handler
	sched/core: Introduce set_special_state()
	sh: fix build failure for J2 cpu with SMP disabled
	tee: check shm references are consistent in offset/size
	mac80211: Adjust SAE authentication timeout
	drm/omap: silence unititialized variable warning
	drm/omap: fix uninitialized ret variable
	drm/omap: fix possible NULL ref issue in tiler_reserve_2d
	drm/omap: check return value from soc_device_match
	drm/omap: handle alloc failures in omap_connector
	driver core: add __printf verification to __ata_ehi_pushv_desc
	ARM: dts: cygnus: fix irq type for arm global timer
	mac80211: use timeout from the AddBA response instead of the request
	x86/xen: Reset VCPU0 info pointer after shared_info remap
	net: aquantia: driver should correctly declare vlan_features bits
	can: dev: increase bus-off message severity
	arm64: Add MIDR encoding for NVIDIA CPUs
	cifs: smb2ops: Fix listxattr() when there are no EAs
	agp: uninorth: make two functions static
	tipc: eliminate KMSAN uninit-value in strcmp complaint
	qed: Fix l2 initializations over iWARP personality
	qede: Fix gfp flags sent to rdma event node allocation
	rxrpc: Fix error reception on AF_INET6 sockets
	rxrpc: Fix the min security level for kernel calls
	KVM: Extend MAX_IRQ_ROUTES to 4096 for all archs
	x86: Delay skip of emulated hypercall instruction
	ixgbe: return error on unsupported SFP module when resetting
	net sched actions: fix invalid pointer dereferencing if skbedit flags missing
	init: fix false positives in W+X checking
	proc/kcore: don't bounds check against address 0
	ocfs2: take inode cluster lock before moving reflinked inode from orphan dir
	kprobes/x86: Prohibit probing on exception masking instructions
	uprobes/x86: Prohibit probing on MOV SS instruction
	objtool, kprobes/x86: Sync the latest <asm/insn.h> header with tools/objtool/arch/x86/include/asm/insn.h
	x86/pkeys/selftests: Adjust the self-test to fresh distros that export the pkeys ABI
	x86/mpx/selftests: Adjust the self-test to fresh distros that export the MPX ABI
	x86/selftests: Add mov_to_ss test
	x86/pkeys/selftests: Give better unexpected fault error messages
	x86/pkeys/selftests: Stop using assert()
	x86/pkeys/selftests: Remove dead debugging code, fix dprint_in_signal
	x86/pkeys/selftests: Allow faults on unknown keys
	x86/pkeys/selftests: Factor out "instruction page"
	x86/pkeys/selftests: Add PROT_EXEC test
	x86/pkeys/selftests: Fix pkey exhaustion test off-by-one
	x86/pkeys/selftests: Fix pointer math
	x86/pkeys/selftests: Save off 'prot' for allocations
	x86/pkeys/selftests: Add a test for pkey 0
	mtd: Fix comparison in map_word_andequal()
	afs: Fix the non-encryption of calls
	usb: musb: fix remote wakeup racing with suspend
	ARM: keystone: fix platform_domain_notifier array overrun
	i2c: pmcmsp: return message count on master_xfer success
	i2c: pmcmsp: fix error return from master_xfer
	i2c: viperboard: return message count on master_xfer success
	ARM: davinci: dm646x: fix timer interrupt generation
	ARM: davinci: board-dm646x-evm: pass correct I2C adapter id for VPIF
	ARM: davinci: board-dm646x-evm: set VPIF capture card name
	clk: imx6ull: use OSC clock during AXI rate change
	locking/rwsem: Add a new RWSEM_ANONYMOUSLY_OWNED flag
	locking/percpu-rwsem: Annotate rwsem ownership transfer by setting RWSEM_OWNER_UNKNOWN
	drm/dumb-buffers: Integer overflow in drm_mode_create_ioctl()
	sched/debug: Move the print_rt_rq() and print_dl_rq() declarations to kernel/sched/sched.h
	sched/deadline: Make the grub_reclaim() function static
	parisc: Move setup_profiling_timer() out of init section
	efi/libstub/arm64: Handle randomized TEXT_OFFSET
	ARM: 8753/1: decompressor: add a missing parameter to the addruart macro
	ARM: 8758/1: decompressor: restore r1 and r2 just before jumping to the kernel
	ARM: kexec: fix kdump register saving on panic()
	Revert "Btrfs: fix scrub to repair raid6 corruption"
	Btrfs: fix scrub to repair raid6 corruption
	Btrfs: make raid6 rebuild retry more
	tcp: do not overshoot window_clamp in tcp_rcv_space_adjust()
	Linux 4.14.51

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-06-21 05:46:51 +09:00
Minchan Kim
6d2707f268 mm: memcg: add __GFP_NOWARN in __memcg_schedule_kmem_cache_create()
[ Upstream commit c892fd82cc0632d425ae011a4dd75eb59e9f84ee ]

If there is heavy memory pressure, page allocation with __GFP_NOWAIT
fails easily although it's order-0 request.  I got below warning 9 times
for normal boot.

     <snip >: page allocation failure: order:0, mode:0x2200000(GFP_NOWAIT|__GFP_NOTRACK)
     .. snip ..
     Call trace:
       dump_backtrace+0x0/0x4
       dump_stack+0xa4/0xc0
       warn_alloc+0xd4/0x15c
       __alloc_pages_nodemask+0xf88/0x10fc
       alloc_slab_page+0x40/0x18c
       new_slab+0x2b8/0x2e0
       ___slab_alloc+0x25c/0x464
       __kmalloc+0x394/0x498
       memcg_kmem_get_cache+0x114/0x2b8
       kmem_cache_alloc+0x98/0x3e8
       mmap_region+0x3bc/0x8c0
       do_mmap+0x40c/0x43c
       vm_mmap_pgoff+0x15c/0x1e4
       sys_mmap+0xb0/0xc8
       el0_svc_naked+0x24/0x28
     Mem-Info:
     active_anon:17124 inactive_anon:193 isolated_anon:0
      active_file:7898 inactive_file:712955 isolated_file:55
      unevictable:0 dirty:27 writeback:18 unstable:0
      slab_reclaimable:12250 slab_unreclaimable:23334
      mapped:19310 shmem:212 pagetables:816 bounce:0
      free:36561 free_pcp:1205 free_cma:35615
     Node 0 active_anon:68496kB inactive_anon:772kB active_file:31592kB inactive_file:2851820kB unevictable:0kB isolated(anon):0kB isolated(file):220kB mapped:77240kB dirty:108kB writeback:72kB shmem:848kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
     DMA free:142188kB min:3056kB low:3820kB high:4584kB active_anon:10052kB inactive_anon:12kB active_file:312kB inactive_file:1412620kB unevictable:0kB writepending:0kB present:1781412kB managed:1604728kB mlocked:0kB slab_reclaimable:3592kB slab_unreclaimable:876kB kernel_stack:400kB pagetables:52kB bounce:0kB free_pcp:1436kB local_pcp:124kB free_cma:142492kB
     lowmem_reserve[]: 0 1842 1842
     Normal free:4056kB min:4172kB low:5212kB high:6252kB active_anon:58376kB inactive_anon:760kB active_file:31348kB inactive_file:1439040kB unevictable:0kB writepending:180kB present:2000636kB managed:1923688kB mlocked:0kB slab_reclaimable:45408kB slab_unreclaimable:92460kB kernel_stack:9680kB pagetables:3212kB bounce:0kB free_pcp:3392kB local_pcp:688kB free_cma:0kB
     lowmem_reserve[]: 0 0 0
     DMA: 0*4kB 0*8kB 1*16kB (C) 0*32kB 0*64kB 0*128kB 1*256kB (C) 1*512kB (C) 0*1024kB 1*2048kB (C) 34*4096kB (C) = 142096kB
     Normal: 228*4kB (UMEH) 172*8kB (UMH) 23*16kB (UH) 24*32kB (H) 5*64kB (H) 1*128kB (H) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3872kB
     721350 total pagecache pages
     0 pages in swap cache
     Swap cache stats: add 0, delete 0, find 0/0
     Free swap  = 0kB
     Total swap = 0kB
     945512 pages RAM
     0 pages HighMem/MovableOnly
     63408 pages reserved
     51200 pages cma reserved

__memcg_schedule_kmem_cache_create() tries to create a shadow slab cache
and the worker allocation failure is not really critical because we will
retry on the next kmem charge.  We might miss some charges but that
shouldn't be critical.  The excessive allocation failure report is not
very helpful.

[mhocko@kernel.org: changelog update]
Link: http://lkml.kernel.org/r/20180418022912.248417-1-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-21 04:02:46 +09:00
Greg Kroah-Hartman
37f5b3d9c7 This is the 4.14.49 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlse4FIACgkQONu9yGCS
 aT5ZqxAAqCPguhXiTIWJPL0760M4I8C/cTLgl0JWpD946cHaQJpUhmuiXfc91+KO
 1mhgjW6PQjojWJmj5VYM2aCkvUoZ4sowtMaiGuhjLL1Hr+s879wI0wB+9uO2gy3O
 8si9gLV1qoa6QOAe5hbGiyOhkv0OYpRq290ar/NFVS+sKucwJEMFY/rLhjedNGlY
 BmT4C8xK5D/s/r6rRAGCyxQNar8RcFd7C+iEZXvQsybi4euJAd9DcpvileKORNxj
 bfBhHyFikbvN6Pfb6ooD9q88y8U3Tvk+sCZJI15acRePIJ4eJPtkjKkunsGms3jM
 cIA9WuC7NHSGMf8ZJAFAaxfNCepWpeb5lPSdkMuIjWhwQ22OhZR4eS+PQvhjojUl
 Z3Ry8fvxtMz4hseaeg+8mUJohLP1+v6GAIAGm9XRwElyTC0RKzKaUkK//zlrmmy3
 wR8vpgdPBZJlvqE61+aNWcnLpKxRp/aWMTklXNVBsnL2FVfl0+8iBz6ylDjfmAx5
 kRGdbWuCcHVfVq91fAwIwOIrHZg2Pi3/pIHQQXljqpSxG7vQrmPj2uiaYPTyAjZM
 LAVtE21TdtZX+4idHDZ8nNew204cr+ug1X373vugS8/NdLovpjb7vmxUusNy9ghG
 A5hif1fa8HEb3bCsUEBgZdkpD46uvk1IZc8js/e2zb5FmR3V8vM=
 =px2V
 -----END PGP SIGNATURE-----

Merge 4.14.49 into android-4.14

Changes in 4.14.49
	scsi: sd_zbc: Fix potential memory leak
	scsi: sd_zbc: Avoid that resetting a zone fails sporadically
	mmap: introduce sane default mmap limits
	mmap: relax file size limit for regular files
	btrfs: define SUPER_FLAG_METADUMP_V2
	kconfig: Avoid format overflow warning from GCC 8.1
	be2net: Fix error detection logic for BE3
	bnx2x: use the right constant
	dccp: don't free ccid2_hc_tx_sock struct in dccp_disconnect()
	enic: set DMA mask to 47 bit
	ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds
	ip6_tunnel: remove magic mtu value 0xFFF8
	ipmr: properly check rhltable_init() return value
	ipv4: remove warning in ip_recv_error
	ipv6: omit traffic class when calculating flow hash
	isdn: eicon: fix a missing-check bug
	kcm: Fix use-after-free caused by clonned sockets
	netdev-FAQ: clarify DaveM's position for stable backports
	net: ipv4: add missing RTA_TABLE to rtm_ipv4_policy
	net: metrics: add proper netlink validation
	net/packet: refine check for priv area size
	net: phy: broadcom: Fix bcm_write_exp()
	net: usb: cdc_mbim: add flag FLAG_SEND_ZLP
	packet: fix reserve calculation
	qed: Fix mask for physical address in ILT entry
	sctp: not allow transport timeout value less than HZ/5 for hb_timer
	team: use netdev_features_t instead of u32
	vhost: synchronize IOTLB message with dev cleanup
	vrf: check the original netdevice for generating redirect
	ipv6: sr: fix memory OOB access in seg6_do_srh_encap/inline
	net: phy: broadcom: Fix auxiliary control register reads
	net-sysfs: Fix memory leak in XPS configuration
	virtio-net: correctly transmit XDP buff after linearizing
	net/mlx4: Fix irq-unsafe spinlock usage
	tun: Fix NULL pointer dereference in XDP redirect
	virtio-net: correctly check num_buf during err path
	net/mlx5e: When RXFCS is set, add FCS data into checksum calculation
	virtio-net: fix leaking page for gso packet during mergeable XDP
	rtnetlink: validate attributes in do_setlink()
	cls_flower: Fix incorrect idr release when failing to modify rule
	PCI: hv: Do not wait forever on a device that has disappeared
	drm: set FMODE_UNSIGNED_OFFSET for drm files
	Linux 4.14.49

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-06-12 18:33:52 +02:00
Linus Torvalds
af760b568e mmap: relax file size limit for regular files
commit 423913ad4ae5b3e8fb8983f70969fb522261ba26 upstream.

Commit be83bbf80682 ("mmap: introduce sane default mmap limits") was
introduced to catch problems in various ad-hoc character device drivers
doing mmap and getting the size limits wrong.  In the process, it used
"known good" limits for the normal cases of mapping regular files and
block device drivers.

It turns out that the "s_maxbytes" limit was less "known good" than I
thought.  In particular, /proc doesn't set it, but exposes one regular
file to mmap: /proc/vmcore.  As a result, that file got limited to the
default MAX_INT s_maxbytes value.

This went unnoticed for a while, because apparently the only thing that
needs it is the s390 kernel zfcpdump, but there might be other tools
that use this too.

Vasily suggested just changing s_maxbytes for all of /proc, which isn't
wrong, but makes me nervous at this stage.  So instead, just make the
new mmap limit always be MAX_LFS_FILESIZE for regular files, which won't
affect anything else.  It wasn't the regular file case I was worried
about.

I'd really prefer for maxsize to have been per-inode, but that is not
how things are today.

Fixes: be83bbf80682 ("mmap: introduce sane default mmap limits")
Reported-by: Vasily Gorbik <gor@linux.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-11 22:49:18 +02:00
Linus Torvalds
16d7ceb04b mmap: introduce sane default mmap limits
commit be83bbf806822b1b89e0a0f23cd87cddc409e429 upstream.

The internal VM "mmap()" interfaces are based on the mmap target doing
everything using page indexes rather than byte offsets, because
traditionally (ie 32-bit) we had the situation that the byte offset
didn't fit in a register.  So while the mmap virtual address was limited
by the word size of the architecture, the backing store was not.

So we're basically passing "pgoff" around as a page index, in order to
be able to describe backing store locations that are much bigger than
the word size (think files larger than 4GB etc).

But while this all makes a ton of sense conceptually, we've been dogged
by various drivers that don't really understand this, and internally
work with byte offsets, and then try to work with the page index by
turning it into a byte offset with "pgoff << PAGE_SHIFT".

Which obviously can overflow.

Adding the size of the mapping to it to get the byte offset of the end
of the backing store just exacerbates the problem, and if you then use
this overflow-prone value to check various limits of your device driver
mmap capability, you're just setting yourself up for problems.

The correct thing for drivers to do is to do their limit math in page
indices, the way the interface is designed.  Because the generic mmap
code _does_ test that the index doesn't overflow, since that's what the
mmap code really cares about.

HOWEVER.

Finding and fixing various random drivers is a sisyphean task, so let's
just see if we can just make the core mmap() code do the limiting for
us.  Realistically, the only "big" backing stores we need to care about
are regular files and block devices, both of which are known to do this
properly, and which have nice well-defined limits for how much data they
can access.

So let's special-case just those two known cases, and then limit other
random mmap users to a backing store that still fits in "unsigned long".
Realistically, that's not much of a limit at all on 64-bit, and on
32-bit architectures the only worry might be the GPU drivers, which can
have big physical address spaces.

To make it possible for drivers like that to say that they are 64-bit
clean, this patch does repurpose the "FMODE_UNSIGNED_OFFSET" bit in the
file flags to allow drivers to mark their file descriptors as safe in
the full 64-bit mmap address space.

[ The timing for doing this is less than optimal, and this should really
  go in a merge window. But realistically, this needs wide testing more
  than it needs anything else, and being main-line is the only way to do
  that.

  So the earlier the better, even if it's outside the proper development
  cycle        - Linus ]

Cc: Kees Cook <keescook@chromium.org>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-11 22:49:18 +02:00
Greg Kroah-Hartman
eca84e5091 This is the 4.14.48 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlsWWugACgkQONu9yGCS
 aT5tPRAAqGZxmPsWiqiVDiEFRG1SyVTRYyQW/7OLgXcaXfs2rvZLMDThrP9egxyy
 FSExZicVRqNH3RHnt6MLiv9tiv7fNIW6ZCQ0/AoDt61c5cEIjmE2VksvvdcKCuzR
 71Yx3HYmK2mz3wDz2xn+YHDFlfQOol1oyMLDr484mBVNSccI9pDaDgeP6l3Z4v6w
 g5jwVwB+lKfJGryW8orn2jSGEz6wWnfRRegzrtshnYk7zRQp7NkjXATcwMxnSz3t
 f5ihztHW/5hBjWjfPP2nIsp29hXZuvwdXikUOJeT/dT8V/gSJdTbO7sjh58BhXUx
 kGgJXwGnj+Y5SPmWp5Z8M4BCyt0jPDodteYllpFjL2SJXUjUKMIE3ucZgL7ah26I
 saUpXhMjumoi3f0eYhRh+AtZfNE7DVFmmZOLjrHo2S8s2GiN1CVbasC+cs9RXfyH
 rsFvkKfhI86L4K6+OIWSsnxiQ8HEr9DbMnyVQlS1gAwstXibInWs+drj9Os4CgQY
 XygJU5+556S02eL0ZyE8y5Hwo6htYDTOoA4Na0wGcvOwPzR0zOxHUPCje6ogXU2s
 u+VqyNfTMvSBDnR6BbWnK1zHxsfTInsx5UGnXYBhkuN440pXLkS9tG5cjSNFTa9z
 vLK9pr3qtPg0RgWOrthzSTgVVuXZhbcdql+jEIOSsNMkNC6iTDc=
 =XFiy
 -----END PGP SIGNATURE-----

Merge 4.14.48 into android-4.14

Changes in 4.14.48
	fix io_destroy()/aio_complete() race
	mm: fix the NULL mapping case in __isolate_lru_page()
	objtool: Support GCC 8's cold subfunctions
	objtool: Support GCC 8 switch tables
	objtool: Detect RIP-relative switch table references
	objtool: Detect RIP-relative switch table references, part 2
	objtool: Fix "noreturn" detection for recursive sibling calls
	x86/mce/AMD: Carve out SMCA get_block_address() code
	x86/MCE/AMD: Cache SMCA MISC block addresses
	Revert "pinctrl: msm: Use dynamic GPIO numbering"
	PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()
	xfs: convert XFS_AGFL_SIZE to a helper function
	xfs: detect agfl count corruption and reset agfl
	Input: synaptics - Lenovo Carbon X1 Gen5 (2017) devices should use RMI
	Input: synaptics - Lenovo Thinkpad X1 Carbon G5 (2017) with Elantech trackpoints should use RMI
	Input: synaptics - add Intertouch support on X1 Carbon 6th and X280
	Input: synaptics - add Lenovo 80 series ids to SMBus
	Input: elan_i2c_smbus - fix corrupted stack
	tracing: Fix crash when freeing instances with event triggers
	tracing: Make the snapshot trigger work with instances
	selinux: KASAN: slab-out-of-bounds in xattr_getsecurity
	cfg80211: further limit wiphy names to 64 bytes
	kbuild: clang: remove crufty HOSTCFLAGS
	drm/i915: Always sanity check engine state upon idling
	dma-buf: remove redundant initialization of sg_table
	drm/amd/powerplay: Fix enum mismatch
	rtlwifi: rtl8192cu: Remove variable self-assignment in rf.c
	ASoC: Intel: sst: remove redundant variable dma_dev_name
	platform/chrome: cros_ec_lpc: remove redundant pointer request
	kbuild: clang: disable unused variable warnings only when constant
	tcp: avoid integer overflows in tcp_rcv_space_adjust()
	iio: ad7793: implement IIO_CHAN_INFO_SAMP_FREQ
	iio:buffer: make length types match kfifo types
	iio:kfifo_buf: check for uint overflow
	iio: adc: select buffer for at91-sama5d2_adc
	MIPS: lantiq: gphy: Drop reboot/remove reset asserts
	MIPS: ptrace: Fix PTRACE_PEEKUSR requests for 64-bit FGRs
	MIPS: prctl: Disallow FRE without FR with PR_SET_FP_MODE requests
	scsi: scsi_transport_srp: Fix shost to rport translation
	stm class: Use vmalloc for the master map
	hwtracing: stm: fix build error on some arches
	IB/core: Fix error code for invalid GID entry
	mm/huge_memory.c: __split_huge_page() use atomic ClearPageDirty()
	Revert "rt2800: use TXOP_BACKOFF for probe frames"
	intel_th: Use correct device when freeing buffers
	drm/psr: Fix missed entry in PSR setup time table.
	drm/i915/lvds: Move acpi lid notification registration to registration phase
	drm/i915: Disable LVDS on Radiant P845
	powerpc/mm/slice: Remove intermediate bitmap copy
	powerpc/mm/slice: create header files dedicated to slices
	powerpc/mm/slice: Enhance for supporting PPC32
	powerpc/mm/slice: Fix hugepage allocation at hint address on 8xx
	Linux 4.14.48

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-06-05 11:52:33 +02:00
Hugh Dickins
a7027b7d69 mm/huge_memory.c: __split_huge_page() use atomic ClearPageDirty()
commit 2d077d4b59924acd1f5180c6fb73b57f4771fde6 upstream.

Swapping load on huge=always tmpfs (with khugepaged tuned up to be very
eager, but I'm not sure that is relevant) soon hung uninterruptibly,
waiting for page lock in shmem_getpage_gfp()'s find_lock_entry(), most
often when "cp -a" was trying to write to a smallish file.  Debug showed
that the page in question was not locked, and page->mapping NULL by now,
but page->index consistent with having been in a huge page before.

Reproduced in minutes on a 4.15 kernel, even with 4.17's 605ca5ede764
("mm/huge_memory.c: reorder operations in __split_huge_page_tail()") added
in; but took hours to reproduce on a 4.17 kernel (no idea why).

The culprit proved to be the __ClearPageDirty() on tails beyond i_size in
__split_huge_page(): the non-atomic __bitoperation may have been safe when
4.8's baa355fd3314 ("thp: file pages support for split_huge_page()")
introduced it, but liable to erase PageWaiters after 4.10's 62906027091f
("mm: add PageWaiters indicating tasks are waiting for a page bit").

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1805291841070.3197@eggly.anvils
Fixes: 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-05 11:41:59 +02:00
Hugh Dickins
b968dd7650 mm: fix the NULL mapping case in __isolate_lru_page()
commit 145e1a71e090575c74969e3daa8136d1e5b99fc8 upstream.

George Boole would have noticed a slight error in 4.16 commit
69d763fc6d3a ("mm: pin address_space before dereferencing it while
isolating an LRU page").  Fix it, to match both the comment above it,
and the original behaviour.

Although anonymous pages are not marked PageDirty at first, we have an
old habit of calling SetPageDirty when a page is removed from swap
cache: so there's a category of ex-swap pages that are easily
migratable, but were inadvertently excluded from compaction's async
migration in 4.16.

Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1805302014001.12558@eggly.anvils
Fixes: 69d763fc6d3a ("mm: pin address_space before dereferencing it while isolating an LRU page")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Reported-by:  Ivan Kalvachev <ikalvachev@gmail.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-05 11:41:54 +02:00
Rik van Riel
019b711f9e ANDROID: add extra free kbytes tunable
Add a userspace visible knob to tell the VM to keep an extra amount
of memory free, by increasing the gap between each zone's min and
low watermarks.

This is useful for realtime applications that call system
calls and have a bound on the number of allocations that happen
in any short time period.  In this application, extra_free_kbytes
would be left at an amount equal to or larger than than the
maximum number of allocations that happen in any burst.

It may also be useful to reduce the memory use of virtual
machines (temporarily?), in a way that does not cause memory
fragmentation like ballooning does.

[ccross]
Revived for use on old kernels where no other solution exists.
The tunable will be removed on kernels that do better at avoiding
direct reclaim.

[surenb]
Will be reverted as soon as Android framework is reworked to
use upstream-supported watermark_scale_factor instead of
extra_free_kbytes.

Bug: 86445363
Change-Id: I765a42be8e964bfd3e2886d1ca85a29d60c3bb3e
Signed-off-by: Rik van Riel<riel@redhat.com>
Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2018-06-04 15:48:04 +00:00
Greg Kroah-Hartman
503f6fecb8 This is the 4.14.45 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlsOPCoACgkQONu9yGCS
 aT4vYBAAoESFP3oUtpyrPQU2yWQx7sRq/Dd8WyNlHlq2nRU8Y42ynB8TdRpAIces
 3aP7vPwFLaK4H0SZt4oA+NialRMhC/bN6BmKaoTUXq2nmE2XzDkcPDu0zHnqQt9C
 vc5wa2hd+H95wj9cdkkPwdlmgVhHztowJ3uqqNaPql2MVjDLKxziNVMv7lAIGPk3
 TycD9SihGAEKFjI2WIXaX6hm+3gGRnuK2ovlqnlF24dLRFiGIBL+fUp5ZGoxVlRP
 W260tQnTv/TvWUJ7V3x6rZ04kgV7LcaZrwSyN7GLJmhoi9Bw0BmL1N3cEAfEZdy2
 YoGqDemLW9bEiHBhFuPOcFr7tyAz8EsVH4/KUwkIMgWNbV8DmTKT2nbfzG9ju6Hb
 q9q3OJyLPBamGxTuiXUspRhQJrVrMX6sahHQDj5786AVgBDoGVFw1d+v9kJCoSAv
 lnA7qTbCFeq288dJ3sU7OZhmApC1oMPjMjmfVWwuQKBz81xqsquAjQRkBY3Odw+j
 yreZ9PS2Krk3bpf9QoDf/NGM+zpFyyy3xbrHpMkIEv48VGYrpe0nP6TZRfEgF65L
 036uZCPzpH+vFdyjMPWUPPXGZCD7q6DGk+wKit2eMFKOXB477yKA2+qAWs0GAeKo
 g7N0Rql7YZQK+Zu+1YvtfqF4WUBBP0uAb7FSuyVKVIzI3LfPCQk=
 =m2qv
 -----END PGP SIGNATURE-----

Merge 4.14.45 into android-4.14

Changes in 4.14.45
	MIPS: c-r4k: Fix data corruption related to cache coherence
	MIPS: ptrace: Expose FIR register through FP regset
	MIPS: Fix ptrace(2) PTRACE_PEEKUSR and PTRACE_POKEUSR accesses to o32 FGRs
	KVM: Fix spelling mistake: "cop_unsuable" -> "cop_unusable"
	affs_lookup(): close a race with affs_remove_link()
	fs: don't scan the inode cache before SB_BORN is set
	aio: fix io_destroy(2) vs. lookup_ioctx() race
	ALSA: timer: Fix pause event notification
	do d_instantiate/unlock_new_inode combinations safely
	mmc: sdhci-iproc: remove hard coded mmc cap 1.8v
	mmc: sdhci-iproc: fix 32bit writes for TRANSFER_MODE register
	mmc: sdhci-iproc: add SDHCI_QUIRK2_HOST_OFF_CARD_ON for cygnus
	libata: Blacklist some Sandisk SSDs for NCQ
	libata: blacklist Micron 500IT SSD with MU01 firmware
	xen-swiotlb: fix the check condition for xen_swiotlb_free_coherent
	drm/vmwgfx: Fix 32-bit VMW_PORT_HB_[IN|OUT] macros
	arm64: lse: Add early clobbers to some input/output asm operands
	powerpc/64s: Clear PCR on boot
	IB/hfi1: Use after free race condition in send context error path
	IB/umem: Use the correct mm during ib_umem_release
	sr: pass down correctly sized SCSI sense buffer
	idr: fix invalid ptr dereference on item delete
	Revert "ipc/shm: Fix shmat mmap nil-page protection"
	ipc/shm: fix shmat() nil address after round-down when remapping
	mm/kasan: don't vfree() nonexistent vm_area
	kasan: free allocated shadow memory on MEM_CANCEL_ONLINE
	kasan: fix memory hotplug during boot
	kernel/sys.c: fix potential Spectre v1 issue
	KVM/VMX: Expose SSBD properly to guests
	KVM: s390: vsie: fix < 8k check for the itdba
	KVM: x86: Update cpuid properly when CR4.OSXAVE or CR4.PKE is changed
	kvm: x86: IA32_ARCH_CAPABILITIES is always supported
	x86/kvm: fix LAPIC timer drift when guest uses periodic mode
	powerpc/64s: Improve RFI L1-D cache flush fallback
	powerpc/pseries: Support firmware disable of RFI flush
	powerpc/powernv: Support firmware disable of RFI flush
	powerpc/rfi-flush: Move the logic to avoid a redo into the debugfs code
	powerpc/rfi-flush: Make it possible to call setup_rfi_flush() again
	powerpc/rfi-flush: Always enable fallback flush on pseries
	powerpc/rfi-flush: Differentiate enabled and patched flush types
	powerpc/rfi-flush: Call setup_rfi_flush() after LPM migration
	powerpc/pseries: Add new H_GET_CPU_CHARACTERISTICS flags
	powerpc: Add security feature flags for Spectre/Meltdown
	powerpc/pseries: Set or clear security feature flags
	powerpc/powernv: Set or clear security feature flags
	powerpc/64s: Move cpu_show_meltdown()
	powerpc/64s: Enhance the information in cpu_show_meltdown()
	powerpc/powernv: Use the security flags in pnv_setup_rfi_flush()
	powerpc/pseries: Use the security flags in pseries_setup_rfi_flush()
	powerpc/64s: Wire up cpu_show_spectre_v1()
	powerpc/64s: Wire up cpu_show_spectre_v2()
	powerpc/pseries: Fix clearing of security feature flags
	powerpc: Move default security feature flags
	powerpc/pseries: Restore default security feature flags on setup
	powerpc/64s: Fix section mismatch warnings from setup_rfi_flush()
	powerpc/64s: Add support for a store forwarding barrier at kernel entry/exit
	MIPS: generic: Fix machine compatible matching
	mac80211: mesh: fix wrong mesh TTL offset calculation
	ARC: Fix malformed ARC_EMUL_UNALIGNED default
	ptr_ring: prevent integer overflow when calculating size
	arm64: dts: rockchip: fix rock64 gmac2io stability issues
	arm64: dts: rockchip: correct ep-gpios for rk3399-sapphire
	libata: Fix compile warning with ATA_DEBUG enabled
	selftests: sync: missing CFLAGS while compiling
	selftest/vDSO: fix O=
	selftests: pstore: Adding config fragment CONFIG_PSTORE_RAM=m
	selftests: memfd: add config fragment for fuse
	ARM: OMAP2+: timer: fix a kmemleak caused in omap_get_timer_dt
	ARM: OMAP3: Fix prm wake interrupt for resume
	ARM: OMAP2+: Fix sar_base inititalization for HS omaps
	ARM: OMAP1: clock: Fix debugfs_create_*() usage
	ibmvnic: Wait until reset is complete to set carrier on
	ibmvnic: Free RX socket buffer in case of adapter error
	ibmvnic: Clean RX pool buffers during device close
	tls: retrun the correct IV in getsockopt
	xhci: workaround for AMD Promontory disabled ports wakeup
	IB/uverbs: Fix method merging in uverbs_ioctl_merge
	IB/uverbs: Fix possible oops with duplicate ioctl attributes
	IB/uverbs: Fix unbalanced unlock on error path for rdma_explicit_destroy
	arm64: dts: rockchip: Fix DWMMC clocks
	ARM: dts: rockchip: Fix DWMMC clocks
	iwlwifi: mvm: fix security bug in PN checking
	iwlwifi: mvm: fix IBSS for devices that support station type API
	iwlwifi: mvm: always init rs with 20mhz bandwidth rates
	NFC: llcp: Limit size of SDP URI
	rxrpc: Work around usercopy check
	MD: Free bioset when md_run fails
	md: fix md_write_start() deadlock w/o metadata devices
	s390/dasd: fix handling of internal requests
	xfrm: do not call rcu_read_unlock when afinfo is NULL in xfrm_get_tos
	mac80211: round IEEE80211_TX_STATUS_HEADROOM up to multiple of 4
	mac80211: fix a possible leak of station stats
	mac80211: fix calling sleeping function in atomic context
	cfg80211: clear wep keys after disconnection
	mac80211: Do not disconnect on invalid operating class
	mac80211: Fix sending ADDBA response for an ongoing session
	gpu: ipu-v3: pre: fix device node leak in ipu_pre_lookup_by_phandle
	gpu: ipu-v3: prg: fix device node leak in ipu_prg_lookup_by_phandle
	md raid10: fix NULL deference in handle_write_completed()
	drm/exynos: g2d: use monotonic timestamps
	drm/exynos: fix comparison to bitshift when dealing with a mask
	drm/meson: fix vsync buffer update
	arm64: perf: correct PMUVer probing
	RDMA/bnxt_re: Unpin SQ and RQ memory if QP create fails
	RDMA/bnxt_re: Fix system crash during load/unload
	ibmvnic: Check for NULL skb's in NAPI poll routine
	net/mlx5e: Return error if prio is specified when offloading eswitch vlan push
	locking/xchg/alpha: Add unconditional memory barrier to cmpxchg()
	md: raid5: avoid string overflow warning
	virtio_net: fix XDP code path in receive_small()
	kernel/relay.c: limit kmalloc size to KMALLOC_MAX_SIZE
	bug.h: work around GCC PR82365 in BUG()
	selftests/memfd: add run_fuse_test.sh to TEST_FILES
	seccomp: add a selftest for get_metadata
	soc: imx: gpc: de-register power domains only if initialized
	powerpc/bpf/jit: Fix 32-bit JIT for seccomp_data access
	s390/cio: fix ccw_device_start_timeout API
	s390/cio: fix return code after missing interrupt
	s390/cio: clear timer when terminating driver I/O
	selftests/bpf/test_maps: exit child process without error in ENOMEM case
	PKCS#7: fix direct verification of SignerInfo signature
	arm64: dts: cavium: fix PCI bus dtc warnings
	nfs: system crashes after NFS4ERR_MOVED recovery
	ARM: OMAP: Fix dmtimer init for omap1
	smsc75xx: fix smsc75xx_set_features()
	regulatory: add NUL to request alpha2
	integrity/security: fix digsig.c build error with header file
	x86/intel_rdt: Fix incorrect returned value when creating rdgroup sub-directory in resctrl file system
	locking/xchg/alpha: Fix xchg() and cmpxchg() memory ordering bugs
	x86/topology: Update the 'cpu cores' field in /proc/cpuinfo correctly across CPU hotplug operations
	mac80211: drop frames with unexpected DS bits from fast-rx to slow path
	arm64: fix unwind_frame() for filtered out fn for function graph tracing
	macvlan: fix use-after-free in macvlan_common_newlink()
	KVM: nVMX: Don't halt vcpu when L1 is injecting events to L2
	kvm: fix warning for CONFIG_HAVE_KVM_EVENTFD builds
	ARM: dts: imx6dl: Include correct dtsi file for Engicam i.CoreM6 DualLite/Solo RQS
	fs: dcache: Avoid livelock between d_alloc_parallel and __d_add
	fs: dcache: Use READ_ONCE when accessing i_dir_seq
	md: fix a potential deadlock of raid5/raid10 reshape
	md/raid1: fix NULL pointer dereference
	batman-adv: fix packet checksum in receive path
	batman-adv: invalidate checksum on fragment reassembly
	netfilter: ipt_CLUSTERIP: put config struct if we can't increment ct refcount
	netfilter: ipt_CLUSTERIP: put config instead of freeing it
	netfilter: ebtables: convert BUG_ONs to WARN_ONs
	batman-adv: Ignore invalid batadv_iv_gw during netlink send
	batman-adv: Ignore invalid batadv_v_gw during netlink send
	batman-adv: Fix netlink dumping of BLA claims
	batman-adv: Fix netlink dumping of BLA backbones
	nvme-pci: Fix nvme queue cleanup if IRQ setup fails
	clocksource/drivers/fsl_ftm_timer: Fix error return checking
	libceph, ceph: avoid memory leak when specifying same option several times
	ceph: fix dentry leak when failing to init debugfs
	xen/pvcalls: fix null pointer dereference on map->sock
	ARM: orion5x: Revert commit 4904dbda41c8.
	qrtr: add MODULE_ALIAS macro to smd
	selftests/futex: Fix line continuation in Makefile
	r8152: fix tx packets accounting
	virtio-gpu: fix ioctl and expose the fixed status to userspace.
	dmaengine: rcar-dmac: fix max_chunk_size for R-Car Gen3
	bcache: fix kcrashes with fio in RAID5 backend dev
	ip_gre: fix IFLA_MTU ignored on NEWLINK
	ip6_tunnel: fix IFLA_MTU ignored on NEWLINK
	sit: fix IFLA_MTU ignored on NEWLINK
	nbd: fix return value in error handling path
	ARM: dts: NSP: Fix amount of RAM on BCM958625HR
	ARM: dts: bcm283x: Fix unit address of local_intc
	powerpc/boot: Fix random libfdt related build errors
	clocksource/drivers/mips-gic-timer: Use correct shift count to extract data
	gianfar: Fix Rx byte accounting for ndev stats
	net/tcp/illinois: replace broken algorithm reference link
	nvmet: fix PSDT field check in command format
	net/smc: use link_id of server in confirm link reply
	mlxsw: core: Fix flex keys scratchpad offset conflict
	mlxsw: spectrum: Treat IPv6 unregistered multicast as broadcast
	spectrum: Reference count VLAN entries
	ARC: mcip: halt GFRC counter when ARC cores halt
	ARC: mcip: update MCIP debug mask when the new cpu came online
	ARC: setup cpu possible mask according to possible-cpus dts property
	ipvs: remove IPS_NAT_MASK check to fix passive FTP
	IB/mlx: Set slid to zero in Ethernet completion struct
	RDMA/bnxt_re: Unconditionly fence non wire memory operations
	RDMA/bnxt_re: Fix incorrect DB offset calculation
	RDMA/bnxt_re: Fix the ib_reg failure cleanup
	xen/pirq: fix error path cleanup when binding MSIs
	drm/amd/amdgpu: Correct VRAM width for APUs with GMC9
	xfrm: Fix ESN sequence number handling for IPsec GSO packets.
	arm64: dts: rockchip: Fix rk3399-gru-* s2r (pinctrl hogs, wifi reset)
	drm/sun4i: Fix dclk_set_phase
	btrfs: use kvzalloc to allocate btrfs_fs_info
	Btrfs: send, fix issuing write op when processing hole in no data mode
	Btrfs: fix log replay failure after linking special file and fsync
	ceph: fix potential memory leak in init_caches()
	block: display the correct diskname for bio
	nvme-pci: Fix EEH failure on ppc
	nvme: pci: pass max vectors as num_possible_cpus() to pci_alloc_irq_vectors
	selftests/powerpc: Skip the subpage_prot tests if the syscall is unavailable
	net: ethtool: don't ignore return from driver get_fecparam method
	iwlwifi: mvm: fix TX of CCMP 256
	iwlwifi: mvm: Fix channel switch for count 0 and 1
	iwlwifi: mvm: fix assert 0x2B00 on older FWs
	iwlwifi: avoid collecting firmware dump if not loaded
	iwlwifi: mvm: fix "failed to remove key" message
	iwlwifi: mvm: Direct multicast frames to the correct station
	iwlwifi: mvm: Correctly set the tid for mcast queue
	rds: Incorrect reference counting in TCP socket creation
	watchdog: f71808e_wdt: Fix magic close handling
	watchdog: sbsa: use 32-bit read for WCV
	batman-adv: Fix multicast packet loss with a single WANT_ALL_IPV4/6 flag
	hv_netvsc: use napi_schedule_irqoff
	hv_netvsc: filter multicast/broadcast
	hv_netvsc: propagate rx filters to VF
	ARM: dts: rockchip: Add missing #sound-dai-cells on rk3288
	perf record: Fix crash in pipe mode
	e1000e: Fix check_for_link return value with autoneg off
	e1000e: allocate ring descriptors with dma_zalloc_coherent
	ia64/err-inject: Use get_user_pages_fast()
	RDMA/qedr: Fix kernel panic when running fio over NFSoRDMA
	RDMA/qedr: Fix iWARP write and send with immediate
	IB/mlx4: Fix corruption of RoCEv2 IPv4 GIDs
	IB/mlx4: Include GID type when deleting GIDs from HW table under RoCE
	IB/mlx5: Fix an error code in __mlx5_ib_modify_qp()
	fbdev: Fixing arbitrary kernel leak in case FBIOGETCMAP_SPARC in sbusfb_ioctl_helper().
	fsl/fman: avoid sleeping in atomic context while adding an address
	qed: Free RoCE ILT Memory on rmmod qedr
	net: qcom/emac: Use proper free methods during TX
	net: smsc911x: Fix unload crash when link is up
	IB/core: Fix possible crash to access NULL netdev
	cxgb4: do not set needs_free_netdev for mgmt dev's
	xen-blkfront: move negotiate_mq to cover all cases of new VBDs
	xen: xenbus: use put_device() instead of kfree()
	hv_netvsc: fix filter flags
	hv_netvsc: fix locking for rx_mode
	hv_netvsc: fix locking during VF setup
	ARM: davinci: fix the GPIO lookup for omapl138-hawk
	arm64: Relax ARM_SMCCC_ARCH_WORKAROUND_1 discovery
	selftests/vm/run_vmtests: adjust hugetlb size according to nr_cpus
	lib/test_kmod.c: fix limit check on number of test devices created
	dmaengine: mv_xor_v2: Fix clock resource by adding a register clock
	netfilter: ebtables: fix erroneous reject of last rule
	can: m_can: change comparison to bitshift when dealing with a mask
	can: m_can: select pinctrl state in each suspend/resume function
	bnxt_en: Check valid VNIC ID in bnxt_hwrm_vnic_set_tpa().
	workqueue: use put_device() instead of kfree()
	ipv4: lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmtu
	sunvnet: does not support GSO for sctp
	KVM: arm/arm64: vgic: Add missing irq_lock to vgic_mmio_read_pending
	gpu: ipu-v3: prg: avoid possible array underflow
	drm/imx: move arming of the vblank event to atomic_flush
	drm/nouveau/bl: fix backlight regression
	xfrm: fix rcu_read_unlock usage in xfrm_local_error
	iwlwifi: mvm: set the correct tid when we flush the MCAST sta
	iwlwifi: mvm: Correctly set IGTK for AP
	iwlwifi: mvm: fix error checking for multi/broadcast sta
	net: Fix vlan untag for bridge and vlan_dev with reorder_hdr off
	vlan: Fix out of order vlan headers with reorder header off
	batman-adv: fix header size check in batadv_dbg_arp()
	net/sched: fix NULL dereference in the error path of tcf_sample_init()
	batman-adv: Fix skbuff rcsum on packet reroute
	vti4: Don't count header length twice on tunnel setup
	ip_tunnel: Clamp MTU to bounds on new link
	vti4: Don't override MTU passed on link creation via IFLA_MTU
	vti6: Fix dev->max_mtu setting
	iwlwifi: mvm: Increase session protection time after CS
	iwlwifi: mvm: clear tx queue id when unreserving aggregation queue
	iwlwifi: mvm: make sure internal station has a valid id
	iwlwifi: mvm: fix array out of bounds reference
	drm/tegra: Shutdown on driver unbind
	perf/cgroup: Fix child event counting bug
	brcmfmac: Fix check for ISO3166 code
	kbuild: make scripts/adjust_autoksyms.sh robust against timestamp races
	RDMA/ucma: Correct option size check using optlen
	RDMA/qedr: fix QP's ack timeout configuration
	RDMA/qedr: Fix rc initialization on CNQ allocation failure
	RDMA/qedr: Fix QP state initialization race
	net/sched: fix idr leak on the error path of tcf_bpf_init()
	net/sched: fix idr leak in the error path of tcf_simp_init()
	net/sched: fix idr leak in the error path of tcf_act_police_init()
	net/sched: fix idr leak in the error path of tcp_pedit_init()
	net/sched: fix idr leak in the error path of __tcf_ipt_init()
	net/sched: fix idr leak in the error path of tcf_skbmod_init()
	net: dsa: Fix functional dsa-loop dependency on FIXED_PHY
	drm/ast: Fixed 1280x800 Display Issue
	mm/mempolicy.c: avoid use uninitialized preferred_node
	mm, thp: do not cause memcg oom for thp
	xfrm: Fix transport mode skb control buffer usage.
	selftests: ftrace: Add probe event argument syntax testcase
	selftests: ftrace: Add a testcase for string type with kprobe_event
	selftests: ftrace: Add a testcase for probepoint
	drm/amdkfd: Fix scratch memory with HWS enabled
	batman-adv: fix multicast-via-unicast transmission with AP isolation
	batman-adv: fix packet loss for broadcasted DHCP packets to a server
	ARM: 8748/1: mm: Define vdso_start, vdso_end as array
	lan78xx: Set ASD in MAC_CR when EEE is enabled.
	net: qmi_wwan: add BroadMobi BM806U 2020:2033
	bonding: fix the err path for dev hwaddr sync in bond_enslave
	net: dsa: mt7530: fix module autoloading for OF platform drivers
	net/mlx5: Make eswitch support to depend on switchdev
	perf/x86/intel: Fix linear IP of PEBS real_ip on Haswell and later CPUs
	x86/alternatives: Fixup alternative_call_2
	llc: properly handle dev_queue_xmit() return value
	builddeb: Fix header package regarding dtc source links
	qede: Fix barrier usage after tx doorbell write.
	mm, slab: memcg_link the SLAB's kmem_cache
	mm/page_owner: fix recursion bug after changing skip entries
	mm/vmstat.c: fix vmstat_update() preemption BUG
	mm/kmemleak.c: wait for scan completion before disabling free
	hv_netvsc: enable multicast if necessary
	qede: Do not drop rx-checksum invalidated packets.
	net: Fix untag for vlan packets without ethernet header
	vlan: Fix vlan insertion for packets without ethernet header
	net: mvneta: fix enable of all initialized RXQs
	sh: fix debug trap failure to process signals before return to user
	firmware: dmi_scan: Fix UUID length safety check
	nvme: don't send keep-alives to the discovery controller
	Btrfs: clean up resources during umount after trans is aborted
	Btrfs: fix loss of prealloc extents past i_size after fsync log replay
	x86/pgtable: Don't set huge PUD/PMD on non-leaf entries
	x86/mm: Do not forbid _PAGE_RW before init for __ro_after_init
	fs/proc/proc_sysctl.c: fix potential page fault while unregistering sysctl table
	swap: divide-by-zero when zero length swap file on ssd
	z3fold: fix memory leak
	sr: get/drop reference to device in revalidate and check_events
	Force log to disk before reading the AGF during a fstrim
	cpufreq: CPPC: Initialize shared perf capabilities of CPUs
	powerpc/fscr: Enable interrupts earlier before calling get_user()
	perf tools: Fix perf builds with clang support
	perf clang: Add support for recent clang versions
	dp83640: Ensure against premature access to PHY registers after reset
	ibmvnic: Zero used TX descriptor counter on reset
	mm/ksm: fix interaction with THP
	mm: fix races between address_space dereference and free in page_evicatable
	mm: thp: fix potential clearing to referenced flag in page_idle_clear_pte_refs_one()
	Btrfs: bail out on error during replay_dir_deletes
	Btrfs: fix NULL pointer dereference in log_dir_items
	btrfs: Fix possible softlock on single core machines
	IB/rxe: Fix for oops in rxe_register_device on ppc64le arch
	ocfs2/dlm: don't handle migrate lockres if already in shutdown
	powerpc/64s/idle: Fix restore of AMOR on POWER9 after deep sleep
	sched/rt: Fix rq->clock_update_flags < RQCF_ACT_SKIP warning
	x86/mm: Fix bogus warning during EFI bootup, use boot_cpu_has() instead of this_cpu_has() in build_cr3_noflush()
	KVM: VMX: raise internal error for exception during invalid protected mode state
	lan78xx: Connect phy early
	fscache: Fix hanging wait on page discarded by writeback
	sparc64: Make atomic_xchg() an inline function rather than a macro.
	net: bgmac: Fix endian access in bgmac_dma_tx_ring_free()
	net: bgmac: Correctly annotate register space
	powerpc/64s: sreset panic if there is no debugger or crash dump handlers
	btrfs: tests/qgroup: Fix wrong tree backref level
	Btrfs: fix copy_items() return value when logging an inode
	btrfs: fix lockdep splat in btrfs_alloc_subvolume_writers
	btrfs: qgroup: Fix root item corruption when multiple same source snapshots are created with quota enabled
	rxrpc: Fix Tx ring annotation after initial Tx failure
	rxrpc: Don't treat call aborts as conn aborts
	xen/acpi: off by one in read_acpi_id()
	drivers: macintosh: rack-meter: really fix bogus memsets
	ACPI: acpi_pad: Fix memory leak in power saving threads
	powerpc/mpic: Check if cpu_possible() in mpic_physmask()
	ieee802154: ca8210: fix uninitialised data read
	ath10k: advertize beacon_int_min_gcd
	iommu/amd: Take into account that alloc_dev_data() may return NULL
	intel_th: Use correct method of finding hub
	m68k: set dma and coherent masks for platform FEC ethernets
	iwlwifi: mvm: check if mac80211_queue is valid in iwl_mvm_disable_txq
	parisc/pci: Switch LBA PCI bus from Hard Fail to Soft Fail mode
	hwmon: (nct6775) Fix writing pwmX_mode
	powerpc/perf: Prevent kernel address leak to userspace via BHRB buffer
	powerpc/perf: Fix kernel address leak via sampling registers
	rsi: fix kernel panic observed on 64bit machine
	tools/thermal: tmon: fix for segfault
	selftests: Print the test we're running to /dev/kmsg
	net/mlx5: Protect from command bit overflow
	watchdog: davinci_wdt: fix error handling in davinci_wdt_probe()
	ath10k: Fix kernel panic while using worker (ath10k_sta_rc_update_wk)
	nvme-pci: disable APST for Samsung NVMe SSD 960 EVO + ASUS PRIME Z370-A
	ath9k: fix crash in spectral scan
	cxgb4: Setup FW queues before registering netdev
	ima: Fix Kconfig to select TPM 2.0 CRB interface
	ima: Fallback to the builtin hash algorithm
	watchdog: aspeed: Allow configuring for alternate boot
	virtio-net: Fix operstate for virtio when no VIRTIO_NET_F_STATUS
	arm: dts: socfpga: fix GIC PPI warning
	ext4: don't complain about incorrect features when probing
	drm/vmwgfx: Unpin the screen object backup buffer when not used
	iommu/mediatek: Fix protect memory setting
	cpufreq: cppc_cpufreq: Fix cppc_cpufreq_init() failure path
	IB/mlx5: Set the default active rate and width to QDR and 4X
	zorro: Set up z->dev.dma_mask for the DMA API
	bcache: quit dc->writeback_thread when BCACHE_DEV_DETACHING is set
	remoteproc: imx_rproc: Fix an error handling path in 'imx_rproc_probe()'
	dt-bindings: add device tree binding for Allwinner H6 main CCU
	ACPICA: Events: add a return on failure from acpi_hw_register_read
	ACPICA: Fix memory leak on unusual memory leak
	ACPICA: acpi: acpica: fix acpi operand cache leak in nseval.c
	cxgb4: Fix queue free path of ULD drivers
	i2c: mv64xxx: Apply errata delay only in standard mode
	KVM: lapic: stop advertising DIRECTED_EOI when in-kernel IOAPIC is in use
	perf top: Fix top.call-graph config option reading
	perf stat: Fix core dump when flag T is used
	IB/core: Honor port_num while resolving GID for IB link layer
	drm/amdkfd: add missing include of mm.h
	coresight: Use %px to print pcsr instead of %p
	regulator: gpio: Fix some error handling paths in 'gpio_regulator_probe()'
	spi: bcm-qspi: fIX some error handling paths
	net/smc: pay attention to MAX_ORDER for CQ entries
	MIPS: ath79: Fix AR724X_PLL_REG_PCIE_CONFIG offset
	PCI: Restore config space on runtime resume despite being unbound
	watchdog: dw: RMW the control register
	watchdog: aspeed: Fix translation of reset mode to ctrl register
	ipmi_ssif: Fix kernel panic at msg_done_handler
	drm/meson: Fix some error handling paths in 'meson_drv_bind_master()'
	drm/meson: Fix an un-handled error path in 'meson_drv_bind_master()'
	powerpc: Add missing prototype for arch_irq_work_raise()
	powerpc/powernv/npu: Fix deadlock in mmio_invalidate()
	cxl: Check if PSL data-cache is available before issue flush request
	f2fs: fix to set KEEP_SIZE bit in f2fs_zero_range
	f2fs: fix to clear CP_TRIMMED_FLAG
	f2fs: fix to check extent cache in f2fs_drop_extent_tree
	perf/core: Fix installing cgroup events on CPU
	max17042: propagate of_node to power supply device
	perf/core: Fix perf_output_read_group()
	drm/panel: simple: Fix the bus format for the Ontat panel
	hwmon: (pmbus/max8688) Accept negative page register values
	hwmon: (pmbus/adm1275) Accept negative page register values
	perf/x86/intel: Properly save/restore the PMU state in the NMI handler
	cdrom: do not call check_disk_change() inside cdrom_open()
	efi/arm*: Only register page tables when they exist
	perf/x86/intel: Fix large period handling on Broadwell CPUs
	perf/x86/intel: Fix event update for auto-reload
	arm64: dts: qcom: Fix SPI5 config on MSM8996
	soc: qcom: wcnss_ctrl: Fix increment in NV upload
	gfs2: Fix fallocate chunk size
	x86/devicetree: Initialize device tree before using it
	x86/devicetree: Fix device IRQ settings in DT
	phy: rockchip-emmc: retry calpad busy trimming
	ALSA: vmaster: Propagate slave error
	phy: qcom-qmp: Fix phy pipe clock gating
	drm/bridge: sii902x: Retry status read after DDI I2C
	tools: hv: fix compiler warnings about major/target_fname
	block: null_blk: fix 'Invalid parameters' when loading module
	dmaengine: pl330: fix a race condition in case of threaded irqs
	dmaengine: rcar-dmac: Check the done lists in rcar_dmac_chan_get_residue()
	enic: enable rq before updating rq descriptors
	watchdog: asm9260_wdt: fix error handling in asm9260_wdt_probe()
	hwrng: stm32 - add reset during probe
	pinctrl: devicetree: Fix dt_to_map_one_config handling of hogs
	pinctrl: artpec6: dt: add missing pin group uart5nocts
	vfio-ccw: fence off transport mode
	dmaengine: qcom: bam_dma: get num-channels and num-ees from dt
	drm: omapdrm: dss: Move initialization code from component bind to probe
	ARM: dts: dra71-evm: Correct evm_sd regulator max voltage
	drm/amdgpu: disable GFX ring and disable PQ wptr in hw_fini
	drm/amdgpu: adjust timeout for ib_ring_tests(v2)
	net: stmmac: ensure that the device has released ownership before reading data
	net: stmmac: ensure that the MSS desc is the last desc to set the own bit
	cpufreq: Reorder cpufreq_online() error code path
	dpaa_eth: fix SG mapping
	PCI: Add function 1 DMA alias quirk for Marvell 88SE9220
	udf: Provide saner default for invalid uid / gid
	ixgbe: prevent ptp_rx_hang from running when in FILTER_ALL mode
	sh_eth: fix TSU init on SH7734/R8A7740
	power: supply: ltc2941-battery-gauge: Fix temperature units
	ARM: dts: bcm283x: Fix probing of bcm2835-i2s
	ARM: dts: bcm283x: Fix pin function of JTAG pins
	PCMCIA / PM: Avoid noirq suspend aborts during suspend-to-idle
	audit: return on memory error to avoid null pointer dereference
	net: stmmac: call correct function in stmmac_mac_config_rx_queues_routing()
	rcu: Call touch_nmi_watchdog() while printing stall warnings
	pinctrl: sh-pfc: r8a7796: Fix MOD_SEL register pin assignment for SSI pins group
	dpaa_eth: fix pause capability advertisement logic
	MIPS: Octeon: Fix logging messages with spurious periods after newlines
	drm/rockchip: Respect page offset for PRIME mmap calls
	x86/apic: Set up through-local-APIC mode on the boot CPU if 'noapic' specified
	perf test: Fix test case inet_pton to accept inlines.
	perf report: Fix wrong jump arrow
	perf tests: Use arch__compare_symbol_names to compare symbols
	perf report: Fix memory corruption in --branch-history mode --branch-history
	perf tests: Fix dwarf unwind for stripped binaries
	selftests/net: fixes psock_fanout eBPF test case
	netlabel: If PF_INET6, check sk_buff ip header version
	drm: rcar-du: lvds: Fix LVDS startup on R-Car Gen3
	drm: rcar-du: lvds: Fix LVDS startup on R-Car Gen2
	ARM: dts: at91: tse850: use the correct compatible for the eeprom
	regmap: Correct comparison in regmap_cached
	i40e: Add delay after EMP reset for firmware to recover
	ARM: dts: imx7d: cl-som-imx7: fix pinctrl_enet
	ARM: dts: porter: Fix HDMI output routing
	regulator: of: Add a missing 'of_node_put()' in an error handling path of 'of_regulator_match()'
	pinctrl: msm: Use dynamic GPIO numbering
	pinctrl: mcp23s08: spi: Fix regmap debugfs entries
	kdb: make "mdr" command repeat
	drm/vmwgfx: Set dmabuf_size when vmw_dmabuf_init is successful
	Linux 4.14.45

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-05-30 13:17:17 +02:00
Yang Shi
5ade3c9618 mm: thp: fix potential clearing to referenced flag in page_idle_clear_pte_refs_one()
[ Upstream commit f0849ac0b8e072073ec5fcc7fadd05a77434364e ]

For PTE-mapped THP, the compound THP has not been split to normal 4K
pages yet, the whole THP is considered referenced if any one of sub page
is referenced.

When walking PTE-mapped THP by pvmw, all relevant PTEs will be checked
to retrieve referenced bit.  But, the current code just returns the
result of the last PTE.  If the last PTE has not referenced, the
referenced flag will be cleared.

Just set referenced when ptep{pmdp}_clear_young_notify() returns true.

Link: http://lkml.kernel.org/r/1518212451-87134-1-git-send-email-yang.shi@linux.alibaba.com
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Reported-by: Gang Deng <gavin.dg@linux.alibaba.com>
Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:52:24 +02:00
Huang Ying
8d700626fb mm: fix races between address_space dereference and free in page_evicatable
[ Upstream commit e92bb4dd9673945179b1fc738c9817dd91bfb629 ]

When page_mapping() is called and the mapping is dereferenced in
page_evicatable() through shrink_active_list(), it is possible for the
inode to be truncated and the embedded address space to be freed at the
same time.  This may lead to the following race.

CPU1                                                CPU2

truncate(inode)                                     shrink_active_list()
  ...                                                 page_evictable(page)
  truncate_inode_page(mapping, page);
    delete_from_page_cache(page)
      spin_lock_irqsave(&mapping->tree_lock, flags);
        __delete_from_page_cache(page, NULL)
          page_cache_tree_delete(..)
            ...                                         mapping = page_mapping(page);
            page->mapping = NULL;
            ...
      spin_unlock_irqrestore(&mapping->tree_lock, flags);
      page_cache_free_page(mapping, page)
        put_page(page)
          if (put_page_testzero(page)) -> false
- inode now has no pages and can be freed including embedded address_space

                                                        mapping_unevictable(mapping)
							  test_bit(AS_UNEVICTABLE, &mapping->flags);
- we've dereferenced mapping which is potentially already free.

Similar race exists between swap cache freeing and page_evicatable()
too.

The address_space in inode and swap cache will be freed after a RCU
grace period.  So the races are fixed via enclosing the page_mapping()
and address_space usage in rcu_read_lock/unlock().  Some comments are
added in code to make it clear what is protected by the RCU read lock.

Link: http://lkml.kernel.org/r/20180212081227.1940-1-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:52:24 +02:00
Claudio Imbrenda
763111d9f3 mm/ksm: fix interaction with THP
[ Upstream commit 77da2ba0648a4fd52e5ff97b8b2b8dd312aec4b0 ]

This patch fixes a corner case for KSM.  When two pages belong or
belonged to the same transparent hugepage, and they should be merged,
KSM fails to split the page, and therefore no merging happens.

This bug can be reproduced by:
* making sure ksm is running (in case disabling ksmtuned)
* enabling transparent hugepages
* allocating a THP-aligned 1-THP-sized buffer
  e.g. on amd64: posix_memalign(&p, 1<<21, 1<<21)
* filling it with the same values
  e.g. memset(p, 42, 1<<21)
* performing madvise to make it mergeable
  e.g. madvise(p, 1<<21, MADV_MERGEABLE)
* waiting for KSM to perform a few scans

The expected outcome is that the all the pages get merged (1 shared and
the rest sharing); the actual outcome is that no pages get merged (1
unshared and the rest volatile)

The reason of this behaviour is that we increase the reference count
once for both pages we want to merge, but if they belong to the same
hugepage (or compound page), the reference counter used in both cases is
the one of the head of the compound page.  This means that
split_huge_page will find a value of the reference counter too high and
will fail.

This patch solves this problem by testing if the two pages to merge
belong to the same hugepage when attempting to merge them.  If so, the
hugepage is split safely.  This means that the hugepage is not split if
not necessary.

Link: http://lkml.kernel.org/r/1521548069-24758-1-git-send-email-imbrenda@linux.vnet.ibm.com
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Co-authored-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:52:24 +02:00
Xidong Wang
3a0de65acd z3fold: fix memory leak
[ Upstream commit 1ec6995d1290bfb87cc3a51f0836c889e857cef9 ]

In z3fold_create_pool(), the memory allocated by __alloc_percpu() is not
released on the error path that pool->compact_wq , which holds the
return value of create_singlethread_workqueue(), is NULL.  This will
result in a memory leak bug.

[akpm@linux-foundation.org: fix oops on kzalloc() failure, check __alloc_percpu() retval]
Link: http://lkml.kernel.org/r/1522803111-29209-1-git-send-email-wangxidong_97@163.com
Signed-off-by: Xidong Wang <wangxidong_97@163.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:52:23 +02:00
Tom Abraham
2ab7738102 swap: divide-by-zero when zero length swap file on ssd
[ Upstream commit a06ad633a37c64a0cd4c229fc605cee8725d376e ]

Calling swapon() on a zero length swap file on SSD can lead to a
divide-by-zero.

Although creating such files isn't possible with mkswap and they woud be
considered invalid, it would be better for the swapon code to be more
robust and handle this condition gracefully (return -EINVAL).
Especially since the fix is small and straightforward.

To help with wear leveling on SSD, the swapon syscall calculates a
random position in the swap file using modulo p->highest_bit, which is
set to maxpages - 1 in read_swap_header.

If the swap file is zero length, read_swap_header sets maxpages=1 and
last_page=0, resulting in p->highest_bit=0 and we divide-by-zero when we
modulo p->highest_bit in swapon syscall.

This can be prevented by having read_swap_header return zero if
last_page is zero.

Link: http://lkml.kernel.org/r/5AC747C1020000A7001FA82C@prv-mh.provo.novell.com
Signed-off-by: Thomas Abraham <tabraham@suse.com>
Reported-by: <Mark.Landis@Teradata.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:52:23 +02:00
Vinayak Menon
28bbb0d963 mm/kmemleak.c: wait for scan completion before disabling free
[ Upstream commit 914b6dfff790544d9b77dfd1723adb3745ec9700 ]

A crash is observed when kmemleak_scan accesses the object->pointer,
likely due to the following race.

  TASK A             TASK B                     TASK C
  kmemleak_write
   (with "scan" and
   NOT "scan=on")
  kmemleak_scan()
                     create_object
                     kmem_cache_alloc fails
                     kmemleak_disable
                     kmemleak_do_cleanup
                     kmemleak_free_enabled = 0
                                                kfree
                                                kmemleak_free bails out
                                                 (kmemleak_free_enabled is 0)
                                                slub frees object->pointer
  update_checksum
  crash - object->pointer
   freed (DEBUG_PAGEALLOC)

kmemleak_do_cleanup waits for the scan thread to complete, but not for
direct call to kmemleak_scan via kmemleak_write.  So add a wait for
kmemleak_scan completion before disabling kmemleak_free, and while at it
fix the comment on stop_scan_thread.

[vinmenon@codeaurora.org: fix stop_scan_thread comment]
  Link: http://lkml.kernel.org/r/1522219972-22809-1-git-send-email-vinmenon@codeaurora.org
Link: http://lkml.kernel.org/r/1522063429-18992-1-git-send-email-vinmenon@codeaurora.org
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:52:21 +02:00
Steven J. Hill
08e9dbd518 mm/vmstat.c: fix vmstat_update() preemption BUG
[ Upstream commit c7f26ccfb2c31eb1bf810ba13d044fcf583232db ]

Attempting to hotplug CPUs with CONFIG_VM_EVENT_COUNTERS enabled can
cause vmstat_update() to report a BUG due to preemption not being
disabled around smp_processor_id().

Discovered on Ubiquiti EdgeRouter Pro with Cavium Octeon II processor.

  BUG: using smp_processor_id() in preemptible [00000000] code:
  kworker/1:1/269
  caller is vmstat_update+0x50/0xa0
  CPU: 0 PID: 269 Comm: kworker/1:1 Not tainted
  4.16.0-rc4-Cavium-Octeon-00009-gf83bbd5-dirty #1
  Workqueue: mm_percpu_wq vmstat_update
  Call Trace:
    show_stack+0x94/0x128
    dump_stack+0xa4/0xe0
    check_preemption_disabled+0x118/0x120
    vmstat_update+0x50/0xa0
    process_one_work+0x144/0x348
    worker_thread+0x150/0x4b8
    kthread+0x110/0x140
    ret_from_kernel_thread+0x14/0x1c

Link: http://lkml.kernel.org/r/1520881552-25659-1-git-send-email-steven.hill@cavium.com
Signed-off-by: Steven J. Hill <steven.hill@cavium.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Tejun Heo <htejun@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:52:21 +02:00
Maninder Singh
d2a5d00dcd mm/page_owner: fix recursion bug after changing skip entries
[ Upstream commit 299815a4fba9f3c7a81434dba0072148f1690608 ]

This patch fixes commit 5f48f0bd4e36 ("mm, page_owner: skip unnecessary
stack_trace entries").

Because if we skip first two entries then logic of checking count value
as 2 for recursion is broken and code will go in one depth recursion.

so we need to check only one call of _RET_IP(__set_page_owner) while
checking for recursion.

Current Backtrace while checking for recursion:-

  (save_stack)             from (__set_page_owner)  // (But recursion returns true here)
  (__set_page_owner)       from (get_page_from_freelist)
  (get_page_from_freelist) from (__alloc_pages_nodemask)
  (__alloc_pages_nodemask) from (depot_save_stack)
  (depot_save_stack)       from (save_stack)       // recursion should return true here
  (save_stack)             from (__set_page_owner)
  (__set_page_owner)       from (get_page_from_freelist)
  (get_page_from_freelist) from (__alloc_pages_nodemask+)
  (__alloc_pages_nodemask) from (depot_save_stack)
  (depot_save_stack)       from (save_stack)
  (save_stack)             from (__set_page_owner)
  (__set_page_owner)       from (get_page_from_freelist)

Correct Backtrace with fix:

  (save_stack)             from (__set_page_owner) // recursion returned true here
  (__set_page_owner)       from (get_page_from_freelist)
  (get_page_from_freelist) from (__alloc_pages_nodemask+)
  (__alloc_pages_nodemask) from (depot_save_stack)
  (depot_save_stack)       from (save_stack)
  (save_stack)             from (__set_page_owner)
  (__set_page_owner)       from (get_page_from_freelist)

Link: http://lkml.kernel.org/r/1521607043-34670-1-git-send-email-maninder1.s@samsung.com
Fixes: 5f48f0bd4e36 ("mm, page_owner: skip unnecessary stack_trace entries")
Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Signed-off-by: Vaneet Narang <v.narang@samsung.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@techadventures.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ayush Mittal <ayush.m@samsung.com>
Cc: Prakash Gupta <guptap@codeaurora.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Cc: Vasyl Gomonovych <gomonovych@gmail.com>
Cc: Amit Sahrawat <a.sahrawat@samsung.com>
Cc: <pankaj.m@samsung.com>
Cc: Vaneet Narang <v.narang@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:52:21 +02:00
Shakeel Butt
da9ec481d6 mm, slab: memcg_link the SLAB's kmem_cache
[ Upstream commit 880cd276dff17ea29e9a8404275c9502b265afa7 ]

All the root caches are linked into slab_root_caches which was
introduced by the commit 510ded33e075 ("slab: implement slab_root_caches
list") but it missed to add the SLAB's kmem_cache.

While experimenting with opt-in/opt-out kmem accounting, I noticed
system crashes due to NULL dereference inside cache_from_memcg_idx()
while deferencing kmem_cache.memcg_params.memcg_caches.  The upstream
clean kernel will not see these crashes but SLAB should be consistent
with SLUB which does linked its boot caches (kmem_cache_node and
kmem_cache) into slab_root_caches.

Link: http://lkml.kernel.org/r/20180319210020.60289-1-shakeelb@google.com
Fixes: 510ded33e075c ("slab: implement slab_root_caches list")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:52:21 +02:00
David Rientjes
49f4a8c52e mm, thp: do not cause memcg oom for thp
[ Upstream commit 9d3c3354bb85bab4d865fe95039443f09a4c8394 ]

Commit 2516035499b9 ("mm, thp: remove __GFP_NORETRY from khugepaged and
madvised allocations") changed the page allocator to no longer detect
thp allocations based on __GFP_NORETRY.

It did not, however, modify the mem cgroup try_charge() path to avoid
oom kill for either khugepaged collapsing or thp faulting.  It is never
expected to oom kill a process to allocate a hugepage for thp; reclaim
is governed by the thp defrag mode and MADV_HUGEPAGE, but allocations
(and charging) should fallback instead of oom killing processes.

Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1803191409420.124411@chino.kir.corp.google.com
Fixes: 2516035499b9 ("mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations")
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:52:19 +02:00
Yisheng Xie
6ca473201d mm/mempolicy.c: avoid use uninitialized preferred_node
[ Upstream commit 8970a63e965b43288c4f5f40efbc2bbf80de7f16 ]

Alexander reported a use of uninitialized memory in __mpol_equal(),
which is caused by incorrect use of preferred_node.

When mempolicy in mode MPOL_PREFERRED with flags MPOL_F_LOCAL, it uses
numa_node_id() instead of preferred_node, however, __mpol_equal() uses
preferred_node without checking whether it is MPOL_F_LOCAL or not.

[akpm@linux-foundation.org: slight comment tweak]
Link: http://lkml.kernel.org/r/4ebee1c2-57f6-bcb8-0e2d-1833d1ee0bb7@huawei.com
Fixes: fc36b8d3d819 ("mempolicy: use MPOL_F_LOCAL to Indicate Preferred Local Policy")
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Reported-by: Alexander Potapenko <glider@google.com>
Tested-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:52:19 +02:00
David Hildenbrand
1da530fe15 kasan: fix memory hotplug during boot
commit 3f1959721558a976aaf9c2024d5bc884e6411bf7 upstream.

Using module_init() is wrong.  E.g.  ACPI adds and onlines memory before
our memory notifier gets registered.

This makes sure that ACPI memory detected during boot up will not result
in a kernel crash.

Easily reproducible with QEMU, just specify a DIMM when starting up.

Link: http://lkml.kernel.org/r/20180522100756.18478-3-david@redhat.com
Fixes: 786a8959912e ("kasan: disable memory hotplug")
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:51:50 +02:00
David Hildenbrand
b052960484 kasan: free allocated shadow memory on MEM_CANCEL_ONLINE
commit ed1596f9ab958dd156a66c9ff1029d3761c1786a upstream.

We have to free memory again when we cancel onlining, otherwise a later
onlining attempt will fail.

Link: http://lkml.kernel.org/r/20180522100756.18478-2-david@redhat.com
Fixes: fa69b5989bb0 ("mm/kasan: add support for memory hotplug")
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:51:49 +02:00
Andrey Ryabinin
9c7821c67a mm/kasan: don't vfree() nonexistent vm_area
commit 0f901dcbc31f88ae41a2aaa365f7802b5d520a28 upstream.

KASAN uses different routines to map shadow for hot added memory and
memory obtained in boot process.  Attempt to offline memory onlined by
normal boot process leads to this:

    Trying to vfree() nonexistent vm area (000000005d3b34b9)
    WARNING: CPU: 2 PID: 13215 at mm/vmalloc.c:1525 __vunmap+0x147/0x190

    Call Trace:
     kasan_mem_notifier+0xad/0xb9
     notifier_call_chain+0x166/0x260
     __blocking_notifier_call_chain+0xdb/0x140
     __offline_pages+0x96a/0xb10
     memory_subsys_offline+0x76/0xc0
     device_offline+0xb8/0x120
     store_mem_state+0xfa/0x120
     kernfs_fop_write+0x1d5/0x320
     __vfs_write+0xd4/0x530
     vfs_write+0x105/0x340
     SyS_write+0xb0/0x140

Obviously we can't call vfree() to free memory that wasn't allocated via
vmalloc().  Use find_vm_area() to see if we can call vfree().

Unfortunately it's a bit tricky to properly unmap and free shadow
allocated during boot, so we'll have to keep it.  If memory will come
online again that shadow will be reused.

Matthew asked: how can you call vfree() on something that isn't a
vmalloc address?

  vfree() is able to free any address returned by
  __vmalloc_node_range().  And __vmalloc_node_range() gives you any
  address you ask.  It doesn't have to be an address in [VMALLOC_START,
  VMALLOC_END] range.

  That's also how the module_alloc()/module_memfree() works on
  architectures that have designated area for modules.

[aryabinin@virtuozzo.com: improve comments]
  Link: http://lkml.kernel.org/r/dabee6ab-3a7a-51cd-3b86-5468718e0390@virtuozzo.com
[akpm@linux-foundation.org: fix typos, reflow comment]
Link: http://lkml.kernel.org/r/20180201163349.8700-1-aryabinin@virtuozzo.com
Fixes: fa69b5989bb0 ("mm/kasan: add support for memory hotplug")
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Paul Menzel <pmenzel+linux-kasan-dev@molgen.mpg.de>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:51:49 +02:00
Greg Kroah-Hartman
4c9e0a9b25 This is the 4.14.43 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlsESzAACgkQONu9yGCS
 aT71uhAAtwH5Dvy395KPNS+IqabGaFnEeVpIEsxtBlIa7crspp9eUqiqEWP6nAGg
 dPeBE4jLEf8lVed0ErZ+p0eJTuhjgUmve4/5+LBWQtZIz+9ppttwklRysxCfPixs
 /cPBfSbfjQTqeQqpB3jOpQAZXnyeipxFMMjxlLoXEcKxcVM9qr3b+oNJ1lw/ETH3
 3NMIYL+PSKyYp2cnAFUpUeU7grJQeTAwPDVy+ziZ8tF0aU5JbHMNRL19d9NxhQCX
 efk4sr8smkKUv9wayM63FMtjlm/MYc6cxLRz2DsWEAQuC6qkEEqwf7vZ4XEGrqci
 1tGWibzzTpo1v+01r57U5VXkS+DMyjYajikZNTe3ixUp19iKQyMSsMrBNupapOMy
 s2x+lZLKFa3q8PGpIy0kJ8yCYw2DZMlrEC+VAfr1S9M3vz9pPzLv398r7eYcHhJb
 Q8hHPdWgX3dcsYhju5/gekDFn7M41dsU3vtoooz50HKDcqVovJNwZNgzsLR8Fs4F
 X3yanXyP5rjBnM9dQUnhi0PvJA6E/ZWDmp6LF9ZiySX1xJ9+5gflI+MnvxRvVuXk
 UP3f8ace87x3zWYzmGin7vouUzsIOueCJXKZCGCvcV5/NLMGAW3NBGCZWnnH6OTy
 RPsDUeKj36QBmalitR9yYF25Ss/zDx1b8RRdeVkD1E0YpfgMubg=
 =opxx
 -----END PGP SIGNATURE-----

Merge 4.14.43 into android-4.14

Changes in 4.14.43
	usbip: usbip_host: refine probe and disconnect debug msgs to be useful
	usbip: usbip_host: delete device from busid_table after rebind
	usbip: usbip_host: run rebind from exit when module is removed
	usbip: usbip_host: fix NULL-ptr deref and use-after-free errors
	usbip: usbip_host: fix bad unlock balance during stub_probe()
	ALSA: usb: mixer: volume quirk for CM102-A+/102S+
	ALSA: hda: Add Lenovo C50 All in one to the power_save blacklist
	ALSA: control: fix a redundant-copy issue
	spi: pxa2xx: Allow 64-bit DMA
	spi: bcm-qspi: Avoid setting MSPI_CDRAM_PCS for spi-nor master
	spi: bcm-qspi: Always read and set BSPI_MAST_N_BOOT_CTRL
	KVM: arm/arm64: VGIC/ITS save/restore: protect kvm_read_guest() calls
	KVM: arm/arm64: VGIC/ITS: protect kvm_read_guest() calls with SRCU lock
	powerpc: Don't preempt_disable() in show_cpuinfo()
	vfio: ccw: fix cleanup if cp_prefetch fails
	tracing/x86/xen: Remove zero data size trace events trace_xen_mmu_flush_tlb{_all}
	tee: shm: fix use-after-free via temporarily dropped reference
	netfilter: nf_tables: free set name in error path
	netfilter: nf_tables: can't fail after linking rule into active rule list
	netfilter: nf_socket: Fix out of bounds access in nf_sk_lookup_slow_v{4,6}
	i2c: designware: fix poll-after-enable regression
	powerpc/powernv: Fix NVRAM sleep in invalid context when crashing
	drm: Match sysfs name in link removal to link creation
	lib/test_bitmap.c: fix bitmap optimisation tests to report errors correctly
	radix tree: fix multi-order iteration race
	mm: don't allow deferred pages with NEED_PER_CPU_KM
	drm/i915/gen9: Add WaClearHIZ_WM_CHICKEN3 for bxt and glk
	s390/qdio: fix access to uninitialized qdio_q fields
	s390/cpum_sf: ensure sample frequency of perf event attributes is non-zero
	s390/qdio: don't release memory in qdio_setup_irq()
	s390: remove indirect branch from do_softirq_own_stack
	x86/pkeys: Override pkey when moving away from PROT_EXEC
	x86/pkeys: Do not special case protection key 0
	efi: Avoid potential crashes, fix the 'struct efi_pci_io_protocol_32' definition for mixed mode
	ARM: 8771/1: kprobes: Prohibit kprobes on do_undefinstr
	x86/mm: Drop TS_COMPAT on 64-bit exec() syscall
	tick/broadcast: Use for_each_cpu() specially on UP kernels
	ARM: 8769/1: kprobes: Fix to use get_kprobe_ctlblk after irq-disabed
	ARM: 8770/1: kprobes: Prohibit probing on optimized_callback
	ARM: 8772/1: kprobes: Prohibit kprobes on get_user functions
	Btrfs: fix xattr loss after power failure
	Btrfs: send, fix invalid access to commit roots due to concurrent snapshotting
	btrfs: property: Set incompat flag if lzo/zstd compression is set
	btrfs: fix crash when trying to resume balance without the resume flag
	btrfs: Split btrfs_del_delalloc_inode into 2 functions
	btrfs: Fix delalloc inodes invalidation during transaction abort
	btrfs: fix reading stale metadata blocks after degraded raid1 mounts
	x86/nospec: Simplify alternative_msr_write()
	x86/bugs: Concentrate bug detection into a separate function
	x86/bugs: Concentrate bug reporting into a separate function
	x86/bugs: Read SPEC_CTRL MSR during boot and re-use reserved bits
	x86/bugs, KVM: Support the combination of guest and host IBRS
	x86/bugs: Expose /sys/../spec_store_bypass
	x86/cpufeatures: Add X86_FEATURE_RDS
	x86/bugs: Provide boot parameters for the spec_store_bypass_disable mitigation
	x86/bugs/intel: Set proper CPU features and setup RDS
	x86/bugs: Whitelist allowed SPEC_CTRL MSR values
	x86/bugs/AMD: Add support to disable RDS on Fam[15,16,17]h if requested
	x86/KVM/VMX: Expose SPEC_CTRL Bit(2) to the guest
	x86/speculation: Create spec-ctrl.h to avoid include hell
	prctl: Add speculation control prctls
	x86/process: Allow runtime control of Speculative Store Bypass
	x86/speculation: Add prctl for Speculative Store Bypass mitigation
	nospec: Allow getting/setting on non-current task
	proc: Provide details on speculation flaw mitigations
	seccomp: Enable speculation flaw mitigations
	x86/bugs: Make boot modes __ro_after_init
	prctl: Add force disable speculation
	seccomp: Use PR_SPEC_FORCE_DISABLE
	seccomp: Add filter flag to opt-out of SSB mitigation
	seccomp: Move speculation migitation control to arch code
	x86/speculation: Make "seccomp" the default mode for Speculative Store Bypass
	x86/bugs: Rename _RDS to _SSBD
	proc: Use underscores for SSBD in 'status'
	Documentation/spec_ctrl: Do some minor cleanups
	x86/bugs: Fix __ssb_select_mitigation() return type
	x86/bugs: Make cpu_show_common() static
	x86/bugs: Fix the parameters alignment and missing void
	x86/cpu: Make alternative_msr_write work for 32-bit code
	KVM: SVM: Move spec control call after restore of GS
	x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP
	x86/cpufeatures: Disentangle MSR_SPEC_CTRL enumeration from IBRS
	x86/cpufeatures: Disentangle SSBD enumeration
	x86/cpufeatures: Add FEATURE_ZEN
	x86/speculation: Handle HT correctly on AMD
	x86/bugs, KVM: Extend speculation control for VIRT_SPEC_CTRL
	x86/speculation: Add virtualized speculative store bypass disable support
	x86/speculation: Rework speculative_store_bypass_update()
	x86/bugs: Unify x86_spec_ctrl_{set_guest,restore_host}
	x86/bugs: Expose x86_spec_ctrl_base directly
	x86/bugs: Remove x86_spec_ctrl_set()
	x86/bugs: Rework spec_ctrl base and mask logic
	x86/speculation, KVM: Implement support for VIRT_SPEC_CTRL/LS_CFG
	KVM: SVM: Implement VIRT_SPEC_CTRL support for SSBD
	x86/bugs: Rename SSBD_NO to SSB_NO
	Linux 4.14.43

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-05-22 20:17:10 +02:00
Pavel Tatashin
fc170bda22 mm: don't allow deferred pages with NEED_PER_CPU_KM
commit ab1e8d8960b68f54af42b6484b5950bd13a4054b upstream.

It is unsafe to do virtual to physical translations before mm_init() is
called if struct page is needed in order to determine the memory section
number (see SECTION_IN_PAGE_FLAGS).  This is because only in mm_init()
we initialize struct pages for all the allocated memory when deferred
struct pages are used.

My recent fix in commit c9e97a1997 ("mm: initialize pages on demand
during boot") exposed this problem, because it greatly reduced number of
pages that are initialized before mm_init(), but the problem existed
even before my fix, as Fengguang Wu found.

Below is a more detailed explanation of the problem.

We initialize struct pages in four places:

1. Early in boot a small set of struct pages is initialized to fill the
   first section, and lower zones.

2. During mm_init() we initialize "struct pages" for all the memory that
   is allocated, i.e reserved in memblock.

3. Using on-demand logic when pages are allocated after mm_init call
   (when memblock is finished)

4. After smp_init() when the rest free deferred pages are initialized.

The problem occurs if we try to do va to phys translation of a memory
between steps 1 and 2.  Because we have not yet initialized struct pages
for all the reserved pages, it is inherently unsafe to do va to phys if
the translation itself requires access of "struct page" as in case of
this combination: CONFIG_SPARSE && !CONFIG_SPARSE_VMEMMAP

The following path exposes the problem:

  start_kernel()
   trap_init()
    setup_cpu_entry_areas()
     setup_cpu_entry_area(cpu)
      get_cpu_gdt_paddr(cpu)
       per_cpu_ptr_to_phys(addr)
        pcpu_addr_to_page(addr)
         virt_to_page(addr)
          pfn_to_page(__pa(addr) >> PAGE_SHIFT)

We disable this path by not allowing NEED_PER_CPU_KM with deferred
struct pages feature.

The problems are discussed in these threads:
  http://lkml.kernel.org/r/20180418135300.inazvpxjxowogyge@wfg-t540p.sh.intel.com
  http://lkml.kernel.org/r/20180419013128.iurzouiqxvcnpbvz@wfg-t540p.sh.intel.com
  http://lkml.kernel.org/r/20180426202619.2768-1-pasha.tatashin@oracle.com

Link: http://lkml.kernel.org/r/20180515175124.1770-1-pasha.tatashin@oracle.com
Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set")
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Dennis Zhou <dennisszhou@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-22 18:53:58 +02:00
Greg Kroah-Hartman
2b59cb7780 This is the 4.14.42 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlr/3ksACgkQONu9yGCS
 aT5vdg/+NrZhrryO0+MisGGRnym0awDDo+TV0Wxuw2VCoCxAGbH0sGSJp9DtKcet
 TDtLmw8RuJFU2NPBcN4aPuGFby5kLmlOslQhKg32mKcW0tnhK67DFhiqceZB/FeY
 JdReYzvMv0UBsr5QFzPA3F5rbwjGV8N//3+spXOt3DykjtwR9wddGp7GxqWxIm/x
 wF28tHr9LAdVuwPHw/Tpkl5ouDn8TGsuNejgv544EDWbACurZCKxxG7IYKD0vFTG
 vrDPTuBoAXpzW/QI2kF7j6hy1hlzREGRak9CLYz2YAcMvXi2Lxlx5eL8lYMjTk5M
 3uvkZQ6lXjIZpKd8mRxUzj6TtZ/g3iM/mTozLBFw/JIsnCNIzyHheVZRuPARd5xT
 PF56P0cLrpO4d7Tdsn5bTcjuZDqNHn+II2ZvB9TaynJD1kDw5bpbfLi/KwZWAEHj
 2KVl4AR1swpoGsQBcjH+w2k3zYHhX1WmrAzMaN/wnybcVwxwVizpWpIIMb6t6ejk
 llG8va2ZSF8UA+OfwrTLUr483kSg3hYW72+85DdvL64K8yMOvmYhV2TncEQBH4aK
 YGjomZDKcT10afIpY5/vAVFdtCBvSB3ar/6pMS/tio0UK/SBwTV81nYCoPWoB8R5
 2gq6JJxjf92AMQhhbGnmPX8knDmbBOodDq3W8thLISIOG1qnJBA=
 =w3oc
 -----END PGP SIGNATURE-----

Merge 4.14.42 into android-4.14

Changes in 4.14.42
	8139too: Use disable_irq_nosync() in rtl8139_poll_controller()
	bridge: check iface upper dev when setting master via ioctl
	dccp: fix tasklet usage
	ipv4: fix fnhe usage by non-cached routes
	ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg
	llc: better deal with too small mtu
	net: ethernet: sun: niu set correct packet size in skb
	net: ethernet: ti: cpsw: fix packet leaking in dual_mac mode
	net/mlx4_en: Fix an error handling path in 'mlx4_en_init_netdev()'
	net/mlx4_en: Verify coalescing parameters are in range
	net/mlx5e: Err if asked to offload TC match on frag being first
	net/mlx5: E-Switch, Include VF RDMA stats in vport statistics
	net sched actions: fix refcnt leak in skbmod
	net_sched: fq: take care of throttled flows before reuse
	net: support compat 64-bit time in {s,g}etsockopt
	net/tls: Don't recursively call push_record during tls_write_space callbacks
	net/tls: Fix connection stall on partial tls record
	openvswitch: Don't swap table in nlattr_set() after OVS_ATTR_NESTED is found
	qmi_wwan: do not steal interfaces from class drivers
	r8169: fix powering up RTL8168h
	rds: do not leak kernel memory to user land
	sctp: delay the authentication for the duplicated cookie-echo chunk
	sctp: fix the issue that the cookie-ack with auth can't get processed
	sctp: handle two v4 addrs comparison in sctp_inet6_cmp_addr
	sctp: remove sctp_chunk_put from fail_mark err path in sctp_ulpevent_make_rcvmsg
	sctp: use the old asoc when making the cookie-ack chunk in dupcook_d
	tcp_bbr: fix to zero idle_restart only upon S/ACKed data
	tcp: ignore Fast Open on repair mode
	tg3: Fix vunmap() BUG_ON() triggered from tg3_free_consistent().
	bonding: do not allow rlb updates to invalid mac
	bonding: send learning packets for vlans on slave
	net: sched: fix error path in tcf_proto_create() when modules are not configured
	net/mlx5e: TX, Use correct counter in dma_map error flow
	net/mlx5: Avoid cleaning flow steering table twice during error flow
	hv_netvsc: set master device
	ipv6: fix uninit-value in ip6_multipath_l3_keys()
	net/mlx5e: Allow offloading ipv4 header re-write for icmp
	nsh: fix infinite loop
	udp: fix SO_BINDTODEVICE
	scsi: aacraid: Correct hba_send to include iu_type
	xfrm: Use __skb_queue_tail in xfrm_trans_queue
	btrfs: Take trans lock before access running trans in check_delayed_ref
	xfrm: fix xfrm_do_migrate() with AEAD e.g(AES-GCM)
	l2tp: revert "l2tp: fix missing print session offset info"
	proc: do not access cmdline nor environ from file-backed areas
	Linux 4.14.42

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-05-19 13:54:30 +02:00
Willy Tarreau
5c9a9508de proc: do not access cmdline nor environ from file-backed areas
commit 7f7ccc2ccc2e70c6054685f5e3522efa81556830 upstream.

proc_pid_cmdline_read() and environ_read() directly access the target
process' VM to retrieve the command line and environment. If this
process remaps these areas onto a file via mmap(), the requesting
process may experience various issues such as extra delays if the
underlying device is slow to respond.

Let's simply refuse to access file-backed areas in these functions.
For this we add a new FOLL_ANON gup flag that is passed to all calls
to access_remote_vm(). The code already takes care of such failures
(including unmapped areas). Accesses via /proc/pid/mem were not
changed though.

This was assigned CVE-2018-1120.

Note for stable backports: the patch may apply to kernels prior to 4.11
but silently miss one location; it must be checked that no call to
access_remote_vm() keeps zero as the last argument.

Reported-by: Qualys Security Advisory <qsa@qualys.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-19 10:20:27 +02:00
Greg Kroah-Hartman
04f740d4da This is the 4.14.41 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlr753gACgkQONu9yGCS
 aT7p/Q//TIC9EKe21E2Lb1Kh4lL5SDjmwe/rkA3PxiqxbkXfUDBehMCfDk4YVNVG
 TlH1TXOubzpS/8cZJPRFHEkrYXPKIA3+hKlAvJukUJCBQqmW1ILEAX5m7jrSmf+B
 tLe/r0ijOtlfB1xQdUs5RxXGIndw0gMGhpo/QTXPAC0hGh0Ykd8v2s4YAjxOvdKw
 z4DaUKtZGEPBWFVK/Bx1Fv3iAmJMt2yerERUqz8MVegYXJt+2RUGoJtsxHuvOk1p
 9q0lzHBWYihQVt1tJ0es/8cB7WsYt8txnVmeN907sryUhDjvTWIxQJb5jEV0gxxK
 AL89PHy4Hfki6l6r+tqYi92frFda8aLfsaSseOhlmqsv0MlwngW2dx3UbjaYd4If
 IQA6n0hWHuxUvjrjsPpsMAa4lvTW+/kFilb0mD6Vixy3ru+/RelKnuawJm6kbMNu
 Cb8QSVSJrhvC/UZLvwO7a3viJdKoI5B9pTh5FTKcY5wUPI1k01pg3WlWNxmnv4ZJ
 LPImR06aoJYhvbutf94AvxbCOt/au8sY4s/yk9oHgvGUEIccrGYf3BwX6ciWRt4b
 r4ZN92C9ZuD+u/ATFgi/akngtjjixw5YrZ20aX86dYcBZ25hYOiIMoc482tYQ12Z
 1vqyvKg9o1oMypG9orF09PWstbNRu3ihGATKdXL9lfAhDklOTKc=
 =zWTK
 -----END PGP SIGNATURE-----

Merge 4.14.41 into android-4.14

Changes in 4.14.41
	ipvs: fix rtnl_lock lockups caused by start_sync_thread
	netfilter: ebtables: don't attempt to allocate 0-sized compat array
	kcm: Call strp_stop before strp_done in kcm_attach
	crypto: af_alg - fix possible uninit-value in alg_bind()
	netlink: fix uninit-value in netlink_sendmsg
	net: fix rtnh_ok()
	net: initialize skb->peeked when cloning
	net: fix uninit-value in __hw_addr_add_ex()
	dccp: initialize ireq->ir_mark
	ipv4: fix uninit-value in ip_route_output_key_hash_rcu()
	soreuseport: initialise timewait reuseport field
	inetpeer: fix uninit-value in inet_getpeer
	memcg: fix per_node_info cleanup
	perf: Remove superfluous allocation error check
	tcp: fix TCP_REPAIR_QUEUE bound checking
	bdi: wake up concurrent wb_shutdown() callers.
	bdi: Fix oops in wb_workfn()
	KVM: PPC: Book3S HV: Fix trap number return from __kvmppc_vcore_entry
	KVM: PPC: Book3S HV: Fix guest time accounting with VIRT_CPU_ACCOUNTING_GEN
	KVM: PPC: Book3S HV: Fix VRMA initialization with 2MB or 1GB memory backing
	arm64: Add work around for Arm Cortex-A55 Erratum 1024718
	compat: fix 4-byte infoleak via uninitialized struct field
	gpioib: do not free unrequested descriptors
	gpio: fix aspeed_gpio unmask irq
	gpio: fix error path in lineevent_create
	rfkill: gpio: fix memory leak in probe error path
	libata: Apply NOLPM quirk for SanDisk SD7UB3Q*G1001 SSDs
	dm integrity: use kvfree for kvmalloc'd memory
	tracing: Fix regex_match_front() to not over compare the test string
	z3fold: fix reclaim lock-ups
	mm: sections are not offlined during memory hotremove
	mm, oom: fix concurrent munlock and oom reaper unmap, v3
	ceph: fix rsize/wsize capping in ceph_direct_read_write()
	can: kvaser_usb: Increase correct stats counter in kvaser_usb_rx_can_msg()
	can: hi311x: Acquire SPI lock on ->do_get_berr_counter
	can: hi311x: Work around TX complete interrupt erratum
	drm/vc4: Fix scaling of uni-planar formats
	drm/i915: Fix drm:intel_enable_lvds ERROR message in kernel log
	drm/nouveau: Fix deadlock in nv50_mstm_register_connector()
	drm/atomic: Clean old_state/new_state in drm_atomic_state_default_clear()
	drm/atomic: Clean private obj old_state/new_state in drm_atomic_state_default_clear()
	net: atm: Fix potential Spectre v1
	atm: zatm: Fix potential Spectre v1
	PCI / PM: Always check PME wakeup capability for runtime wakeup support
	PCI / PM: Check device_may_wakeup() in pci_enable_wake()
	cpufreq: schedutil: Avoid using invalid next_freq
	Revert "Bluetooth: btusb: Fix quirk for Atheros 1525/QCA6174"
	Bluetooth: btusb: Add Dell XPS 13 9360 to btusb_needs_reset_resume_table
	Bluetooth: btusb: Only check needs_reset_resume DMI table for QCA rome chipsets
	thermal: exynos: Reading temperature makes sense only when TMU is turned on
	thermal: exynos: Propagate error value from tmu_read()
	nvme: add quirk to force medium priority for SQ creation
	smb3: directory sync should not return an error
	sched/autogroup: Fix possible Spectre-v1 indexing for sched_prio_to_weight[]
	tracing/uprobe_event: Fix strncpy corner case
	perf/x86: Fix possible Spectre-v1 indexing for hw_perf_event cache_*
	perf/x86/cstate: Fix possible Spectre-v1 indexing for pkg_msr
	perf/x86/msr: Fix possible Spectre-v1 indexing in the MSR driver
	perf/core: Fix possible Spectre-v1 indexing for ->aux_pages[]
	perf/x86: Fix possible Spectre-v1 indexing for x86_pmu::event_map()
	KVM: PPC: Book3S HV: Fix handling of large pages in radix page fault handler
	KVM: x86: remove APIC Timer periodic/oneshot spikes
	Linux 4.14.41

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-05-16 11:40:03 +02:00
David Rientjes
2270dfcc4b mm, oom: fix concurrent munlock and oom reaper unmap, v3
commit 27ae357fa82be5ab73b2ef8d39dcb8ca2563483a upstream.

Since exit_mmap() is done without the protection of mm->mmap_sem, it is
possible for the oom reaper to concurrently operate on an mm until
MMF_OOM_SKIP is set.

This allows munlock_vma_pages_all() to concurrently run while the oom
reaper is operating on a vma.  Since munlock_vma_pages_range() depends
on clearing VM_LOCKED from vm_flags before actually doing the munlock to
determine if any other vmas are locking the same memory, the check for
VM_LOCKED in the oom reaper is racy.

This is especially noticeable on architectures such as powerpc where
clearing a huge pmd requires serialize_against_pte_lookup().  If the pmd
is zapped by the oom reaper during follow_page_mask() after the check
for pmd_none() is bypassed, this ends up deferencing a NULL ptl or a
kernel oops.

Fix this by manually freeing all possible memory from the mm before
doing the munlock and then setting MMF_OOM_SKIP.  The oom reaper can not
run on the mm anymore so the munlock is safe to do in exit_mmap().  It
also matches the logic that the oom reaper currently uses for
determining when to set MMF_OOM_SKIP itself, so there's no new risk of
excessive oom killing.

This issue fixes CVE-2018-1000200.

Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1804241526320.238665@chino.kir.corp.google.com
Fixes: 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently")
Signed-off-by: David Rientjes <rientjes@google.com>
Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>	[4.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-16 10:10:27 +02:00
Pavel Tatashin
8ee7dabb31 mm: sections are not offlined during memory hotremove
commit 27227c733852f71008e9bf165950bb2edaed3a90 upstream.

Memory hotplug and hotremove operate with per-block granularity.  If the
machine has a large amount of memory (more than 64G), the size of a
memory block can span multiple sections.  By mistake, during hotremove
we set only the first section to offline state.

The bug was discovered because kernel selftest started to fail:
  https://lkml.kernel.org/r/20180423011247.GK5563@yexl-desktop

After commit, "mm/memory_hotplug: optimize probe routine".  But, the bug
is older than this commit.  In this optimization we also added a check
for sections to be in a proper state during hotplug operation.

Link: http://lkml.kernel.org/r/20180427145257.15222-1-pasha.tatashin@oracle.com
Fixes: 2d070eab2e82 ("mm: consider zone which is not fully populated to have holes")
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-16 10:10:27 +02:00
Vitaly Wool
21fb6d8bc5 z3fold: fix reclaim lock-ups
commit 6098d7e136692f9c6e23ae362c62ec822343e4d5 upstream.

Do not try to optimize in-page object layout while the page is under
reclaim.  This fixes lock-ups on reclaim and improves reclaim
performance at the same time.

[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20180430125800.444cae9706489f412ad12621@gmail.com
Signed-off-by: Vitaly Wool <vitaly.vul@sony.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: <Oleksiy.Avramchenko@sony.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-16 10:10:27 +02:00
Tetsuo Handa
6b5a99167a bdi: wake up concurrent wb_shutdown() callers.
commit 8236b0ae31c837d2b3a2565c5f8d77f637e824cc upstream.

syzbot is reporting hung tasks at wait_on_bit(WB_shutting_down) in
wb_shutdown() [1]. This seems to be because commit 5318ce7d46866e1d ("bdi:
Shutdown writeback on all cgwbs in cgwb_bdi_destroy()") forgot to call
wake_up_bit(WB_shutting_down) after clear_bit(WB_shutting_down).

Introduce a helper function clear_and_wake_up_bit() and use it, in order
to avoid similar errors in future.

[1] https://syzkaller.appspot.com/bug?id=b297474817af98d5796bc544e1bb806fc3da0e5e

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+c0cf869505e03bdf1a24@syzkaller.appspotmail.com>
Fixes: 5318ce7d46866e1d ("bdi: Shutdown writeback on all cgwbs in cgwb_bdi_destroy()")
Cc: Tejun Heo <tj@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-16 10:10:25 +02:00
Michal Hocko
17ffa29c35 memcg: fix per_node_info cleanup
commit 4eaf431f6f71bbed40a4c733ffe93a7e8cedf9d9 upstream.

syzbot has triggered a NULL ptr dereference when allocation fault
injection enforces a failure and alloc_mem_cgroup_per_node_info
initializes memcg->nodeinfo only half way through.

But __mem_cgroup_free still tries to free all per-node data and
dereferences pn->lruvec_stat_cpu unconditioanlly even if the specific
per-node data hasn't been initialized.

The bug is quite unlikely to hit because small allocations do not fail
and we would need quite some numa nodes to make struct
mem_cgroup_per_node large enough to cross the costly order.

Link: http://lkml.kernel.org/r/20180406100906.17790-1-mhocko@kernel.org
Reported-by: syzbot+8a5de3cce7cdc70e9ebe@syzkaller.appspotmail.com
Fixes: 00f3ca2c2d66 ("mm: memcontrol: per-lruvec stats infrastructure")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-16 10:10:24 +02:00
Greg Kroah-Hartman
c89418ee18 This is the 4.14.40 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlryqJwACgkQONu9yGCS
 aT4TlxAApPkv3brRm/VkYEQKp+JXU9Yz9MvX+UQ8tmqmwAE2HxmKZTScMNGD+dvu
 tgoZEdS7i23G52Qjn1RFn1Zw4HKgW3ZJMAvrRJJJVOlNnccRrvx9wUFOiqYeUFFC
 NCgiKckBPbFZvLe7IMWVz1KyrotogXEWP12scuV4CG792hxzZRa1unBfrIvUi9p4
 fp4IcyYucTcmQqzW4EPmOeE+iahJGTTLngyqL5uwZVegzxwQzVB+Kpc05pU0PpIo
 vgThcBNWaxHD8cyXFVISWoHjdMvUjvkNuDWJPUACT8Tpo4Q/r2ToPEnoEJ2meVos
 jlzBDQ0IwXb7L2GkTlvRLBlCCFcYizTN59LMiaApVSW1bwXS+eJef+zWlHCvmaRs
 /g5SU4OQUnv58j4nr/Uumjx79u4rcpVyINbuvzpKi85wInrrswVFQ5Eo+nac1r7j
 3ttifyhKfxrTHCbPULX5nNYF98tP38iz4I+M8Q5jjAGB71vJ5Lrvfl5nf6K/VamS
 jy1R0rLo/DCkb7bym49nI+WUBs9M8+TfAEtRAB9BklPBvdA8ktrkuD/OVcJ4pWej
 GKmd6yi6gfuPBHDJxQyqb8Ll8IUVDTld0dMg+WZa0GsJpko39K8XuqgEwwBewlUJ
 yCFYrm6F939Ra2WWvUEpPjChYeoG90vaaDZGuvPB7EOeFWJjcEQ=
 =IWvJ
 -----END PGP SIGNATURE-----

Merge 4.14.40 into android-4.14

Changes in 4.14.40
	geneve: update skb dst pmtu on tx path
	net: don't call update_pmtu unconditionally
	percpu: include linux/sched.h for cond_resched()
	crypto: talitos - fix IPsec cipher in length
	ACPI / button: make module loadable when booted in non-ACPI mode
	USB: serial: option: Add support for Quectel EP06
	ALSA: hda - Fix incorrect usage of IS_REACHABLE()
	ALSA: pcm: Check PCM state at xfern compat ioctl
	ALSA: seq: Fix races at MIDI encoding in snd_virmidi_output_trigger()
	ALSA: dice: fix kernel NULL pointer dereference due to invalid calculation for array index
	ALSA: aloop: Mark paused device as inactive
	ALSA: aloop: Add missing cable lock to ctl API callbacks
	tracepoint: Do not warn on ENOMEM
	scsi: target: Fix fortify_panic kernel exception
	Input: leds - fix out of bound access
	Input: atmel_mxt_ts - add touchpad button mapping for Samsung Chromebook Pro
	rtlwifi: btcoex: Add power_on_setting routine
	rtlwifi: cleanup 8723be ant_sel definition
	xfs: prevent creating negative-sized file via INSERT_RANGE
	RDMA/cxgb4: release hw resources on device removal
	RDMA/ucma: Allow resolving address w/o specifying source address
	RDMA/mlx5: Fix multiple NULL-ptr deref errors in rereg_mr flow
	RDMA/mlx5: Protect from shift operand overflow
	NET: usb: qmi_wwan: add support for ublox R410M PID 0x90b2
	IB/mlx5: Use unlimited rate when static rate is not supported
	IB/hfi1: Fix handling of FECN marked multicast packet
	IB/hfi1: Fix loss of BECN with AHG
	IB/hfi1: Fix NULL pointer dereference when invalid num_vls is used
	iw_cxgb4: Atomically flush per QP HW CQEs
	drm/vmwgfx: Fix a buffer object leak
	drm/bridge: vga-dac: Fix edid memory leak
	test_firmware: fix setting old custom fw path back on exit, second try
	errseq: Always report a writeback error once
	USB: serial: visor: handle potential invalid device configuration
	usb: dwc3: gadget: Fix list_del corruption in dwc3_ep_dequeue
	USB: Accept bulk endpoints with 1024-byte maxpacket
	USB: serial: option: reimplement interface masking
	USB: serial: option: adding support for ublox R410M
	usb: musb: host: fix potential NULL pointer dereference
	usb: musb: trace: fix NULL pointer dereference in musb_g_tx()
	platform/x86: asus-wireless: Fix NULL pointer dereference
	irqchip/qcom: Fix check for spurious interrupts
	tracing: Fix bad use of igrab in trace_uprobe.c
	Linux 4.14.40

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-05-09 12:56:13 +02:00
Tejun Heo
e9caf1e1d5 percpu: include linux/sched.h for cond_resched()
commit 71546d100422bcc2c543dadeb9328728997cd23a upstream.

microblaze build broke due to missing declaration of the
cond_resched() invocation added recently.  Let's include linux/sched.h
explicitly.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-09 09:51:48 +02:00