12248 Commits

Author SHA1 Message Date
Blagovest Kolenichev
2973dadc19 Merge android-4.14.56 (818299f) into msm-4.14
* refs/heads/tmp-818299f
  Linux 4.14.56
  f2fs: give message and set need_fsck given broken node id
  loop: remember whether sysfs_create_group() was done
  RDMA/ucm: Mark UCM interface as BROKEN
  PM / hibernate: Fix oops at snapshot_write()
  loop: add recursion validation to LOOP_CHANGE_FD
  netfilter: x_tables: initialise match/target check parameter struct
  netfilter: nf_queue: augment nfqa_cfg_policy
  uprobes/x86: Remove incorrect WARN_ON() in uprobe_init_insn()
  crypto: x86/salsa20 - remove x86 salsa20 implementations
  nvme-pci: Remap CMB SQ entries on every controller reset
  xen: setup pv irq ops vector earlier
  iw_cxgb4: correctly enforce the max reg_mr depth
  i2c: tegra: Fix NACK error handling
  IB/hfi1: Fix incorrect mixing of ERR_PTR and NULL return values
  tools build: fix # escaping in .cmd files for future Make
  arm64: neon: Fix function may_use_simd() return error status
  kbuild: delete INSTALL_FW_PATH from kbuild documentation
  tracing: Reorder display of TGID to be after PID
  mm: do not bug_on on incorrect length in __mm_populate()
  fs, elf: make sure to page align bss in load_elf_library
  fs/proc/task_mmu.c: fix Locked field in /proc/pid/smaps*
  mm: do not drop unused pages when userfaultd is running
  ALSA: hda - Handle pm failure during hotplug
  ALSA: hda/realtek - two more lenovo models need fixup of MIC_LOCATION
  scsi: megaraid_sas: fix selection of reply queue
  scsi: megaraid_sas: Create separate functions to allocate ctrl memory
  scsi: megaraid_sas: replace is_ventura with adapter_type checks
  scsi: megaraid_sas: replace instance->ctrl_context checks with instance->adapter_type
  scsi: megaraid_sas: use adapter_type for all gen controllers
  genirq/affinity: assign vectors to all possible CPUs
  Fix up non-directory creation in SGID directories
  devpts: resolve devpts bind-mounts
  devpts: hoist out check for DEVPTS_SUPER_MAGIC
  xhci: xhci-mem: off by one in xhci_stream_id_to_ring()
  usb: quirks: add delay quirks for Corsair Strafe
  USB: serial: mos7840: fix status-register error handling
  USB: yurex: fix out-of-bounds uaccess in read handler
  USB: serial: keyspan_pda: fix modem-status error handling
  USB: serial: cp210x: add another USB ID for Qivicon ZigBee stick
  USB: serial: ch341: fix type promotion bug in ch341_control_in()
  ahci: Disable LPM on Lenovo 50 series laptops with a too old BIOS
  vmw_balloon: fix inflation with batching
  ata: Fix ZBC_OUT all bit handling
  ata: Fix ZBC_OUT command block check
  staging: r8822be: Fix RTL8822be can't find any wireless AP
  staging: rtl8723bs: Prevent an underflow in rtw_check_beacon_data().
  ibmasm: don't write out of bounds in read handler
  mmc: dw_mmc: fix card threshold control configuration
  mmc: sdhci-esdhc-imx: allow 1.8V modes without 100/200MHz pinctrl states
  MIPS: Fix ioremap() RAM check
  MIPS: Use async IPIs for arch_trigger_cpumask_backtrace()
  MIPS: Call dump_stack() from show_regs()
  ASoC: mediatek: preallocate pages use platform device
  media: rc: mce_kbd decoder: fix stuck keys
  ANDROID: Fix massive cpufreq_times memory leaks
  ANDROID: Reduce use of #ifdef CONFIG_CPU_FREQ_TIMES

Change-Id: I8181c52138e12e6cdd25b9cf0ffba19469593ab2
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2018-07-17 06:56:37 -07:00
Greg Kroah-Hartman
818299f6bd This is the 4.14.56 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAltNuVYACgkQONu9yGCS
 aT7kTA/+MRHC5oFvdnhSsF6jAHsY9rgJNQXPtZCFhZnHhhYHtubQ2OJOmSZ7IfM0
 9yhz/7vijC9+tLufXQxQnu2UUL3ojNu1+l+q9s0U1GUzNiONlJ9q/CyB4xjXFRCS
 1RdiDZaQbIqUCYs38UCTsEJF65uKjzQ6dpF21XdIXp5FPxgiZawo4HpjQRJswbAl
 Du97ybMEPN3XnAn207GjZwy58ubRLF5HDG1sqNGfjVWJ7oMTi+QJOCvY3PJtU3j2
 unS0qjxLU432rOyDfaJK7Yj9s61zu0PurbJrHo+dw3O3hd/Og7soqoqohUEjZWXd
 z7jjrntXZOZ/0st2yHmygfAPUJm/8jsh7Pd39Jgyfeu/3Clo51gO494rwATQsyE5
 mwIdllyzyMNBEJI2F2fxE60WlFsbTjeBOX3BaOwnF8pGRJWsCAfbFknRbuKh1fO5
 czFbUSOi00POw4WHT1rxV9u0yDBXmP47fy9zHquOim+PfK8pFvWuf6GSFjvqRTv8
 20w1w7eixMi09ZXOkgTJ3S00MKHSpxoaenI3n2NcEVVRgDEVfh3C/zelvvfCDMHD
 i36DN39Sj41PNA/R4n0TIA4W+ab9qBVzQl16yaj9JURR2rA92GyMVC1+Xjqo1Py3
 GRFOf2Gprlm0/vfkiRsMu9coAJuKV6+8fHXQU4mzHulKUaDWuJ0=
 =/wBU
 -----END PGP SIGNATURE-----

Merge 4.14.56 into android-4.14

Changes in 4.14.56
	media: rc: mce_kbd decoder: fix stuck keys
	ASoC: mediatek: preallocate pages use platform device
	MIPS: Call dump_stack() from show_regs()
	MIPS: Use async IPIs for arch_trigger_cpumask_backtrace()
	MIPS: Fix ioremap() RAM check
	mmc: sdhci-esdhc-imx: allow 1.8V modes without 100/200MHz pinctrl states
	mmc: dw_mmc: fix card threshold control configuration
	ibmasm: don't write out of bounds in read handler
	staging: rtl8723bs: Prevent an underflow in rtw_check_beacon_data().
	staging: r8822be: Fix RTL8822be can't find any wireless AP
	ata: Fix ZBC_OUT command block check
	ata: Fix ZBC_OUT all bit handling
	vmw_balloon: fix inflation with batching
	ahci: Disable LPM on Lenovo 50 series laptops with a too old BIOS
	USB: serial: ch341: fix type promotion bug in ch341_control_in()
	USB: serial: cp210x: add another USB ID for Qivicon ZigBee stick
	USB: serial: keyspan_pda: fix modem-status error handling
	USB: yurex: fix out-of-bounds uaccess in read handler
	USB: serial: mos7840: fix status-register error handling
	usb: quirks: add delay quirks for Corsair Strafe
	xhci: xhci-mem: off by one in xhci_stream_id_to_ring()
	devpts: hoist out check for DEVPTS_SUPER_MAGIC
	devpts: resolve devpts bind-mounts
	Fix up non-directory creation in SGID directories
	genirq/affinity: assign vectors to all possible CPUs
	scsi: megaraid_sas: use adapter_type for all gen controllers
	scsi: megaraid_sas: replace instance->ctrl_context checks with instance->adapter_type
	scsi: megaraid_sas: replace is_ventura with adapter_type checks
	scsi: megaraid_sas: Create separate functions to allocate ctrl memory
	scsi: megaraid_sas: fix selection of reply queue
	ALSA: hda/realtek - two more lenovo models need fixup of MIC_LOCATION
	ALSA: hda - Handle pm failure during hotplug
	mm: do not drop unused pages when userfaultd is running
	fs/proc/task_mmu.c: fix Locked field in /proc/pid/smaps*
	fs, elf: make sure to page align bss in load_elf_library
	mm: do not bug_on on incorrect length in __mm_populate()
	tracing: Reorder display of TGID to be after PID
	kbuild: delete INSTALL_FW_PATH from kbuild documentation
	arm64: neon: Fix function may_use_simd() return error status
	tools build: fix # escaping in .cmd files for future Make
	IB/hfi1: Fix incorrect mixing of ERR_PTR and NULL return values
	i2c: tegra: Fix NACK error handling
	iw_cxgb4: correctly enforce the max reg_mr depth
	xen: setup pv irq ops vector earlier
	nvme-pci: Remap CMB SQ entries on every controller reset
	crypto: x86/salsa20 - remove x86 salsa20 implementations
	uprobes/x86: Remove incorrect WARN_ON() in uprobe_init_insn()
	netfilter: nf_queue: augment nfqa_cfg_policy
	netfilter: x_tables: initialise match/target check parameter struct
	loop: add recursion validation to LOOP_CHANGE_FD
	PM / hibernate: Fix oops at snapshot_write()
	RDMA/ucm: Mark UCM interface as BROKEN
	loop: remember whether sysfs_create_group() was done
	f2fs: give message and set need_fsck given broken node id
	Linux 4.14.56

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-07-17 12:29:15 +02:00
Michal Hocko
81ebc9decd mm: do not bug_on on incorrect length in __mm_populate()
commit bb177a732c4369bb58a1fe1df8f552b6f0f7db5f upstream.

syzbot has noticed that a specially crafted library can easily hit
VM_BUG_ON in __mm_populate

  kernel BUG at mm/gup.c:1242!
  invalid opcode: 0000 [#1] SMP
  CPU: 2 PID: 9667 Comm: a.out Not tainted 4.18.0-rc3 #644
  Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
  RIP: 0010:__mm_populate+0x1e2/0x1f0
  Code: 55 d0 65 48 33 14 25 28 00 00 00 89 d8 75 21 48 83 c4 20 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 75 18 f1 ff 0f 0b e8 6e 18 f1 ff <0f> 0b 31 db eb c9 e8 93 06 e0 ff 0f 1f 00 55 48 89 e5 53 48 89 fb
  Call Trace:
     vm_brk_flags+0xc3/0x100
     vm_brk+0x1f/0x30
     load_elf_library+0x281/0x2e0
     __ia32_sys_uselib+0x170/0x1e0
     do_fast_syscall_32+0xca/0x420
     entry_SYSENTER_compat+0x70/0x7f

The reason is that the length of the new brk is not page aligned when we
try to populate the it.  There is no reason to bug on that though.
do_brk_flags already aligns the length properly so the mapping is
expanded as it should.  All we need is to tell mm_populate about it.
Besides that there is absolutely no reason to to bug_on in the first
place.  The worst thing that could happen is that the last page wouldn't
get populated and that is far from putting system into an inconsistent
state.

Fix the issue by moving the length sanitization code from do_brk_flags
up to vm_brk_flags.  The only other caller of do_brk_flags is brk
syscall entry and it makes sure to provide the proper length so t here
is no need for sanitation and so we can use do_brk_flags without it.

Also remove the bogus BUG_ONs.

[osalvador@techadventures.net: fix up vm_brk_flags s@request@len@]
Link: http://lkml.kernel.org/r/20180706090217.GI32658@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: syzbot <syzbot+5dcb560fe12aa5091c06@syzkaller.appspotmail.com>
Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-17 11:39:29 +02:00
Christian Borntraeger
684a2d8ed5 mm: do not drop unused pages when userfaultd is running
commit bce73e4842390f7b7309c8e253e139db71288ac3 upstream.

KVM guests on s390 can notify the host of unused pages.  This can result
in pte_unused callbacks to be true for KVM guest memory.

If a page is unused (checked with pte_unused) we might drop this page
instead of paging it.  This can have side-effects on userfaultd, when
the page in question was already migrated:

The next access of that page will trigger a fault and a user fault
instead of faulting in a new and empty zero page.  As QEMU does not
expect a userfault on an already migrated page this migration will fail.

The most straightforward solution is to ignore the pte_unused hint if a
userfault context is active for this VMA.

Link: http://lkml.kernel.org/r/20180703171854.63981-1-borntraeger@de.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-17 11:39:29 +02:00
qctecmdr Service
de4318de7e Merge "Merge android-4.14.55 (2e9aed1) into msm-4.14" 2018-07-12 16:49:04 -07:00
Isaac J. Manjarres
179f58aa64 Merge android-4.14.55 (2e9aed1) into msm-4.14
* remotes/origin/tmp-2e9aed1:
  Linux 4.14.55
  Revert mm/vmstat.c: fix vmstat_update() preemption BUG
  sched, tracing: Fix trace_sched_pi_setprio() for deboosting
  staging: comedi: quatech_daqp_cs: fix no-op loop daqp_ao_insn_write()
  netfilter: nf_log: don't hold nf_log_mutex during user access
  mtd: cfi_cmdset_0002: Change erase functions to check chip good only
  mtd: cfi_cmdset_0002: Change erase functions to retry for error
  mtd: cfi_cmdset_0002: Change definition naming to retry write operation
  dm: prevent DAX mounts if not supported
  dm: set QUEUE_FLAG_DAX accordingly in dm_table_set_restrictions()
  dax: check for QUEUE_FLAG_DAX in bdev_dax_supported()
  dax: change bdev_dax_supported() to support boolean returns
  fs: allow per-device dax status checking for filesystems
  mtd: rawnand: mxc: set spare area size register explicitly
  media: cx25840: Use subdev host data for PLL override
  Kbuild: fix # escaping in .cmd files for future Make
  Revert "dpaa_eth: fix error in dpaa_remove()"
  f2fs: truncate preallocated blocks in error case
  media: vb2: core: Finish buffers at the end of the stream
  mm: hwpoison: disable memory error handling on 1GB hugepage
  irq/core: Fix boot crash when the irqaffinity= boot parameter is passed on CPUMASK_OFFSTACK=y kernels(v1)
  HID: debug: check length before copy_to_user()
  HID: hiddev: fix potential Spectre v1
  HID: i2c-hid: Fix "incomplete report" noise
  block: cope with WRITE ZEROES failing in blkdev_issue_zeroout()
  block: factor out __blkdev_issue_zero_pages()
  ext4: check superblock mapped prior to committing
  ext4: add more mount time checks of the superblock
  ext4: add more inode number paranoia checks
  ext4: avoid running out of journal credits when appending to an inline file
  ext4: never move the system.data xattr out of the inode body
  ext4: clear i_data in ext4_inode_info when removing inline data
  ext4: include the illegal physical block in the bad map ext4_error msg
  ext4: verify the depth of extent tree in ext4_find_extent()
  ext4: only look at the bg_flags field if it is valid
  ext4: always check block group bounds in ext4_init_block_bitmap()
  ext4: make sure bitmaps and the inode table don't overlap with bg descriptors
  ext4: always verify the magic number in xattr blocks
  ext4: add corruption check in ext4_xattr_set_entry()
  jbd2: don't mark block as modified if the handle is out of credits
  drm/udl: fix display corruption of the last line
  drm: Use kvzalloc for allocating blob property memory
  cifs: Fix slab-out-of-bounds in send_set_info() on SMB2 ACE setting
  cifs: Fix infinite loop when using hard mount option
  cifs: Fix memory leak in smb2_set_ea()
  cifs: Fix use after free of a mid_q_entry
  vfio: Use get_user_pages_longterm correctly
  drbd: fix access after free
  s390: Correct register corruption in critical section cleanup
  scsi: target: Fix truncated PR-in ReadKeys response
  scsi: sg: mitigate read/write abuse
  tracing: Fix missing return symbol in function_graph output
  mm: hugetlb: yield when prepping struct pages
  userfaultfd: hugetlbfs: fix userfaultfd_huge_must_wait() pte access
  arm64: fix show_data fallout from KERN_CONT changes

Change-Id: I020c73a87142daffcc219230e476069e3bc98d2d
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2018-07-12 07:05:40 -07:00
Liam Mark
c3fea2d9c7 mm, oom: rate limit oom reaper logging
The oom reaper logging is sometimes flooding the logs and causing a
watchdog timeout, so rate limit the oom reaper logging.

Change-Id: Ic4ddf2d6273839e43501f08f51b1c6406d7cc4d2
Signed-off-by: Liam Mark <lmark@codeaurora.org>
2018-07-11 16:31:44 -07:00
qctecmdr Service
8946e7a5cc Merge "defconfig: sm8150: Enable memory region offlining support" 2018-07-11 08:53:19 -07:00
Greg Kroah-Hartman
2e9aed164f This is the 4.14.55 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAltGFEUACgkQONu9yGCS
 aT5jMw//Y70OdIlEj7R/AjZZqAzwczZQhW/00ytJvCUZPzTazEhXxSbyc4d65VjQ
 4mdrl6nfYLOR0bF8gdWlGFCZRc405CXAc9INvixEUbj0w5MPWPQOmqT9gyOCg+Mu
 Iv5FXXEhk+H5vCIpN7g2PnoPFDFX7cC1vlqtbFfKRXCyGUZirmPl2vVcgID6scWN
 gB3+oWWcgNaCWuyz+tXrzzEQOJhMW84Y55wga1T1gjpE3yBreMU0j6DOXPTxrf/E
 VFs/h75ObR9yNB8O38d7zPrzQpaJHK1rhtqpJB+Thftxr0nO3Bn4Bg2FjnzMp8qP
 HNQKseeFfn0C7uNPjl3Pc5DH5BWfveOUPfbUHzuzyQZbK8E5O22BLhMxu+yS9PO2
 xzlN0OF8vP1VIR+gs12qopF9aGRCBM88YVCALb93fK+vEHhVOOa1kmfyTu3rCf/p
 M3rqw1YuW3TSwcskeL2MlSjnmxmM7HR/PmLJGD4xdmCwQtLAljVTD/sIUZOiPchh
 fH8CQc6QJEWo25oNSvdjQTdQtTTORMaU7JZ8TxEfbE7DRb4ziBpLNIxAanYc8vEw
 qXRXkigTdOW/Fb2X7vLxANXxXc5Xd4gRxjRJZfvN0ekw8GSkyk7wpNyURGDGt9UY
 kPMal06BUg7zEjHc16xVhrIed7PzE+FfTTzEspBOtbMkVzmHCTk=
 =dpg4
 -----END PGP SIGNATURE-----

Merge 4.14.55 into android-4.14

Changes in 4.14.55
	userfaultfd: hugetlbfs: fix userfaultfd_huge_must_wait() pte access
	mm: hugetlb: yield when prepping struct pages
	tracing: Fix missing return symbol in function_graph output
	scsi: sg: mitigate read/write abuse
	scsi: target: Fix truncated PR-in ReadKeys response
	s390: Correct register corruption in critical section cleanup
	drbd: fix access after free
	vfio: Use get_user_pages_longterm correctly
	cifs: Fix use after free of a mid_q_entry
	cifs: Fix memory leak in smb2_set_ea()
	cifs: Fix infinite loop when using hard mount option
	cifs: Fix slab-out-of-bounds in send_set_info() on SMB2 ACE setting
	drm: Use kvzalloc for allocating blob property memory
	drm/udl: fix display corruption of the last line
	jbd2: don't mark block as modified if the handle is out of credits
	ext4: add corruption check in ext4_xattr_set_entry()
	ext4: always verify the magic number in xattr blocks
	ext4: make sure bitmaps and the inode table don't overlap with bg descriptors
	ext4: always check block group bounds in ext4_init_block_bitmap()
	ext4: only look at the bg_flags field if it is valid
	ext4: verify the depth of extent tree in ext4_find_extent()
	ext4: include the illegal physical block in the bad map ext4_error msg
	ext4: clear i_data in ext4_inode_info when removing inline data
	ext4: never move the system.data xattr out of the inode body
	ext4: avoid running out of journal credits when appending to an inline file
	ext4: add more inode number paranoia checks
	ext4: add more mount time checks of the superblock
	ext4: check superblock mapped prior to committing
	block: factor out __blkdev_issue_zero_pages()
	block: cope with WRITE ZEROES failing in blkdev_issue_zeroout()
	HID: i2c-hid: Fix "incomplete report" noise
	HID: hiddev: fix potential Spectre v1
	HID: debug: check length before copy_to_user()
	irq/core: Fix boot crash when the irqaffinity= boot parameter is passed on CPUMASK_OFFSTACK=y kernels(v1)
	mm: hwpoison: disable memory error handling on 1GB hugepage
	media: vb2: core: Finish buffers at the end of the stream
	f2fs: truncate preallocated blocks in error case
	Revert "dpaa_eth: fix error in dpaa_remove()"
	Kbuild: fix # escaping in .cmd files for future Make
	media: cx25840: Use subdev host data for PLL override
	mtd: rawnand: mxc: set spare area size register explicitly
	fs: allow per-device dax status checking for filesystems
	dax: change bdev_dax_supported() to support boolean returns
	dax: check for QUEUE_FLAG_DAX in bdev_dax_supported()
	dm: set QUEUE_FLAG_DAX accordingly in dm_table_set_restrictions()
	dm: prevent DAX mounts if not supported
	mtd: cfi_cmdset_0002: Change definition naming to retry write operation
	mtd: cfi_cmdset_0002: Change erase functions to retry for error
	mtd: cfi_cmdset_0002: Change erase functions to check chip good only
	netfilter: nf_log: don't hold nf_log_mutex during user access
	staging: comedi: quatech_daqp_cs: fix no-op loop daqp_ao_insn_write()
	sched, tracing: Fix trace_sched_pi_setprio() for deboosting
	Revert mm/vmstat.c: fix vmstat_update() preemption BUG
	Linux 4.14.55

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-07-11 16:46:10 +02:00
Sebastian Andrzej Siewior
b3ef356a09 Revert mm/vmstat.c: fix vmstat_update() preemption BUG
commit 28557cc106e6d2aa8b8c5c7687ea9f8055ff3911 upstream.

Revert commit c7f26ccfb2c3 ("mm/vmstat.c: fix vmstat_update() preemption
BUG").  Steven saw a "using smp_processor_id() in preemptible" message
and added a preempt_disable() section around it to keep it quiet.  This
is not the right thing to do it does not fix the real problem.

vmstat_update() is invoked by a kworker on a specific CPU.  This worker
it bound to this CPU.  The name of the worker was "kworker/1:1" so it
should have been a worker which was bound to CPU1.  A worker which can
run on any CPU would have a `u' before the first digit.

smp_processor_id() can be used in a preempt-enabled region as long as
the task is bound to a single CPU which is the case here.  If it could
run on an arbitrary CPU then this is the problem we have an should seek
to resolve.

Not only this smp_processor_id() must not be migrated to another CPU but
also refresh_cpu_vm_stats() which might access wrong per-CPU variables.
Not to mention that other code relies on the fact that such a worker
runs on one specific CPU only.

Therefore revert that commit and we should look instead what broke the
affinity mask of the kworker.

Link: http://lkml.kernel.org/r/20180504104451.20278-1-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Steven J. Hill <steven.hill@cavium.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-11 16:29:24 +02:00
Naoya Horiguchi
b16a6af974 mm: hwpoison: disable memory error handling on 1GB hugepage
commit 31286a8484a85e8b4e91ddb0f5415aee8a416827 upstream.

Recently the following BUG was reported:

    Injecting memory failure for pfn 0x3c0000 at process virtual address 0x7fe300000000
    Memory failure: 0x3c0000: recovery action for huge page: Recovered
    BUG: unable to handle kernel paging request at ffff8dfcc0003000
    IP: gup_pgd_range+0x1f0/0xc20
    PGD 17ae72067 P4D 17ae72067 PUD 0
    Oops: 0000 [#1] SMP PTI
    ...
    CPU: 3 PID: 5467 Comm: hugetlb_1gb Not tainted 4.15.0-rc8-mm1-abc+ #3
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014

You can easily reproduce this by calling madvise(MADV_HWPOISON) twice on
a 1GB hugepage.  This happens because get_user_pages_fast() is not aware
of a migration entry on pud that was created in the 1st madvise() event.

I think that conversion to pud-aligned migration entry is working, but
other MM code walking over page table isn't prepared for it.  We need
some time and effort to make all this work properly, so this patch
avoids the reported bug by just disabling error handling for 1GB
hugepage.

[n-horiguchi@ah.jp.nec.com: v2]
  Link: http://lkml.kernel.org/r/1517284444-18149-1-git-send-email-n-horiguchi@ah.jp.nec.com
Link: http://lkml.kernel.org/r/1517207283-15769-1-git-send-email-n-horiguchi@ah.jp.nec.com
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-11 16:29:20 +02:00
Cannon Matthews
48b019a51a mm: hugetlb: yield when prepping struct pages
commit 520495fe96d74e05db585fc748351e0504d8f40d upstream.

When booting with very large numbers of gigantic (i.e.  1G) pages, the
operations in the loop of gather_bootmem_prealloc, and specifically
prep_compound_gigantic_page, takes a very long time, and can cause a
softlockup if enough pages are requested at boot.

For example booting with 3844 1G pages requires prepping
(set_compound_head, init the count) over 1 billion 4K tail pages, which
takes considerable time.

Add a cond_resched() to the outer loop in gather_bootmem_prealloc() to
prevent this lockup.

Tested: Booted with softlockup_panic=1 hugepagesz=1G hugepages=3844 and
no softlockup is reported, and the hugepages are reported as
successfully setup.

Link: http://lkml.kernel.org/r/20180627214447.260804-1-cannonmatthews@google.com
Signed-off-by: Cannon Matthews <cannonmatthews@google.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-11 16:29:13 +02:00
qctecmdr Service
a980a6a325 Merge "lowmemorykiller: Introduce sysfs node for ALMK and PPR adj threshold" 2018-07-10 11:49:41 -07:00
Sudarshan Rajagopalan
911c18bb7e arm64: mm/memblock: Update memory limit calculation
The system RAM region would not be ideally contiguous from
start to end of DRAM. Detect any offset/holes present in the
DDR region and update the memory limit being set, so that the
DRAM end address is aligned to the memory block size. This is
a requirement for mem-offline framework to work properly.

Change-Id: I97cb0cce70e7c414ae2d200a4a3f714d145c0e64
Signed-off-by: Sudarshan Rajagopalan <sudaraja@codeaurora.org>
2018-07-10 11:39:00 -07:00
qctecmdr Service
41a65c888c Merge "mm: swap: free up swap on mm reap" 2018-07-09 12:31:38 -07:00
Suyog Sarda
f326985b26 lowmemorykiller: Introduce sysfs node for ALMK and PPR adj threshold
The grouping of tasks based on oom_score_adj values change from
one framework to another. This requires corresponding changes in
the threshold values set for almk and per process reclaim.
Introduce sysfs nodes to set threshold adj for process reclaim
and adaptive LMK dynamically.

Change-Id: Ib7565bfd5d2e93aa4ff8fdd20414cac0a0f38bf7
Signed-off-by: Suyog Sarda <ssarda@codeaurora.org>
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2018-07-09 12:37:23 +05:30
qctecmdr Service
6c86746573 Merge "mm: run the showmem notifier in alloc failure" 2018-07-07 12:29:07 -07:00
qctecmdr Service
f4c73f910e Merge "Merge android-4.14.53 (57c2874) into msm-4.14" 2018-07-07 12:29:02 -07:00
Prakash Gupta
17e3983d31 mm: run the showmem notifier in alloc failure
When the page allocation fails, it's useful to
be able to see the state of unaccounted memory in the system. Call the
showmem notifier to get other clients to dump out their state.

This is an example output with this patch.

[  457.125478] SLUB: Unable to allocate memory on node -1, gfp=0x2008000(GFP_NOWAIT|__GFP_ZERO)
[  457.133982]   cache: kmalloc-128, object size: 128, buffer size: 640, default order: 2, min order: 0
[  457.143179]   node 0: slabs: 5903, objs: 132755, free: 26
[  457.906076] BootAnimation: page allocation failure: order:0, mode:0x2204000(GFP_NOWAIT|__GFP_COMP|__GFP_NOTRACK)
[  457.916395] CPU: 2 PID: 4752 Comm: BootAnimation Not tainted 4.9.37+ #43
[  457.916398] Hardware name: Qualcomm Technologies, Inc. SDM845 v1 MTP (DT)
[  457.916402] Call trace:
[  457.916420] [<ffffff82c5c89504>] dump_backtrace+0x0/0x2c4
[  457.916426] [<ffffff82c5c897e8>] show_stack+0x20/0x28
[  457.916434] [<ffffff82c5fea888>] dump_stack+0xb8/0xf4
[  457.916442] [<ffffff82c5dd136c>] warn_alloc+0x154/0x170
[  457.916447] [<ffffff82c5dd184c>] __alloc_pages_nodemask+0x430/0xcdc
[  457.916454] [<ffffff82c5e1c154>] new_slab+0x344/0x430
[  457.916458] [<ffffff82c5e1e404>] ___slab_alloc.constprop.72+0x2f4/0x398
[  457.916463] [<ffffff82c5e1e4f0>] __slab_alloc.isra.69.constprop.71+0x48/0x80
[  457.916467] [<ffffff82c5e1ea24>] kmem_cache_alloc_trace+0x210/0x2dc
[  457.916476] [<ffffff82c68ebf0c>] binder_transaction+0x280/0x2008
[  457.916480] [<ffffff82c68ee68c>] binder_thread_write+0x9f8/0x136c
[  457.916484] [<ffffff82c68f08d0>] binder_ioctl_write_read+0x14c/0x3b0
[  457.916488] [<ffffff82c68f0df0>] binder_ioctl+0x2bc/0x868
[  457.916494] [<ffffff82c5e451e4>] do_vfs_ioctl+0xd0/0x858
[  457.916498] [<ffffff82c5e459fc>] SyS_ioctl+0x90/0xa4
[  457.916503] [<ffffff82c5c83770>] el0_svc_naked+0x24/0x28
[  457.916505] Mem-Info:
[  457.916515] active_anon:83629 inactive_anon:212 isolated_anon:0\x0a active_file:5955 inactive_file:5745 isolated_file:0\x0a unevictable:630956 dirty:0 writeback:0 unstable:0\x0a slab_recl
aimable:16602 slab_unreclaimable:69384\x0a mapped:3609 shmem:308 pagetables:5737 bounce:0\x0a free:4446 free_pcp:482 free_cma:112
[  457.916524] Node 0 active_anon:334516kB inactive_anon:848kB active_file:23820kB inactive_file:22980kB unevictable:2523824kB isolated(anon):0kB isolated(file):0kB mapped:14436kB dirty:0kB
writeback:0kB shmem:1232kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no
[  457.916534] DMA free:12516kB min:3352kB low:4924kB high:6496kB active_anon:45336kB inactive_anon:64kB active_file:3588kB inactive_file:3404kB unevictable:1564224kB writepending:0kB presen
t:1854120kB managed:1748064kB mlocked:1564224kB slab_reclaimable:10664kB slab_unreclaimable:24400kB kernel_stack:2608kB pagetables:6376kB bounce:0kB free_pcp:572kB local_pcp:0kB free_cma:448
kB
[  457.916536] lowmem_reserve[]: 0 1901 1901
[  457.916550] Normal free:5268kB min:4148kB low:6092kB high:8036kB active_anon:289180kB inactive_anon:784kB active_file:20232kB inactive_file:19576kB unevictable:959600kB writepending:0kB p
resent:2068224kB managed:1984196kB mlocked:959600kB slab_reclaimable:55744kB slab_unreclaimable:253136kB kernel_stack:19232kB pagetables:16572kB bounce:0kB free_pcp:1356kB local_pcp:116kB fr
ee_cma:0kB
[  457.916552] lowmem_reserve[]: 0 0 0
[  457.916560] DMA: 819*4kB (UMEC) 280*8kB (UMEC) 224*16kB (UME) 98*32kB (UME) 6*64kB (UME) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 12620kB
[  457.916594] Normal: 1118*4kB (UMEH) 38*8kB (UMH) 1*16kB (H) 2*32kB (H) 1*64kB (H) 1*128kB (H) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 5048kB
[  457.916627] 12071 total pagecache pages
[  457.916630] 0 pages in swap cache
[  457.916633] Swap cache stats: add 0, delete 0, find 0/0
[  457.916634] Free swap  = 0kB
[  457.916636] Total swap = 0kB
[  457.916639] 980586 pages RAM
[  457.916641] 0 pages HighMem/MovableOnly
[  457.916643] 47521 pages reserved
[  457.916645] 51200 pages cma reserved
[  457.916651] cma: cma-0 pages: => 0 used of 2048 total pages
[  457.916660] cma: cma-1 pages: => 0 used of 23552 total pages
[  457.916665] cma: cma-2 pages: => 1695 used of 3072 total pages
[  457.916670] cma: cma-3 pages: => 8277 used of 9216 total pages
[  457.916674] cma: cma-4 pages: => 186 used of 5120 total pages
[  457.916679] cma: cma-5 pages: => 3792 used of 8192 total pages
[  457.916685]        Heap name  Total heap size Total orphaned size
[  457.916687] ---------------------------------
[  457.916691]           qsecom 0x           ba000 0x               0
[  457.916694]           system 0x         44db000 0x          500000
[  457.916705] -------------------------------------------------
[  457.916708] uncached pool = 31027200 cached pool = 0 secure pool = 0
[  457.916710] pool total (uncached + cached + secure) = 31027200
[  457.916712] -------------------------------------------------
[  457.916715]             adsp 0x          614000 0x               0
[  457.916720]             spss 0x               0 0x               0
[  457.916725]   secure_display 0x               0 0x               0
[  457.916727]      secure_heap 0x               0 0x               0

Change-Id: Id01cce4abf331ff9c1c7ab9f0c0f9b1fc4146467
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
2018-07-05 11:41:58 -07:00
Prakash Gupta
09e1bae59c mm, oom: run the showmem notifier in oom
When the oom starts killing processes, it's useful to
be able to see the state of unaccounted memory in the system. Call the
showmem notifier to get other clients to dump out their state.

Change-Id: Id32ff6d6747fee7d0889447323ddce25282e93f6
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
2018-07-05 11:41:57 -07:00
Prakash Gupta
9ed5f4f274 mm: cma: Register with show_mem notification framework
Register with the show_mem notification framework
to let cma dump out data for debugging.

This is an example output with this patch.

cma: cma-0 pages: => 0 used of 2048 total pages
cma: cma-1 pages: => 0 used of 23552 total pages
cma: cma-2 pages: => 1696 used of 3072 total pages
cma: cma-3 pages: => 8277 used of 9216 total pages
cma: cma-4 pages: => 186 used of 5120 total pages
cma: cma-5 pages: => 3792 used of 8192 total pages

Change-Id: I80d9291444fd9361f5586bdf60373ed8b5b41705
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
2018-07-05 11:41:56 -07:00
Charan Teja Reddy
b95aeb6151 lowmemorykiller: use oom reaper to free pages of task killed by lmk
Free the pages parallely for a task that is LMK killed using the
oom_reaper. This freeing of pages will help to give the pages to
buddy system well advance there by we may achieve less number of
killings by LMK.

Change-Id: I5e1ed183437ab243f12cbbf3ae10d9ca5211fc06
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
2018-07-05 11:41:50 -07:00
Vinayak Menon
22d014495f mm: swap: free up swap on mm reap
With the swap slots feature, the freeing up of swap space
during the mm reap (kill or oom reap for e.g.) is delayed
till a threshold is reached. This is done mainly to reduce
the fragmentation of swap space. The caching is done per
cpu. But with zram, this causes zram space consumed not
being free up immediately on task kill or oom reap. Since
we don't use THP swap, fragmentation is not a concern. So
free up the slots without caching when the swap device is
a synchronous one. Note that this does not disable the swap
slots feature which can take into effect during swap slots
allocation.

Change-Id: I9985edfdd88723d38ae905496556e07fb8bf09af
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2018-07-05 17:19:46 +05:30
Vinayak Menon
59906b3de3 mm: vmpressure: make vmpressure window variable
Right now the vmpressure window is of constant size 2MB, which
works well with the following exceptions.
1) False vmpressure triggers are seen when the RAM size is greater
than 3GB. This results in lowmemorykiller, which uses vmpressure
events, killing tasks unnecessarily.
2) Vmpressure events are received late under memory pressure. This
behaviour is seen prominently in <=2GB RAM targets. This results in
lowmemorykiller kicking in late to kill tasks resulting in avoidable
page cache reclaim.

The problem analysis shows that the issue is with the constant size
of the vmpressure window which does not adapt to the varying memory
conditions. This patch recalculates the vmpressure window size at
the end of each window. The chosen window size is proportional to
the total of free and cached memory at that point.

Change-Id: I7e9ef4ddd82e2c2dd04ce09ec8d58a8829cfb64d
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2018-07-04 12:18:22 +05:30
Vinayak Menon
2493065a69 mm: vmpressure: account allocstalls only on higher pressures
At present any vmpressure value is scaled up if the pages are
reclaimed through direct reclaim. This can result in false
vmpressure values. Consider a case where a device is booted up
and most of the memory is occuppied by file pages. kswapd will
make sure that high watermark is maintained. Now when a sudden
huge allocation request comes in, the system will definitely
have to get into direct reclaims. The vmpressures can be very low,
but because of allocstall accounting logic even these low values
will be scaled to values nearing 100. This can result in
unnecessary LMK kills for example. So define a tunable threshold
for vmpressure above which the allocstalls will be accounted.

Change-Id: Idd7c6724264ac89f1f68f2e9d70a32390ffca3e5
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2018-07-04 12:18:21 +05:30
Vinayak Menon
68ce3bbbaa mm: vmpressure: scale pressure based on reclaim context
The existing calculation of vmpressure takes into account only
the ratio of reclaimed to scanned pages, but not the time spent
or the difficulty in reclaiming those pages. For e.g. when there
are quite a number of file pages in the system, an allocation
request can be satisfied by reclaiming the file pages alone. If
such a reclaim is successful, the vmpressure value will remain low
irrespective of the time spent by the reclaim code to free up the
file pages. With a feature like lowmemorykiller, killing a task
can be faster than reclaiming the file pages alone. So if the
vmpressure values reflect the reclaim difficulty level, clients
can make a decision based on that, for e.g. to kill a task early.

This patch monitors the number of pages scanned in the direct
reclaim path and scales the vmpressure level according to that.

Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Change-Id: I6e643d29a9a1aa0814309253a8b690ad86ec0b13
2018-07-04 12:18:21 +05:30
Vinayak Menon
7964b3ce47 mm: process reclaim: vmpressure based process reclaim
With this patch, anon pages of inactive tasks can be reclaimed,
depending on memory pressure. Memory pressure is detected
using vmpressure events. 'N' best tasks in terms of anon
size is selected and pages proportional to their tasksize
is reclaimed. The total number of pages reclaimed at each
run of the swap work, can be tuned from userspace, the
default being SWAP_CLUSTER_MAX * 32.

The patch also adds tracepoints to debug and tune the
feature.

echo 1 > /sys/module/process_reclaim/parameters/enable_process_reclaim
to enable the feature.

echo <pages> > /sys/module/process_reclaim/parameters/per_swap_size,
to set the number of pages reclaimed in each scan.

/sys/module/process_reclaim/parameters/reclaim_avg_efficiency, provides
the average efficiency (scan to reclaim ratio) of the algorithm.

/sys/module/process_reclaim/parameters/swap_eff_win, to set the window
period (in unit of number of times reclaim is triggered) to detect
low efficiency runs.

/sys/module/process_reclaim/parameters/swap_opt_eff, to set the optimal
efficiency threshold for low efficiency detection.

Change-Id: I895986f10c997d1715761eaaadc4bbbee60db9d2
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2018-07-04 12:18:09 +05:30
Liam Mark
8df57cb4fa android/lowmemorykiller: Selectively count free CMA pages
In certain memory configurations there can be a large number of
CMA pages which are not suitable to satisfy certain memory
requests.

This large number of unsuitable pages can cause the
lowmemorykiller to not kill any tasks because the
lowmemorykiller counts all free pages.
In order to ensure the lowmemorykiller properly evaluates the
free memory only count the free CMA pages if they are suitable
for satisfying the memory request.

Change-Id: I7f06d53e2d8cfe7439e5561fe6e5209ce73b1c90
Signed-off-by: Liam Mark <lmark@codeaurora.org>
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2018-07-03 18:24:32 -07:00
Isaac J. Manjarres
44984fa144 Merge android-4.14.53 (57c2874) into msm-4.14
* remotes/origin/tmp-57c2874:
  Linux 4.14.53
  xhci: Fix use-after-free in xhci_free_virt_device
  dm thin: handle running out of data space vs concurrent discard
  dm zoned: avoid triggering reclaim from inside dmz_map()
  x86/efi: Fix efi_call_phys_epilog() with CONFIG_X86_5LEVEL=y
  block: Fix cloning of requests with a special payload
  block: Fix transfer when chunk sectors exceeds max
  slub: fix failure when we delete and create a slab cache
  ALSA: hda/realtek - Fix the problem of two front mics on more machines
  ALSA: hda/realtek - Add a quirk for FSC ESPRIMO U9210
  ALSA: hda/realtek - Fix pop noise on Lenovo P50 & co
  ALSA: timer: Fix UBSAN warning at SNDRV_TIMER_IOCTL_NEXT_DEVICE ioctl
  Input: elantech - fix V4 report decoding for module with middle key
  Input: elantech - enable middle button of touchpads on ThinkPad P52
  Input: elan_i2c_smbus - fix more potential stack buffer overflows
  Input: xpad - fix GPD Win 2 controller name
  udf: Detect incorrect directory size
  xen: Remove unnecessary BUG_ON from __unbind_from_irq()
  mm: fix devmem_is_allowed() for sub-page System RAM intersections
  mm/ksm.c: ignore STABLE_FLAG of rmap_item->address in rmap_walk_ksm()
  rbd: flush rbd_dev->watch_dwork after watch is unregistered
  pwm: lpss: platform: Save/restore the ctrl register over a suspend/resume
  Input: elan_i2c - add ELAN0618 (Lenovo v330 15IKB) ACPI ID
  ACPI / LPSS: Add missing prv_offset setting for byt/cht PWM devices
  video: uvesafb: Fix integer overflow in allocation
  NFSv4: Fix a typo in nfs41_sequence_process
  NFSv4: Revert commit 5f83d86cf531d ("NFSv4.x: Fix wraparound issues..")
  NFSv4: Fix possible 1-byte stack overflow in nfs_idmap_read_and_verify_message
  nfsd: restrict rd_maxcount to svc_max_payload in nfsd_encode_readdir
  media: dvb_frontend: fix locking issues at dvb_frontend_get_event()
  media: cx231xx: Add support for AverMedia DVD EZMaker 7
  media: v4l2-compat-ioctl32: prevent go past max size
  media: vsp1: Release buffers for each video node
  perf/x86/intel/uncore: Add event constraint for BDX PCU
  perf vendor events: Add Goldmont Plus V1 event file
  perf intel-pt: Fix packet decoding of CYC packets
  perf intel-pt: Fix "Unexpected indirect branch" error
  perf intel-pt: Fix MTC timing after overflow
  perf intel-pt: Fix decoding to accept CBR between FUP and corresponding TIP
  perf intel-pt: Fix sync_switch INTEL_PT_SS_NOT_TRACING
  perf tools: Fix symbol and object code resolution for vdso32 and vdsox32
  arm: dts: mt7623: fix invalid memory node being generated
  mfd: intel-lpss: Fix Intel Cannon Lake LPSS I2C input clock
  mfd: intel-lpss: Program REMAP register in PIO mode
  backlight: tps65217_bl: Fix Device Tree node lookup
  backlight: max8925_bl: Fix Device Tree node lookup
  backlight: as3711_bl: Fix Device Tree node lookup
  UBIFS: Fix potential integer overflow in allocation
  ubi: fastmap: Correctly handle interrupted erasures in EBA
  ubi: fastmap: Cancel work upon detach
  rpmsg: smd: do not use mananged resources for endpoints and channels
  md: fix two problems with setting the "re-add" device state.
  rtc: sun6i: Fix bit_idx value for clk_register_gate
  clk: at91: PLL recalc_rate() now using cached MUL and DIV values
  linvdimm, pmem: Preserve read-only setting for pmem devices
  scsi: zfcp: fix missing REC trigger trace on enqueue without ERP thread
  scsi: zfcp: fix missing REC trigger trace for all objects in ERP_FAILED
  scsi: zfcp: fix missing REC trigger trace on terminate_rport_io for ERP_FAILED
  scsi: zfcp: fix missing REC trigger trace on terminate_rport_io early return
  scsi: zfcp: fix misleading REC trigger trace where erp_action setup failed
  scsi: zfcp: fix missing SCSI trace for retry of abort / scsi_eh TMF
  scsi: zfcp: fix missing SCSI trace for result of eh_host_reset_handler
  scsi: qla2xxx: Mask off Scope bits in retry delay
  scsi: qla2xxx: Fix setting lower transfer speed if GPSC fails
  scsi: hpsa: disable device during shutdown
  mm: fix __gup_device_huge vs unmap
  iio: sca3000: Fix an error handling path in 'sca3000_probe()'
  iio: adc: ad7791: remove sample freq sysfs attributes
  Btrfs: fix return value on rename exchange failure
  X.509: unpack RSA signatureValue field from BIT STRING
  irqchip/gic-v3-its: Don't bind LPI to unavailable NUMA node
  time: Make sure jiffies_to_msecs() preserves non-zero time periods
  MIPS: io: Add barrier after register read in inX()
  cpufreq: intel_pstate: Fix scaling max/min limits with Turbo 3.0
  pinctrl: devicetree: Fix pctldev pointer overwrite
  pinctrl: samsung: Correct EINTG banks order
  auxdisplay: fix broken menu
  PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume
  PCI: Add ACS quirk for Intel 300 series
  PCI: Add ACS quirk for Intel 7th & 8th Gen mobile
  PCI: hv: Make sure the bus domain is really unique
  MIPS: BCM47XX: Enable 74K Core ExternalSync for PCIe erratum
  mtd: cfi_cmdset_0002: Avoid walking all chips when unlocking.
  mtd: cfi_cmdset_0002: Fix unlocking requests crossing a chip boudary
  mtd: cfi_cmdset_0002: fix SEGV unlocking multiple chips
  mtd: cfi_cmdset_0002: Use right chip in do_ppb_xxlock()
  mtd: cfi_cmdset_0002: Change write buffer to check correct value
  xprtrdma: Return -ENOBUFS when no pages are available
  RDMA/mlx4: Discard unknown SQP work requests
  IB/hfi1: Fix user context tail allocation for DMA_RTAIL
  IB/hfi1: Optimize kthread pointer locking when queuing CQ entries
  IB/hfi1: Reorder incorrect send context disable
  IB/hfi1: Fix fault injection init/exit issues
  IB/isert: fix T10-pi check mask setting
  IB/isert: Fix for lib/dma_debug check_sync warning
  IB/mlx5: Fetch soft WQE's on fatal error state
  IB/core: Make testing MR flags for writability a static inline function
  IB/mlx4: Mark user MR as writable if actual virtual memory is writable
  IB/{hfi1, qib}: Add handling of kernel restart
  IB/qib: Fix DMA api warning with debug kernel
  tpm: fix race condition in tpm_common_write()
  tpm: fix use after free in tpm2_load_context()
  of: platform: stop accessing invalid dev in of_platform_device_destroy
  of: unittest: for strings, account for trailing \0 in property length field
  of: overlay: validate offset from property fixups
  ARM64: dts: meson: disable sd-uhs modes on the libretech-cc
  arm64: mm: Ensure writes to swapper are ordered wrt subsequent cache maintenance
  arm64: kpti: Use early_param for kpti= command-line option
  arm64: Fix syscall restarting around signal suppressed by tracer
  ARM: dts: socfpga: Fix NAND controller node compatible for Arria10
  ARM: dts: socfpga: Fix NAND controller clock supply
  ARM: dts: socfpga: Fix NAND controller node compatible
  ARM: dts: Fix SPI node for Arria10
  ARM: 8764/1: kgdb: fix NUMREGBYTES so that gdb_regs[] is the correct size
  cxl: Disable prefault_mode in Radix mode
  soc: rockchip: power-domain: Fix wrong value when power up pd with writemask
  powerpc/fadump: Unregister fadump on kexec down path.
  cpuidle: powernv: Fix promotion from snooze if next state disabled
  powerpc/powernv/cpuidle: Init all present cpus for deep states
  powerpc/powernv: copy/paste - Mask SO bit in CR
  powerpc/powernv/ioda2: Remove redundant free of TCE pages
  powerpc/ptrace: Fix enforcement of DAWR constraints
  powerpc/perf: Fix memory allocation for core-imc based on num_possible_cpus()
  powerpc/ptrace: Fix setting 512B aligned breakpoints with PTRACE_SET_DEBUGREG
  powerpc/mm/hash: Add missing isync prior to kernel stack SLB switch
  fuse: fix control dir setup and teardown
  fuse: don't keep dead fuse_conn at fuse_fill_super().
  fuse: atomic_o_trunc should truncate pagecache
  fuse: fix congested state leak on aborted connections
  printk: fix possible reuse of va_list variable
  Bluetooth: hci_qca: Avoid missing rampatch failure with userspace fw loader
  ipmi:bt: Set the timeout before doing a capabilities check
  branch-check: fix long->int truncation when profiling branches
  mips: ftrace: fix static function graph tracing
  ftrace/selftest: Have the reset_trigger code be a bit more careful
  lib/vsprintf: Remove atomic-unsafe support for %pCr
  clk: renesas: cpg-mssr: Stop using printk format %pCr
  thermal: bcm2835: Stop using printk format %pCr
  ASoC: cirrus: i2s: Fix {TX|RX}LinCtrlData setup
  ASoC: cirrus: i2s: Fix LRCLK configuration
  ASoC: cs35l35: Add use_single_rw to regmap config
  ASoC: dapm: delete dapm_kcontrol_data paths list before freeing it
  1wire: family module autoload fails because of upper/lower case mismatch.
  usb: do not reset if a low-speed or full-speed device timed out
  PM / OPP: Update voltage in case freq == old_freq
  PM / core: Fix supplier device runtime PM usage counter imbalance
  PM / Domains: Fix error path during attach in genpd
  signal/xtensa: Consistenly use SIGBUS in do_unaligned_user
  serial: sh-sci: Use spin_{try}lock_irqsave instead of open coding version
  m68k/mac: Fix SWIM memory resource end address
  m68k/mm: Adjust VM area to be unmapped by gap size for __iounmap()
  x86: Call fixup_exception() before notify_die() in math_error()
  x86/mce: Do not overwrite MCi_STATUS in mce_no_way_out()
  x86/mce: Fix incorrect "Machine check from unknown source" message
  x86/mce: Check for alternate indication of machine check recovery on Skylake
  x86/mce: Improve error message when kernel cannot recover
  x86/xen: Add call of speculative_store_bypass_ht_init() to PV paths
  x86/spectre_v1: Disable compiler optimizations over array_index_mask_nospec()
  FROMLIST: trace: Reorder display of TGID to be after PID

Change-Id: I2e5135127f9d81a39dc77bc84fa50c76ec0b58af
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2018-07-03 12:43:20 -07:00
Greg Kroah-Hartman
57c28741d0 This is the 4.14.53 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAls7QPEACgkQONu9yGCS
 aT5Zuw//UYR0Hahnjiv61N2NCo5cH+uSOc0XjR/a8iTBHVa5lN459dmrKVUDJKyS
 JrIJjwsaUL5H/VHN/XrdRUQMqo38osQ395t+sVCzVaouaJ0nYlEaxVexI0E87mpk
 zsd7qF0HfgGxOEEVfCcxlwKDzgstSNMP3KWprTZZ/5V04NjPlOXPsNOnKj6PWKTI
 4XCp7OrVQhL5zFQKm0kPok9CHrunjjYpF0pgftKblhdB/RPi0E/XbpLrW5hDxOvY
 MxnzKWKHsbEzV6PJKFNmEvFc4D3/Dm3mDG9aI7fL4FbnSBxkxKrzkAX8HP163Lc1
 cNiwhqo4v2IsfVvuJcV9+toVsg+UHcmPETd02hfhIBnN7lCo56+IBoo2FTsV9BRy
 AIWtwzpBj52j0gXTHhORYRhQqa6Jd/N7+9Aay40avWs8NI1tokOGfgifLoJlbXqE
 spfMZdK1ihiUNav2PmY7WklPlN4OeGGcMKvt0bJ4IY2nprI/oeKEUvAkwC5CVRo+
 w/Qvgp94vJDALWRA7e0dUR2cQMN0Y9ELLCy08KgdzRDTUY5f0xVw9Qz0Swx1Zxgk
 DwD+nxscEzr4n0wKtcLkkt2wu9sS/eUeAAHKFqNKRtHQvgqx0oymgow35pw4XHjt
 04sXUemWUXzR73T55HC960vWBrpu67HbNAyGqlCbiATX63euEDY=
 =YCfp
 -----END PGP SIGNATURE-----

Merge 4.14.53 into android-4.14

Changes in 4.14.53
	x86/spectre_v1: Disable compiler optimizations over array_index_mask_nospec()
	x86/xen: Add call of speculative_store_bypass_ht_init() to PV paths
	x86/mce: Improve error message when kernel cannot recover
	x86/mce: Check for alternate indication of machine check recovery on Skylake
	x86/mce: Fix incorrect "Machine check from unknown source" message
	x86/mce: Do not overwrite MCi_STATUS in mce_no_way_out()
	x86: Call fixup_exception() before notify_die() in math_error()
	m68k/mm: Adjust VM area to be unmapped by gap size for __iounmap()
	m68k/mac: Fix SWIM memory resource end address
	serial: sh-sci: Use spin_{try}lock_irqsave instead of open coding version
	signal/xtensa: Consistenly use SIGBUS in do_unaligned_user
	PM / Domains: Fix error path during attach in genpd
	PM / core: Fix supplier device runtime PM usage counter imbalance
	PM / OPP: Update voltage in case freq == old_freq
	usb: do not reset if a low-speed or full-speed device timed out
	1wire: family module autoload fails because of upper/lower case mismatch.
	ASoC: dapm: delete dapm_kcontrol_data paths list before freeing it
	ASoC: cs35l35: Add use_single_rw to regmap config
	ASoC: cirrus: i2s: Fix LRCLK configuration
	ASoC: cirrus: i2s: Fix {TX|RX}LinCtrlData setup
	thermal: bcm2835: Stop using printk format %pCr
	clk: renesas: cpg-mssr: Stop using printk format %pCr
	lib/vsprintf: Remove atomic-unsafe support for %pCr
	ftrace/selftest: Have the reset_trigger code be a bit more careful
	mips: ftrace: fix static function graph tracing
	branch-check: fix long->int truncation when profiling branches
	ipmi:bt: Set the timeout before doing a capabilities check
	Bluetooth: hci_qca: Avoid missing rampatch failure with userspace fw loader
	printk: fix possible reuse of va_list variable
	fuse: fix congested state leak on aborted connections
	fuse: atomic_o_trunc should truncate pagecache
	fuse: don't keep dead fuse_conn at fuse_fill_super().
	fuse: fix control dir setup and teardown
	powerpc/mm/hash: Add missing isync prior to kernel stack SLB switch
	powerpc/ptrace: Fix setting 512B aligned breakpoints with PTRACE_SET_DEBUGREG
	powerpc/perf: Fix memory allocation for core-imc based on num_possible_cpus()
	powerpc/ptrace: Fix enforcement of DAWR constraints
	powerpc/powernv/ioda2: Remove redundant free of TCE pages
	powerpc/powernv: copy/paste - Mask SO bit in CR
	powerpc/powernv/cpuidle: Init all present cpus for deep states
	cpuidle: powernv: Fix promotion from snooze if next state disabled
	powerpc/fadump: Unregister fadump on kexec down path.
	soc: rockchip: power-domain: Fix wrong value when power up pd with writemask
	cxl: Disable prefault_mode in Radix mode
	ARM: 8764/1: kgdb: fix NUMREGBYTES so that gdb_regs[] is the correct size
	ARM: dts: Fix SPI node for Arria10
	ARM: dts: socfpga: Fix NAND controller node compatible
	ARM: dts: socfpga: Fix NAND controller clock supply
	ARM: dts: socfpga: Fix NAND controller node compatible for Arria10
	arm64: Fix syscall restarting around signal suppressed by tracer
	arm64: kpti: Use early_param for kpti= command-line option
	arm64: mm: Ensure writes to swapper are ordered wrt subsequent cache maintenance
	ARM64: dts: meson: disable sd-uhs modes on the libretech-cc
	of: overlay: validate offset from property fixups
	of: unittest: for strings, account for trailing \0 in property length field
	of: platform: stop accessing invalid dev in of_platform_device_destroy
	tpm: fix use after free in tpm2_load_context()
	tpm: fix race condition in tpm_common_write()
	IB/qib: Fix DMA api warning with debug kernel
	IB/{hfi1, qib}: Add handling of kernel restart
	IB/mlx4: Mark user MR as writable if actual virtual memory is writable
	IB/core: Make testing MR flags for writability a static inline function
	IB/mlx5: Fetch soft WQE's on fatal error state
	IB/isert: Fix for lib/dma_debug check_sync warning
	IB/isert: fix T10-pi check mask setting
	IB/hfi1: Fix fault injection init/exit issues
	IB/hfi1: Reorder incorrect send context disable
	IB/hfi1: Optimize kthread pointer locking when queuing CQ entries
	IB/hfi1: Fix user context tail allocation for DMA_RTAIL
	RDMA/mlx4: Discard unknown SQP work requests
	xprtrdma: Return -ENOBUFS when no pages are available
	mtd: cfi_cmdset_0002: Change write buffer to check correct value
	mtd: cfi_cmdset_0002: Use right chip in do_ppb_xxlock()
	mtd: cfi_cmdset_0002: fix SEGV unlocking multiple chips
	mtd: cfi_cmdset_0002: Fix unlocking requests crossing a chip boudary
	mtd: cfi_cmdset_0002: Avoid walking all chips when unlocking.
	MIPS: BCM47XX: Enable 74K Core ExternalSync for PCIe erratum
	PCI: hv: Make sure the bus domain is really unique
	PCI: Add ACS quirk for Intel 7th & 8th Gen mobile
	PCI: Add ACS quirk for Intel 300 series
	PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume
	auxdisplay: fix broken menu
	pinctrl: samsung: Correct EINTG banks order
	pinctrl: devicetree: Fix pctldev pointer overwrite
	cpufreq: intel_pstate: Fix scaling max/min limits with Turbo 3.0
	MIPS: io: Add barrier after register read in inX()
	time: Make sure jiffies_to_msecs() preserves non-zero time periods
	irqchip/gic-v3-its: Don't bind LPI to unavailable NUMA node
	X.509: unpack RSA signatureValue field from BIT STRING
	Btrfs: fix return value on rename exchange failure
	iio: adc: ad7791: remove sample freq sysfs attributes
	iio: sca3000: Fix an error handling path in 'sca3000_probe()'
	mm: fix __gup_device_huge vs unmap
	scsi: hpsa: disable device during shutdown
	scsi: qla2xxx: Fix setting lower transfer speed if GPSC fails
	scsi: qla2xxx: Mask off Scope bits in retry delay
	scsi: zfcp: fix missing SCSI trace for result of eh_host_reset_handler
	scsi: zfcp: fix missing SCSI trace for retry of abort / scsi_eh TMF
	scsi: zfcp: fix misleading REC trigger trace where erp_action setup failed
	scsi: zfcp: fix missing REC trigger trace on terminate_rport_io early return
	scsi: zfcp: fix missing REC trigger trace on terminate_rport_io for ERP_FAILED
	scsi: zfcp: fix missing REC trigger trace for all objects in ERP_FAILED
	scsi: zfcp: fix missing REC trigger trace on enqueue without ERP thread
	linvdimm, pmem: Preserve read-only setting for pmem devices
	clk: at91: PLL recalc_rate() now using cached MUL and DIV values
	rtc: sun6i: Fix bit_idx value for clk_register_gate
	md: fix two problems with setting the "re-add" device state.
	rpmsg: smd: do not use mananged resources for endpoints and channels
	ubi: fastmap: Cancel work upon detach
	ubi: fastmap: Correctly handle interrupted erasures in EBA
	UBIFS: Fix potential integer overflow in allocation
	backlight: as3711_bl: Fix Device Tree node lookup
	backlight: max8925_bl: Fix Device Tree node lookup
	backlight: tps65217_bl: Fix Device Tree node lookup
	mfd: intel-lpss: Program REMAP register in PIO mode
	mfd: intel-lpss: Fix Intel Cannon Lake LPSS I2C input clock
	arm: dts: mt7623: fix invalid memory node being generated
	perf tools: Fix symbol and object code resolution for vdso32 and vdsox32
	perf intel-pt: Fix sync_switch INTEL_PT_SS_NOT_TRACING
	perf intel-pt: Fix decoding to accept CBR between FUP and corresponding TIP
	perf intel-pt: Fix MTC timing after overflow
	perf intel-pt: Fix "Unexpected indirect branch" error
	perf intel-pt: Fix packet decoding of CYC packets
	perf vendor events: Add Goldmont Plus V1 event file
	perf/x86/intel/uncore: Add event constraint for BDX PCU
	media: vsp1: Release buffers for each video node
	media: v4l2-compat-ioctl32: prevent go past max size
	media: cx231xx: Add support for AverMedia DVD EZMaker 7
	media: dvb_frontend: fix locking issues at dvb_frontend_get_event()
	nfsd: restrict rd_maxcount to svc_max_payload in nfsd_encode_readdir
	NFSv4: Fix possible 1-byte stack overflow in nfs_idmap_read_and_verify_message
	NFSv4: Revert commit 5f83d86cf531d ("NFSv4.x: Fix wraparound issues..")
	NFSv4: Fix a typo in nfs41_sequence_process
	video: uvesafb: Fix integer overflow in allocation
	ACPI / LPSS: Add missing prv_offset setting for byt/cht PWM devices
	Input: elan_i2c - add ELAN0618 (Lenovo v330 15IKB) ACPI ID
	pwm: lpss: platform: Save/restore the ctrl register over a suspend/resume
	rbd: flush rbd_dev->watch_dwork after watch is unregistered
	mm/ksm.c: ignore STABLE_FLAG of rmap_item->address in rmap_walk_ksm()
	mm: fix devmem_is_allowed() for sub-page System RAM intersections
	xen: Remove unnecessary BUG_ON from __unbind_from_irq()
	udf: Detect incorrect directory size
	Input: xpad - fix GPD Win 2 controller name
	Input: elan_i2c_smbus - fix more potential stack buffer overflows
	Input: elantech - enable middle button of touchpads on ThinkPad P52
	Input: elantech - fix V4 report decoding for module with middle key
	ALSA: timer: Fix UBSAN warning at SNDRV_TIMER_IOCTL_NEXT_DEVICE ioctl
	ALSA: hda/realtek - Fix pop noise on Lenovo P50 & co
	ALSA: hda/realtek - Add a quirk for FSC ESPRIMO U9210
	ALSA: hda/realtek - Fix the problem of two front mics on more machines
	slub: fix failure when we delete and create a slab cache
	block: Fix transfer when chunk sectors exceeds max
	block: Fix cloning of requests with a special payload
	x86/efi: Fix efi_call_phys_epilog() with CONFIG_X86_5LEVEL=y
	dm zoned: avoid triggering reclaim from inside dmz_map()
	dm thin: handle running out of data space vs concurrent discard
	xhci: Fix use-after-free in xhci_free_virt_device
	Linux 4.14.53

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-07-03 18:26:32 +02:00
David Rientjes
157ea568b9 mm, oom: remove 3% bonus for CAP_SYS_ADMIN processes
Since the 2.6 kernel, the oom killer has slightly biased away from
CAP_SYS_ADMIN processes by discounting some of its memory usage in
comparison to other processes.

This has always been implicit and nothing exactly relies on the behavior.

Gaurav notices that __task_cred() can dereference a potentially freed
pointer if the task under consideration is exiting because a reference to
the task_struct is not held.

Remove the CAP_SYS_ADMIN bias so that all processes are treated equally.

If any CAP_SYS_ADMIN process would like to be biased against, it is always
allowed to adjust /proc/pid/oom_score_adj.

Change-Id: Ib5aabf6e1669301e9367b2495d26f21924ae7209
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1803071548510.6996@chino.kir.corp.google.com
Signed-off-by: David Rientjes <rientjes@google.com>
Reported-by: Gaurav Kohli <gkohli@codeaurora.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Git-commit: a62ca4dbf28fc5caad697f2603bcd5acbadce330
Git-repo: http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/
Signed-off-by: Gaurav Kohli <gkohli@codeaurora.org>
2018-07-03 05:57:31 -07:00
Mikulas Patocka
804a0db743 slub: fix failure when we delete and create a slab cache
commit d50d82faa0c964e31f7a946ba8aba7c715ca7ab0 upstream.

In kernel 4.17 I removed some code from dm-bufio that did slab cache
merging (commit 21bb13276768: "dm bufio: remove code that merges slab
caches") - both slab and slub support merging caches with identical
attributes, so dm-bufio now just calls kmem_cache_create and relies on
implicit merging.

This uncovered a bug in the slub subsystem - if we delete a cache and
immediatelly create another cache with the same attributes, it fails
because of duplicate filename in /sys/kernel/slab/.  The slub subsystem
offloads freeing the cache to a workqueue - and if we create the new
cache before the workqueue runs, it complains because of duplicate
filename in sysfs.

This patch fixes the bug by moving the call of kobject_del from
sysfs_slab_remove_workfn to shutdown_cache.  kobject_del must be called
while we hold slab_mutex - so that the sysfs entry is deleted before a
cache with the same attributes could be created.

Running device-mapper-test-suite with:

  dmtest run --suite thin-provisioning -n /commit_failure_causes_fallback/

triggered:

  Buffer I/O error on dev dm-0, logical block 1572848, async page read
  device-mapper: thin: 253:1: metadata operation 'dm_pool_alloc_data_block' failed: error = -5
  device-mapper: thin: 253:1: aborting current metadata transaction
  sysfs: cannot create duplicate filename '/kernel/slab/:a-0000144'
  CPU: 2 PID: 1037 Comm: kworker/u48:1 Not tainted 4.17.0.snitm+ #25
  Hardware name: Supermicro SYS-1029P-WTR/X11DDW-L, BIOS 2.0a 12/06/2017
  Workqueue: dm-thin do_worker [dm_thin_pool]
  Call Trace:
   dump_stack+0x5a/0x73
   sysfs_warn_dup+0x58/0x70
   sysfs_create_dir_ns+0x77/0x80
   kobject_add_internal+0xba/0x2e0
   kobject_init_and_add+0x70/0xb0
   sysfs_slab_add+0xb1/0x250
   __kmem_cache_create+0x116/0x150
   create_cache+0xd9/0x1f0
   kmem_cache_create_usercopy+0x1c1/0x250
   kmem_cache_create+0x18/0x20
   dm_bufio_client_create+0x1ae/0x410 [dm_bufio]
   dm_block_manager_create+0x5e/0x90 [dm_persistent_data]
   __create_persistent_data_objects+0x38/0x940 [dm_thin_pool]
   dm_pool_abort_metadata+0x64/0x90 [dm_thin_pool]
   metadata_operation_failed+0x59/0x100 [dm_thin_pool]
   alloc_data_block.isra.53+0x86/0x180 [dm_thin_pool]
   process_cell+0x2a3/0x550 [dm_thin_pool]
   do_worker+0x28d/0x8f0 [dm_thin_pool]
   process_one_work+0x171/0x370
   worker_thread+0x49/0x3f0
   kthread+0xf8/0x130
   ret_from_fork+0x35/0x40
  kobject_add_internal failed for :a-0000144 with -EEXIST, don't try to register things with the same name in the same directory.
  kmem_cache_create(dm_bufio_buffer-16) failed with error -17

Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1806151817130.6333@file01.intranet.prod.int.rdu2.redhat.com
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-by: Mike Snitzer <snitzer@redhat.com>
Tested-by: Mike Snitzer <snitzer@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:25:04 +02:00
Jia He
6f23028480 mm/ksm.c: ignore STABLE_FLAG of rmap_item->address in rmap_walk_ksm()
commit 1105a2fc022f3c7482e32faf516e8bc44095f778 upstream.

In our armv8a server(QDF2400), I noticed lots of WARN_ON caused by
PAGE_SIZE unaligned for rmap_item->address under memory pressure
tests(start 20 guests and run memhog in the host).

  WARNING: CPU: 4 PID: 4641 at virt/kvm/arm/mmu.c:1826 kvm_age_hva_handler+0xc0/0xc8
  CPU: 4 PID: 4641 Comm: memhog Tainted: G        W 4.17.0-rc3+ #8
  Call trace:
   kvm_age_hva_handler+0xc0/0xc8
   handle_hva_to_gpa+0xa8/0xe0
   kvm_age_hva+0x4c/0xe8
   kvm_mmu_notifier_clear_flush_young+0x54/0x98
   __mmu_notifier_clear_flush_young+0x6c/0xa0
   page_referenced_one+0x154/0x1d8
   rmap_walk_ksm+0x12c/0x1d0
   rmap_walk+0x94/0xa0
   page_referenced+0x194/0x1b0
   shrink_page_list+0x674/0xc28
   shrink_inactive_list+0x26c/0x5b8
   shrink_node_memcg+0x35c/0x620
   shrink_node+0x100/0x430
   do_try_to_free_pages+0xe0/0x3a8
   try_to_free_pages+0xe4/0x230
   __alloc_pages_nodemask+0x564/0xdc0
   alloc_pages_vma+0x90/0x228
   do_anonymous_page+0xc8/0x4d0
   __handle_mm_fault+0x4a0/0x508
   handle_mm_fault+0xf8/0x1b0
   do_page_fault+0x218/0x4b8
   do_translation_fault+0x90/0xa0
   do_mem_abort+0x68/0xf0
   el0_da+0x24/0x28

In rmap_walk_ksm, the rmap_item->address might still have the
STABLE_FLAG, then the start and end in handle_hva_to_gpa might not be
PAGE_SIZE aligned.  Thus it will cause exceptions in handle_hva_to_gpa
on arm64.

This patch fixes it by ignoring (not removing) the low bits of address
when doing rmap_walk_ksm.

IMO, it should be backported to stable tree.  the storm of WARN_ONs is
very easy for me to reproduce.  More than that, I watched a panic (not
reproducible) as follows:

  page:ffff7fe003742d80 count:-4871 mapcount:-2126053375 mapping: (null) index:0x0
  flags: 0x1fffc00000000000()
  raw: 1fffc00000000000 0000000000000000 0000000000000000 ffffecf981470000
  raw: dead000000000100 dead000000000200 ffff8017c001c000 0000000000000000
  page dumped because: nonzero _refcount
  CPU: 29 PID: 18323 Comm: qemu-kvm Tainted: G W 4.14.15-5.hxt.aarch64 #1
  Hardware name: <snip for confidential issues>
  Call trace:
    dump_backtrace+0x0/0x22c
    show_stack+0x24/0x2c
    dump_stack+0x8c/0xb0
    bad_page+0xf4/0x154
    free_pages_check_bad+0x90/0x9c
    free_pcppages_bulk+0x464/0x518
    free_hot_cold_page+0x22c/0x300
    __put_page+0x54/0x60
    unmap_stage2_range+0x170/0x2b4
    kvm_unmap_hva_handler+0x30/0x40
    handle_hva_to_gpa+0xb0/0xec
    kvm_unmap_hva_range+0x5c/0xd0

I even injected a fault on purpose in kvm_unmap_hva_range by seting
size=size-0x200, the call trace is similar as above.  So I thought the
panic is similarly caused by the root cause of WARN_ON.

Andrea said:

: It looks a straightforward safe fix, on x86 hva_to_gfn_memslot would
: zap those bits and hide the misalignment caused by the low metadata
: bits being erroneously left set in the address, but the arm code
: notices when that's the last page in the memslot and the hva_end is
: getting aligned and the size is below one page.
:
: I think the problem triggers in the addr += PAGE_SIZE of
: unmap_stage2_ptes that never matches end because end is aligned but
: addr is not.
:
: 	} while (pte++, addr += PAGE_SIZE, addr != end);
:
: x86 again only works on hva_start/hva_end after converting it to
: gfn_start/end and that being in pfn units the bits are zapped before
: they risk to cause trouble.

Jia He said:

: I've tested by myself in arm64 server (QDF2400,46 cpus,96G mem) Without
: this patch, the WARN_ON is very easy for reproducing.  After this patch, I
: have run the same benchmarch for a whole day without any WARN_ONs

Link: http://lkml.kernel.org/r/1525403506-6750-1-git-send-email-hejianet@gmail.com
Signed-off-by: Jia He <jia.he@hxt-semitech.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Tested-by: Jia He <hejianet@gmail.com>
Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:25:03 +02:00
Dan Williams
2d329968a8 mm: fix __gup_device_huge vs unmap
commit a9b6de77b1a3ff729f7bfc54b2e17711776a416c upstream.

get_user_pages_fast() for device pages is missing the typical validation
that all page references have been taken while the mapping was valid.
Without this validation truncate operations can not reliably coordinate
against new page reference events like O_DIRECT.

Cc: <stable@vger.kernel.org>
Fixes: 3565fce3a659 ("mm, x86: get_user_pages() for dax mappings")
Reported-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:24:57 +02:00
qctecmdr Service
b2f1f26ad0 Merge "ion: invalidate the pool pointers after free" 2018-06-29 19:34:49 -07:00
Minchan Kim
08d7320937 mm: disable fault around on emulated access bit architecture
fault_around aims to reduce minor faults of file-backed pages via
speculative ahead pte mapping and relying on readahead logic.  However,
on non-HW access bit architecture the benefit is highly limited because
they should emulate the young bit with minor faults for reclaim's page
aging algorithm.  IOW, we cannot reduce minor faults on those
architectures.

I did quick a test on my ARM machine.

512M file mmap sequential every word read on eSATA drive 4 times.
stddev is stable.

  = fault_around 4096 =
  elapsed time(usec): 6747645

  = fault_around 65536 =
  elapsed time(usec): 6709263

  0.5% gain.

Even when I tested it with eMMC there is no gain because I guess with
slow storage the major fault is the dominant factor.

Also, fault_around has the side effect of shrinking slab more
aggressively and causes higher vmpressure, so if such speculation fails,
it can evict slab more which can result in page I/O (e.g., inode cache).
In the end, it would make void any benefit of fault_around.

So let's make the default "disabled" on those architectures.

Change-Id: I5e6b74943c95f6779b3a6e463b4d0a8b27eaac01
Link: http://lkml.kernel.org/r/20160518014229.GB21538@bbox
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Git-commit: d0834a6c2c5b0c76cfb806bd7dba6556d8b4edbb
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
[vinmenon@codeaurora.org: trivial merge conflict fixes]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2018-06-28 23:22:08 -07:00
Vinayak Menon
ad50bf626a mm: make faultaround produce old ptes
Based on Kirill's patch [1].

Currently, faultaround code produces young pte.  This can screw up
vmscan behaviour[2], as it makes vmscan think that these pages are hot
and not push them out on first round.

During sparse file access faultaround gets more pages mapped and all of
them are young. Under memory pressure, this makes vmscan swap out anon
pages instead, or to drop other page cache pages which otherwise stay
resident.

Modify faultaround to produce old ptes if sysctl 'want_old_faultaround_pte'
is set, so they can easily be reclaimed under memory pressure.

This can to some extend defeat the purpose of faultaround on machines
without hardware accessed bit as it will not help us with reducing the
number of minor page faults.

Making the faultaround ptes old results in a unixbench regression for some
architectures [3][4]. But on some architectures like arm64 it is not found
to cause any regression.

unixbench shell8 scores on arm64 v8.2 hardware with CONFIG_ARM64_HW_AFDBM
enabled  (5 runs min, max, avg):
Base: (741,748,744)
With this patch: (739,748,743)

So by default produce young ptes and provide a sysctl option to make the
ptes old.

[1] https://marc.info/?l=linux-mm&m=146348837703148
[2] https://lkml.org/lkml/2016/4/18/612
[3] https://marc.info/?l=linux-kernel&m=146582237922378&w=2
[4] https://marc.info/?l=linux-mm&m=146589376909424&w=2

Change-Id: I193185cc953bc33a44fc24963a9df9e555906d95
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Patch-mainline: linux-mm @ Fri, 19 Jan 2018 17:24:54
[vinmenon@codeaurora.org: enable by default since arm works well
with old fault_around ptes + edit the links in commit message to
fix checkpatch issues]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2018-06-28 22:34:46 -07:00
qctecmdr Service
2d21e5e538 Merge "add documentation about reclaim knob on proc.txt" 2018-06-28 21:55:27 -07:00
Isaac J. Manjarres
d46b5c945c Merge android-4.14.52 (08850d5) into msm-4.14
* remotes/origin/tmp-08850d5:
  Linux 4.14.52
  mm, page_alloc: do not break __GFP_THISNODE by zonelist reset
  fs/binfmt_misc.c: do not allow offset overflow
  vhost: fix info leak due to uninitialized memory
  HID: wacom: Correct logical maximum Y for 2nd-gen Intuos Pro large
  HID: intel_ish-hid: ipc: register more pm callbacks to support hibernation
  orangefs: report attributes_mask and attributes for statx
  orangefs: set i_size on new symlink
  iwlwifi: fw: harden page loading code
  x86/intel_rdt: Enable CMT and MBM on new Skylake stepping
  w1: mxc_w1: Enable clock before calling clk_get_rate() on it
  libata: Drop SanDisk SD7UB3Q*G1001 NOLPM quirk
  libata: zpodd: small read overflow in eject_tray()
  cpufreq: governors: Fix long idle detection logic in load calculation
  cpufreq: Fix new policy initialization during limits updates via sysfs
  bdi: Move cgroup bdi_writeback to a dedicated low concurrency workqueue
  blk-mq: reinit q->tag_set_list entry only after grace period
  nbd: use bd_set_size when updating disk size
  nbd: update size when connected
  nbd: fix nbd device deletion
  cifs: For SMB2 security informaion query, check for minimum sized security descriptor instead of sizeof FileAllInformation class
  CIFS: 511c54a2f69195b28afb9dd119f03787b1625bb4 adds a check for session expiry
  smb3: on reconnect set PreviousSessionId field
  smb3: fix various xid leaks
  x86/MCE: Fix stack out-of-bounds write in mce-inject.c: Flags_read()
  ALSA: hda: add dock and led support for HP ProBook 640 G4
  ALSA: hda: add dock and led support for HP EliteBook 830 G5
  ALSA: hda - Handle kzalloc() failure in snd_hda_attach_pcm_stream()
  ALSA: hda/conexant - Add fixup for HP Z2 G4 workstation
  ALSA: hda/realtek - Enable mic-mute hotkey for several Lenovo AIOs
  btrfs: scrub: Don't use inode pages for device replace
  btrfs: return error value if create_io_em failed in cow_file_range
  Btrfs: fix memory and mount leak in btrfs_ioctl_rm_dev_v2()
  Btrfs: fix clone vs chattr NODATASUM race
  driver core: Don't ignore class_dir_create_and_add() failure.
  ext4: fix fencepost error in check for inode count overflow during resize
  ext4: correctly handle a zero-length xattr with a non-zero e_value_offs
  ext4: bubble errors from ext4_find_inline_data_nolock() up to ext4_iget()
  ext4: do not allow external inodes for inline data
  ext4: update mtime in ext4_punch_hole even if no blocks are released
  ext4: fix hole length detection in ext4_ind_map_blocks()
  NFSv4.1: Fix up replays of interrupted requests
  tls: fix use-after-free in tls_push_record
  hv_netvsc: Fix a network regression after ifdown/ifup
  net: in virtio_net_hdr only add VLAN_HLEN to csum_start if payload holds vlan
  udp: fix rx queue len reported by diag and proc interface
  socket: close race condition between sock_close() and sockfs_setattr()
  tcp: verify the checksum of the first data segment in a new connection
  net/sched: act_simple: fix parsing of TCA_DEF_DATA
  net: dsa: add error handling for pskb_trim_rcsum
  ipv6: allow PMTU exceptions to local routes
  cdc_ncm: avoid padding beyond end of skb
  bonding: re-evaluate force_primary when the primary slave name changes
  ANDROID: sdcardfs: fix potential crash when reserved_mb is not zero
  ANDROID: xt_qtaguid: Remove unnecessary null checks to device's name
  ANDROID: Add kconfig to make dm-verity check_at_most_once default enabled

Conflicts:
	net/netfilter/xt_qtaguid.c

Change-Id: I5c94ff8a691b9d84899d7863fbd309aa41c5c338
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2018-06-28 09:55:21 -07:00
Isaac J. Manjarres
bbea3fef30 Merge android-4.14.51 (a51b40c) into msm-4.14
* remotes/origin/tmp-a51b40c:
  Linux 4.14.51
  tcp: do not overshoot window_clamp in tcp_rcv_space_adjust()
  Btrfs: make raid6 rebuild retry more
  Btrfs: fix scrub to repair raid6 corruption
  Revert "Btrfs: fix scrub to repair raid6 corruption"
  ARM: kexec: fix kdump register saving on panic()
  ARM: 8758/1: decompressor: restore r1 and r2 just before jumping to the kernel
  ARM: 8753/1: decompressor: add a missing parameter to the addruart macro
  efi/libstub/arm64: Handle randomized TEXT_OFFSET
  parisc: Move setup_profiling_timer() out of init section
  sched/deadline: Make the grub_reclaim() function static
  sched/debug: Move the print_rt_rq() and print_dl_rq() declarations to kernel/sched/sched.h
  drm/dumb-buffers: Integer overflow in drm_mode_create_ioctl()
  locking/percpu-rwsem: Annotate rwsem ownership transfer by setting RWSEM_OWNER_UNKNOWN
  locking/rwsem: Add a new RWSEM_ANONYMOUSLY_OWNED flag
  clk: imx6ull: use OSC clock during AXI rate change
  ARM: davinci: board-dm646x-evm: set VPIF capture card name
  ARM: davinci: board-dm646x-evm: pass correct I2C adapter id for VPIF
  ARM: davinci: dm646x: fix timer interrupt generation
  i2c: viperboard: return message count on master_xfer success
  i2c: pmcmsp: fix error return from master_xfer
  i2c: pmcmsp: return message count on master_xfer success
  ARM: keystone: fix platform_domain_notifier array overrun
  usb: musb: fix remote wakeup racing with suspend
  afs: Fix the non-encryption of calls
  mtd: Fix comparison in map_word_andequal()
  x86/pkeys/selftests: Add a test for pkey 0
  x86/pkeys/selftests: Save off 'prot' for allocations
  x86/pkeys/selftests: Fix pointer math
  x86/pkeys/selftests: Fix pkey exhaustion test off-by-one
  x86/pkeys/selftests: Add PROT_EXEC test
  x86/pkeys/selftests: Factor out "instruction page"
  x86/pkeys/selftests: Allow faults on unknown keys
  x86/pkeys/selftests: Remove dead debugging code, fix dprint_in_signal
  x86/pkeys/selftests: Stop using assert()
  x86/pkeys/selftests: Give better unexpected fault error messages
  x86/selftests: Add mov_to_ss test
  x86/mpx/selftests: Adjust the self-test to fresh distros that export the MPX ABI
  x86/pkeys/selftests: Adjust the self-test to fresh distros that export the pkeys ABI
  objtool, kprobes/x86: Sync the latest <asm/insn.h> header with tools/objtool/arch/x86/include/asm/insn.h
  uprobes/x86: Prohibit probing on MOV SS instruction
  kprobes/x86: Prohibit probing on exception masking instructions
  ocfs2: take inode cluster lock before moving reflinked inode from orphan dir
  proc/kcore: don't bounds check against address 0
  init: fix false positives in W+X checking
  net sched actions: fix invalid pointer dereferencing if skbedit flags missing
  ixgbe: return error on unsupported SFP module when resetting
  x86: Delay skip of emulated hypercall instruction
  KVM: Extend MAX_IRQ_ROUTES to 4096 for all archs
  rxrpc: Fix the min security level for kernel calls
  rxrpc: Fix error reception on AF_INET6 sockets
  qede: Fix gfp flags sent to rdma event node allocation
  qed: Fix l2 initializations over iWARP personality
  tipc: eliminate KMSAN uninit-value in strcmp complaint
  agp: uninorth: make two functions static
  cifs: smb2ops: Fix listxattr() when there are no EAs
  arm64: Add MIDR encoding for NVIDIA CPUs
  can: dev: increase bus-off message severity
  net: aquantia: driver should correctly declare vlan_features bits
  x86/xen: Reset VCPU0 info pointer after shared_info remap
  mac80211: use timeout from the AddBA response instead of the request
  ARM: dts: cygnus: fix irq type for arm global timer
  driver core: add __printf verification to __ata_ehi_pushv_desc
  drm/omap: handle alloc failures in omap_connector
  drm/omap: check return value from soc_device_match
  drm/omap: fix possible NULL ref issue in tiler_reserve_2d
  drm/omap: fix uninitialized ret variable
  drm/omap: silence unititialized variable warning
  mac80211: Adjust SAE authentication timeout
  tee: check shm references are consistent in offset/size
  sh: fix build failure for J2 cpu with SMP disabled
  sched/core: Introduce set_special_state()
  spi: bcm2835aux: ensure interrupts are enabled for shared handler
  RDMA/cma: Do not query GID during QP state transition to RTR
  IB/hfi1: Fix memory leak in exception path in get_irq_affinity()
  IB/hfi1 Use correct type for num_user_context
  smc: fix sendpage() call
  ARM: OMAP1: ams-delta: fix deferred_fiq handler
  nvme: Set integrity flag for user passthrough commands
  nvme: fix potential memory leak in option parsing
  iommu/vt-d: fix shift-out-of-bounds in bug checking
  arm64: tegra: Make BCM89610 PHY interrupt as active low
  kthread, sched/wait: Fix kthread_parkme() wait-loop
  stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock
  parisc: drivers.c: Fix section mismatches
  bpf, x64: fix memleak when not converging after image
  scsi: vmw-pvscsi: return DID_BUS_BUSY for adapter-initated aborts
  hexagon: export csum_partial_copy_nocheck
  hexagon: add memset_io() helper
  Input: atmel_mxt_ts - fix the firmware update
  ARM: dts: logicpd-som-lv: Fix Audio Mute
  ARM: dts: logicpd-som-lv: Fix WL127x Startup Issues
  ARM: OMAP2+: powerdomain: use raw_smp_processor_id() for trace
  dt-bindings: panel: lvds: Fix path to display timing bindings
  ARM: davinci: board-dm355-evm: fix broken networking
  ARM: davinci: board-omapl138-hawk: fix GPIO numbers for MMC/SD lookup
  ARM: davinci: board-da850-evm: fix GPIO lookup for MMC/SD
  ARM: davinci: board-da830-evm: fix GPIO lookup for MMC/SD
  IB/core: Make ib_mad_client_id atomic
  <linux/stringhash.h>: fix end_name_hash() for 64bit long
  IB/rxe: avoid double kfree_skb
  IB/rxe: add RXE_START_MASK for rxe_opcode IB_OPCODE_RC_SEND_ONLY_INV
  RDMA/iwpm: fix memory leak on map_info
  RDMA/cma: Fix use after destroy access to net namespace for IPoIB
  IB/uverbs: Fix validating mandatory attributes
  IB: make INFINIBAND_ADDR_TRANS configurable
  ib_srp: depend on INFINIBAND_ADDR_TRANS
  ib_srpt: depend on INFINIBAND_ADDR_TRANS
  nvmet-rdma: depend on INFINIBAND_ADDR_TRANS
  nvme: depend on INFINIBAND_ADDR_TRANS
  tipc: fix bug in function tipc_nl_node_dump_monitor
  i2c: sprd: Fix the i2c count issue
  i2c: sprd: Prevent i2c accesses after suspend is called
  bpf: fix uninitialized variable in bpf tools
  x86/cpu/intel: Add missing TLB cpuid values
  ata: ahci: mvebu: override ahci_stop_engine for mvebu AHCI
  libahci: Allow drivers to override stop_engine
  KVM: arm/arm64: vgic: fix possible spectre-v1 in vgic_mmio_read_apr()
  arm64: fix possible spectre-v1 in ptrace_hbp_get_event()
  blk-mq: fix sysfs inflight counter
  HID: intel-ish-hid: use put_device() instead of kfree()
  rpmsg: added MODULE_ALIAS for rpmsg_char
  remoteproc: qcom: Fix potential device node leaks
  perf/x86/intel: Don't enable freeze-on-smi for PerfMon V1
  rds: ib: Fix missing call to rds_ib_dev_put in rds_ib_setup_qp
  selftests: ftrace: Add a testcase for multiple actions on trigger
  HID: wacom: Release device resource data obtained by devres_alloc()
  HID: lenovo: Add support for IBM/Lenovo Scrollpoint mice
  arm64: ptrace: remove addr_limit manipulation
  net: ethtool: Add missing kernel doc for FEC parameters
  thermal: int3403_thermal: Fix NULL pointer deref on module load / probe
  drm/amdkfd: fix clock counter retrieval for node without GPU
  ACPI / watchdog: Prefer iTCO_wdt on Lenovo Z50-70
  ARM: dts: da850: fix W=1 warnings with pinmux node
  net: phy: marvell: clear wol event before setting it
  powerpc/powernv/memtrace: Let the arch hotunplug code flush cache
  dt-bindings: meson-uart: DT fix s/clocks-names/clock-names/
  ACPI / PM: Blacklist Low Power S0 Idle _DSM for ThinkPad X1 Tablet(2016)
  usb: typec: ucsi: fix tracepoint related build error
  mm: memcg: add __GFP_NOWARN in __memcg_schedule_kmem_cache_create()
  kexec_file: do not add extra alignment to efi memmap
  proc: revalidate kernel thread inodes to root:root
  mm, pagemap: fix swap offset value for PMD migration entry
  scsi: isci: Fix infinite loop in while loop
  scsi: storvsc: Set up correct queue depth values for IDE devices
  parisc: time: Convert read_persistent_clock() to read_persistent_clock64()
  vfs: Undo an overly zealous MS_RDONLY -> SB_RDONLY conversion
  net: hns: Avoid action name truncation
  blkcg: init root blkcg_gq under lock
  drm/msm: don't deref error pointer in the msm_fbdev_create error path
  drm/msm/dsi: use correct enum in dsi_get_cmd_fmt
  drm/msm: Fix possible null dereference on failure of get_pages()
  ASoC: msm8916-wcd-analog: use threaded context for mbhc events
  netfilter: nf_tables: fix out-of-bounds in nft_chain_commit_update
  netfilter: nf_tables: NAT chain and extensions require NF_TABLES
  scsi: target: fix crash with iscsi target and dvd
  scsi: megaraid_sas: Do not log an error if FW successfully initializes.
  scsi: iscsi: respond to netlink with unicast when appropriate
  tipc: fix infinite loop when dumping link monitor summary
  blkcg: don't hold blkcg lock when deactivating policy
  spi: cadence: Add usleep_range() for cdns_spi_fill_tx_fifo()
  ASoC: topology: Check widget kcontrols before deref.
  xen: xenbus_dev_frontend: Really return response string
  ASoC: topology: Fix bugs of freeing soc topology
  PCI: kirin: Fix reset gpio name
  soc: bcm2835: Make !RASPBERRYPI_FIRMWARE dummies return failure
  soc: bcm: raspberrypi-power: Fix use of __packed
  eCryptfs: don't pass up plaintext names when using filename encryption
  ASoC: rt5514: Add the missing register in the readable table
  clk: honor CLK_MUX_ROUND_CLOSEST in generic clk mux
  dt-bindings: dmaengine: rcar-dmac: document R8A77965 support
  dt-bindings: serial: sh-sci: Add support for r8a77965 (H)SCIF
  dt-bindings: pinctrl: sunxi: Fix reference to driver
  doc: Add vendor prefix for Kieback & Peter GmbH
  spi: sh-msiof: Fix bit field overflow writes to TSCR/RSCR
  MIPS: dts: Boston: Fix PCI bus dtc warnings:
  isofs: fix potential memory leak in mount option parsing
  s390/smsgiucv: disable SMSG on module unload
  MIPS: io: Add barrier after register read in readX()
  fsnotify: fix ignore mask logic in send_to_group()
  perf report: Fix switching to another perf.data file
  nfp: ignore signals when communicating with management FW
  MIPS: io: Prevent compiler reordering writeX()
  x86: Add check for APIC access address for vmentry of L2 guests
  KVM: X86: fix incorrect reference of trace_kvm_pi_irte_update
  Input: synaptics-rmi4 - fix an unchecked out of memory error path
  clocksource/drivers/imx-tpm: Correct some registers operation flow

  stop_machine: Disable preemption when waking two stopper threads

  When cpu_stop_queue_two_works() begins to wake the stopper
  threads, it does so without preemption disabled, which leads
  to the following race condition:

  The source CPU calls cpu_stop_queue_two_works(), with cpu1
  as the source CPU, and cpu2 as the destination CPU. When
  adding the stopper threads to the wake queue used in this
  function, the source CPU stopper thread is added first,
  and the destination CPU stopper thread is added last.

  When wake_up_q() is invoked to wake the stopper threads, the
  threads are woken up in the order that they are queued in,
  so the source CPU's stopper thread is woken up first, and
  it preempts the thread running on the source CPU.

  The stopper thread will then execute on the source CPU,
  disable preemption, and begin executing multi_cpu_stop()
  and wait for an ack from the destination CPU's stopper thread,
  with preemption still disabled. Since the worker thread that
  woke up the stopper thread on the source CPU is affine to the
  source CPU, and preemption is disabled on the source CPU, that
  thread will never run to dequeue the destination CPU's stopper
  thread from the wake queue, and thus, the destination CPU's
  stopper thread will never run, causing the source CPU's stopper
  thread to wait forever, and stall.

  Disable preemption when waking the stopper threads in
  cpu_stop_queue_two_works() to ensure that the worker thread
  that is waking up the stopper threads isn't preempted
  by the source CPU's stopper thread, and permanently
  scheduled out, leaving the remaining stopper thread asleep
  in the wake queue.

Conflicts:
	drivers/gpu/drm/msm/msm_gem.c
	include/linux/sched.h
	kernel/kthread.c

Change-Id: I177cb8516cdfe50d61cb948ed342d330e61376a1
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2018-06-28 09:30:40 -07:00
qctecmdr Service
26ca81d660 Merge "Revert "mm: memory: reduce fault_around_bytes"" 2018-06-26 17:37:43 -07:00
Vinayak Menon
0b4f73c9b4 Revert "mm: memory: reduce fault_around_bytes"
This reverts commit 23f8a2f5f2e0.

fault_around_bytes were reduced as it was found to cause reclaim
issues. The reclaim issues were mainly because of faultaround
producing young ptes. The following patches will make faultaround
produce old ptes. Thus revert the fault_around_bytes to its
original value.

Change-Id: If9a385eead5a5d7a0d40fed41ddf4753e15d2998
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2018-06-26 13:23:12 +05:30
Minchan Kim
62478c622a mm: Support address range reclaim
This patch adds address range reclaim of a process.
The requirement is following as,

Like webkit1, it uses a address space for handling multi tabs.
IOW, it uses *one* process model so all tabs shares address space
of the process. In such scenario, per-process reclaim is rather
coarse-grained so this patch supports more fine-grained reclaim
for being able to reclaim target address range of the process.
For reclaim target range, you should use following format.

	echo [addr] [size-byte] > /proc/pid/reclaim
The addr should be page-aligned.

So now reclaim konb's interface is following as.

echo file > /proc/pid/reclaim
	reclaim file-backed pages only
echo anon > /proc/pid/reclaim
	reclaim anonymous pages only
echo all > /proc/pid/reclaim
	reclaim all pages
echo 0x100000 8K > /proc/pid/reclaim
	reclaim pages in (0x100000 - 0x102000)

Change-Id: I111131d31be1cfcfa246617b634a9a8bc4078098
Signed-off-by: Minchan Kim <minchan@kernel.org>
Patch-mainline: linux-mm @ 9 May 2013 08:39:01
[vinmenon@codeaurora.org: trivial merge conflict fixes]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2018-06-26 11:18:05 +05:30
Minchan Kim
434fe4e62b mm: Enhance per process reclaim to consider shared pages
Some pages could be shared by several processes. (ex, libc)
In case of that, it's too bad to reclaim them from the beginnig.

This patch causes VM to keep them on memory until last task
try to reclaim them so shared pages will be reclaimed only if
all of task has gone swapping out.

This feature doesn't handle non-linear mapping on ramfs because
it's very time-consuming and doesn't make sure of reclaiming and
not common.

Change-Id: I7e5f34f2e947f5db6d405867fe2ad34863ca40f7
Signed-off-by: Sangseok Lee <sangseok.lee@lge.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Patch-mainline: linux-mm @ 9 May 2013 16:21:27
[vinmenon@codeaurora.org: trivial merge conflict fixes + changes
to make the patch work with 4.14 kernel + Avoid targetted reclaim
from ksm since vma_address(page) is not valid for a ksm page]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2018-06-26 11:18:04 +05:30
Minchan Kim
fc50cf4478 mm: Remove shrink_page
By previous patch, shrink_page_list can handle pages from
multiple zone so let's remove shrink_page.

Change-Id: I3526377aa6ee6142b8f3ec63396e7ada1e442505
Signed-off-by: Minchan Kim <minchan@kernel.org>
Patch-mainline: linux-mm @ 22 Apr 2013 17:45:03
[vinmenon@codeaurora.org: trivial merge conflict fixes]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2018-06-26 11:18:04 +05:30
Minchan Kim
54d3ac19a0 mm: make shrink_page_list with pages work from multiple zones
Shrink_page_list expects all pages come from a same zone
but it's too limited to use.

This patch removes the dependency so next patch can use
shrink_page_list with pages from multiple zones.

Change-Id: I34469b7f0a79f2b79e30e40033ba8b3e1dd5f2d0
Signed-off-by: Minchan Kim <minchan@kernel.org>
Patch-mainline: linux-mm @ 9 May 2013 16:21:25
[vinmenon@codeaurora.org: changes for node based lrus.
shrink_page_list expects all pages come from same node.]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2018-06-26 11:18:03 +05:30
Minchan Kim
e6caa6025c mm: Per process reclaim
These day, there are many platforms available in the embedded market
and they are smarter than kernel which has very limited information
about working set so they want to involve memory management more heavily
like android's lowmemory killer and ashmem or recent many lowmemory
notifier.

One of the simple imagine scenario about userspace's intelligence is that
platform can manage tasks as forground and background so it would be
better to reclaim background's task pages for end-user's *responsibility*
although it has frequent referenced pages.

This patch adds new knob "reclaim under proc/<pid>/" so task manager
can reclaim any target process anytime, anywhere. It could give another
method to platform for using memory efficiently.

It can avoid process killing for getting free memory, which was really
terrible experience because I lost my best score of game I had ever
after I switch the phone call while I enjoyed the game.

Reclaim file-backed pages only.
	echo file > /proc/PID/reclaim
Reclaim anonymous pages only.
	echo anon > /proc/PID/reclaim
Reclaim all pages
	echo all > /proc/PID/reclaim

Change-Id: Iabdb7bc2ef3dc4d94e3ea005fbe18f4cd06739ab
Signed-off-by: Minchan Kim <minchan@kernel.org>
Patch-mainline: linux-mm @ 9 May 2013 16:21:24
[vinmenon@codeaurora.org: trivial merge conflict fixes,
and minor tweak of the commit msg]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2018-06-26 11:18:03 +05:30
Minchan Kim
16ef91359c mm: prevent to write out dirty page in CMA by may_writepage
Now, local variable references in shrink_page_list is
PAGEREF_RECLAIM_CLEAN as default. It is for preventing to reclaim
dirty pages when CMA try to migrate pages.
Strictly speaking, we don't need it because CMA already didn't allow
to write out by .may_writepage = 0 in reclaim_clean_pages_from_list.

Morever, it has a problem to prevent anonymous pages's swap out when
we use force_reclaim = true in shrink_page_list(ex, per process reclaim
can do it)

So this patch makes references's default value to PAGEREF_RECLAIM
and declare .may_writepage = 0 of scan_control in CMA part to make
code more clear.

Change-Id: I5edc3c955d106ecebc4949ce27daf5b7b7a18089
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mgorman@suse.de>
Reported-by: Minkyung Kim <minkyung88@lge.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Patch-mainline: linux-mm @ 9 May 2013 16:21:23
[vinmenon@codeaurora.org: trivial merge conflict fixes]
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2018-06-26 11:18:02 +05:30
Vinayak Menon
1cf9b916b6 mm: vmpressure: allow in-kernel clients to subscribe for events
Currently, vmpressure is tied to memcg and its events are
available only to userspace clients. This patch removes
the dependency on CONFIG_MEMCG and adds a mechanism for
in-kernel clients to subscribe for vmpressure events (in
fact raw vmpressure values are delivered instead of vmpressure
levels, to provide clients more flexibility to take actions
on custom pressure levels which are not currently defined
by vmpressure module).

Change-Id: I38010f166546e8d7f12f5f355b5dbfd6ba04d587
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
2018-06-26 11:18:01 +05:30
Greg Kroah-Hartman
08850d51f9 This is the 4.14.52 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlsxg4wACgkQONu9yGCS
 aT5gohAAz5xy4C1KerI0nbJTpmGC1RNRTI1Ynwx8g+E69cEe0DhDqak6o+ZZBgNq
 asoVrDDUi9FkpeJfX2gK023pkbMcdFU9uXadlWtmMXFeXyAteVyw6OgSJOM1qMlH
 4H2XsHyEROpE6lwqVsT5Qk+UnzzjT7ypG3b1czn89szFeJf0mGzExtSTo01VaJad
 wccCwZ5MA1djhS34YZqZfSz1Nb0SUlT7zAoyES8+Cc70wTxT0xv/OhmXtukvTKzW
 5Yr/QS+OEa6eWMt2ObqkJsLB2bZogoR/QIkhEQCPnq+V8/QVrRu0dE0PbjJ2Ocn5
 tpORkQVELl/V7cTjevtcFH6dyH/7C82qHAlW7qRHLvYAuwamTppyt+a0jwyhnOEt
 vkb15A7GRgqwTLDS89M4kUxvR3Kkz5cOdFk95jgv3dkYc43nQvstV6GrXjtW+6oT
 P1tD/2oucwKIrOOx2FLkhETG9vCV408lOBQXK0Jb1bxBUVQTtl8b5mk4xIdmQF5E
 a8WJQYIs3NpCXzIbS2AAp6u82q2Cs931n13vqjIPlQ/fl8uImxZyyC+6hSne3X6y
 dhqERs9uHk9xKSp18K7BxBflyXaW5fWKGh/CmExxIKfIIrNDYAf6HFoSWhKcIbwT
 /g2S3eR5QaeYCmSA02ReBjb8D5PLhpdtM+FEo+xeI0UkGhblhf0=
 =8PkZ
 -----END PGP SIGNATURE-----

Merge 4.14.52 into android-4.14

Changes in 4.14.52
	bonding: re-evaluate force_primary when the primary slave name changes
	cdc_ncm: avoid padding beyond end of skb
	ipv6: allow PMTU exceptions to local routes
	net: dsa: add error handling for pskb_trim_rcsum
	net/sched: act_simple: fix parsing of TCA_DEF_DATA
	tcp: verify the checksum of the first data segment in a new connection
	socket: close race condition between sock_close() and sockfs_setattr()
	udp: fix rx queue len reported by diag and proc interface
	net: in virtio_net_hdr only add VLAN_HLEN to csum_start if payload holds vlan
	hv_netvsc: Fix a network regression after ifdown/ifup
	tls: fix use-after-free in tls_push_record
	NFSv4.1: Fix up replays of interrupted requests
	ext4: fix hole length detection in ext4_ind_map_blocks()
	ext4: update mtime in ext4_punch_hole even if no blocks are released
	ext4: do not allow external inodes for inline data
	ext4: bubble errors from ext4_find_inline_data_nolock() up to ext4_iget()
	ext4: correctly handle a zero-length xattr with a non-zero e_value_offs
	ext4: fix fencepost error in check for inode count overflow during resize
	driver core: Don't ignore class_dir_create_and_add() failure.
	Btrfs: fix clone vs chattr NODATASUM race
	Btrfs: fix memory and mount leak in btrfs_ioctl_rm_dev_v2()
	btrfs: return error value if create_io_em failed in cow_file_range
	btrfs: scrub: Don't use inode pages for device replace
	ALSA: hda/realtek - Enable mic-mute hotkey for several Lenovo AIOs
	ALSA: hda/conexant - Add fixup for HP Z2 G4 workstation
	ALSA: hda - Handle kzalloc() failure in snd_hda_attach_pcm_stream()
	ALSA: hda: add dock and led support for HP EliteBook 830 G5
	ALSA: hda: add dock and led support for HP ProBook 640 G4
	x86/MCE: Fix stack out-of-bounds write in mce-inject.c: Flags_read()
	smb3: fix various xid leaks
	smb3: on reconnect set PreviousSessionId field
	CIFS: 511c54a2f69195b28afb9dd119f03787b1625bb4 adds a check for session expiry
	cifs: For SMB2 security informaion query, check for minimum sized security descriptor instead of sizeof FileAllInformation class
	nbd: fix nbd device deletion
	nbd: update size when connected
	nbd: use bd_set_size when updating disk size
	blk-mq: reinit q->tag_set_list entry only after grace period
	bdi: Move cgroup bdi_writeback to a dedicated low concurrency workqueue
	cpufreq: Fix new policy initialization during limits updates via sysfs
	cpufreq: governors: Fix long idle detection logic in load calculation
	libata: zpodd: small read overflow in eject_tray()
	libata: Drop SanDisk SD7UB3Q*G1001 NOLPM quirk
	w1: mxc_w1: Enable clock before calling clk_get_rate() on it
	x86/intel_rdt: Enable CMT and MBM on new Skylake stepping
	iwlwifi: fw: harden page loading code
	orangefs: set i_size on new symlink
	orangefs: report attributes_mask and attributes for statx
	HID: intel_ish-hid: ipc: register more pm callbacks to support hibernation
	HID: wacom: Correct logical maximum Y for 2nd-gen Intuos Pro large
	vhost: fix info leak due to uninitialized memory
	fs/binfmt_misc.c: do not allow offset overflow
	mm, page_alloc: do not break __GFP_THISNODE by zonelist reset
	Linux 4.14.52

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2018-06-26 09:14:49 +08:00