238 Commits

Author SHA1 Message Date
Richard Raya
b6eaf5c762 Revert "mm: move buddy list manipulations into helpers"
This reverts commit be1968b3980cb973d3034605735ab12e1fa4672a.

Change-Id: I203cdaca4f6fa23584a7b4c1ea15c73ff2137227
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-09-30 01:46:45 -03:00
Richard Raya
3143685e95 Merge branch 'linux-4.14.y' of https://github.com/openela/kernel-lts
* 'linux-4.14.y' of https://github.com/openela/kernel-lts: (278 commits)
  LTS: Update to 4.14.348
  docs: kernel_include.py: Cope with docutils 0.21
  serial: kgdboc: Fix NMI-safety problems from keyboard reset code
  btrfs: add missing mutex_unlock in btrfs_relocate_sys_chunks()
  dm: limit the number of targets and parameter size area
  Revert "selftests: mm: fix map_hugetlb failure on 64K page size systems"
  LTS: Update to 4.14.347
  rds: Fix build regression.
  RDS: IB: Use DEFINE_PER_CPU_SHARED_ALIGNED for rds_ib_stats
  af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc().
  net: fix out-of-bounds access in ops_init
  drm/vmwgfx: Fix invalid reads in fence signaled events
  dyndbg: fix old BUG_ON in >control parser
  tipc: fix UAF in error path
  usb: gadget: f_fs: Fix a race condition when processing setup packets.
  usb: gadget: composite: fix OS descriptors w_value logic
  firewire: nosy: ensure user_length is taken into account when fetching packet contents
  af_unix: Fix garbage collector racing against connect()
  af_unix: Do not use atomic ops for unix_sk(sk)->inflight.
  ipv6: fib6_rules: avoid possible NULL dereference in fib6_rule_action()
  ...

Change-Id: If329d39dd4e95e14045bb7c58494c197d1352d60
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-06-04 16:33:29 -03:00
Vlastimil Babka
ce55bbe342 mm, vmscan: prevent infinite loop for costly GFP_NOIO | __GFP_RETRY_MAYFAIL allocations
commit 803de9000f334b771afacb6ff3e78622916668b0 upstream.

Sven reports an infinite loop in __alloc_pages_slowpath() for costly order
__GFP_RETRY_MAYFAIL allocations that are also GFP_NOIO.  Such combination
can happen in a suspend/resume context where a GFP_KERNEL allocation can
have __GFP_IO masked out via gfp_allowed_mask.

Quoting Sven:

1. try to do a "costly" allocation (order > PAGE_ALLOC_COSTLY_ORDER)
   with __GFP_RETRY_MAYFAIL set.

2. page alloc's __alloc_pages_slowpath tries to get a page from the
   freelist. This fails because there is nothing free of that costly
   order.

3. page alloc tries to reclaim by calling __alloc_pages_direct_reclaim,
   which bails out because a zone is ready to be compacted; it pretends
   to have made a single page of progress.

4. page alloc tries to compact, but this always bails out early because
   __GFP_IO is not set (it's not passed by the snd allocator, and even
   if it were, we are suspending so the __GFP_IO flag would be cleared
   anyway).

5. page alloc believes reclaim progress was made (because of the
   pretense in item 3) and so it checks whether it should retry
   compaction. The compaction retry logic thinks it should try again,
   because:
    a) reclaim is needed because of the early bail-out in item 4
    b) a zonelist is suitable for compaction

6. goto 2. indefinite stall.

(end quote)

The immediate root cause is confusing the COMPACT_SKIPPED returned from
__alloc_pages_direct_compact() (step 4) due to lack of __GFP_IO to be
indicating a lack of order-0 pages, and in step 5 evaluating that in
should_compact_retry() as a reason to retry, before incrementing and
limiting the number of retries.  There are however other places that
wrongly assume that compaction can happen while we lack __GFP_IO.

To fix this, introduce gfp_compaction_allowed() to abstract the __GFP_IO
evaluation and switch the open-coded test in try_to_compact_pages() to use
it.

Also use the new helper in:
- compaction_ready(), which will make reclaim not bail out in step 3, so
  there's at least one attempt to actually reclaim, even if chances are
  small for a costly order
- in_reclaim_compaction() which will make should_continue_reclaim()
  return false and we don't over-reclaim unnecessarily
- in __alloc_pages_slowpath() to set a local variable can_compact,
  which is then used to avoid retrying reclaim/compaction for costly
  allocations (step 5) if we can't compact and also to skip the early
  compaction attempt that we do in some cases

Link: https://lkml.kernel.org/r/20240221114357.13655-2-vbabka@suse.cz
Fixes: 3250845d0526 ("Revert "mm, oom: prevent premature OOM killer invocation for high order request"")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Sven van Ashbrook <svenva@chromium.org>
Closes: https://lore.kernel.org/all/CAG-rBihs_xMKb3wrMO1%2B-%2Bp4fowP9oy1pa_OTkfxBzPUVOZF%2Bg@mail.gmail.com/
Tested-by: Karthikeyan Ramasubramanian <kramasub@chromium.org>
Cc: Brian Geffon <bgeffon@google.com>
Cc: Curtis Malainey <cujomalainey@chromium.org>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Takashi Iwai <tiwai@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit c82a659cc8bb7a7f8a8348fc7f203c412ae3636f)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-05-30 09:00:42 +00:00
Dan Williams
be1968b398
mm: move buddy list manipulations into helpers
In preparation for runtime randomization of the zone lists, take all
(well, most of) the list_*() functions in the buddy allocator and put
them in helper functions.  Provide a common control point for injecting
additional behavior when freeing pages.

[dan.j.williams@intel.com: fix buddy list helpers]
  Link: http://lkml.kernel.org/r/155033679702.1773410.13041474192173212653.stgit@dwillia2-desk3.amr.corp.intel.com
[vbabka@suse.cz: remove del_page_from_free_area() migratetype parameter]
  Link: http://lkml.kernel.org/r/4672701b-6775-6efd-0797-b6242591419e@suse.cz
Link: http://lkml.kernel.org/r/154899812264.3165233.5219320056406926223.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Robert Elliott <elliott@hpe.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-08-25 14:41:21 +00:00
Yu Zhao
bb4a01c36e
BACKPORT: mm/swap.c: don't pass "enum lru_list" to del_page_from_lru_list()
The parameter is redundant in the sense that it can be potentially
extracted from the "struct page" parameter by page_lru(). We need to
make sure that existing PageActive() or PageUnevictable() remains
until the function returns. A few places don't conform, and simple
reordering fixes them.

This patch may have left page_off_lru() seemingly odd, and we'll take
care of it in the next patch.

Link: https://lore.kernel.org/linux-mm/20201207220949.830352-6-yuzhao@google.com/
Link: https://lkml.kernel.org/r/20210122220600.906146-6-yuzhao@google.com
Signed-off-by: Yu Zhao <yuzhao@google.com>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 46ae6b2cc2a47904a368d238425531ea91f3a2a5)
Bug: 228114874
Change-Id: I1e14dcbf4111b39cf155ed3512423448865eb324
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-07-01 09:15:23 +00:00
azrim
acf071c96b
Revert "BACKPORT: mm/swap.c: don't pass "enum lru_list" to del_page_from_lru_list()"
This reverts commit ab4b57fe4682a6246055e00f8dca5b90ea37dce0.
2022-07-01 09:04:09 +00:00
Yu Zhao
ab4b57fe46
BACKPORT: mm/swap.c: don't pass "enum lru_list" to del_page_from_lru_list()
The parameter is redundant in the sense that it can be potentially
extracted from the "struct page" parameter by page_lru(). We need to
make sure that existing PageActive() or PageUnevictable() remains
until the function returns. A few places don't conform, and simple
reordering fixes them.

This patch may have left page_off_lru() seemingly odd, and we'll take
care of it in the next patch.

Link: https://lore.kernel.org/linux-mm/20201207220949.830352-6-yuzhao@google.com/
Link: https://lkml.kernel.org/r/20210122220600.906146-6-yuzhao@google.com
Signed-off-by: Yu Zhao <yuzhao@google.com>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 46ae6b2cc2a47904a368d238425531ea91f3a2a5)

BUG=b:123039911
TEST=Built

Change-Id: Iaf7a9c8c71da7d41c40f566ef9be8ac33c4e012d
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/2913975
Reviewed-by: Yu Zhao <yuzhao@chromium.org>
Commit-Queue: Yu Zhao <yuzhao@chromium.org>
Tested-by: Yu Zhao <yuzhao@chromium.org>
Signed-off-by: azrim <mirzaspc@gmail.com>
2022-06-10 15:54:17 +07:00
Blagovest Kolenichev
fbbfccc5a5 Merge android-4.14-q.147 (5c8d069) into msm-4.14
* refs/heads/tmp-5c8d069:
  Revert "net: qrtr: Stop rx_worker before freeing node"
  Linux 4.14.147
  Btrfs: fix race setting up and completing qgroup rescan workers
  btrfs: qgroup: Drop quota_root and fs_info parameters from update_qgroup_status_item
  mm/compaction.c: clear total_{migrate,free}_scanned before scanning a new zone
  md/raid0: avoid RAID0 data corruption due to layout confusion.
  CIFS: Fix oplock handling for SMB 2.1+ protocols
  CIFS: fix max ea value size
  i2c: riic: Clear NACK in tend isr
  hwrng: core - don't wait on add_early_randomness()
  quota: fix wrong condition in is_quota_modification()
  ext4: fix punch hole for inline_data file systems
  ext4: fix warning inside ext4_convert_unwritten_extents_endio
  /dev/mem: Bail out upon SIGKILL.
  cfg80211: Purge frame registrations on iftype change
  md: only call set_in_sync() when it is expected to succeed.
  md: don't report active array_state until after revalidate_disk() completes.
  md/raid6: Set R5_ReadError when there is read failure on parity disk
  btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space
  btrfs: Relinquish CPUs in btrfs_compare_trees
  Btrfs: fix use-after-free when using the tree modification log
  ovl: filter of trusted xattr results in audit
  memcg, kmem: do not fail __GFP_NOFAIL charges
  memcg, oom: don't require __GFP_FS when invoking memcg OOM killer
  gfs2: clear buf_in_tr when ending a transaction in sweep_bh_for_rgrps
  regulator: Defer init completion for a while after late_initcall
  alarmtimer: Use EOPNOTSUPP instead of ENOTSUPP
  arm64: dts: rockchip: limit clock rate of MMC controllers for RK3328
  ARM: zynq: Use memcpy_toio instead of memcpy on smp bring-up
  ARM: samsung: Fix system restart on S3C6410
  ASoC: Intel: Fix use of potentially uninitialized variable
  ASoC: Intel: Skylake: Use correct function to access iomem space
  ASoC: Intel: NHLT: Fix debug print format
  binfmt_elf: Do not move brk for INTERP-less ET_EXEC
  media: sn9c20x: Add MSI MS-1039 laptop to flip_dmi_table
  KVM: x86: Manually calculate reserved bits when loading PDPTRS
  KVM: x86: set ctxt->have_exception in x86_decode_insn()
  KVM: x86: always stop emulation on page fault
  x86/retpolines: Fix up backport of a9d57ef15cbe
  parisc: Disable HP HSC-PCI Cards to prevent kernel crash
  fuse: fix missing unlock_page in fuse_writepage()
  ALSA: hda/realtek - Fixup mute led on HP Spectre x360
  randstruct: Check member structs in is_pure_ops_struct()
  IB/hfi1: Define variables as unsigned long to fix KASAN warning
  printk: Do not lose last line in kmsg buffer dump
  scsi: scsi_dh_rdac: zero cdb in send_mode_select()
  ALSA: firewire-tascam: check intermediate state of clock status and retry
  ALSA: firewire-tascam: handle error code when getting current source of clock
  PM / devfreq: passive: fix compiler warning
  media: omap3isp: Set device on omap3isp subdevs
  btrfs: extent-tree: Make sure we only allocate extents from block groups with the same type
  ALSA: hda/realtek - Blacklist PC beep for Lenovo ThinkCentre M73/93
  media: ttusb-dec: Fix info-leak in ttusb_dec_send_command()
  drm/amd/powerplay/smu7: enforce minimal VBITimeout (v2)
  ALSA: hda - Drop unsol event handler for Intel HDMI codecs
  e1000e: add workaround for possible stalled packet
  libertas: Add missing sentinel at end of if_usb.c fw_table
  raid5: don't increment read_errors on EILSEQ return
  mmc: sdhci: Fix incorrect switch to HS mode
  mmc: core: Clarify sdio_irq_pending flag for MMC_CAP2_SDIO_IRQ_NOTHREAD
  raid5: don't set STRIPE_HANDLE to stripe which is in batch list
  ASoC: dmaengine: Make the pcm->name equal to pcm->id if the name is not set
  s390/crypto: xts-aes-s390 fix extra run-time crypto self tests finding
  kprobes: Prohibit probing on BUG() and WARN() address
  dmaengine: ti: edma: Do not reset reserved paRAM slots
  md/raid1: fail run raid1 array when active disk less than one
  hwmon: (acpi_power_meter) Change log level for 'unsafe software power cap'
  ACPI / PCI: fix acpi_pci_irq_enable() memory leak
  ACPI: custom_method: fix memory leaks
  ARM: dts: exynos: Mark LDO10 as always-on on Peach Pit/Pi Chromebooks
  libtraceevent: Change users plugin directory
  iommu/iova: Avoid false sharing on fq_timer_on
  iommu/amd: Silence warnings under memory pressure
  nvmet: fix data units read and written counters in SMART log
  arm64: kpti: ensure patched kernel text is fetched from PoU
  ACPI / CPPC: do not require the _PSD method
  ASoC: es8316: fix headphone mixer volume table
  media: ov9650: add a sanity check
  perf trace beauty ioctl: Fix off-by-one error in cmd->string table
  media: saa7134: fix terminology around saa7134_i2c_eeprom_md7134_gate()
  media: cpia2_usb: fix memory leaks
  media: saa7146: add cleanup in hexium_attach()
  media: cec-notifier: clear cec_adap in cec_notifier_unregister
  PM / devfreq: exynos-bus: Correct clock enable sequence
  PM / devfreq: passive: Use non-devm notifiers
  EDAC/amd64: Decode syndrome before translating address
  EDAC/amd64: Recognize DRAM device type ECC capability
  libperf: Fix alignment trap with xyarray contents in 'perf stat'
  media: dvb-core: fix a memory leak bug
  nbd: add missing config put
  media: hdpvr: add terminating 0 at end of string
  media: radio/si470x: kill urb on error
  ARM: dts: imx7d: cl-som-imx7: make ethernet work again
  net: lpc-enet: fix printk format strings
  media: imx: mipi csi-2: Don't fail if initial state times-out
  media: omap3isp: Don't set streaming state on random subdevs
  media: i2c: ov5645: Fix power sequence
  perf record: Support aarch64 random socket_id assignment
  dmaengine: iop-adma: use correct printk format strings
  media: rc: imon: Allow iMON RC protocol for ffdc 7e device
  media: fdp1: Reduce FCP not found message level to debug
  media: mtk-mdp: fix reference count on old device tree
  perf test vfs_getname: Disable ~/.perfconfig to get default output
  media: gspca: zero usb_buf on error
  sched/fair: Use rq_lock/unlock in online_fair_sched_group
  efi: cper: print AER info of PCIe fatal error
  EDAC, pnd2: Fix ioremap() size in dnv_rd_reg()
  ACPI / processor: don't print errors for processorIDs == 0xff
  md: don't set In_sync if array is frozen
  md: don't call spare_active in md_reap_sync_thread if all member devices can't work
  md/raid1: end bio when the device faulty
  ASoC: rsnd: don't call clk_get_rate() under atomic context
  EDAC/altera: Use the proper type for the IRQ status bits
  ia64:unwind: fix double free for mod->arch.init_unw_table
  ALSA: usb-audio: Skip bSynchAddress endpoint check if it is invalid
  base: soc: Export soc_device_register/unregister APIs
  media: iguanair: add sanity checks
  EDAC/mc: Fix grain_bits calculation
  ALSA: i2c: ak4xxx-adda: Fix a possible null pointer dereference in build_adc_controls()
  ALSA: hda - Show the fatal CORB/RIRB error more clearly
  x86/apic: Soft disable APIC before initializing it
  x86/reboot: Always use NMI fallback when shutdown via reboot vector IPI fails
  sched/core: Fix CPU controller for !RT_GROUP_SCHED
  sched/fair: Fix imbalance due to CPU affinity
  media: i2c: ov5640: Check for devm_gpiod_get_optional() error
  media: hdpvr: Add device num check and handling
  media: exynos4-is: fix leaked of_node references
  media: mtk-cir: lower de-glitch counter for rc-mm protocol
  media: dib0700: fix link error for dibx000_i2c_set_speed
  leds: leds-lp5562 allow firmware files up to the maximum length
  dmaengine: bcm2835: Print error in case setting DMA mask fails
  ASoC: sgtl5000: Fix charge pump source assignment
  regulator: lm363x: Fix off-by-one n_voltages for lm3632 ldo_vpos/ldo_vneg
  ALSA: hda: Flush interrupts on disabling
  nfc: enforce CAP_NET_RAW for raw sockets
  ieee802154: enforce CAP_NET_RAW for raw sockets
  ax25: enforce CAP_NET_RAW for raw sockets
  appletalk: enforce CAP_NET_RAW for raw sockets
  mISDN: enforce CAP_NET_RAW for raw sockets
  net/mlx5: Add device ID of upcoming BlueField-2
  usbnet: sanity checking of packet sizes and device mtu
  usbnet: ignore endpoints with invalid wMaxPacketSize
  skge: fix checksum byte order
  sch_netem: fix a divide by zero in tabledist()
  ppp: Fix memory leak in ppp_write
  openvswitch: change type of UPCALL_PID attribute to NLA_UNSPEC
  net_sched: add max len check for TCA_KIND
  net/sched: act_sample: don't push mac header on ip6gre ingress
  net: qrtr: Stop rx_worker before freeing node
  net/phy: fix DP83865 10 Mbps HDX loopback disable function
  macsec: drop skb sk before calling gro_cells_receive
  cdc_ncm: fix divide-by-zero caused by invalid wMaxPacketSize
  arcnet: provide a buffer big enough to actually receive packets
  f2fs: use generic EFSBADCRC/EFSCORRUPTED
  Bluetooth: btrtl: Additional Realtek 8822CE Bluetooth devices
  xfs: don't crash on null attr fork xfs_bmapi_read
  ACPI: video: Add new hw_changes_brightness quirk, set it on PB Easynote MZ35
  net: don't warn in inet diag when IPV6 is disabled
  drm: Flush output polling on shutdown
  f2fs: fix to do sanity check on segment bitmap of LFS curseg
  dm zoned: fix invalid memory access
  Revert "f2fs: avoid out-of-range memory access"
  blk-mq: move cancel of requeue_work to the front of blk_exit_queue
  PCI: hv: Avoid use of hv_pci_dev->pci_slot after freeing it
  f2fs: check all the data segments against all node ones
  irqchip/gic-v3-its: Fix LPI release for Multi-MSI devices
  locking/lockdep: Add debug_locks check in __lock_downgrade()
  power: supply: sysfs: ratelimit property read error message
  pinctrl: sprd: Use define directive for sprd_pinconf_params values
  objtool: Clobber user CFLAGS variable
  ALSA: hda - Apply AMD controller workaround for Raven platform
  ALSA: hda - Add laptop imic fixup for ASUS M9V laptop
  arm64: kpti: Whitelist Cortex-A CPUs that don't implement the CSV3 field
  ASoC: fsl: Fix of-node refcount unbalance in fsl_ssi_probe_from_dt()
  media: tvp5150: fix switch exit in set control handler
  iwlwifi: mvm: send BCAST management frames to the right station
  crypto: talitos - fix missing break in switch statement
  mtd: cfi_cmdset_0002: Use chip_good() to retry in do_write_oneword()
  HID: hidraw: Fix invalid read in hidraw_ioctl
  HID: logitech: Fix general protection fault caused by Logitech driver
  HID: sony: Fix memory corruption issue on cleanup.
  HID: prodikeys: Fix general protection fault during probe
  IB/core: Add an unbound WQ type to the new CQ API
  objtool: Query pkg-config for libelf location
  powerpc/xive: Fix bogus error code returned by OPAL
  Revert "Bluetooth: validate BLE connection interval updates"

Conflicts:
	drivers/mmc/core/sdio_irq.c
	fs/f2fs/data.c
	fs/f2fs/f2fs.h
	fs/f2fs/inode.c

Change-Id: I757f54737e4d58319f2866f687a39123f0889e1e
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
2019-10-11 05:19:38 -07:00
Greg Kroah-Hartman
5c8d069c5d This is the 4.14.147 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl2YdO8ACgkQONu9yGCS
 aT50DBAAuLQNqoTz2C6HkEHqU6l9v491n8Aw6ILuUoILcoWGIOZEtdmKtRXXBWjR
 v0s1T4FbeOZkKHTJGXuCd8Di6BYBDBSSBRSvuvWFisCyo/nwM3i1bNVz+i0IBAbS
 Ug3mQOkCHfZgvht17XCDjcHgE2pdHThof5T9/3IIewfSyx4I0WZKlr4Xq1HF535g
 30w1Ox+rWLQ4735LlGfMTd+oqYIaP91xyuFuGcsp2Pgy7nLFTtaa69fHD93qVH9o
 2rdIowlYFU/uIkGx9qW/sMDq4Gp34xczdp/ABNiRylEL1Rf1cOkcn8KZX14ePodk
 5x4hXHDkCWqquu6HnuTB6bzR7gwVsqxBEucT4v7wnkHTIFucEOeBc7E6T2llGTRe
 dfESrc28lXN3E9WGu7gAqi7Hvr2oDGVffthySwR6Yq4WoVSppHTc/SekZ4p2qhAl
 8jp4V86U5Fwr6ERCwZ0LcQ8TUK1j9KptpJ1P4Lb/w4wT2csq8DasunDm8/7lYFfp
 ISa9OE4fF8bhSI45bP+o4WYac6x7F4A8RpdTGJ1qRp0crKDL7oKP5YtFzJTGyt2a
 FDnONywuYu2Iayt76fqYU8Lh7yDpxzLSY/66VXRAPcb4Xtc55BKjlRaoEqXOPXqB
 8sid+r28LHYiOlHDfb+J/IKI8YyGAiB5ac7Pakw7Q8d07fsxiuI=
 =OgXW
 -----END PGP SIGNATURE-----

Merge 4.14.147 into android-4.14-q

Changes in 4.14.147
	Revert "Bluetooth: validate BLE connection interval updates"
	powerpc/xive: Fix bogus error code returned by OPAL
	objtool: Query pkg-config for libelf location
	IB/core: Add an unbound WQ type to the new CQ API
	HID: prodikeys: Fix general protection fault during probe
	HID: sony: Fix memory corruption issue on cleanup.
	HID: logitech: Fix general protection fault caused by Logitech driver
	HID: hidraw: Fix invalid read in hidraw_ioctl
	mtd: cfi_cmdset_0002: Use chip_good() to retry in do_write_oneword()
	crypto: talitos - fix missing break in switch statement
	iwlwifi: mvm: send BCAST management frames to the right station
	media: tvp5150: fix switch exit in set control handler
	ASoC: fsl: Fix of-node refcount unbalance in fsl_ssi_probe_from_dt()
	arm64: kpti: Whitelist Cortex-A CPUs that don't implement the CSV3 field
	ALSA: hda - Add laptop imic fixup for ASUS M9V laptop
	ALSA: hda - Apply AMD controller workaround for Raven platform
	objtool: Clobber user CFLAGS variable
	pinctrl: sprd: Use define directive for sprd_pinconf_params values
	power: supply: sysfs: ratelimit property read error message
	locking/lockdep: Add debug_locks check in __lock_downgrade()
	irqchip/gic-v3-its: Fix LPI release for Multi-MSI devices
	f2fs: check all the data segments against all node ones
	PCI: hv: Avoid use of hv_pci_dev->pci_slot after freeing it
	blk-mq: move cancel of requeue_work to the front of blk_exit_queue
	Revert "f2fs: avoid out-of-range memory access"
	dm zoned: fix invalid memory access
	f2fs: fix to do sanity check on segment bitmap of LFS curseg
	drm: Flush output polling on shutdown
	net: don't warn in inet diag when IPV6 is disabled
	ACPI: video: Add new hw_changes_brightness quirk, set it on PB Easynote MZ35
	xfs: don't crash on null attr fork xfs_bmapi_read
	Bluetooth: btrtl: Additional Realtek 8822CE Bluetooth devices
	f2fs: use generic EFSBADCRC/EFSCORRUPTED
	arcnet: provide a buffer big enough to actually receive packets
	cdc_ncm: fix divide-by-zero caused by invalid wMaxPacketSize
	macsec: drop skb sk before calling gro_cells_receive
	net/phy: fix DP83865 10 Mbps HDX loopback disable function
	net: qrtr: Stop rx_worker before freeing node
	net/sched: act_sample: don't push mac header on ip6gre ingress
	net_sched: add max len check for TCA_KIND
	openvswitch: change type of UPCALL_PID attribute to NLA_UNSPEC
	ppp: Fix memory leak in ppp_write
	sch_netem: fix a divide by zero in tabledist()
	skge: fix checksum byte order
	usbnet: ignore endpoints with invalid wMaxPacketSize
	usbnet: sanity checking of packet sizes and device mtu
	net/mlx5: Add device ID of upcoming BlueField-2
	mISDN: enforce CAP_NET_RAW for raw sockets
	appletalk: enforce CAP_NET_RAW for raw sockets
	ax25: enforce CAP_NET_RAW for raw sockets
	ieee802154: enforce CAP_NET_RAW for raw sockets
	nfc: enforce CAP_NET_RAW for raw sockets
	ALSA: hda: Flush interrupts on disabling
	regulator: lm363x: Fix off-by-one n_voltages for lm3632 ldo_vpos/ldo_vneg
	ASoC: sgtl5000: Fix charge pump source assignment
	dmaengine: bcm2835: Print error in case setting DMA mask fails
	leds: leds-lp5562 allow firmware files up to the maximum length
	media: dib0700: fix link error for dibx000_i2c_set_speed
	media: mtk-cir: lower de-glitch counter for rc-mm protocol
	media: exynos4-is: fix leaked of_node references
	media: hdpvr: Add device num check and handling
	media: i2c: ov5640: Check for devm_gpiod_get_optional() error
	sched/fair: Fix imbalance due to CPU affinity
	sched/core: Fix CPU controller for !RT_GROUP_SCHED
	x86/reboot: Always use NMI fallback when shutdown via reboot vector IPI fails
	x86/apic: Soft disable APIC before initializing it
	ALSA: hda - Show the fatal CORB/RIRB error more clearly
	ALSA: i2c: ak4xxx-adda: Fix a possible null pointer dereference in build_adc_controls()
	EDAC/mc: Fix grain_bits calculation
	media: iguanair: add sanity checks
	base: soc: Export soc_device_register/unregister APIs
	ALSA: usb-audio: Skip bSynchAddress endpoint check if it is invalid
	ia64:unwind: fix double free for mod->arch.init_unw_table
	EDAC/altera: Use the proper type for the IRQ status bits
	ASoC: rsnd: don't call clk_get_rate() under atomic context
	md/raid1: end bio when the device faulty
	md: don't call spare_active in md_reap_sync_thread if all member devices can't work
	md: don't set In_sync if array is frozen
	ACPI / processor: don't print errors for processorIDs == 0xff
	EDAC, pnd2: Fix ioremap() size in dnv_rd_reg()
	efi: cper: print AER info of PCIe fatal error
	sched/fair: Use rq_lock/unlock in online_fair_sched_group
	media: gspca: zero usb_buf on error
	perf test vfs_getname: Disable ~/.perfconfig to get default output
	media: mtk-mdp: fix reference count on old device tree
	media: fdp1: Reduce FCP not found message level to debug
	media: rc: imon: Allow iMON RC protocol for ffdc 7e device
	dmaengine: iop-adma: use correct printk format strings
	perf record: Support aarch64 random socket_id assignment
	media: i2c: ov5645: Fix power sequence
	media: omap3isp: Don't set streaming state on random subdevs
	media: imx: mipi csi-2: Don't fail if initial state times-out
	net: lpc-enet: fix printk format strings
	ARM: dts: imx7d: cl-som-imx7: make ethernet work again
	media: radio/si470x: kill urb on error
	media: hdpvr: add terminating 0 at end of string
	nbd: add missing config put
	media: dvb-core: fix a memory leak bug
	libperf: Fix alignment trap with xyarray contents in 'perf stat'
	EDAC/amd64: Recognize DRAM device type ECC capability
	EDAC/amd64: Decode syndrome before translating address
	PM / devfreq: passive: Use non-devm notifiers
	PM / devfreq: exynos-bus: Correct clock enable sequence
	media: cec-notifier: clear cec_adap in cec_notifier_unregister
	media: saa7146: add cleanup in hexium_attach()
	media: cpia2_usb: fix memory leaks
	media: saa7134: fix terminology around saa7134_i2c_eeprom_md7134_gate()
	perf trace beauty ioctl: Fix off-by-one error in cmd->string table
	media: ov9650: add a sanity check
	ASoC: es8316: fix headphone mixer volume table
	ACPI / CPPC: do not require the _PSD method
	arm64: kpti: ensure patched kernel text is fetched from PoU
	nvmet: fix data units read and written counters in SMART log
	iommu/amd: Silence warnings under memory pressure
	iommu/iova: Avoid false sharing on fq_timer_on
	libtraceevent: Change users plugin directory
	ARM: dts: exynos: Mark LDO10 as always-on on Peach Pit/Pi Chromebooks
	ACPI: custom_method: fix memory leaks
	ACPI / PCI: fix acpi_pci_irq_enable() memory leak
	hwmon: (acpi_power_meter) Change log level for 'unsafe software power cap'
	md/raid1: fail run raid1 array when active disk less than one
	dmaengine: ti: edma: Do not reset reserved paRAM slots
	kprobes: Prohibit probing on BUG() and WARN() address
	s390/crypto: xts-aes-s390 fix extra run-time crypto self tests finding
	ASoC: dmaengine: Make the pcm->name equal to pcm->id if the name is not set
	raid5: don't set STRIPE_HANDLE to stripe which is in batch list
	mmc: core: Clarify sdio_irq_pending flag for MMC_CAP2_SDIO_IRQ_NOTHREAD
	mmc: sdhci: Fix incorrect switch to HS mode
	raid5: don't increment read_errors on EILSEQ return
	libertas: Add missing sentinel at end of if_usb.c fw_table
	e1000e: add workaround for possible stalled packet
	ALSA: hda - Drop unsol event handler for Intel HDMI codecs
	drm/amd/powerplay/smu7: enforce minimal VBITimeout (v2)
	media: ttusb-dec: Fix info-leak in ttusb_dec_send_command()
	ALSA: hda/realtek - Blacklist PC beep for Lenovo ThinkCentre M73/93
	btrfs: extent-tree: Make sure we only allocate extents from block groups with the same type
	media: omap3isp: Set device on omap3isp subdevs
	PM / devfreq: passive: fix compiler warning
	ALSA: firewire-tascam: handle error code when getting current source of clock
	ALSA: firewire-tascam: check intermediate state of clock status and retry
	scsi: scsi_dh_rdac: zero cdb in send_mode_select()
	printk: Do not lose last line in kmsg buffer dump
	IB/hfi1: Define variables as unsigned long to fix KASAN warning
	randstruct: Check member structs in is_pure_ops_struct()
	ALSA: hda/realtek - Fixup mute led on HP Spectre x360
	fuse: fix missing unlock_page in fuse_writepage()
	parisc: Disable HP HSC-PCI Cards to prevent kernel crash
	x86/retpolines: Fix up backport of a9d57ef15cbe
	KVM: x86: always stop emulation on page fault
	KVM: x86: set ctxt->have_exception in x86_decode_insn()
	KVM: x86: Manually calculate reserved bits when loading PDPTRS
	media: sn9c20x: Add MSI MS-1039 laptop to flip_dmi_table
	binfmt_elf: Do not move brk for INTERP-less ET_EXEC
	ASoC: Intel: NHLT: Fix debug print format
	ASoC: Intel: Skylake: Use correct function to access iomem space
	ASoC: Intel: Fix use of potentially uninitialized variable
	ARM: samsung: Fix system restart on S3C6410
	ARM: zynq: Use memcpy_toio instead of memcpy on smp bring-up
	arm64: dts: rockchip: limit clock rate of MMC controllers for RK3328
	alarmtimer: Use EOPNOTSUPP instead of ENOTSUPP
	regulator: Defer init completion for a while after late_initcall
	gfs2: clear buf_in_tr when ending a transaction in sweep_bh_for_rgrps
	memcg, oom: don't require __GFP_FS when invoking memcg OOM killer
	memcg, kmem: do not fail __GFP_NOFAIL charges
	ovl: filter of trusted xattr results in audit
	Btrfs: fix use-after-free when using the tree modification log
	btrfs: Relinquish CPUs in btrfs_compare_trees
	btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space
	md/raid6: Set R5_ReadError when there is read failure on parity disk
	md: don't report active array_state until after revalidate_disk() completes.
	md: only call set_in_sync() when it is expected to succeed.
	cfg80211: Purge frame registrations on iftype change
	/dev/mem: Bail out upon SIGKILL.
	ext4: fix warning inside ext4_convert_unwritten_extents_endio
	ext4: fix punch hole for inline_data file systems
	quota: fix wrong condition in is_quota_modification()
	hwrng: core - don't wait on add_early_randomness()
	i2c: riic: Clear NACK in tend isr
	CIFS: fix max ea value size
	CIFS: Fix oplock handling for SMB 2.1+ protocols
	md/raid0: avoid RAID0 data corruption due to layout confusion.
	mm/compaction.c: clear total_{migrate,free}_scanned before scanning a new zone
	btrfs: qgroup: Drop quota_root and fs_info parameters from update_qgroup_status_item
	Btrfs: fix race setting up and completing qgroup rescan workers
	Linux 4.14.147

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
2019-10-06 12:38:20 +02:00
Yafang Shao
0861bcab4f mm/compaction.c: clear total_{migrate,free}_scanned before scanning a new zone
[ Upstream commit a94b525241c0fff3598809131d7cfcfe1d572d8c ]

total_{migrate,free}_scanned will be added to COMPACTMIGRATE_SCANNED and
COMPACTFREE_SCANNED in compact_zone().  We should clear them before
scanning a new zone.  In the proc triggered compaction, we forgot clearing
them.

[laoar.shao@gmail.com: introduce a helper compact_zone_counters_init()]
  Link: http://lkml.kernel.org/r/1563869295-25748-1-git-send-email-laoar.shao@gmail.com
[akpm@linux-foundation.org: expand compact_zone_counters_init() into its single callsite, per mhocko]
[vbabka@suse.cz: squash compact_zone() list_head init as well]
  Link: http://lkml.kernel.org/r/1fb6f7da-f776-9e42-22f8-bbb79b030b98@suse.cz
[akpm@linux-foundation.org: kcompactd_do_work(): avoid unnecessary initialization of cc.zone]
Link: http://lkml.kernel.org/r/1563789275-9639-1-git-send-email-laoar.shao@gmail.com
Fixes: 7f354a548d1c ("mm, compaction: add vmstats for kcompactd work")
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: Yafang Shao <shaoyafang@didiglobal.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-05 12:48:13 +02:00
Blagovest Kolenichev
070370f0ae Merge android-4.14.108 (4344de2) into msm-4.14
* refs/heads/tmp-4344de2:
  Linux 4.14.108
  s390/setup: fix boot crash for machine without EDAT-1
  KVM: nVMX: Ignore limit checks on VMX instructions using flat segments
  KVM: nVMX: Apply addr size mask to effective address for VMX instructions
  KVM: nVMX: Sign extend displacements of VMX instr's mem operands
  KVM: x86/mmu: Do not cache MMIO accesses while memslots are in flux
  KVM: x86/mmu: Detect MMIO generation wrap in any address space
  KVM: Call kvm_arch_memslots_updated() before updating memslots
  drm/radeon/evergreen_cs: fix missing break in switch statement
  media: imx: csi: Stop upstream before disabling IDMA channel
  media: imx: csi: Disable CSI immediately after last EOF
  media: vimc: Add vimc-streamer for stream control
  media: uvcvideo: Avoid NULL pointer dereference at the end of streaming
  media: imx: prpencvf: Stop upstream before disabling IDMA channel
  rcu: Do RCU GP kthread self-wakeup from softirq and interrupt
  tpm: Unify the send callback behaviour
  tpm/tpm_crb: Avoid unaligned reads in crb_recv()
  md: Fix failed allocation of md_register_thread
  perf intel-pt: Fix divide by zero when TSC is not available
  perf intel-pt: Fix overlap calculation for padding
  perf auxtrace: Define auxtrace record alignment
  perf intel-pt: Fix CYC timestamp calculation after OVF
  x86/unwind/orc: Fix ORC unwind table alignment
  bcache: never writeback a discard operation
  PM / wakeup: Rework wakeup source timer cancellation
  NFSv4.1: Reinitialise sequence results before retransmitting a request
  nfsd: fix wrong check in write_v4_end_grace()
  nfsd: fix memory corruption caused by readdir
  NFS: Don't recoalesce on error in nfs_pageio_complete_mirror()
  NFS: Fix an I/O request leakage in nfs_do_recoalesce
  NFS: Fix I/O request leakages
  cpcap-charger: generate events for userspace
  dm integrity: limit the rate of error messages
  dm: fix to_sector() for 32bit
  arm64: KVM: Fix architecturally invalid reset value for FPEXC32_EL2
  arm64: debug: Ensure debug handlers check triggering exception level
  arm64: Fix HCR.TGE status for NMI contexts
  ARM: s3c24xx: Fix boolean expressions in osiris_dvs_notify
  powerpc/traps: Fix the message printed when stack overflows
  powerpc/traps: fix recoverability of machine check handling on book3s/32
  powerpc/hugetlb: Don't do runtime allocation of 16G pages in LPAR configuration
  powerpc/ptrace: Simplify vr_get/set() to avoid GCC warning
  powerpc: Fix 32-bit KVM-PR lockup and host crash with MacOS guest
  powerpc/83xx: Also save/restore SPRG4-7 during suspend
  powerpc/powernv: Make opal log only readable by root
  powerpc/wii: properly disable use of BATs when requested.
  powerpc/32: Clear on-stack exception marker upon exception return
  security/selinux: fix SECURITY_LSM_NATIVE_LABELS on reused superblock
  jbd2: fix compile warning when using JBUFFER_TRACE
  jbd2: clear dirty flag when revoking a buffer from an older transaction
  serial: 8250_pci: Have ACCES cards that use the four port Pericom PI7C9X7954 chip use the pci_pericom_setup()
  serial: 8250_pci: Fix number of ports for ACCES serial cards
  serial: 8250_of: assume reg-shift of 2 for mrvl,mmp-uart
  serial: uartps: Fix stuck ISR if RX disabled with non-empty FIFO
  drm/i915: Relax mmap VMA check
  crypto: arm64/aes-neonbs - fix returning final keystream block
  i2c: tegra: fix maximum transfer size
  parport_pc: fix find_superio io compare code, should use equal test.
  intel_th: Don't reference unassigned outputs
  device property: Fix the length used in PROPERTY_ENTRY_STRING()
  kernel/sysctl.c: add missing range check in do_proc_dointvec_minmax_conv
  mm/vmalloc: fix size check for remap_vmalloc_range_partial()
  mm: hwpoison: fix thp split handing in soft_offline_in_use_page()
  nfit: acpi_nfit_ctl(): Check out_obj->type in the right place
  usb: chipidea: tegra: Fix missed ci_hdrc_remove_device()
  clk: ingenic: Fix doc of ingenic_cgu_div_info
  clk: ingenic: Fix round_rate misbehaving with non-integer dividers
  clk: clk-twl6040: Fix imprecise external abort for pdmclk
  clk: uniphier: Fix update register for CPU-gear
  ext2: Fix underflow in ext2_max_size()
  cxl: Wrap iterations over afu slices inside 'afu_list_lock'
  IB/hfi1: Close race condition on user context disable and close
  ext4: fix crash during online resizing
  ext4: add mask of ext4 flags to swap
  cpufreq: pxa2xx: remove incorrect __init annotation
  cpufreq: tegra124: add missing of_node_put()
  x86/kprobes: Prohibit probing on optprobe template code
  irqchip/gic-v3-its: Avoid parsing _indirect_ twice for Device table
  libertas_tf: don't set URB_ZERO_PACKET on IN USB transfer
  crypto: pcbc - remove bogus memcpy()s with src == dest
  Btrfs: fix corruption reading shared and compressed extents after hole punching
  btrfs: ensure that a DUP or RAID1 block group has exactly two stripes
  Btrfs: setup a nofs context for memory allocation at __btrfs_set_acl
  m68k: Add -ffreestanding to CFLAGS
  splice: don't merge into linked buffers
  fs/devpts: always delete dcache dentry-s in dput()
  scsi: target/iscsi: Avoid iscsit_release_commands_from_conn() deadlock
  scsi: sd: Optimal I/O size should be a multiple of physical block size
  scsi: aacraid: Fix performance issue on logical drives
  scsi: virtio_scsi: don't send sc payload with tmfs
  s390/virtio: handle find on invalid queue gracefully
  s390/setup: fix early warning messages
  clocksource/drivers/exynos_mct: Clear timer interrupt when shutdown
  clocksource/drivers/exynos_mct: Move one-shot check from tick clear to ISR
  regulator: s2mpa01: Fix step values for some LDOs
  regulator: max77620: Initialize values for DT properties
  regulator: s2mps11: Fix steps for buck7, buck8 and LDO35
  spi: pxa2xx: Setup maximum supported DMA transfer length
  spi: ti-qspi: Fix mmap read when more than one CS in use
  mmc: sdhci-esdhc-imx: fix HS400 timing issue
  ACPI / device_sysfs: Avoid OF modalias creation for removed device
  xen: fix dom0 boot on huge systems
  tracing: Do not free iter->trace in fail path of tracing_open_pipe()
  tracing: Use strncpy instead of memcpy for string keys in hist triggers
  CIFS: Fix read after write for files with read caching
  CIFS: Do not reset lease state to NONE on lease break
  crypto: arm64/aes-ccm - fix bugs in non-NEON fallback routine
  crypto: arm64/aes-ccm - fix logical bug in AAD MAC handling
  crypto: testmgr - skip crc32c context test for ahash algorithms
  crypto: hash - set CRYPTO_TFM_NEED_KEY if ->setkey() fails
  crypto: arm64/crct10dif - revert to C code for short inputs
  crypto: arm/crct10dif - revert to C code for short inputs
  fix cgroup_do_mount() handling of failure exits
  libnvdimm: Fix altmap reservation size calculation
  libnvdimm/pmem: Honor force_raw for legacy pmem regions
  libnvdimm, pfn: Fix over-trim in trim_pfn_device()
  libnvdimm/label: Clear 'updating' flag after label-set update
  stm class: Prevent division by zero
  media: videobuf2-v4l2: drop WARN_ON in vb2_warn_zero_bytesused()
  tmpfs: fix uninitialized return value in shmem_link
  net: set static variable an initial value in atl2_probe()
  nfp: bpf: fix ALU32 high bits clearance bug
  nfp: bpf: fix code-gen bug on BPF_ALU | BPF_XOR | BPF_K
  net: thunderx: make CFG_DONE message to run through generic send-ack sequence
  mac80211_hwsim: propagate genlmsg_reply return code
  phonet: fix building with clang
  ARCv2: support manual regfile save on interrupts
  ARC: uacces: remove lp_start, lp_end from clobber list
  ARCv2: lib: memcpy: fix doing prefetchw outside of buffer
  ixgbe: fix older devices that do not support IXGBE_MRQC_L3L4TXSWEN
  tmpfs: fix link accounting when a tmpfile is linked in
  net: marvell: mvneta: fix DMA debug warning
  arm64: Relax GIC version check during early boot
  qed: Fix iWARP syn packet mac address validation.
  ASoC: topology: free created components in tplg load error
  mailbox: bcm-flexrm-mailbox: Fix FlexRM ring flush timeout issue
  net: mv643xx_eth: disable clk on error path in mv643xx_eth_shared_probe()
  qmi_wwan: apply SET_DTR quirk to Sierra WP7607
  pinctrl: meson: meson8b: fix the sdxc_a data 1..3 pins
  net: systemport: Fix reception of BPDUs
  scsi: libiscsi: Fix race between iscsi_xmit_task and iscsi_complete_task
  keys: Fix dependency loop between construction record and auth key
  assoc_array: Fix shortcut creation
  af_key: unconditionally clone on broadcast
  ARM: 8824/1: fix a migrating irq bug when hotplug cpu
  esp: Skip TX bytes accounting when sending from a request socket
  clk: sunxi: A31: Fix wrong AHB gate number
  clk: sunxi-ng: v3s: Fix TCON reset de-assert bit
  Input: st-keyscan - fix potential zalloc NULL dereference
  auxdisplay: ht16k33: fix potential user-after-free on module unload
  i2c: bcm2835: Clear current buffer pointers and counts after a transfer
  i2c: cadence: Fix the hold bit setting
  net: hns: Fix object reference leaks in hns_dsaf_roce_reset()
  mm: page_alloc: fix ref bias in page_frag_alloc() for 1-byte allocs
  Revert "mm: use early_pfn_to_nid in page_ext_init"
  mm/gup: fix gup_pmd_range() for dax
  NFS: Don't use page_file_mapping after removing the page
  floppy: check_events callback should not return a negative number
  ipvs: fix dependency on nf_defrag_ipv6
  mac80211: Fix Tx aggregation session tear down with ITXQs
  Input: matrix_keypad - use flush_delayed_work()
  Input: ps2-gpio - flush TX work when closing port
  Input: cap11xx - switch to using set_brightness_blocking()
  ARM: OMAP2+: fix lack of timer interrupts on CPU1 after hotplug
  KVM: arm/arm64: Reset the VCPU without preemption and vcpu state loaded
  ASoC: rsnd: fixup rsnd_ssi_master_clk_start() user count check
  ASoC: dapm: fix out-of-bounds accesses to DAPM lookup tables
  ARM: OMAP2+: Variable "reg" in function omap4_dsi_mux_pads() could be uninitialized
  Input: pwm-vibra - stop regulator after disabling pwm, not before
  Input: pwm-vibra - prevent unbalanced regulator
  s390/dasd: fix using offset into zero size array error
  gpu: ipu-v3: Fix CSI offsets for imx53
  drm/imx: imx-ldb: add missing of_node_puts
  gpu: ipu-v3: Fix i.MX51 CSI control registers offset
  drm/imx: ignore plane updates on disabled crtcs
  crypto: rockchip - update new iv to device in multiple operations
  crypto: rockchip - fix scatterlist nents error
  crypto: ahash - fix another early termination in hash walk
  crypto: caam - fixed handling of sg list
  stm class: Fix an endless loop in channel allocation
  iio: adc: exynos-adc: Fix NULL pointer exception on unbind
  ASoC: fsl_esai: fix register setting issue in RIGHT_J mode
  9p/net: fix memory leak in p9_client_create
  9p: use inode->i_lock to protect i_size_write() under 32-bit
  FROMLIST: psi: introduce psi monitor
  FROMLIST: refactor header includes to allow kthread.h inclusion in psi_types.h
  FROMLIST: psi: track changed states
  FROMLIST: psi: split update_stats into parts
  FROMLIST: psi: rename psi fields in preparation for psi trigger addition
  FROMLIST: psi: make psi_enable static
  FROMLIST: psi: introduce state_mask to represent stalled psi states
  ANDROID: cuttlefish_defconfig: Enable CONFIG_INPUT_MOUSEDEV
  ANDROID: cuttlefish_defconfig: Enable CONFIG_PSI
  BACKPORT: kernel: cgroup: add poll file operation
  BACKPORT: fs: kernfs: add poll file operation
  UPSTREAM: psi: avoid divide-by-zero crash inside virtual machines
  UPSTREAM: psi: clarify the Kconfig text for the default-disable option
  UPSTREAM: psi: fix aggregation idle shut-off
  UPSTREAM: psi: fix reference to kernel commandline enable
  UPSTREAM: psi: make disabling/enabling easier for vendor kernels
  UPSTREAM: kernel/sched/psi.c: simplify cgroup_move_task()
  BACKPORT: psi: cgroup support
  UPSTREAM: psi: pressure stall information for CPU, memory, and IO
  UPSTREAM: sched: introduce this_rq_lock_irq()
  UPSTREAM: sched: sched.h: make rq locking and clock functions available in stats.h
  UPSTREAM: sched: loadavg: make calc_load_n() public
  BACKPORT: sched: loadavg: consolidate LOAD_INT, LOAD_FRAC, CALC_LOAD
  UPSTREAM: delayacct: track delays from thrashing cache pages
  UPSTREAM: mm: workingset: tell cache transitions from workingset thrashing
  sched/fair: fix energy compute when a cluster is only a cpu core in multi-cluster system

Conflicts:
	arch/arm/kernel/irq.c
	drivers/scsi/sd.c
	include/linux/sched.h
	include/uapi/linux/taskstats.h
	kernel/sched/Makefile
	sound/soc/soc-dapm.c

Change-Id: I12ebb57a34da9101ee19458d7e1f96ecc769c39a
Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org>
2019-05-15 07:44:57 -07:00
Johannes Weiner
94eab674ff UPSTREAM: psi: pressure stall information for CPU, memory, and IO
When systems are overcommitted and resources become contended, it's hard
to tell exactly the impact this has on workload productivity, or how close
the system is to lockups and OOM kills.  In particular, when machines work
multiple jobs concurrently, the impact of overcommit in terms of latency
and throughput on the individual job can be enormous.

In order to maximize hardware utilization without sacrificing individual
job health or risk complete machine lockups, this patch implements a way
to quantify resource pressure in the system.

A kernel built with CONFIG_PSI=y creates files in /proc/pressure/ that
expose the percentage of time the system is stalled on CPU, memory, or IO,
respectively.  Stall states are aggregate versions of the per-task delay
accounting delays:

       cpu: some tasks are runnable but not executing on a CPU
       memory: tasks are reclaiming, or waiting for swapin or thrashing cache
       io: tasks are waiting for io completions

These percentages of walltime can be thought of as pressure percentages,
and they give a general sense of system health and productivity loss
incurred by resource overcommit.  They can also indicate when the system
is approaching lockup scenarios and OOMs.

To do this, psi keeps track of the task states associated with each CPU
and samples the time they spend in stall states.  Every 2 seconds, the
samples are averaged across CPUs - weighted by the CPUs' non-idle time to
eliminate artifacts from unused CPUs - and translated into percentages of
walltime.  A running average of those percentages is maintained over 10s,
1m, and 5m periods (similar to the loadaverage).

[hannes@cmpxchg.org: doc fixlet, per Randy]
  Link: http://lkml.kernel.org/r/20180828205625.GA14030@cmpxchg.org
[hannes@cmpxchg.org: code optimization]
  Link: http://lkml.kernel.org/r/20180907175015.GA8479@cmpxchg.org
[hannes@cmpxchg.org: rename psi_clock() to psi_update_work(), per Peter]
  Link: http://lkml.kernel.org/r/20180907145404.GB11088@cmpxchg.org
[hannes@cmpxchg.org: fix build]
  Link: http://lkml.kernel.org/r/20180913014222.GA2370@cmpxchg.org
Link: http://lkml.kernel.org/r/20180828172258.3185-9-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Daniel Drake <drake@endlessm.com>
Tested-by: Suren Baghdasaryan <surenb@google.com>
Cc: Christopher Lameter <cl@linux.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Weiner <jweiner@fb.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Enderborg <peter.enderborg@sony.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit eb414681d5a07d28d2ff90dc05f69ec6b232ebd2)

Bug: 127712811
Test: lmkd in PSI mode
Change-Id: Id00d23c977169b0c4636d92016fc1fee0274be05
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
2019-03-21 16:19:55 -07:00
Vinayak Menon
6093c334a7 mm: compaction: fix the page state calculation in too_many_isolated
Commit "mm: vmscan: fix the page state calculation in too_many_isolated"
fixed an issue where a number of tasks were blocked in reclaim path
for seconds, because of vmstat_diff not being synced in time.
A similar problem can happen in isolate_migratepages_block, where
similar calculation is performed. This patch fixes that.

Change-Id: Ie74f108ef770da688017b515fe37faea6f384589
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
2018-05-10 23:32:47 -07:00
Greg Kroah-Hartman
b24413180f License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:10:55 +01:00
Davidlohr Bueso
6818600ff0 mm,compaction: serialize waitqueue_active() checks (for real)
Andrea brought to my attention that the L->{L,S} guarantees are
completely bogus for this case.  I was looking at the diagram, from the
offending commit, when that _is_ the race, we had the load reordered
already.

What we need is at least S->L semantics, thus simply use
wq_has_sleeper() to serialize the call for good.

Link: http://lkml.kernel.org/r/20170914175313.GB811@linux-80c1.suse
Fixes: 46acef048a6 (mm,compaction: serialize waitqueue_active() checks)
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Reported-by: Andrea Parri <parri.andrea@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-03 17:54:24 -07:00
Michal Hocko
ccbe1e4dde mm, compaction: skip over holes in __reset_isolation_suitable
__reset_isolation_suitable walks the whole zone pfn range and it tries
to jump over holes by checking the zone for each page.  It might still
stumble over offline pages, though.  Skip those by checking
pfn_to_online_page()

Link: http://lkml.kernel.org/r/20170515085827.16474-9-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Daniel Kiper <daniel.kiper@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: Tobias Regnery <tobias.regnery@gmail.com>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-06 16:24:32 -07:00
Vlastimil Babka
baf6a9a1db mm, compaction: finish whole pageblock to reduce fragmentation
The main goal of direct compaction is to form a high-order page for
allocation, but it should also help against long-term fragmentation when
possible.

Most lower-than-pageblock-order compactions are for non-movable
allocations, which means that if we compact in a movable pageblock and
terminate as soon as we create the high-order page, it's unlikely that
the fallback heuristics will claim the whole block.  Instead there might
be a single unmovable page in a pageblock full of movable pages, and the
next unmovable allocation might pick another pageblock and increase
long-term fragmentation.

To help against such scenarios, this patch changes the termination
criteria for compaction so that the current pageblock is finished even
though the high-order page already exists.  Note that it might be
possible that the high-order page formed elsewhere in the zone due to
parallel activity, but this patch doesn't try to detect that.

This is only done with sync compaction, because async compaction is
limited to pageblock of the same migratetype, where it cannot result in
a migratetype fallback.  (Async compaction also eagerly skips
order-aligned blocks where isolation fails, which is against the goal of
migrating away as much of the pageblock as possible.)

As a result of this patch, long-term memory fragmentation should be
reduced.

In testing based on 4.9 kernel with stress-highalloc from mmtests
configured for order-4 GFP_KERNEL allocations, this patch has reduced
the number of unmovable allocations falling back to movable pageblocks
by 20%.  The number

Link: http://lkml.kernel.org/r/20170307131545.28577-9-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-08 17:15:10 -07:00
Vlastimil Babka
282722b0d2 mm, compaction: restrict async compaction to pageblocks of same migratetype
The migrate scanner in async compaction is currently limited to
MIGRATE_MOVABLE pageblocks.  This is a heuristic intended to reduce
latency, based on the assumption that non-MOVABLE pageblocks are
unlikely to contain movable pages.

However, with the exception of THP's, most high-order allocations are
not movable.  Should the async compaction succeed, this increases the
chance that the non-MOVABLE allocations will fallback to a MOVABLE
pageblock, making the long-term fragmentation worse.

This patch attempts to help the situation by changing async direct
compaction so that the migrate scanner only scans the pageblocks of the
requested migratetype.  If it's a non-MOVABLE type and there are such
pageblocks that do contain movable pages, chances are that the
allocation can succeed within one of such pageblocks, removing the need
for a fallback.  If that fails, the subsequent sync attempt will ignore
this restriction.

In testing based on 4.9 kernel with stress-highalloc from mmtests
configured for order-4 GFP_KERNEL allocations, this patch has reduced
the number of unmovable allocations falling back to movable pageblocks
by 30%.  The number of movable allocations falling back is reduced by
12%.

Link: http://lkml.kernel.org/r/20170307131545.28577-8-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-08 17:15:10 -07:00
Vlastimil Babka
d39773a062 mm, compaction: add migratetype to compact_control
Preparation patch.  We are going to need migratetype at lower layers
than compact_zone() and compact_finished().

Link: http://lkml.kernel.org/r/20170307131545.28577-7-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-08 17:15:10 -07:00
Vlastimil Babka
b682debd97 mm, compaction: change migrate_async_suitable() to suitable_migration_source()
Preparation for making the decisions more complex and depending on
compact_control flags.  No functional change.

Link: http://lkml.kernel.org/r/20170307131545.28577-6-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-08 17:15:10 -07:00
Vlastimil Babka
228d7e3390 mm, compaction: remove redundant watermark check in compact_finished()
When detecting whether compaction has succeeded in forming a high-order
page, __compact_finished() employs a watermark check, followed by an own
search for a suitable page in the freelists.  This is not ideal for two
reasons:

 - The watermark check also searches high-order freelists, but has a
   less strict criteria wrt fallback. It's therefore redundant and waste
   of cycles. This was different in the past when high-order watermark
   check attempted to apply reserves to high-order pages.

 - The watermark check might actually fail due to lack of order-0 pages.
   Compaction can't help with that, so there's no point in continuing
   because of that. It's possible that high-order page still exists and
   it terminates.

This patch therefore removes the watermark check.  This should save some
cycles and terminate compaction sooner in some cases.

Link: http://lkml.kernel.org/r/20170307131545.28577-3-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-08 17:15:09 -07:00
Yisheng Xie
1ef36db2a9 mm/compaction: ignore block suitable after check large free page
By reviewing code, I find that if the migrate target is a large free
page and we ignore suitable, it may splite large target free page into
smaller block which is not good for defrag.  So move the ignore block
suitable after check large free page.

As Vlastimil pointed out in RFC version that this patch is just based on
logical analyses which might be better for future-proofing the function
and it is most likely won't have any visible effect right now, for
direct compaction shouldn't have to be called if there's a
>=pageblock_order page already available.

Link: http://lkml.kernel.org/r/1489490743-5364-1-git-send-email-xieyisheng1@huawei.com
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-03 15:52:09 -07:00
Ingo Molnar
174cd4b1e5 sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h>
Fix up affected files that include this signal functionality via sched.h.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:32 +01:00
Yisheng Xie
9e5bcd610f mm/migration: make isolate_movable_page() return int type
Patch series "HWPOISON: soft offlining for non-lru movable page", v6.

After Minchan's commit bda807d44454 ("mm: migrate: support non-lru
movable page migration"), some type of non-lru page like zsmalloc and
virtio-balloon page also support migration.

Therefore, we can:

1) soft offlining no-lru movable pages, which means when memory
   corrected errors occur on a non-lru movable page, we can stop to use
   it by migrating data onto another page and disable the original
   (maybe half-broken) one.

2) enable memory hotplug for non-lru movable pages, i.e. we may offline
   blocks, which include such pages, by using non-lru page migration.

This patchset is heavily dependent on non-lru movable page migration.

This patch (of 4):

Change the return type of isolate_movable_page() from bool to int.  It
will return 0 when isolate movable page successfully, and return -EBUSY
when it isolates failed.

There is no functional change within this patch but prepare for later
patch.

[xieyisheng1@huawei.com: v6]
  Link: http://lkml.kernel.org/r/1486108770-630-2-git-send-email-xieyisheng1@huawei.com
Link: http://lkml.kernel.org/r/1485867981-16037-2-git-send-email-ysxie@foxmail.com
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Suggested-by: Michal Hocko <mhocko@kernel.org>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:55 -08:00
Davidlohr Bueso
46acef048a mm,compaction: serialize waitqueue_active() checks
Without a memory barrier, the following race can occur with a high-order
allocation:

wakeup_kcompactd(order == 1)  		     kcompactd()
  [L] waitqueue_active(kcompactd_wait)
						[S] prepare_to_wait_event(kcompactd_wait)
						[L] (kcompactd_max_order == 0)
  [S] kcompactd_max_order = order;		      schedule()

Where the waitqueue_active() check is speculatively re-ordered to before
setting the actual condition (max_order), not seeing the threads that's
going to block; making us miss a wakeup.  There are a couple of options
to fix this, including calling wq_has_sleepers() which adds a full
barrier, or unconditionally doing the wake_up_interruptible() and
serialize on the q->lock.  However, to make use of the control
dependency, we just need to add L->L guarantees.

While this bug is theoretical, there have been other offenders of the
lockless waitqueue_active() in the past -- this is also documented in
the call itself.

Link: http://lkml.kernel.org/r/1483975528-24342-1-git-send-email-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-22 16:41:29 -08:00
David Rientjes
7f354a548d mm, compaction: add vmstats for kcompactd work
A "compact_daemon_wake" vmstat exists that represents the number of
times kcompactd has woken up.  This doesn't represent how much work it
actually did, though.

It's useful to understand how much compaction work is being done by
kcompactd versus other methods such as direct compaction and explicitly
triggered per-node (or system) compaction.

This adds two new vmstats: "compact_daemon_migrate_scanned" and
"compact_daemon_free_scanned" to represent the number of pages kcompactd
has scanned as part of its migration scanner and freeing scanner,
respectively.

These values are still accounted for in the general
"compact_migrate_scanned" and "compact_free_scanned" for compatibility.

It could be argued that explicitly triggered compaction could also be
tracked separately, and that could be added if others find it useful.

Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1612071749390.69852@chino.kir.corp.google.com
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-22 16:41:29 -08:00
Michal Hocko
73e64c51af mm, compaction: allow compaction for GFP_NOFS requests
compaction has been disabled for GFP_NOFS and GFP_NOIO requests since
the direct compaction was introduced by commit 56de7263fcf3 ("mm:
compaction: direct compact when a high-order allocation fails").  The
main reason is that the migration of page cache pages might recurse back
to fs/io layer and we could potentially deadlock.  This is overly
conservative because all the anonymous memory is migrateable in the
GFP_NOFS context just fine.  This might be a large portion of the memory
in many/most workkloads.

Remove the GFP_NOFS restriction and make sure that we skip all fs pages
(those with a mapping) while isolating pages to be migrated.  We cannot
consider clean fs pages because they might need a metadata update so
only isolate pages without any mapping for nofs requests.

The effect of this patch will be probably very limited in many/most
workloads because higher order GFP_NOFS requests are quite rare,
although different configurations might lead to very different results.
David Chinner has mentioned a heavy metadata workload with 64kB block
which to quote him:

: Unfortunately, there was an era of cargo cult configuration tweaks in the
: Ceph community that has resulted in a large number of production machines
: with XFS filesystems configured this way.  And a lot of them store large
: numbers of small files and run under significant sustained memory
: pressure.
:
: I slowly working towards getting rid of these high order allocations and
: replacing them with the equivalent number of single page allocations, but
: I haven't got that (complex) change working yet.

We can do the following to simulate that workload:
$ mkfs.xfs -f -n size=64k <dev>
$ mount <dev> /mnt/scratch
$ time ./fs_mark  -D  10000  -S0  -n  100000  -s  0  -L  32 \
        -d  /mnt/scratch/0  -d  /mnt/scratch/1 \
        -d  /mnt/scratch/2  -d  /mnt/scratch/3 \
        -d  /mnt/scratch/4  -d  /mnt/scratch/5 \
        -d  /mnt/scratch/6  -d  /mnt/scratch/7 \
        -d  /mnt/scratch/8  -d  /mnt/scratch/9 \
        -d  /mnt/scratch/10  -d  /mnt/scratch/11 \
        -d  /mnt/scratch/12  -d  /mnt/scratch/13 \
        -d  /mnt/scratch/14  -d  /mnt/scratch/15

and indeed is hammers the system with many high order GFP_NOFS requests as
per a simle tracepoint during the load:
$ echo '!(gfp_flags & 0x80) && (gfp_flags &0x400000)' > $TRACE_MNT/events/kmem/mm_page_alloc/filter
I am getting
5287609 order=0
     37 order=1
1594905 order=2
3048439 order=3
6699207 order=4
  66645 order=5

My testing was done in a kvm guest so performance numbers should be
taken with a grain of salt but there seems to be a difference when the
patch is applied:

* Original kernel
FSUse%        Count         Size    Files/sec     App Overhead
     1      1600000            0       4300.1         20745838
     3      3200000            0       4239.9         23849857
     5      4800000            0       4243.4         25939543
     6      6400000            0       4248.4         19514050
     8      8000000            0       4262.1         20796169
     9      9600000            0       4257.6         21288675
    11     11200000            0       4259.7         19375120
    13     12800000            0       4220.7         22734141
    14     14400000            0       4238.5         31936458
    16     16000000            0       4231.5         23409901
    18     17600000            0       4045.3         23577700
    19     19200000            0       2783.4         58299526
    21     20800000            0       2678.2         40616302
    23     22400000            0       2693.5         83973996

and xfs complaining about memory allocation not making progress
[ 2304.372647] XFS: fs_mark(3289) possible memory allocation deadlock size 65624 in kmem_alloc (mode:0x2408240)
[ 2304.443323] XFS: fs_mark(3285) possible memory allocation deadlock size 65728 in kmem_alloc (mode:0x2408240)
[ 4796.772477] XFS: fs_mark(3424) possible memory allocation deadlock size 46936 in kmem_alloc (mode:0x2408240)
[ 4796.775329] XFS: fs_mark(3423) possible memory allocation deadlock size 51416 in kmem_alloc (mode:0x2408240)
[ 4797.388808] XFS: fs_mark(3424) possible memory allocation deadlock size 65728 in kmem_alloc (mode:0x2408240)

* Patched kernel
FSUse%        Count         Size    Files/sec     App Overhead
     1      1600000            0       4289.1         19243934
     3      3200000            0       4241.6         32828865
     5      4800000            0       4248.7         32884693
     6      6400000            0       4314.4         19608921
     8      8000000            0       4269.9         24953292
     9      9600000            0       4270.7         33235572
    11     11200000            0       4346.4         40817101
    13     12800000            0       4285.3         29972397
    14     14400000            0       4297.2         20539765
    16     16000000            0       4219.6         18596767
    18     17600000            0       4273.8         49611187
    19     19200000            0       4300.4         27944451
    21     20800000            0       4270.6         22324585
    22     22400000            0       4317.6         22650382
    24     24000000            0       4065.2         22297964

So the dropdown at Count 19200000 didn't happen and there was only a
single warning about allocation not making progress
[ 3063.815003] XFS: fs_mark(3272) possible memory allocation deadlock size 65624 in kmem_alloc (mode:0x2408240)

This suggests that the patch has helped even though there is not all that
much of anonymous memory as the workload mostly generates fs metadata.  I
assume the success rate would be higher with more anonymous memory which
should be the case in many workloads.

[akpm@linux-foundation.org: fix comment]
Link: http://lkml.kernel.org/r/20161012114721.31853-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Linus Torvalds
e34bac726d Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:

 - various misc bits

 - most of MM (quite a lot of MM material is awaiting the merge of
   linux-next dependencies)

 - kasan

 - printk updates

 - procfs updates

 - MAINTAINERS

 - /lib updates

 - checkpatch updates

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (123 commits)
  init: reduce rootwait polling interval time to 5ms
  binfmt_elf: use vmalloc() for allocation of vma_filesz
  checkpatch: don't emit unified-diff error for rename-only patches
  checkpatch: don't check c99 types like uint8_t under tools
  checkpatch: avoid multiple line dereferences
  checkpatch: don't check .pl files, improve absolute path commit log test
  scripts/checkpatch.pl: fix spelling
  checkpatch: don't try to get maintained status when --no-tree is given
  lib/ida: document locking requirements a bit better
  lib/rbtree.c: fix typo in comment of ____rb_erase_color
  lib/Kconfig.debug: make CONFIG_STRICT_DEVMEM depend on CONFIG_DEVMEM
  MAINTAINERS: add drm and drm/i915 irc channels
  MAINTAINERS: add "C:" for URI for chat where developers hang out
  MAINTAINERS: add drm and drm/i915 bug filing info
  MAINTAINERS: add "B:" for URI where to file bugs
  get_maintainer: look for arbitrary letter prefixes in sections
  printk: add Kconfig option to set default console loglevel
  printk/sound: handle more message headers
  printk/btrfs: handle more message headers
  printk/kdb: handle more message headers
  ...
2016-12-12 20:50:02 -08:00
Ming Ling
6afcf8ef0c mm, compaction: fix NR_ISOLATED_* stats for pfn based migration
Since commit bda807d44454 ("mm: migrate: support non-lru movable page
migration") isolate_migratepages_block) can isolate !PageLRU pages which
would acct_isolated account as NR_ISOLATED_*.  Accounting these non-lru
pages NR_ISOLATED_{ANON,FILE} doesn't make any sense and it can misguide
heuristics based on those counters such as pgdat_reclaimable_pages resp.
too_many_isolated which would lead to unexpected stalls during the
direct reclaim without any good reason.  Note that
__alloc_contig_migrate_range can isolate a lot of pages at once.

On mobile devices such as 512M ram android Phone, it may use a big zram
swap.  In some cases zram(zsmalloc) uses too many non-lru but
migratedable pages, such as:

      MemTotal: 468148 kB
      Normal free:5620kB
      Free swap:4736kB
      Total swap:409596kB
      ZRAM: 164616kB(zsmalloc non-lru pages)
      active_anon:60700kB
      inactive_anon:60744kB
      active_file:34420kB
      inactive_file:37532kB

Fix this by only accounting lru pages to NR_ISOLATED_* in
isolate_migratepages_block right after they were isolated and we still
know they were on LRU.  Drop acct_isolated because it is called after
the fact and we've lost that information.  Batching per-cpu counter
doesn't make much improvement anyway.  Also make sure that we uncharge
only LRU pages when putting them back on the LRU in
putback_movable_pages resp.  when unmap_and_move migrates the page.

[mhocko@suse.com: replace acct_isolated() with direct counting]
Fixes: bda807d44454 ("mm: migrate: support non-lru movable page migration")
Link: http://lkml.kernel.org/r/20161019080240.9682-1-mhocko@kernel.org
Signed-off-by: Ming Ling <ming.ling@spreadtrum.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12 18:55:07 -08:00
Anna-Maria Gleixner
e46b1db249 mm/compaction: Convert to hotplug state machine
Install the callbacks via the state machine. Should the hotplug init fail then
no threads are spawned.

Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: linux-mm@kvack.org
Cc: rt@linutronix.de
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Link: http://lkml.kernel.org/r/20161126231350.10321-15-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-02 00:52:37 +01:00
Vlastimil Babka
2031142028 mm, compaction: restrict fragindex to costly orders
Fragmentation index and the vm.extfrag_threshold sysctl is meant as a
heuristic to prevent excessive compaction for costly orders (i.e.  THP).
It's unlikely to make any difference for non-costly orders, especially
with the default threshold.  But we cannot afford any uncertainty for
the non-costly orders where the only alternative to successful
reclaim/compaction is OOM.  After the recent patches we are guaranteed
maximum effort without heuristics from compaction before deciding OOM,
and fragindex is the last remaining heuristic.  Therefore skip fragindex
altogether for non-costly orders.

Suggested-by: Michal Hocko <mhocko@suse.com>
Link: http://lkml.kernel.org/r/20160926162025.21555-5-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:29 -07:00
Vlastimil Babka
cc5c9f098f mm, compaction: ignore fragindex from compaction_zonelist_suitable()
The compaction_zonelist_suitable() function tries to determine if
compaction will be able to proceed after sufficient reclaim, i.e.
whether there are enough reclaimable pages to provide enough order-0
freepages for compaction.

This addition of reclaimable pages to the free pages works well for the
order-0 watermark check, but in the fragmentation index check we only
consider truly free pages.  Thus we can get fragindex value close to 0
which indicates failure do to lack of memory, and wrongly decide that
compaction won't be suitable even after reclaim.

Instead of trying to somehow adjust fragindex for reclaimable pages,
let's just skip it from compaction_zonelist_suitable().

Link: http://lkml.kernel.org/r/20160926162025.21555-4-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:29 -07:00
Vlastimil Babka
9f7e338793 mm, compaction: make full priority ignore pageblock suitability
Several people have reported premature OOMs for order-2 allocations
(stack) due to OOM rework in 4.7.  In the scenario (parallel kernel
build and dd writing to two drives) many pageblocks get marked as
Unmovable and compaction free scanner struggles to isolate free pages.
Joonsoo Kim pointed out that the free scanner skips pageblocks that are
not movable to prevent filling them and forcing non-movable allocations
to fallback to other pageblocks.  Such heuristic makes sense to help
prevent long-term fragmentation, but premature OOMs are relatively more
urgent problem.  As a compromise, this patch disables the heuristic only
for the ultimate compaction priority.

Link: http://lkml.kernel.org/r/20160906135258.18335-5-vbabka@suse.cz
Reported-by: Ralf-Peter Rohbeck <Ralf-Peter.Rohbeck@quantum.com>
Reported-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com>
Reported-by: Olaf Hering <olaf@aepfle.de>
Suggested-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:29 -07:00
Vlastimil Babka
8348faf91f mm, compaction: require only min watermarks for non-costly orders
The __compaction_suitable() function checks the low watermark plus a
compact_gap() gap to decide if there's enough free memory to perform
compaction.  Then __isolate_free_page uses low watermark check to decide
if particular free page can be isolated.  In the latter case, using low
watermark is needlessly pessimistic, as the free page isolations are
only temporary.  For __compaction_suitable() the higher watermark makes
sense for high-order allocations where more freepages increase the
chance of success, and we can typically fail with some order-0 fallback
when the system is struggling to reach that watermark.  But for
low-order allocation, forming the page should not be that hard.  So
using low watermark here might just prevent compaction from even trying,
and eventually lead to OOM killer even if we are above min watermarks.

So after this patch, we use min watermark for non-costly orders in
__compaction_suitable(), and for all orders in __isolate_free_page().

[vbabka@suse.cz: clarify __isolate_free_page() comment]
 Link: http://lkml.kernel.org/r/7ae4baec-4eca-e70b-2a69-94bea4fb19fa@suse.cz
Link: http://lkml.kernel.org/r/20160810091226.6709-11-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Lorenzo Stoakes <lstoakes@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:27 -07:00
Vlastimil Babka
984fdba6a3 mm, compaction: use proper alloc_flags in __compaction_suitable()
The __compaction_suitable() function checks the low watermark plus a
compact_gap() gap to decide if there's enough free memory to perform
compaction.  This check uses direct compactor's alloc_flags, but that's
wrong, since these flags are not applicable for freepage isolation.

For example, alloc_flags may indicate access to memory reserves, making
compaction proceed, and then fail watermark check during the isolation.

A similar problem exists for ALLOC_CMA, which may be part of
alloc_flags, but not during freepage isolation.  In this case however it
makes sense to use ALLOC_CMA both in __compaction_suitable() and
__isolate_free_page(), since there's actually nothing preventing the
freepage scanner to isolate from CMA pageblocks, with the assumption
that a page that could be migrated once by compaction can be migrated
also later by CMA allocation.  Thus we should count pages in CMA
pageblocks when considering compaction suitability and when isolating
freepages.

To sum up, this patch should remove some false positives from
__compaction_suitable(), and allow compaction to proceed when free pages
required for compaction reside in the CMA pageblocks.

Link: http://lkml.kernel.org/r/20160810091226.6709-10-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:27 -07:00
Vlastimil Babka
9861a62c33 mm, compaction: create compact_gap wrapper
Compaction uses a watermark gap of (2UL << order) pages at various
places and it's not immediately obvious why.  Abstract it through a
compact_gap() wrapper to create a single place with a thorough
explanation.

[vbabka@suse.cz: clarify the comment of compact_gap()]
 Link: http://lkml.kernel.org/r/7b6aed1f-fdf8-2063-9ff4-bbe4de712d37@suse.cz
Link: http://lkml.kernel.org/r/20160810091226.6709-9-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:27 -07:00
Vlastimil Babka
f2b8228c5f mm, compaction: use correct watermark when checking compaction success
The __compact_finished() function uses low watermark in a check that has
to pass if the direct compaction is to finish and allocation should
succeed.  This is too pessimistic, as the allocation will typically use
min watermark.  It may happen that during compaction, we drop below the
low watermark (due to parallel activity), but still form the target
high-order page.  By checking against low watermark, we might needlessly
continue compaction.

Similarly, __compaction_suitable() uses low watermark in a check whether
allocation can succeed without compaction.  Again, this is unnecessarily
pessimistic.

After this patch, these check will use direct compactor's alloc_flags to
determine the watermark, which is effectively the min watermark.

Link: http://lkml.kernel.org/r/20160810091226.6709-8-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:27 -07:00
Vlastimil Babka
a8e025e55b mm, compaction: add the ultimate direct compaction priority
During reclaim/compaction loop, it's desirable to get a final answer
from unsuccessful compaction so we can either fail the allocation or
invoke the OOM killer.  However, heuristics such as deferred compaction
or pageblock skip bits can cause compaction to skip parts or whole zones
and lead to premature OOM's, failures or excessive reclaim/compaction
retries.

To remedy this, we introduce a new direct compaction priority called
COMPACT_PRIO_SYNC_FULL, which instructs direct compaction to:

 - ignore deferred compaction status for a zone
 - ignore pageblock skip hints
 - ignore cached scanner positions and scan the whole zone

The new priority should get eventually picked up by
should_compact_retry() and this should improve success rates for costly
allocations using __GFP_REPEAT, such as hugetlbfs allocations, and
reduce some corner-case OOM's for non-costly allocations.

Link: http://lkml.kernel.org/r/20160810091226.6709-6-vbabka@suse.cz
[vbabka@suse.cz: use the MIN_COMPACT_PRIORITY alias]
  Link: http://lkml.kernel.org/r/d443b884-87e7-1c93-8684-3a3a35759fb1@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:27 -07:00
Vlastimil Babka
7ceb009a22 mm, compaction: don't recheck watermarks after COMPACT_SUCCESS
Joonsoo has reminded me that in a later patch changing watermark checks
throughout compaction I forgot to update checks in
try_to_compact_pages() and compactd_do_work().  Closer inspection
however shows that they are redundant now in the success case, because
compact_zone() now reliably reports this with COMPACT_SUCCESS.  So
effectively the checks just repeat (a subset) of checks that have just
passed.  So instead of checking watermarks again, just test the return
value.

Note it's also possible that compaction would declare failure e.g.
because its find_suitable_fallback() is more strict than simple
watermark check, and then the watermark check we are removing would then
still succeed.  After this patch this is not possible and it's arguably
better, because for long-term fragmentation avoidance we should rather
try a different zone than allocate with the unsuitable fallback.  If
compaction of all zones fail and the allocation is important enough, it
will retry and succeed anyway.

Also remove the stray "bool success" variable from kcompactd_do_work().

Link: http://lkml.kernel.org/r/20160810091226.6709-5-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Tested-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Michal Hocko <mhocko@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:27 -07:00
Vlastimil Babka
cf378319d3 mm, compaction: rename COMPACT_PARTIAL to COMPACT_SUCCESS
COMPACT_PARTIAL has historically meant that compaction returned after
doing some work without fully compacting a zone.  It however didn't
distinguish if compaction terminated because it succeeded in creating
the requested high-order page.  This has changed recently and now we
only return COMPACT_PARTIAL when compaction thinks it succeeded, or the
high-order watermark check in compaction_suitable() passes and no
compaction needs to be done.

So at this point we can make the return value clearer by renaming it to
COMPACT_SUCCESS.  The next patch will remove some redundant tests for
success where compaction just returned COMPACT_SUCCESS.

Link: http://lkml.kernel.org/r/20160810091226.6709-4-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:27 -07:00
Vlastimil Babka
791cae9620 mm, compaction: cleanup unused functions
Since kswapd compaction moved to kcompactd, compact_pgdat() is not
called anymore, so we remove it.  The only caller of __compact_pgdat()
is compact_node(), so we merge them and remove code that was only
reachable from kswapd.

Link: http://lkml.kernel.org/r/20160810091226.6709-3-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:27 -07:00
Vlastimil Babka
06ed29989f mm, compaction: make whole_zone flag ignore cached scanner positions
Patch series "make direct compaction more deterministic")

This is mostly a followup to Michal's oom detection rework, which
highlighted the need for direct compaction to provide better feedback in
reclaim/compaction loop, so that it can reliably recognize when
compaction cannot make further progress, and allocation should invoke
OOM killer or fail.  We've discussed this at LSF/MM [1] where I proposed
expanding the async/sync migration mode used in compaction to more
general "priorities".  This patchset adds one new priority that just
overrides all the heuristics and makes compaction fully scan all zones.
I don't currently think that we need more fine-grained priorities, but
we'll see.  Other than that there's some smaller fixes and cleanups,
mainly related to the THP-specific hacks.

I've tested this with stress-highalloc in GFP_KERNEL order-4 and
THP-like order-9 scenarios.  There's some improvement for compaction
stats for the order-4, which is likely due to the better watermarks
handling.  In the previous version I reported mostly noise wrt
compaction stats, and decreased direct reclaim - now the reclaim is
without difference.  I believe this is due to the less aggressive
compaction priority increase in patch 6.

"before" is a mmotm tree prior to 4.7 release plus the first part of the
series that was sent and merged separately

                                    before        after
order-4:

Compaction stalls                    27216       30759
Compaction success                   19598       25475
Compaction failures                   7617        5283
Page migrate success                370510      464919
Page migrate failure                 25712       27987
Compaction pages isolated           849601     1041581
Compaction migrate scanned       143146541   101084990
Compaction free scanned          208355124   144863510
Compaction cost                       1403        1210

order-9:

Compaction stalls                     7311        7401
Compaction success                    1634        1683
Compaction failures                   5677        5718
Page migrate success                194657      183988
Page migrate failure                  4753        4170
Compaction pages isolated           498790      456130
Compaction migrate scanned          565371      524174
Compaction free scanned            4230296     4250744
Compaction cost                        215         203

[1] https://lwn.net/Articles/684611/

This patch (of 11):

A recent patch has added whole_zone flag that compaction sets when
scanning starts from the zone boundary, in order to report that zone has
been fully scanned in one attempt.  For allocations that want to try
really hard or cannot fail, we will want to introduce a mode where
scanning whole zone is guaranteed regardless of the cached positions.

This patch reuses the whole_zone flag in a way that if it's already
passed true to compaction, the cached scanner positions are ignored.
Employing this flag during reclaim/compaction loop will be done in the
next patch.  This patch however converts compaction invoked from
userspace via procfs to use this flag.  Before this patch, the cached
positions were first reset to zone boundaries and then read back from
struct zone, so there was a window where a parallel compaction could
replace the reset values, making the manual compaction less effective.
Using the flag instead of performing reset is more robust.

[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20160810091226.6709-2-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:27 -07:00
Vlastimil Babka
c3486f5376 mm, compaction: simplify contended compaction handling
Async compaction detects contention either due to failing trylock on
zone->lock or lru_lock, or by need_resched().  Since 1f9efdef4f3f ("mm,
compaction: khugepaged should not give up due to need_resched()") the
code got quite complicated to distinguish these two up to the
__alloc_pages_slowpath() level, so different decisions could be taken
for khugepaged allocations.

After the recent changes, khugepaged allocations don't check for
contended compaction anymore, so we again don't need to distinguish lock
and sched contention, and simplify the current convoluted code a lot.

However, I believe it's also possible to simplify even more and
completely remove the check for contended compaction after the initial
async compaction for costly orders, which was originally aimed at THP
page fault allocations.  There are several reasons why this can be done
now:

- with the new defaults, THP page faults no longer do reclaim/compaction at
  all, unless the system admin has overridden the default, or application has
  indicated via madvise that it can benefit from THP's. In both cases, it
  means that the potential extra latency is expected and worth the benefits.
- even if reclaim/compaction proceeds after this patch where it previously
  wouldn't, the second compaction attempt is still async and will detect the
  contention and back off, if the contention persists
- there are still heuristics like deferred compaction and pageblock skip bits
  in place that prevent excessive THP page fault latencies

Link: http://lkml.kernel.org/r/20160721073614.24395-9-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Vlastimil Babka
a5508cd83f mm, compaction: introduce direct compaction priority
In the context of direct compaction, for some types of allocations we
would like the compaction to either succeed or definitely fail while
trying as hard as possible.  Current async/sync_light migration mode is
insufficient, as there are heuristics such as caching scanner positions,
marking pageblocks as unsuitable or deferring compaction for a zone.  At
least the final compaction attempt should be able to override these
heuristics.

To communicate how hard compaction should try, we replace migration mode
with a new enum compact_priority and change the relevant function
signatures.  In compact_zone_order() where struct compact_control is
constructed, the priority is mapped to suitable control flags.  This
patch itself has no functional change, as the current priority levels
are mapped back to the same migration modes as before.  Expanding them
will be done next.

Note that !CONFIG_COMPACTION variant of try_to_compact_pages() is
removed, as the only caller exists under CONFIG_COMPACTION.

Link: http://lkml.kernel.org/r/20160721073614.24395-8-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Hugh Dickins
1d2047fefa mm, compaction: don't isolate PageWriteback pages in MIGRATE_SYNC_LIGHT mode
At present MIGRATE_SYNC_LIGHT is allowing __isolate_lru_page() to
isolate a PageWriteback page, which __unmap_and_move() then rejects with
-EBUSY: of course the writeback might complete in between, but that's
not what we usually expect, so probably better not to isolate it.

When tested by stress-highalloc from mmtests, this has reduced the
number of page migrate failures by 60-70%.

Link: http://lkml.kernel.org/r/20160721073614.24395-2-vbabka@suse.cz
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman
5a1c84b404 mm: remove reclaim and compaction retry approximations
If per-zone LRU accounting is available then there is no point
approximating whether reclaim and compaction should retry based on pgdat
statistics.  This is effectively a revert of "mm, vmstat: remove zone
and node double accounting by approximating retries" with the difference
that inactive/active stats are still available.  This preserves the
history of why the approximation was retried and why it had to be
reverted to handle OOM kills on 32-bit systems.

Link: http://lkml.kernel.org/r/1469110261-7365-4-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman
bca6759258 mm, vmstat: remove zone and node double accounting by approximating retries
The number of LRU pages, dirty pages and writeback pages must be
accounted for on both zones and nodes because of the reclaim retry
logic, compaction retry logic and highmem calculations all depending on
per-zone stats.

Many lowmem allocations are immune from OOM kill due to a check in
__alloc_pages_may_oom for (ac->high_zoneidx < ZONE_NORMAL) since commit
03668b3ceb0c ("oom: avoid oom killer for lowmem allocations").  The
exception is costly high-order allocations or allocations that cannot
fail.  If the __alloc_pages_may_oom avoids OOM-kill for low-order lowmem
allocations then it would fall through to __alloc_pages_direct_compact.

This patch will blindly retry reclaim for zone-constrained allocations
in should_reclaim_retry up to MAX_RECLAIM_RETRIES.  This is not ideal
but without per-zone stats there are not many alternatives.  The impact
it that zone-constrained allocations may delay before considering the
OOM killer.

As there is no guarantee enough memory can ever be freed to satisfy
compaction, this patch avoids retrying compaction for zone-contrained
allocations.

In combination, that means that the per-node stats can be used when
deciding whether to continue reclaim using a rough approximation.  While
it is possible this will make the wrong decision on occasion, it will
not infinite loop as the number of reclaim attempts is capped by
MAX_RECLAIM_RETRIES.

The final step is calculating the number of dirtyable highmem pages.  As
those calculations only care about the global count of file pages in
highmem.  This patch uses a global counter used instead of per-zone
stats as it is sufficient.

In combination, this allows the per-zone LRU and dirty state counters to
be removed.

[mgorman@techsingularity.net: fix acct_highmem_file_pages()]
  Link: http://lkml.kernel.org/r/1468853426-12858-4-git-send-email-mgorman@techsingularity.netLink: http://lkml.kernel.org/r/1467970510-21195-35-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Suggested by: Michal Hocko <mhocko@kernel.org>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman
599d0c954f mm, vmscan: move LRU lists to node
This moves the LRU lists from the zone to the node and related data such
as counters, tracing, congestion tracking and writeback tracking.

Unfortunately, due to reclaim and compaction retry logic, it is
necessary to account for the number of LRU pages on both zone and node
logic.  Most reclaim logic is based on the node counters but the retry
logic uses the zone counters which do not distinguish inactive and
active sizes.  It would be possible to leave the LRU counters on a
per-zone basis but it's a heavier calculation across multiple cache
lines that is much more frequent than the retry checks.

Other than the LRU counters, this is mostly a mechanical patch but note
that it introduces a number of anomalies.  For example, the scans are
per-zone but using per-node counters.  We also mark a node as congested
when a zone is congested.  This causes weird problems that are fixed
later but is easier to review.

In the event that there is excessive overhead on 32-bit systems due to
the nodes being on LRU then there are two potential solutions

1. Long-term isolation of highmem pages when reclaim is lowmem

   When pages are skipped, they are immediately added back onto the LRU
   list. If lowmem reclaim persisted for long periods of time, the same
   highmem pages get continually scanned. The idea would be that lowmem
   keeps those pages on a separate list until a reclaim for highmem pages
   arrives that splices the highmem pages back onto the LRU. It potentially
   could be implemented similar to the UNEVICTABLE list.

   That would reduce the skip rate with the potential corner case is that
   highmem pages have to be scanned and reclaimed to free lowmem slab pages.

2. Linear scan lowmem pages if the initial LRU shrink fails

   This will break LRU ordering but may be preferable and faster during
   memory pressure than skipping LRU pages.

Link: http://lkml.kernel.org/r/1467970510-21195-4-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Mel Gorman
a52633d8e9 mm, vmscan: move lru_lock to the node
Node-based reclaim requires node-based LRUs and locking.  This is a
preparation patch that just moves the lru_lock to the node so later
patches are easier to review.  It is a mechanical change but note this
patch makes contention worse because the LRU lock is hotter and direct
reclaim and kswapd can contend on the same lock even when reclaiming
from different zones.

Link: http://lkml.kernel.org/r/1467970510-21195-3-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Ganesh Mahendran
b2b331f966 mm/compaction: remove unnecessary order check in try_to_compact_pages()
The caller __alloc_pages_direct_compact() already checked (order == 0)
so there's no need to check again.

Link: http://lkml.kernel.org/r/1465973568-3496-1-git-send-email-opensource.ganesh@gmail.com
Signed-off-by: Ganesh Mahendran <opensource.ganesh@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00