mirror of
https://github.com/rd-stuffs/msm-4.14.git
synced 2025-02-20 11:45:48 +08:00
454 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
|
57c2bb95b7 |
Merge android-4.14-p.105 (e742253) into msm-4.14
* refs/heads/tmp-e742253: Linux 4.14.105 x86/uaccess: Don't leak the AC flag into __put_user() value evaluation MIPS: eBPF: Fix icache flush end address MIPS: fix truncation in __cmpxchg_small for short values mm: enforce min addr even if capable() in expand_downwards() mmc: sdhci-esdhc-imx: correct the fix of ERR004536 mmc: tmio: fix access width of Block Count Register mmc: tmio_mmc_core: don't claim spurious interrupts mmc: spi: Fix card detection during probe powerpc: Always initialize input array when calling epapr_hypercall() KVM: nSVM: clear events pending from svm_complete_interrupts() when exiting to L1 svm: Fix AVIC incomplete IPI emulation cfg80211: extend range deviation for DMG mac80211: Add attribute aligned(2) to struct 'action' mac80211: don't initiate TDLS connection if station is not associated to AP ibmveth: Do not process frames after calling napi_reschedule net: dev_is_mac_header_xmit() true for ARPHRD_RAWIP net: usb: asix: ax88772_bind return error when hw_reset fail hv_netvsc: Fix ethtool change hash key error net: altera_tse: fix connect_local_phy error path scsi: csiostor: fix NULL pointer dereference in csio_vport_set_state() writeback: synchronize sync(2) against cgroup writeback membership switches direct-io: allow direct writes to empty inodes staging: android: ion: Support cpu access during dma_buf_detach serial: fsl_lpuart: fix maximum acceptable baud rate with over-sampling drm/amd/powerplay: OD setting fix on Vega10 locking/rwsem: Fix (possible) missed wakeup futex: Fix (possible) missed wakeup sched/wait: Fix rcuwait_wake_up() ordering mac80211: fix miscounting of ttl-dropped frames staging: rtl8723bs: Fix build error with Clang when inlining is disabled drivers: thermal: int340x_thermal: Fix sysfs race condition ARC: fix __ffs return value to avoid build warnings selftests: gpio-mockup-chardev: Check asprintf() for error selftests: seccomp: use LDLIBS instead of LDFLAGS ASoC: imx-audmux: change snprintf to scnprintf for possible overflow ASoC: dapm: change snprintf to scnprintf for possible overflow genirq: Make sure the initial affinity is not empty usb: gadget: Potential NULL dereference on allocation error usb: dwc3: gadget: Fix the uninitialized link_state when udc starts usb: dwc3: gadget: synchronize_irq dwc irq in suspend thermal: int340x_thermal: Fix a NULL vs IS_ERR() check clk: vc5: Abort clock configuration without upstream clock ASoC: Variable "val" in function rt274_i2c_probe() could be uninitialized ALSA: compress: prevent potential divide by zero bugs ASoC: Intel: Haswell/Broadwell: fix setting for .dynamic field drm/msm: Unblock writer if reader closes file scsi: libsas: Fix rphy phy_identifier for PHYs with end devices attached net: stmmac: Disable ACS Feature for GMAC >= 4 net: stmmac: Fix reception of Broadcom switches tags Revert "loop: Fold __loop_release into loop_release" Revert "loop: Get rid of loop_index_mutex" Revert "loop: Fix double mutex_unlock(&loop_ctl_mutex) in loop_control_ioctl()" FROMGIT: binder: create node flag to request sender's security context Modify include/uapi/linux/android/binder.h, as commit: FROMGIT: binder: create node flag to request sender's security context introduces enums and structures, which are already defined in other userspace files that include the binder uapi file. Thus, the redeclaration of these enums and structures can lead to build errors. To avoid this, guard the redundant declarations in the uapi header with the __KERNEL__ header guard, so they are not exported to userspace. Conflicts: drivers/gpu/drm/msm/msm_rd.c drivers/staging/android/ion/ion.c include/uapi/linux/android/binder.h sound/core/compress_offload.c Change-Id: I5d470f222a6a1baa284813a11f847cfcbe6ee0a6 Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org> Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org> |
||
|
e742253322 |
This is the 4.14.105 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlx+qpsACgkQONu9yGCS aT6+Aw//WiHXgsyk+jnwuOzJ0rvx38ky6l7RRm6uZ5r2M+aYQJ12FEc+AVomyu6T lA7ciLeIWNQFO0xQJ5Tg8LMPumL4JbA/EVElPu5h5yHMXySkQuC6WLXnH083d+dZ QbXEk6O9xuIxbk2sy58BmQnmqdXipTnpNIXpT/137Rlz1/jzxyv8PpsihitFI9/C 3HrMHgSodKEOnZbWdruYv30Ac6nMrWolGmGaQiRszE7FfoFLyGl3KEX7kp6sMAKU P4L+9qUDswRVs4Zfa07XiNlXAmZUYDzZPkQVstsgKOmXNHtBrSLE94pYllsHqW8C GDbtW5+ZpUm89LObhlZWPcxLXQkOUJUOb2SJChtaPUmOT0rEoExTB0bH1SADQM5d Mb3jLGD9CVrCjgxB4J6871xeCZPIL1XIXEKs0lDz7YDOSF7WTYXTrQOAiA+PZLlR VcD7WO7Z25XYP15WcU4ypzswsLPGro3QQi/nzk9qyh/NQt65LlpW/USZ3hiyp/Sh u4Wh2rrvAWyrOi01ShRmn87S9X/IuzB9cuzthxM7rWrF+/uj+vHOQp+/BAEFkkub njis6oFbkB2xfykRA99w1tLe6to3qNcWFtJuOBxjVFdbB24pqIWYmj4dwyE0TQCD 7QLfDCXx3vBDAg30Qbxx4Bh2w0ruggtHWa+D24Qz4YehJv1ppNY= =Ofi4 -----END PGP SIGNATURE----- Merge 4.14.105 into android-4.14-p Changes in 4.14.105 Revert "loop: Fix double mutex_unlock(&loop_ctl_mutex) in loop_control_ioctl()" Revert "loop: Get rid of loop_index_mutex" Revert "loop: Fold __loop_release into loop_release" net: stmmac: Fix reception of Broadcom switches tags net: stmmac: Disable ACS Feature for GMAC >= 4 scsi: libsas: Fix rphy phy_identifier for PHYs with end devices attached drm/msm: Unblock writer if reader closes file ASoC: Intel: Haswell/Broadwell: fix setting for .dynamic field ALSA: compress: prevent potential divide by zero bugs ASoC: Variable "val" in function rt274_i2c_probe() could be uninitialized clk: vc5: Abort clock configuration without upstream clock thermal: int340x_thermal: Fix a NULL vs IS_ERR() check usb: dwc3: gadget: synchronize_irq dwc irq in suspend usb: dwc3: gadget: Fix the uninitialized link_state when udc starts usb: gadget: Potential NULL dereference on allocation error genirq: Make sure the initial affinity is not empty ASoC: dapm: change snprintf to scnprintf for possible overflow ASoC: imx-audmux: change snprintf to scnprintf for possible overflow selftests: seccomp: use LDLIBS instead of LDFLAGS selftests: gpio-mockup-chardev: Check asprintf() for error ARC: fix __ffs return value to avoid build warnings drivers: thermal: int340x_thermal: Fix sysfs race condition staging: rtl8723bs: Fix build error with Clang when inlining is disabled mac80211: fix miscounting of ttl-dropped frames sched/wait: Fix rcuwait_wake_up() ordering futex: Fix (possible) missed wakeup locking/rwsem: Fix (possible) missed wakeup drm/amd/powerplay: OD setting fix on Vega10 serial: fsl_lpuart: fix maximum acceptable baud rate with over-sampling staging: android: ion: Support cpu access during dma_buf_detach direct-io: allow direct writes to empty inodes writeback: synchronize sync(2) against cgroup writeback membership switches scsi: csiostor: fix NULL pointer dereference in csio_vport_set_state() net: altera_tse: fix connect_local_phy error path hv_netvsc: Fix ethtool change hash key error net: usb: asix: ax88772_bind return error when hw_reset fail net: dev_is_mac_header_xmit() true for ARPHRD_RAWIP ibmveth: Do not process frames after calling napi_reschedule mac80211: don't initiate TDLS connection if station is not associated to AP mac80211: Add attribute aligned(2) to struct 'action' cfg80211: extend range deviation for DMG svm: Fix AVIC incomplete IPI emulation KVM: nSVM: clear events pending from svm_complete_interrupts() when exiting to L1 powerpc: Always initialize input array when calling epapr_hypercall() mmc: spi: Fix card detection during probe mmc: tmio_mmc_core: don't claim spurious interrupts mmc: tmio: fix access width of Block Count Register mmc: sdhci-esdhc-imx: correct the fix of ERR004536 mm: enforce min addr even if capable() in expand_downwards() MIPS: fix truncation in __cmpxchg_small for short values MIPS: eBPF: Fix icache flush end address x86/uaccess: Don't leak the AC flag into __put_user() value evaluation Linux 4.14.105 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
f581706924 |
mm: enforce min addr even if capable() in expand_downwards()
commit 0a1d52994d440e21def1c2174932410b4f2a98a1 upstream. security_mmap_addr() does a capability check with current_cred(), but we can reach this code from contexts like a VFS write handler where current_cred() must not be used. This can be abused on systems without SMAP to make NULL pointer dereferences exploitable again. Fixes: 8869477a49c3 ("security: protect from stack expansion into low vm addresses") Cc: stable@kernel.org Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
0ceec5e7d7 |
mm, oom: remove oom_lock from oom_reaper
oom_reaper used to rely on the oom_lock since e2fe14564d33 ("oom_reaper: close race with exiting task"). We do not really need the lock anymore though. 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently") has removed serialization with the exit path based on the mm reference count and so we do not really rely on the oom_lock anymore. Tetsuo was arguing that at least MMF_OOM_SKIP should be set under the lock to prevent from races when the page allocator didn't manage to get the freed (reaped) memory in __alloc_pages_may_oom but it sees the flag later on and move on to another victim. Although this is possible in principle let's wait for it to actually happen in real life before we make the locking more complex again. Therefore remove the oom_lock for oom_reaper paths (both exit_mmap and oom_reap_task_mm). The reaper serializes with exit_mmap by mmap_sem + MMF_OOM_SKIP flag. There is no synchronization with out_of_memory path now. Change-Id: Ifca2e3b07d4382e7a4c535306a83fbde6640ba1c Link: https://patchwork.kernel.org/patch/10533755/ Git-repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Git-commit: af5679fbc669f31f7ebd0d473bca76c24c07de30 Suggested-by: David Rientjes <rientjes@google.com> Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Charan Teja Reddy <charante@codeaurora.org> |
||
|
2973dadc19 |
Merge android-4.14.56 (818299f) into msm-4.14
* refs/heads/tmp-818299f Linux 4.14.56 f2fs: give message and set need_fsck given broken node id loop: remember whether sysfs_create_group() was done RDMA/ucm: Mark UCM interface as BROKEN PM / hibernate: Fix oops at snapshot_write() loop: add recursion validation to LOOP_CHANGE_FD netfilter: x_tables: initialise match/target check parameter struct netfilter: nf_queue: augment nfqa_cfg_policy uprobes/x86: Remove incorrect WARN_ON() in uprobe_init_insn() crypto: x86/salsa20 - remove x86 salsa20 implementations nvme-pci: Remap CMB SQ entries on every controller reset xen: setup pv irq ops vector earlier iw_cxgb4: correctly enforce the max reg_mr depth i2c: tegra: Fix NACK error handling IB/hfi1: Fix incorrect mixing of ERR_PTR and NULL return values tools build: fix # escaping in .cmd files for future Make arm64: neon: Fix function may_use_simd() return error status kbuild: delete INSTALL_FW_PATH from kbuild documentation tracing: Reorder display of TGID to be after PID mm: do not bug_on on incorrect length in __mm_populate() fs, elf: make sure to page align bss in load_elf_library fs/proc/task_mmu.c: fix Locked field in /proc/pid/smaps* mm: do not drop unused pages when userfaultd is running ALSA: hda - Handle pm failure during hotplug ALSA: hda/realtek - two more lenovo models need fixup of MIC_LOCATION scsi: megaraid_sas: fix selection of reply queue scsi: megaraid_sas: Create separate functions to allocate ctrl memory scsi: megaraid_sas: replace is_ventura with adapter_type checks scsi: megaraid_sas: replace instance->ctrl_context checks with instance->adapter_type scsi: megaraid_sas: use adapter_type for all gen controllers genirq/affinity: assign vectors to all possible CPUs Fix up non-directory creation in SGID directories devpts: resolve devpts bind-mounts devpts: hoist out check for DEVPTS_SUPER_MAGIC xhci: xhci-mem: off by one in xhci_stream_id_to_ring() usb: quirks: add delay quirks for Corsair Strafe USB: serial: mos7840: fix status-register error handling USB: yurex: fix out-of-bounds uaccess in read handler USB: serial: keyspan_pda: fix modem-status error handling USB: serial: cp210x: add another USB ID for Qivicon ZigBee stick USB: serial: ch341: fix type promotion bug in ch341_control_in() ahci: Disable LPM on Lenovo 50 series laptops with a too old BIOS vmw_balloon: fix inflation with batching ata: Fix ZBC_OUT all bit handling ata: Fix ZBC_OUT command block check staging: r8822be: Fix RTL8822be can't find any wireless AP staging: rtl8723bs: Prevent an underflow in rtw_check_beacon_data(). ibmasm: don't write out of bounds in read handler mmc: dw_mmc: fix card threshold control configuration mmc: sdhci-esdhc-imx: allow 1.8V modes without 100/200MHz pinctrl states MIPS: Fix ioremap() RAM check MIPS: Use async IPIs for arch_trigger_cpumask_backtrace() MIPS: Call dump_stack() from show_regs() ASoC: mediatek: preallocate pages use platform device media: rc: mce_kbd decoder: fix stuck keys ANDROID: Fix massive cpufreq_times memory leaks ANDROID: Reduce use of #ifdef CONFIG_CPU_FREQ_TIMES Change-Id: I8181c52138e12e6cdd25b9cf0ffba19469593ab2 Signed-off-by: Blagovest Kolenichev <bkolenichev@codeaurora.org> Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org> |
||
|
818299f6bd |
This is the 4.14.56 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAltNuVYACgkQONu9yGCS aT7kTA/+MRHC5oFvdnhSsF6jAHsY9rgJNQXPtZCFhZnHhhYHtubQ2OJOmSZ7IfM0 9yhz/7vijC9+tLufXQxQnu2UUL3ojNu1+l+q9s0U1GUzNiONlJ9q/CyB4xjXFRCS 1RdiDZaQbIqUCYs38UCTsEJF65uKjzQ6dpF21XdIXp5FPxgiZawo4HpjQRJswbAl Du97ybMEPN3XnAn207GjZwy58ubRLF5HDG1sqNGfjVWJ7oMTi+QJOCvY3PJtU3j2 unS0qjxLU432rOyDfaJK7Yj9s61zu0PurbJrHo+dw3O3hd/Og7soqoqohUEjZWXd z7jjrntXZOZ/0st2yHmygfAPUJm/8jsh7Pd39Jgyfeu/3Clo51gO494rwATQsyE5 mwIdllyzyMNBEJI2F2fxE60WlFsbTjeBOX3BaOwnF8pGRJWsCAfbFknRbuKh1fO5 czFbUSOi00POw4WHT1rxV9u0yDBXmP47fy9zHquOim+PfK8pFvWuf6GSFjvqRTv8 20w1w7eixMi09ZXOkgTJ3S00MKHSpxoaenI3n2NcEVVRgDEVfh3C/zelvvfCDMHD i36DN39Sj41PNA/R4n0TIA4W+ab9qBVzQl16yaj9JURR2rA92GyMVC1+Xjqo1Py3 GRFOf2Gprlm0/vfkiRsMu9coAJuKV6+8fHXQU4mzHulKUaDWuJ0= =/wBU -----END PGP SIGNATURE----- Merge 4.14.56 into android-4.14 Changes in 4.14.56 media: rc: mce_kbd decoder: fix stuck keys ASoC: mediatek: preallocate pages use platform device MIPS: Call dump_stack() from show_regs() MIPS: Use async IPIs for arch_trigger_cpumask_backtrace() MIPS: Fix ioremap() RAM check mmc: sdhci-esdhc-imx: allow 1.8V modes without 100/200MHz pinctrl states mmc: dw_mmc: fix card threshold control configuration ibmasm: don't write out of bounds in read handler staging: rtl8723bs: Prevent an underflow in rtw_check_beacon_data(). staging: r8822be: Fix RTL8822be can't find any wireless AP ata: Fix ZBC_OUT command block check ata: Fix ZBC_OUT all bit handling vmw_balloon: fix inflation with batching ahci: Disable LPM on Lenovo 50 series laptops with a too old BIOS USB: serial: ch341: fix type promotion bug in ch341_control_in() USB: serial: cp210x: add another USB ID for Qivicon ZigBee stick USB: serial: keyspan_pda: fix modem-status error handling USB: yurex: fix out-of-bounds uaccess in read handler USB: serial: mos7840: fix status-register error handling usb: quirks: add delay quirks for Corsair Strafe xhci: xhci-mem: off by one in xhci_stream_id_to_ring() devpts: hoist out check for DEVPTS_SUPER_MAGIC devpts: resolve devpts bind-mounts Fix up non-directory creation in SGID directories genirq/affinity: assign vectors to all possible CPUs scsi: megaraid_sas: use adapter_type for all gen controllers scsi: megaraid_sas: replace instance->ctrl_context checks with instance->adapter_type scsi: megaraid_sas: replace is_ventura with adapter_type checks scsi: megaraid_sas: Create separate functions to allocate ctrl memory scsi: megaraid_sas: fix selection of reply queue ALSA: hda/realtek - two more lenovo models need fixup of MIC_LOCATION ALSA: hda - Handle pm failure during hotplug mm: do not drop unused pages when userfaultd is running fs/proc/task_mmu.c: fix Locked field in /proc/pid/smaps* fs, elf: make sure to page align bss in load_elf_library mm: do not bug_on on incorrect length in __mm_populate() tracing: Reorder display of TGID to be after PID kbuild: delete INSTALL_FW_PATH from kbuild documentation arm64: neon: Fix function may_use_simd() return error status tools build: fix # escaping in .cmd files for future Make IB/hfi1: Fix incorrect mixing of ERR_PTR and NULL return values i2c: tegra: Fix NACK error handling iw_cxgb4: correctly enforce the max reg_mr depth xen: setup pv irq ops vector earlier nvme-pci: Remap CMB SQ entries on every controller reset crypto: x86/salsa20 - remove x86 salsa20 implementations uprobes/x86: Remove incorrect WARN_ON() in uprobe_init_insn() netfilter: nf_queue: augment nfqa_cfg_policy netfilter: x_tables: initialise match/target check parameter struct loop: add recursion validation to LOOP_CHANGE_FD PM / hibernate: Fix oops at snapshot_write() RDMA/ucm: Mark UCM interface as BROKEN loop: remember whether sysfs_create_group() was done f2fs: give message and set need_fsck given broken node id Linux 4.14.56 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
81ebc9decd |
mm: do not bug_on on incorrect length in __mm_populate()
commit bb177a732c4369bb58a1fe1df8f552b6f0f7db5f upstream. syzbot has noticed that a specially crafted library can easily hit VM_BUG_ON in __mm_populate kernel BUG at mm/gup.c:1242! invalid opcode: 0000 [#1] SMP CPU: 2 PID: 9667 Comm: a.out Not tainted 4.18.0-rc3 #644 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017 RIP: 0010:__mm_populate+0x1e2/0x1f0 Code: 55 d0 65 48 33 14 25 28 00 00 00 89 d8 75 21 48 83 c4 20 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 75 18 f1 ff 0f 0b e8 6e 18 f1 ff <0f> 0b 31 db eb c9 e8 93 06 e0 ff 0f 1f 00 55 48 89 e5 53 48 89 fb Call Trace: vm_brk_flags+0xc3/0x100 vm_brk+0x1f/0x30 load_elf_library+0x281/0x2e0 __ia32_sys_uselib+0x170/0x1e0 do_fast_syscall_32+0xca/0x420 entry_SYSENTER_compat+0x70/0x7f The reason is that the length of the new brk is not page aligned when we try to populate the it. There is no reason to bug on that though. do_brk_flags already aligns the length properly so the mapping is expanded as it should. All we need is to tell mm_populate about it. Besides that there is absolutely no reason to to bug_on in the first place. The worst thing that could happen is that the last page wouldn't get populated and that is far from putting system into an inconsistent state. Fix the issue by moving the length sanitization code from do_brk_flags up to vm_brk_flags. The only other caller of do_brk_flags is brk syscall entry and it makes sure to provide the proper length so t here is no need for sanitation and so we can use do_brk_flags without it. Also remove the bogus BUG_ONs. [osalvador@techadventures.net: fix up vm_brk_flags s@request@len@] Link: http://lkml.kernel.org/r/20180706090217.GI32658@dhcp22.suse.cz Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: syzbot <syzbot+5dcb560fe12aa5091c06@syzkaller.appspotmail.com> Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Zi Yan <zi.yan@cs.rutgers.edu> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
6b00789828 | Merge "Merge android-4.14.49 into msm-4.14" | ||
|
37f5b3d9c7 |
This is the 4.14.49 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlse4FIACgkQONu9yGCS aT5ZqxAAqCPguhXiTIWJPL0760M4I8C/cTLgl0JWpD946cHaQJpUhmuiXfc91+KO 1mhgjW6PQjojWJmj5VYM2aCkvUoZ4sowtMaiGuhjLL1Hr+s879wI0wB+9uO2gy3O 8si9gLV1qoa6QOAe5hbGiyOhkv0OYpRq290ar/NFVS+sKucwJEMFY/rLhjedNGlY BmT4C8xK5D/s/r6rRAGCyxQNar8RcFd7C+iEZXvQsybi4euJAd9DcpvileKORNxj bfBhHyFikbvN6Pfb6ooD9q88y8U3Tvk+sCZJI15acRePIJ4eJPtkjKkunsGms3jM cIA9WuC7NHSGMf8ZJAFAaxfNCepWpeb5lPSdkMuIjWhwQ22OhZR4eS+PQvhjojUl Z3Ry8fvxtMz4hseaeg+8mUJohLP1+v6GAIAGm9XRwElyTC0RKzKaUkK//zlrmmy3 wR8vpgdPBZJlvqE61+aNWcnLpKxRp/aWMTklXNVBsnL2FVfl0+8iBz6ylDjfmAx5 kRGdbWuCcHVfVq91fAwIwOIrHZg2Pi3/pIHQQXljqpSxG7vQrmPj2uiaYPTyAjZM LAVtE21TdtZX+4idHDZ8nNew204cr+ug1X373vugS8/NdLovpjb7vmxUusNy9ghG A5hif1fa8HEb3bCsUEBgZdkpD46uvk1IZc8js/e2zb5FmR3V8vM= =px2V -----END PGP SIGNATURE----- Merge 4.14.49 into android-4.14 Changes in 4.14.49 scsi: sd_zbc: Fix potential memory leak scsi: sd_zbc: Avoid that resetting a zone fails sporadically mmap: introduce sane default mmap limits mmap: relax file size limit for regular files btrfs: define SUPER_FLAG_METADUMP_V2 kconfig: Avoid format overflow warning from GCC 8.1 be2net: Fix error detection logic for BE3 bnx2x: use the right constant dccp: don't free ccid2_hc_tx_sock struct in dccp_disconnect() enic: set DMA mask to 47 bit ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds ip6_tunnel: remove magic mtu value 0xFFF8 ipmr: properly check rhltable_init() return value ipv4: remove warning in ip_recv_error ipv6: omit traffic class when calculating flow hash isdn: eicon: fix a missing-check bug kcm: Fix use-after-free caused by clonned sockets netdev-FAQ: clarify DaveM's position for stable backports net: ipv4: add missing RTA_TABLE to rtm_ipv4_policy net: metrics: add proper netlink validation net/packet: refine check for priv area size net: phy: broadcom: Fix bcm_write_exp() net: usb: cdc_mbim: add flag FLAG_SEND_ZLP packet: fix reserve calculation qed: Fix mask for physical address in ILT entry sctp: not allow transport timeout value less than HZ/5 for hb_timer team: use netdev_features_t instead of u32 vhost: synchronize IOTLB message with dev cleanup vrf: check the original netdevice for generating redirect ipv6: sr: fix memory OOB access in seg6_do_srh_encap/inline net: phy: broadcom: Fix auxiliary control register reads net-sysfs: Fix memory leak in XPS configuration virtio-net: correctly transmit XDP buff after linearizing net/mlx4: Fix irq-unsafe spinlock usage tun: Fix NULL pointer dereference in XDP redirect virtio-net: correctly check num_buf during err path net/mlx5e: When RXFCS is set, add FCS data into checksum calculation virtio-net: fix leaking page for gso packet during mergeable XDP rtnetlink: validate attributes in do_setlink() cls_flower: Fix incorrect idr release when failing to modify rule PCI: hv: Do not wait forever on a device that has disappeared drm: set FMODE_UNSIGNED_OFFSET for drm files Linux 4.14.49 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
af760b568e |
mmap: relax file size limit for regular files
commit 423913ad4ae5b3e8fb8983f70969fb522261ba26 upstream. Commit be83bbf80682 ("mmap: introduce sane default mmap limits") was introduced to catch problems in various ad-hoc character device drivers doing mmap and getting the size limits wrong. In the process, it used "known good" limits for the normal cases of mapping regular files and block device drivers. It turns out that the "s_maxbytes" limit was less "known good" than I thought. In particular, /proc doesn't set it, but exposes one regular file to mmap: /proc/vmcore. As a result, that file got limited to the default MAX_INT s_maxbytes value. This went unnoticed for a while, because apparently the only thing that needs it is the s390 kernel zfcpdump, but there might be other tools that use this too. Vasily suggested just changing s_maxbytes for all of /proc, which isn't wrong, but makes me nervous at this stage. So instead, just make the new mmap limit always be MAX_LFS_FILESIZE for regular files, which won't affect anything else. It wasn't the regular file case I was worried about. I'd really prefer for maxsize to have been per-inode, but that is not how things are today. Fixes: be83bbf80682 ("mmap: introduce sane default mmap limits") Reported-by: Vasily Gorbik <gor@linux.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
16d7ceb04b |
mmap: introduce sane default mmap limits
commit be83bbf806822b1b89e0a0f23cd87cddc409e429 upstream. The internal VM "mmap()" interfaces are based on the mmap target doing everything using page indexes rather than byte offsets, because traditionally (ie 32-bit) we had the situation that the byte offset didn't fit in a register. So while the mmap virtual address was limited by the word size of the architecture, the backing store was not. So we're basically passing "pgoff" around as a page index, in order to be able to describe backing store locations that are much bigger than the word size (think files larger than 4GB etc). But while this all makes a ton of sense conceptually, we've been dogged by various drivers that don't really understand this, and internally work with byte offsets, and then try to work with the page index by turning it into a byte offset with "pgoff << PAGE_SHIFT". Which obviously can overflow. Adding the size of the mapping to it to get the byte offset of the end of the backing store just exacerbates the problem, and if you then use this overflow-prone value to check various limits of your device driver mmap capability, you're just setting yourself up for problems. The correct thing for drivers to do is to do their limit math in page indices, the way the interface is designed. Because the generic mmap code _does_ test that the index doesn't overflow, since that's what the mmap code really cares about. HOWEVER. Finding and fixing various random drivers is a sisyphean task, so let's just see if we can just make the core mmap() code do the limiting for us. Realistically, the only "big" backing stores we need to care about are regular files and block devices, both of which are known to do this properly, and which have nice well-defined limits for how much data they can access. So let's special-case just those two known cases, and then limit other random mmap users to a backing store that still fits in "unsigned long". Realistically, that's not much of a limit at all on 64-bit, and on 32-bit architectures the only worry might be the GPU drivers, which can have big physical address spaces. To make it possible for drivers like that to say that they are 64-bit clean, this patch does repurpose the "FMODE_UNSIGNED_OFFSET" bit in the file flags to allow drivers to mark their file descriptors as safe in the full 64-bit mmap address space. [ The timing for doing this is less than optimal, and this should really go in a merge window. But realistically, this needs wide testing more than it needs anything else, and being main-line is the only way to do that. So the earlier the better, even if it's outside the proper development cycle - Linus ] Cc: Kees Cook <keescook@chromium.org> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Willy Tarreau <w@1wt.eu> Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
eba2eb1ba0 |
mm: protect mm_rb tree with a rwlock
This change is inspired by the Peter's proposal patch [1] which was protecting the VMA using SRCU. Unfortunately, SRCU is not scaling well in that particular case, and it is introducing major performance degradation due to excessive scheduling operations. To allow access to the mm_rb tree without grabbing the mmap_sem, this patch is protecting it access using a rwlock. As the mm_rb tree is a O(log n) search it is safe to protect it using such a lock. The VMA cache is not protected by the new rwlock and it should not be used without holding the mmap_sem. To allow the picked VMA structure to be used once the rwlock is released, a use count is added to the VMA structure. When the VMA is allocated it is set to 1. Each time the VMA is picked with the rwlock held its use count is incremented. Each time the VMA is released it is decremented. When the use count hits zero, this means that the VMA is no more used and should be freed. This patch is preparing for 2 kind of VMA access : - as usual, under the control of the mmap_sem, - without holding the mmap_sem for the speculative page fault handler. Access done under the control the mmap_sem doesn't require to grab the rwlock to protect read access to the mm_rb tree, but access in write must be done under the protection of the rwlock too. This affects inserting and removing of elements in the RB tree. The patch is introducing 2 new functions: - vma_get() to find a VMA based on an address by holding the new rwlock. - vma_put() to release the VMA when its no more used. These services are designed to be used when access are made to the RB tree without holding the mmap_sem. When a VMA is removed from the RB tree, its vma->vm_rb field is cleared and we rely on the WMB done when releasing the rwlock to serialize the write with the RMB done in a later patch to check for the VMA's validity. When free_vma is called, the file associated with the VMA is closed immediately, but the policy and the file structure remained in used until the VMA's use count reach 0, which may happens later when exiting an in progress speculative page fault. [1] https://patchwork.kernel.org/patch/5108281/ Change-Id: I9ecc922b8efa4b28975cc6a8e9531284c24ac14e Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Patch-mainline: linux-mm @ Tue, 17 Apr 2018 16:33:23 [vinmenon@codeaurora.org: fix the return of put_vma] Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> |
||
|
f831a42848 |
mm: protect mremap() against SPF hanlder
If a thread is remapping an area while another one is faulting on the destination area, the SPF handler may fetch the vma from the RB tree before the pte has been moved by the other thread. This means that the moved ptes will overwrite those create by the page fault handler leading to page leaked. CPU 1 CPU2 enter mremap() unmap the dest area copy_vma() Enter speculative page fault handler >> at this time the dest area is present in the RB tree fetch the vma matching dest area create a pte as the VMA matched Exit the SPF handler <data written in the new page> move_ptes() > it is assumed that the dest area is empty, > the move ptes overwrite the page mapped by the CPU2. To prevent that, when the VMA matching the dest area is extended or created by copy_vma(), it should be marked as non available to the SPF handler. The usual way to so is to rely on vm_write_begin()/end(). This is already in __vma_adjust() called by copy_vma() (through vma_merge()). But __vma_adjust() is calling vm_write_end() before returning which create a window for another thread. This patch adds a new parameter to vma_merge() which is passed down to vma_adjust(). The assumption is that copy_vma() is returning a vma which should be released by calling vm_raw_write_end() by the callee once the ptes have been moved. Change-Id: Icd338ad6e9b3c97b7334d3b8d30a8badfa2a4efa Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Patch-mainline: linux-mm @ Tue, 17 Apr 2018 16:33:16 [vinmenon@codeaurora.org: changes in vma_merge arguments related to the anon vma user name which is not suppported upstream.] Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> |
||
|
4153099359 |
mm: protect VMA modifications using VMA sequence count
The VMA sequence count has been introduced to allow fast detection of VMA modification when running a page fault handler without holding the mmap_sem. This patch provides protection against the VMA modification done in : - madvise() - mpol_rebind_policy() - vma_replace_policy() - change_prot_numa() - mlock(), munlock() - mprotect() - mmap_region() - collapse_huge_page() - userfaultd registering services In addition, VMA fields which will be read during the speculative fault path needs to be written using WRITE_ONCE to prevent write to be split and intermediate values to be pushed to other CPUs. Change-Id: Ic36046b7254e538b6baf7144c50ae577ee7f2074 Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Patch-mainline: linux-mm @ Tue, 17 Apr 2018 16:33:15 [vinmenon@codeaurora.org: trivial merge conflict fixes] Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> |
||
|
184ccc18de |
mm: VMA sequence count
Wrap the VMA modifications (vma_adjust/unmap_page_range) with sequence counts such that we can easily test if a VMA is changed. The unmap_page_range() one allows us to make assumptions about page-tables; when we find the seqcount hasn't changed we can assume page-tables are still valid. The flip side is that we cannot distinguish between a vma_adjust() and the unmap_page_range() -- where with the former we could have re-checked the vma bounds against the address. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> [Port to 4.12 kernel] [Build depends on CONFIG_SPECULATIVE_PAGE_FAULT] [Introduce vm_write_* inline function depending on CONFIG_SPECULATIVE_PAGE_FAULT] [Fix lock dependency between mapping->i_mmap_rwsem and vma->vm_sequence by using vm_raw_write* functions] [Fix a lock dependency warning in mmap_region() when entering the error path] [move sequence initialisation INIT_VMA()] Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Change-Id: Ibc23ef3b9dbb80323c0f24cb06da34b4c3a8fa71 Patch-mainline: linux-mm @ 17 Apr 2018 16:33:14 [vinmenon@codeaurora.org: trivial merge conflict fixes] Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> |
||
|
f98b74c615 |
mm: introduce INIT_VMA()
Some VMA struct fields need to be initialized once the VMA structure is allocated. Currently this only concerns anon_vma_chain field but some other will be added to support the speculative page fault. Instead of spreading the initialization calls all over the code, let's introduce a dedicated inline function. Change-Id: I9f6b29dc74055354318b548e2b6b22c37d4c61bb Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Patch-mainline: linux-mm @ Tue, 17 Apr 2018 16:33:13 [vinmenon@codeaurora.org: trivial merge conflict fixes] Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> |
||
|
04f740d4da |
This is the 4.14.41 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlr753gACgkQONu9yGCS aT7p/Q//TIC9EKe21E2Lb1Kh4lL5SDjmwe/rkA3PxiqxbkXfUDBehMCfDk4YVNVG TlH1TXOubzpS/8cZJPRFHEkrYXPKIA3+hKlAvJukUJCBQqmW1ILEAX5m7jrSmf+B tLe/r0ijOtlfB1xQdUs5RxXGIndw0gMGhpo/QTXPAC0hGh0Ykd8v2s4YAjxOvdKw z4DaUKtZGEPBWFVK/Bx1Fv3iAmJMt2yerERUqz8MVegYXJt+2RUGoJtsxHuvOk1p 9q0lzHBWYihQVt1tJ0es/8cB7WsYt8txnVmeN907sryUhDjvTWIxQJb5jEV0gxxK AL89PHy4Hfki6l6r+tqYi92frFda8aLfsaSseOhlmqsv0MlwngW2dx3UbjaYd4If IQA6n0hWHuxUvjrjsPpsMAa4lvTW+/kFilb0mD6Vixy3ru+/RelKnuawJm6kbMNu Cb8QSVSJrhvC/UZLvwO7a3viJdKoI5B9pTh5FTKcY5wUPI1k01pg3WlWNxmnv4ZJ LPImR06aoJYhvbutf94AvxbCOt/au8sY4s/yk9oHgvGUEIccrGYf3BwX6ciWRt4b r4ZN92C9ZuD+u/ATFgi/akngtjjixw5YrZ20aX86dYcBZ25hYOiIMoc482tYQ12Z 1vqyvKg9o1oMypG9orF09PWstbNRu3ihGATKdXL9lfAhDklOTKc= =zWTK -----END PGP SIGNATURE----- Merge 4.14.41 into android-4.14 Changes in 4.14.41 ipvs: fix rtnl_lock lockups caused by start_sync_thread netfilter: ebtables: don't attempt to allocate 0-sized compat array kcm: Call strp_stop before strp_done in kcm_attach crypto: af_alg - fix possible uninit-value in alg_bind() netlink: fix uninit-value in netlink_sendmsg net: fix rtnh_ok() net: initialize skb->peeked when cloning net: fix uninit-value in __hw_addr_add_ex() dccp: initialize ireq->ir_mark ipv4: fix uninit-value in ip_route_output_key_hash_rcu() soreuseport: initialise timewait reuseport field inetpeer: fix uninit-value in inet_getpeer memcg: fix per_node_info cleanup perf: Remove superfluous allocation error check tcp: fix TCP_REPAIR_QUEUE bound checking bdi: wake up concurrent wb_shutdown() callers. bdi: Fix oops in wb_workfn() KVM: PPC: Book3S HV: Fix trap number return from __kvmppc_vcore_entry KVM: PPC: Book3S HV: Fix guest time accounting with VIRT_CPU_ACCOUNTING_GEN KVM: PPC: Book3S HV: Fix VRMA initialization with 2MB or 1GB memory backing arm64: Add work around for Arm Cortex-A55 Erratum 1024718 compat: fix 4-byte infoleak via uninitialized struct field gpioib: do not free unrequested descriptors gpio: fix aspeed_gpio unmask irq gpio: fix error path in lineevent_create rfkill: gpio: fix memory leak in probe error path libata: Apply NOLPM quirk for SanDisk SD7UB3Q*G1001 SSDs dm integrity: use kvfree for kvmalloc'd memory tracing: Fix regex_match_front() to not over compare the test string z3fold: fix reclaim lock-ups mm: sections are not offlined during memory hotremove mm, oom: fix concurrent munlock and oom reaper unmap, v3 ceph: fix rsize/wsize capping in ceph_direct_read_write() can: kvaser_usb: Increase correct stats counter in kvaser_usb_rx_can_msg() can: hi311x: Acquire SPI lock on ->do_get_berr_counter can: hi311x: Work around TX complete interrupt erratum drm/vc4: Fix scaling of uni-planar formats drm/i915: Fix drm:intel_enable_lvds ERROR message in kernel log drm/nouveau: Fix deadlock in nv50_mstm_register_connector() drm/atomic: Clean old_state/new_state in drm_atomic_state_default_clear() drm/atomic: Clean private obj old_state/new_state in drm_atomic_state_default_clear() net: atm: Fix potential Spectre v1 atm: zatm: Fix potential Spectre v1 PCI / PM: Always check PME wakeup capability for runtime wakeup support PCI / PM: Check device_may_wakeup() in pci_enable_wake() cpufreq: schedutil: Avoid using invalid next_freq Revert "Bluetooth: btusb: Fix quirk for Atheros 1525/QCA6174" Bluetooth: btusb: Add Dell XPS 13 9360 to btusb_needs_reset_resume_table Bluetooth: btusb: Only check needs_reset_resume DMI table for QCA rome chipsets thermal: exynos: Reading temperature makes sense only when TMU is turned on thermal: exynos: Propagate error value from tmu_read() nvme: add quirk to force medium priority for SQ creation smb3: directory sync should not return an error sched/autogroup: Fix possible Spectre-v1 indexing for sched_prio_to_weight[] tracing/uprobe_event: Fix strncpy corner case perf/x86: Fix possible Spectre-v1 indexing for hw_perf_event cache_* perf/x86/cstate: Fix possible Spectre-v1 indexing for pkg_msr perf/x86/msr: Fix possible Spectre-v1 indexing in the MSR driver perf/core: Fix possible Spectre-v1 indexing for ->aux_pages[] perf/x86: Fix possible Spectre-v1 indexing for x86_pmu::event_map() KVM: PPC: Book3S HV: Fix handling of large pages in radix page fault handler KVM: x86: remove APIC Timer periodic/oneshot spikes Linux 4.14.41 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
2270dfcc4b |
mm, oom: fix concurrent munlock and oom reaper unmap, v3
commit 27ae357fa82be5ab73b2ef8d39dcb8ca2563483a upstream. Since exit_mmap() is done without the protection of mm->mmap_sem, it is possible for the oom reaper to concurrently operate on an mm until MMF_OOM_SKIP is set. This allows munlock_vma_pages_all() to concurrently run while the oom reaper is operating on a vma. Since munlock_vma_pages_range() depends on clearing VM_LOCKED from vm_flags before actually doing the munlock to determine if any other vmas are locking the same memory, the check for VM_LOCKED in the oom reaper is racy. This is especially noticeable on architectures such as powerpc where clearing a huge pmd requires serialize_against_pte_lookup(). If the pmd is zapped by the oom reaper during follow_page_mask() after the check for pmd_none() is bypassed, this ends up deferencing a NULL ptl or a kernel oops. Fix this by manually freeing all possible memory from the mm before doing the munlock and then setting MMF_OOM_SKIP. The oom reaper can not run on the mm anymore so the munlock is safe to do in exit_mmap(). It also matches the logic that the oom reaper currently uses for determining when to set MMF_OOM_SKIP itself, so there's no new risk of excessive oom killing. This issue fixes CVE-2018-1000200. Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1804241526320.238665@chino.kir.corp.google.com Fixes: 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently") Signed-off-by: David Rientjes <rientjes@google.com> Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> [4.14+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
cb3422d0ff |
ANDROID: mm: Export do_munmap
The 0-day build bot reports the following build error, seen if SDCARD_FS is built as module. ERROR: "do_munmap" undefined! Fixes: 84a1b7d3d312 ("Included sdcardfs source code for kernel 3.0") Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Guenter Roeck <groeck@chromium.org> |
||
|
7237d3f322 |
This is the 4.14.8 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlo6Km0ACgkQONu9yGCS aT7vKA//WO6PM1s+/eDNZu7IuBcvrvSuFI7FffvbDuWsIFk5fKwLen2M9RLT9oZF 8XxnZG0fWKZCdvLTeC14C4oKedcOMSR6VZIg5CddMqibq5uWNLisfZUb0KKG4DE3 Mljvw8AM8OJupmXtDHu/nG4/J1vP03jWi6liMAwLyzxTGOMjtq0a7Nbt3NiTArFz cI7Xs+zEnS2fgKX2SrJGB+6xYKQe4Q4JSjZ1lQAM3LW3G96n731HsU/oRKkU9FyQ ZBysxNITbPSwKP9SuvgAl5rYo7ZKIQ9UbUDyJmm82gvwxDupdaa4hkrBo/+imESN ngDO/EFirsPXTEj2H7SPw3+sRh9j41IDAcqDPZKFNuwEe0g0fwJ/g3/7EaZeXpR2 kXT+sHrwkzXcDXeuEQkUtsPULCJuYfgVUXnSIGtuPv7sG1IOaeOyofXHYixGsONu HRZMNPQcdqT3aYQIlMIOWEJidpOtxGztwdz1fWWzUQwNPBBNH2tlgEXhyR0ofDaM VVRRw9V6IyS4/5HjaCs/nIp+WN+vyW0p+xijQIzzLTAFRITB5IMDWwggPT5z4JPk Yf41I8bjkTUA0oMWL/b737u2krM8wYmmwvdc+9+lrA+HM51B63ViuvWyXrEAd2Sv 37X8oZa8dj4lIDZoTll+8wXcY80VHuz7+vvVYIRUwnSEg1bLKIU= =eZgu -----END PGP SIGNATURE----- Merge 4.14.8 into android-4.14 Changes in 4.14.8 mfd: fsl-imx25: Clean up irq settings during removal crypto: algif_aead - fix reference counting of null skcipher crypto: rsa - fix buffer overread when stripping leading zeroes crypto: hmac - require that the underlying hash algorithm is unkeyed crypto: salsa20 - fix blkcipher_walk API usage crypto: af_alg - fix NULL pointer dereference in cifs: fix NULL deref in SMB2_read string.h: workaround for increased stack usage autofs: fix careless error in recent commit kernel: make groups_sort calling a responsibility group_info allocators mm, oom_reaper: fix memory corruption tracing: Allocate mask_str buffer dynamically USB: uas and storage: Add US_FL_BROKEN_FUA for another JMicron JMS567 ID USB: core: prevent malicious bNumInterfaces overflow ovl: Pass ovl_get_nlink() parameters in right order ovl: update ctx->pos on impure dir iteration usbip: fix stub_rx: get_pipe() to validate endpoint number usbip: fix stub_rx: harden CMD_SUBMIT path to handle malicious input usbip: prevent vhci_hcd driver from leaking a socket pointer address usbip: fix stub_send_ret_submit() vulnerability to null transfer_buffer mmc: core: apply NO_CMD23 quirk to some specific cards ceph: drop negative child dentries before try pruning inode's alias usb: xhci: fix TDS for MTK xHCI1.1 xhci: Don't add a virt_dev to the devs array before it's fully allocated IB/core: Bound check alternate path port number IB/core: Don't enforce PKey security on SMI MADs nfs: don't wait on commit in nfs_commit_inode() if there were no commit requests arm64: mm: Fix pte_mkclean, pte_mkdirty semantics arm64: Initialise high_memory global variable earlier arm64: fix CONFIG_DEBUG_WX address reporting scsi: core: Fix a scsi_show_rq() NULL pointer dereference scsi: libsas: fix length error in sas_smp_handler() sched/rt: Do not pull from current CPU if only one CPU to pull dm: fix various targets to dm_register_target after module __init resources created SUNRPC: Fix a race in the receive code path iw_cxgb4: only insert drain cqes if wq is flushed x86/boot/compressed/64: Detect and handle 5-level paging at boot-time x86/boot/compressed/64: Print error if 5-level paging is not supported eeprom: at24: change nvmem stride to 1 posix-timer: Properly check sigevent->sigev_notify dmaengine: dmatest: move callback wait queue to thread context Revert "exec: avoid RLIMIT_STACK races with prlimit()" ext4: support fast symlinks from ext3 file systems ext4: fix fdatasync(2) after fallocate(2) operation ext4: add missing error check in __ext4_new_inode() ext4: fix crash when a directory's i_size is too small IB/mlx4: Fix RSS's QPC attributes assignments HID: cp2112: fix broken gpio_direction_input callback sfc: don't warn on successful change of MAC fbdev: controlfb: Add missing modes to fix out of bounds access video: udlfb: Fix read EDID timeout video: fbdev: au1200fb: Release some resources if a memory allocation fails video: fbdev: au1200fb: Return an error code if a memory allocation fails rtc: pcf8563: fix output clock rate scsi: aacraid: use timespec64 instead of timeval drm/amdgpu: bypass lru touch for KIQ ring submission PM / s2idle: Clear the events_check_enabled flag ASoC: Intel: Skylake: Fix uuid_module memory leak in failure case dmaengine: ti-dma-crossbar: Correct am335x/am43xx mux value type mlxsw: spectrum: Fix error return code in mlxsw_sp_port_create() PCI/PME: Handle invalid data when reading Root Status powerpc/powernv/cpufreq: Fix the frequency read by /proc/cpuinfo PCI: Do not allocate more buses than available in parent iommu/mediatek: Fix driver name thunderbolt: tb: fix use after free in tb_activate_pcie_devices netfilter: ipvs: Fix inappropriate output of procfs powerpc/opal: Fix EBUSY bug in acquiring tokens powerpc/ipic: Fix status get and status clear powerpc/pseries/vio: Dispose of virq mapping on vdevice unregister platform/x86: intel_punit_ipc: Fix resource ioremap warning target/iscsi: Detect conn_cmd_list corruption early target/iscsi: Fix a race condition in iscsit_add_reject_from_cmd() iscsi-target: fix memory leak in lio_target_tiqn_addtpg() target:fix condition return in core_pr_dump_initiator_port() target/file: Do not return error for UNMAP if length is zero badblocks: fix wrong return value in badblocks_set if badblocks are disabled iommu/amd: Limit the IOVA page range to the specified addresses xfs: truncate pagecache before writeback in xfs_setattr_size() arm-ccn: perf: Prevent module unload while PMU is in use crypto: tcrypt - fix buffer lengths in test_aead_speed() mm: Handle 0 flags in _calc_vm_trans() macro net: hns3: fix for getting advertised_caps in hns3_get_link_ksettings net: hns3: Fix a misuse to devm_free_irq staging: rtl8188eu: Revert part of "staging: rtl8188eu: fix comments with lines over 80 characters" clk: mediatek: add the option for determining PLL source clock clk: imx: imx7d: Fix parent clock for OCRAM_CLK clk: imx6: refine hdmi_isfr's parent to make HDMI work on i.MX6 SoCs w/o VPU media: camss-vfe: always initialize reg at vfe_set_xbar_cfg() clk: hi6220: mark clock cs_atb_syspll as critical blk-mq-sched: dispatch from scheduler IFF progress is made in ->dispatch clk: tegra: Use readl_relaxed_poll_timeout_atomic() in tegra210_clock_init() clk: tegra: Fix cclk_lp divisor register ppp: Destroy the mutex when cleanup ASoC: rsnd: rsnd_ssi_run_mods() needs to care ssi_parent_mod thermal/drivers/step_wise: Fix temperature regulation misbehavior misc: pci_endpoint_test: Fix failure path return values in probe misc: pci_endpoint_test: Avoid triggering a BUG() scsi: scsi_debug: write_same: fix error report GFS2: Take inode off order_write list when setting jdata flag media: usbtv: fix brightness and contrast controls rpmsg: glink: Initialize the "intent_req_comp" completion variable bcache: explicitly destroy mutex while exiting bcache: fix wrong cache_misses statistics Ib/hfi1: Return actual operational VLs in port info query Bluetooth: hci_ldisc: Fix another race when closing the tty. arm64: prevent regressions in compressed kernel image size when upgrading to binutils 2.27 btrfs: fix false EIO for missing device btrfs: Explicitly handle btrfs_update_root failure btrfs: undo writable superblocke when sprouting fails btrfs: avoid null pointer dereference on fs_info when calling btrfs_crit btrfs: tests: Fix a memory leak in error handling path in 'run_test()' qtnfmac: modify full Tx queue error reporting mtd: spi-nor: stm32-quadspi: Fix uninitialized error return code ARM64: dts: meson-gxbb-odroidc2: fix usb1 power supply Bluetooth: btusb: Add new NFA344A entry. samples/bpf: adjust rlimit RLIMIT_MEMLOCK for xdp1 liquidio: fix kernel panic in VF driver platform/x86: hp_accel: Add quirk for HP ProBook 440 G4 nvme: use kref_get_unless_zero in nvme_find_get_ns l2tp: cleanup l2tp_tunnel_delete calls xfs: fix log block underflow during recovery cycle verification xfs: return a distinct error code value for IGET_INCORE cache misses xfs: fix incorrect extent state in xfs_bmap_add_extent_unwritten_real net: dsa: lan9303: Do not disable switch fabric port 0 at .probe net: hns3: fix a bug in hclge_uninit_client_instance net: hns3: add nic_client check when initialize roce base information net: hns3: fix the bug of hns3_set_txbd_baseinfo RDMA/cxgb4: Declare stag as __be32 PCI: Detach driver before procfs & sysfs teardown on device remove scsi: hisi_sas: fix the risk of freeing slot twice scsi: hpsa: cleanup sas_phy structures in sysfs when unloading scsi: hpsa: destroy sas transport properties before scsi_host mfd: mxs-lradc: Fix error handling in mxs_lradc_probe() net: hns3: fix the TX/RX ring.queue_index in hns3_ring_get_cfg net: hns3: fix the bug when map buffer fail net: hns3: fix a bug when alloc new buffer serdev: ttyport: enforce tty-driver open() requirement powerpc/perf/hv-24x7: Fix incorrect comparison in memord powerpc/xmon: Check before calling xive functions soc: mediatek: pwrap: fix compiler errors ipv4: ipv4_default_advmss() should use route mtu KVM: nVMX: Fix EPT switching advertising tty fix oops when rmmod 8250 dev/dax: fix uninitialized variable build warning pinctrl: adi2: Fix Kconfig build problem raid5: Set R5_Expanded on parity devices as well as data. scsi: scsi_devinfo: Add REPORTLUN2 to EMC SYMMETRIX blacklist entry IB/core: Fix use workqueue without WQ_MEM_RECLAIM IB/core: Fix calculation of maximum RoCE MTU vt6655: Fix a possible sleep-in-atomic bug in vt6655_suspend IB/hfi1: Mask out A bit from psn trace rtl8188eu: Fix a possible sleep-in-atomic bug in rtw_createbss_cmd rtl8188eu: Fix a possible sleep-in-atomic bug in rtw_disassoc_cmd ipmi_si: fix memory leak on new_smi nullb: fix error return code in null_init() scsi: sd: change manage_start_stop to bool in sysfs interface scsi: sd: change allow_restart to bool in sysfs interface scsi: bfa: integer overflow in debugfs raid5-ppl: check recovery_offset when performing ppl recovery md-cluster: fix wrong condition check in raid1_write_request xprtrdma: Don't defer fencing an async RPC's chunks udf: Avoid overflow when session starts at large offset macvlan: Only deliver one copy of the frame to the macvlan interface IB/core: Fix endianness annotation in rdma_is_multicast_addr() RDMA/cma: Avoid triggering undefined behavior IB/ipoib: Grab rtnl lock on heavy flush when calling ndo_open/stop icmp: don't fail on fragment reassembly time exceeded lightnvm: pblk: prevent gc kicks when gc is not operational lightnvm: pblk: fix changing GC group list for a line lightnvm: pblk: use right flag for GC allocation lightnvm: pblk: initialize debug stat counter lightnvm: pblk: fix min size for page mempool lightnvm: pblk: protect line bitmap while submitting meta io ath9k: fix tx99 potential info leak ath10k: fix core PCI suspend when WoWLAN is supported but disabled ath10k: fix build errors with !CONFIG_PM usb: musb: da8xx: fix babble condition handling Linux 4.14.8 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
c5c36272cd |
This is the 4.14.4 stable release
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAlomdF8ACgkQONu9yGCS aT668xAAtbsoX6y/g11RT5/DsBoeIYgvNTzfcU3dGJC2/1rEx+pBOLFkGVNcK7nR wXD/DUFHWQRSsynke+gP8mjmWsRxwmoo0bv04eZ3Xdf8GGAVKIJQjUXV5jXOCPtw fMWshZkQlM11aus/bxEW0H7vqBK4DBLoYJ7H21i5SKkWubyUmDV6rX0So1w6sKYo RSvVG1MTkLjRSrSStgBKTBMoOdj6PfCKcQRmaqjPNZRP2+uqD+8NuUlbMZijxuYw U3lhXv8czRt0NSyA3pc9ucFR6DwAvc6VgVRvLec1+XzKHlvmCgBo9Tmsq5DcfT1B 9owFlS53yzyEMk8o7FYznX5rDd32MBIejjAgpCKyXxurkv58NiwSs6VJIzHcNHJK 2xc1nmZH8wIrUaYo7ecq6e7hN+TMvPK9wWyhsKauiofaJUY4c7pI2Qb37ddNPxpE 11j3Vb0OlqxK3rAc+ElDmTe6GZ3rd2hLZU6nyPIqIWOrwgXf2zlB5X9ytZzR4gMi rZrzDyKNO3lRNhteb5qzGzT6bH5wMvDZUp6DhviSBd4FVSXfTT53AEDoYgk9OLE2 rhaMVTu4zgRQi7AvM1PRyiVisQHwnXQUU6pGiXDWltFJMz9uPvHmMT8iZlCODePG 3x/Hj4ZAXHARNKkDQCwvPz3zWffwugRdXzMiPN1oyDzxgzQuC/c= =bxe4 -----END PGP SIGNATURE----- Merge 4.14.4 into android-4.14 Changes in 4.14.4 platform/x86: hp-wmi: Fix tablet mode detection for convertibles mm, memory_hotplug: do not back off draining pcp free pages from kworker context mm, oom_reaper: gather each vma to prevent leaking TLB entry mm, thp: Do not make page table dirty unconditionally in touch_p[mu]d() mm/cma: fix alloc_contig_range ret code/potential leak mm: fix device-dax pud write-faults triggered by get_user_pages() mm, hugetlbfs: introduce ->split() to vm_operations_struct device-dax: implement ->split() to catch invalid munmap attempts mm: introduce get_user_pages_longterm mm: fail get_vaddr_frames() for filesystem-dax mappings v4l2: disable filesystem-dax mapping support IB/core: disable memory registration of filesystem-dax vmas exec: avoid RLIMIT_STACK races with prlimit() mm/madvise.c: fix madvise() infinite loop under special circumstances mm: migrate: fix an incorrect call of prep_transhuge_page() mm, memcg: fix mem_cgroup_swapout() for THPs fs/fat/inode.c: fix sb_rdonly() change autofs: revert "autofs: take more care to not update last_used on path walk" autofs: revert "autofs: fix AT_NO_AUTOMOUNT not being honored" mm/hugetlb: fix NULL-pointer dereference on 5-level paging machine btrfs: clear space cache inode generation always nfsd: Fix stateid races between OPEN and CLOSE nfsd: Fix another OPEN stateid race nfsd: fix panic in posix_unblock_lock called from nfs4_laundromat crypto: algif_aead - skip SGL entries with NULL page crypto: af_alg - remove locking in async callback crypto: skcipher - Fix skcipher_walk_aead_common lockd: lost rollback of set_grace_period() in lockd_down_net() s390: revert ELF_ET_DYN_BASE base changes drm: omapdrm: Fix DPI on platforms using the DSI VDDS omapdrm: hdmi4: Correct the SoC revision matching apparmor: fix oops in audit_signal_cb hook arm64: module-plts: factor out PLT generation code for ftrace arm64: ftrace: emit ftrace-mod.o contents through code powerpc/powernv: Fix kexec crashes caused by tlbie tracing powerpc/kexec: Fix kexec/kdump in P9 guest kernels KVM: x86: pvclock: Handle first-time write to pvclock-page contains random junk KVM: x86: Exit to user-mode on #UD intercept when emulator requires KVM: x86: inject exceptions produced by x86_decode_insn KVM: lapic: Split out x2apic ldr calculation KVM: lapic: Fixup LDR on load in x2apic mmc: sdhci: Avoid swiotlb buffer being full mmc: block: Fix missing blk_put_request() mmc: block: Check return value of blk_get_request() mmc: core: Do not leave the block driver in a suspended state mmc: block: Ensure that debugfs files are removed mmc: core: prepend 0x to pre_eol_info entry in sysfs mmc: core: prepend 0x to OCR entry in sysfs ACPI / EC: Fix regression related to PM ops support in ECDT device eeprom: at24: fix reading from 24MAC402/24MAC602 eeprom: at24: correctly set the size for at24mac402 eeprom: at24: check at24_read/write arguments i2c: i801: Fix Failed to allocate irq -2147483648 error cxl: Check if vphb exists before iterating over AFU devices bcache: Fix building error on MIPS bcache: only permit to recovery read error when cache device is clean bcache: recover data from backing when data is clean hwmon: (jc42) optionally try to disable the SMBUS timeout nvme-pci: add quirk for delay before CHK RDY for WDC SN200 Revert "drm/radeon: dont switch vt on suspend" drm/amdgpu: potential uninitialized variable in amdgpu_vce_ring_parse_cs() drm/amdgpu: Potential uninitialized variable in amdgpu_vm_update_directories() drm/amdgpu: correct reference clock value on vega10 drm/amdgpu: fix error handling in amdgpu_bo_do_create drm/amdgpu: Properly allocate VM invalidate eng v2 drm/amdgpu: Remove check which is not valid for certain VBIOS drm/ttm: fix ttm_bo_cleanup_refs_or_queue once more dma-buf: make reservation_object_copy_fences rcu save drm/amdgpu: reserve root PD while releasing it drm/ttm: Always and only destroy bo->ttm_resv in ttm_bo_release_list drm/vblank: Fix flip event vblank count drm/vblank: Tune drm_crtc_accurate_vblank_count() WARN down to a debug drm/tilcdc: Precalculate total frametime in tilcdc_crtc_set_mode() drm/radeon: fix atombios on big endian drm/panel: simple: Add missing panel_simple_unprepare() calls drm/hisilicon: Ensure LDI regs are properly configured. drm/ttm: once more fix ttm_buffer_object_transfer drm/amd/pp: fix typecast error in powerplay. drm/fb_helper: Disable all crtc's when initial setup fails. drm/fsl-dcu: Don't set connector DPMS property drm/edid: Don't send non-zero YQ in AVI infoframe for HDMI 1.x sinks drm/amdgpu: move UVD/VCE and VCN structure out from union drm/amdgpu: Set adev->vcn.irq.num_types for VCN include/linux/compiler-clang.h: handle randomizable anonymous structs IB/core: Do not warn on lid conversions for OPA IB/hfi1: Do not warn on lid conversions for OPA e1000e: fix the use of magic numbers for buffer overrun issue md: forbid a RAID5 from having both a bitmap and a journal. drm/i915: Fix false-positive assert_rpm_wakelock_held in i915_pmic_bus_access_notifier v2 drm/i915: Re-register PMIC bus access notifier on runtime resume drm/i915/fbdev: Serialise early hotplug events with async fbdev config drm/i915/gvt: Correct ADDR_4K/2M/1G_MASK definition drm/i915: Don't try indexed reads to alternate slave addresses drm/i915: Prevent zero length "index" write Revert "x86/entry/64: Add missing irqflags tracing to native_load_gs_index()" Linux 4.14.4 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
|
55fe4698d8 |
mm, oom_reaper: fix memory corruption
commit 4837fe37adff1d159904f0c013471b1ecbcb455e upstream. David Rientjes has reported the following memory corruption while the oom reaper tries to unmap the victims address space BUG: Bad page map in process oom_reaper pte:6353826300000000 pmd:00000000 addr:00007f50cab1d000 vm_flags:08100073 anon_vma:ffff9eea335603f0 mapping: (null) index:7f50cab1d file: (null) fault: (null) mmap: (null) readpage: (null) CPU: 2 PID: 1001 Comm: oom_reaper Call Trace: unmap_page_range+0x1068/0x1130 __oom_reap_task_mm+0xd5/0x16b oom_reaper+0xff/0x14c kthread+0xc1/0xe0 Tetsuo Handa has noticed that the synchronization inside exit_mmap is insufficient. We only synchronize with the oom reaper if tsk_is_oom_victim which is not true if the final __mmput is called from a different context than the oom victim exit path. This can trivially happen from context of any task which has grabbed mm reference (e.g. to read /proc/<pid>/ file which requires mm etc.). The race would look like this oom_reaper oom_victim task mmget_not_zero do_exit mmput __oom_reap_task_mm mmput __mmput exit_mmap remove_vma unmap_page_range Fix this issue by providing a new mm_is_oom_victim() helper which operates on the mm struct rather than a task. Any context which operates on a remote mm struct should use this helper in place of tsk_is_oom_victim. The flag is set in mark_oom_victim and never cleared so it is stable in the exit_mmap path. Debugged by Tetsuo Handa. Link: http://lkml.kernel.org/r/20171210095130.17110-1-mhocko@kernel.org Fixes: 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently") Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: David Rientjes <rientjes@google.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Andrea Argangeli <andrea@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
8392add7be |
ANDROID: mm: add a field to store names for private anonymous memory
Userspace processes often have multiple allocators that each do anonymous mmaps to get memory. When examining memory usage of individual processes or systems as a whole, it is useful to be able to break down the various heaps that were allocated by each layer and examine their size, RSS, and physical memory usage. This patch adds a user pointer to the shared union in vm_area_struct that points to a null terminated string inside the user process containing a name for the vma. vmas that point to the same address will be merged, but vmas that point to equivalent strings at different addresses will not be merged. Userspace can set the name for a region of memory by calling prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, start, len, (unsigned long)name); Setting the name to NULL clears it. The names of named anonymous vmas are shown in /proc/pid/maps as [anon:<name>] and in /proc/pid/smaps in a new "Name" field that is only present for named vmas. If the userspace pointer is no longer valid all or part of the name will be replaced with "<fault>". The idea to store a userspace pointer to reduce the complexity within mm (at the expense of the complexity of reading /proc/pid/mem) came from Dave Hansen. This results in no runtime overhead in the mm subsystem other than comparing the anon_name pointers when considering vma merging. The pointer is stored in a union with fieds that are only used on file-backed mappings, so it does not increase memory usage. Includes fix from Jed Davis <jld@mozilla.com> for typo in prctl_set_vma_anon_name, which could attempt to set the name across two vmas at the same time due to a typo, which might corrupt the vma list. Fix it to use tmp instead of end to limit the name setting to a single vma at a time. Change-Id: I9aa7b6b5ef536cd780599ba4e2fba8ceebe8b59f Signed-off-by: Dmitry Shmidt <dimitrysh@google.com> [AmitP: Fix get_user_pages_remote() call to align with upstream commit 5b56d49fc31d ("mm: add locked parameter to get_user_pages_remote()")] Signed-off-by: Amit Pundir <amit.pundir@linaro.org> |
||
|
c6c78a1d45 |
mm, hugetlbfs: introduce ->split() to vm_operations_struct
commit 31383c6865a578834dd953d9dbc88e6b19fe3997 upstream. Patch series "device-dax: fix unaligned munmap handling" When device-dax is operating in huge-page mode we want it to behave like hugetlbfs and fail attempts to split vmas into unaligned ranges. It would be messy to teach the munmap path about device-dax alignment constraints in the same (hstate) way that hugetlbfs communicates this constraint. Instead, these patches introduce a new ->split() vm operation. This patch (of 2): The device-dax interface has similar constraints as hugetlbfs in that it requires the munmap path to unmap in huge page aligned units. Rather than add more custom vma handling code in __split_vma() introduce a new vm operation to perform this vma specific check. Link: http://lkml.kernel.org/r/151130418135.4029.6783191281930729710.stgit@dwillia2-desk3.amr.corp.intel.com Fixes: dee410792419 ("/dev/dax, core: file operations and dax-mmap") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Cc: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
f808c13fd3 |
lib/interval_tree: fast overlap detection
Allow interval trees to quickly check for overlaps to avoid unnecesary tree lookups in interval_tree_iter_first(). As of this patch, all interval tree flavors will require using a 'rb_root_cached' such that we can have the leftmost node easily available. While most users will make use of this feature, those with special functions (in addition to the generic insert, delete, search calls) will avoid using the cached option as they can do funky things with insertions -- for example, vma_interval_tree_insert_after(). [jglisse@redhat.com: fix deadlock from typo vm_lock_anon_vma()] Link: http://lkml.kernel.org/r/20170808225719.20723-1-jglisse@redhat.com Link: http://lkml.kernel.org/r/20170719014603.19029-12-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Doug Ledford <dledford@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: David Airlie <airlied@linux.ie> Cc: Jason Wang <jasowang@redhat.com> Cc: Christian Benvenuti <benve@cisco.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
2129258024 |
mm: oom: let oom_reap_task and exit_mmap run concurrently
This is purely required because exit_aio() may block and exit_mmap() may never start, if the oom_reap_task cannot start running on a mm with mm_users == 0. At the same time if the OOM reaper doesn't wait at all for the memory of the current OOM candidate to be freed by exit_mmap->unmap_vmas, it would generate a spurious OOM kill. If it wasn't because of the exit_aio or similar blocking functions in the last mmput, it would be enough to change the oom_reap_task() in the case it finds mm_users == 0, to wait for a timeout or to wait for __mmput to set MMF_OOM_SKIP itself, but it's not just exit_mmap the problem here so the concurrency of exit_mmap and oom_reap_task is apparently warranted. It's a non standard runtime, exit_mmap() runs without mmap_sem, and oom_reap_task runs with the mmap_sem for reading as usual (kind of MADV_DONTNEED). The race between the two is solved with a combination of tsk_is_oom_victim() (serialized by task_lock) and MMF_OOM_SKIP (serialized by a dummy down_write/up_write cycle on the same lines of the ksm_exit method). If the oom_reap_task() may be running concurrently during exit_mmap, exit_mmap will wait it to finish in down_write (before taking down mm structures that would make the oom_reap_task fail with use after free). If exit_mmap comes first, oom_reap_task() will skip the mm if MMF_OOM_SKIP is already set and in turn all memory is already freed and furthermore the mm data structures may already have been taken down by free_pgtables. [aarcange@redhat.com: incremental one liner] Link: http://lkml.kernel.org/r/20170726164319.GC29716@redhat.com [rientjes@google.com: remove unused mmput_async] Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1708141733130.50317@chino.kir.corp.google.com [aarcange@redhat.com: microoptimization] Link: http://lkml.kernel.org/r/20170817171240.GB5066@redhat.com Link: http://lkml.kernel.org/r/20170726162912.GA29716@redhat.com Fixes: 26db62f179d1 ("oom: keep mm of the killed task available") Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: David Rientjes <rientjes@google.com> Reported-by: David Rientjes <rientjes@google.com> Tested-by: David Rientjes <rientjes@google.com> Reviewed-by: Michal Hocko <mhocko@suse.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
2376dd7ced |
userfaultfd: call userfaultfd_unmap_prep only if __split_vma succeeds
A __split_vma is not a worthy event to report, and it's definitely not a unmap so it would be incorrect to report unmap for the whole region to the userfaultfd manager if a __split_vma fails. So only call userfaultfd_unmap_prep after the __vma_splitting is over and do_munmap cannot fail anymore. Also add unlikely because it's better to optimize for the vast majority of apps that aren't using userfaultfd in a non cooperative way. Ideally we should also find a way to eliminate the branch entirely if CONFIG_USERFAULTFD=n, but it would complicate things so stick to unlikely for now. Link: http://lkml.kernel.org/r/20170802165145.22628-5-aarcange@redhat.com Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Cc: Alexey Perevalov <a.perevalov@samsung.com> Cc: Maxime Coquelin <maxime.coquelin@redhat.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
c41f012ade |
mm: rename global_page_state to global_zone_page_state
global_page_state is error prone as a recent bug report pointed out [1]. It only returns proper values for zone based counters as the enum it gets suggests. We already have global_node_page_state so let's rename global_page_state to global_zone_page_state to be more explicit here. All existing users seems to be correct: $ git grep "global_page_state(NR_" | sed 's@.*(\(NR_[A-Z_]*\)).*@\1@' | sort | uniq -c 2 NR_BOUNCE 2 NR_FREE_CMA_PAGES 11 NR_FREE_PAGES 1 NR_KERNEL_STACK_KB 1 NR_MLOCK 2 NR_PAGETABLE This patch shouldn't introduce any functional change. [1] http://lkml.kernel.org/r/201707260628.v6Q6SmaS030814@www262.sakura.ne.jp Link: http://lkml.kernel.org/r/20170801134256.5400-2-hannes@cmpxchg.org Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
37511fb5c9 |
mm: fix overflow check in expand_upwards()
Jörn Engel noticed that the expand_upwards() function might not return -ENOMEM in case the requested address is (unsigned long)-PAGE_SIZE and if the architecture didn't defined TASK_SIZE as multiple of PAGE_SIZE. Affected architectures are arm, frv, m68k, blackfin, h8300 and xtensa which all define TASK_SIZE as 0xffffffff, but since none of those have an upwards-growing stack we currently have no actual issue. Nevertheless let's fix this just in case any of the architectures with an upward-growing stack (currently parisc, metag and partly ia64) define TASK_SIZE similar. Link: http://lkml.kernel.org/r/20170702192452.GA11868@p100.box Fixes: bd726c90b6b8 ("Allow stack to grow up to address space limit") Signed-off-by: Helge Deller <deller@gmx.de> Reported-by: Jörn Engel <joern@purestorage.com> Cc: Hugh Dickins <hughd@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
24c79d8e0a |
mm: use dedicated helper to access rlimit value
Use rlimit() helper instead of manually writing whole chain from current task to rlim_cur. Link: http://lkml.kernel.org/r/20170705172811.8027-1-k.opasiak@samsung.com Signed-off-by: Krzysztof Opasiak <k.opasiak@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
32e4e6d5cb |
mm/mmap.c: expand_downwards: don't require the gap if !vm_prev
expand_stack(vma) fails if address < stack_guard_gap even if there is no vma->vm_prev. I don't think this makes sense, and we didn't do this before the recent commit 1be7107fbe18 ("mm: larger stack guard gap, between vmas"). We do not need a gap in this case, any address is fine as long as security_mmap_addr() doesn't object. This also simplifies the code, we know that address >= prev->vm_end and thus underflow is not possible. Link: http://lkml.kernel.org/r/20170628175258.GA24881@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Hugh Dickins <hughd@google.com> Cc: Larry Woodman <lwoodman@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
561b5e0709 |
mm/mmap.c: do not blow on PROT_NONE MAP_FIXED holes in the stack
Commit 1be7107fbe18 ("mm: larger stack guard gap, between vmas") has introduced a regression in some rust and Java environments which are trying to implement their own stack guard page. They are punching a new MAP_FIXED mapping inside the existing stack Vma. This will confuse expand_{downwards,upwards} into thinking that the stack expansion would in fact get us too close to an existing non-stack vma which is a correct behavior wrt safety. It is a real regression on the other hand. Let's work around the problem by considering PROT_NONE mapping as a part of the stack. This is a gros hack but overflowing to such a mapping would trap anyway an we only can hope that usespace knows what it is doing and handle it propely. Fixes: 1be7107fbe18 ("mm: larger stack guard gap, between vmas") Link: http://lkml.kernel.org/r/20170705182849.GA18027@dhcp22.suse.cz Signed-off-by: Michal Hocko <mhocko@suse.com> Debugged-by: Vlastimil Babka <vbabka@suse.cz> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Willy Tarreau <w@1wt.eu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
09b56d5a41 |
Merge branch 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM updates from Russell King: - add support for ftrace-with-registers, which is needed for kgraft and other ftrace tools - support for mremap() for the sigpage/vDSO so that checkpoint/restore can work - add timestamps to each line of the register dump output - remove the unused KTHREAD_SIZE from nommu - align the ARM bitops APIs with the generic API (using unsigned long pointers rather than void pointers) - make the configuration of userspace Thumb support an expert option so that we can default it on, and avoid some hard to debug userspace crashes * 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8684/1: NOMMU: Remove unused KTHREAD_SIZE definition ARM: 8683/1: ARM32: Support mremap() for sigpage/vDSO ARM: 8679/1: bitops: Align prototypes to generic API ARM: 8678/1: ftrace: Adds support for CONFIG_DYNAMIC_FTRACE_WITH_REGS ARM: make configuration of userspace Thumb support an expert option ARM: 8673/1: Fix __show_regs output timestamps |
||
|
ac34ceaf1c |
mm/mmap.c: mark protection_map as __ro_after_init
The protection map is only modified by per-arch init code so it can be protected from writes after the init code runs. This change was extracted from PaX where it's part of KERNEXEC. Link: http://lkml.kernel.org/r/20170510174441.26163-1-danielmicay@gmail.com Signed-off-by: Daniel Micay <danielmicay@gmail.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
bd726c90b6 |
Allow stack to grow up to address space limit
Fix expand_upwards() on architectures with an upward-growing stack (parisc, metag and partly IA-64) to allow the stack to reliably grow exactly up to the address space limit given by TASK_SIZE. Signed-off-by: Helge Deller <deller@gmx.de> Acked-by: Hugh Dickins <hughd@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
f4cb767d76 |
mm: fix new crash in unmapped_area_topdown()
Trinity gets kernel BUG at mm/mmap.c:1963! in about 3 minutes of mmap testing. That's the VM_BUG_ON(gap_end < gap_start) at the end of unmapped_area_topdown(). Linus points out how MAP_FIXED (which does not have to respect our stack guard gap intentions) could result in gap_end below gap_start there. Fix that, and the similar case in its alternative, unmapped_area(). Cc: stable@vger.kernel.org Fixes: 1be7107fbe18 ("mm: larger stack guard gap, between vmas") Reported-by: Dave Jones <davej@codemonkey.org.uk> Debugged-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
280e87e98c |
ARM: 8683/1: ARM32: Support mremap() for sigpage/vDSO
CRIU restores application mappings on the same place where they were before Checkpoint. That means, that we need to move vDSO and sigpage during restore on exactly the same place where they were before C/R. Make mremap() code update mm->context.{sigpage,vdso} pointers during VMA move. Sigpage is used for landing after handling a signal - if the pointer is not updated during moving, the application might crash on any signal after mremap(). vDSO pointer on ARM32 is used only for setting auxv at this moment, update it during mremap() in case of future usage. Without those updates, current work of CRIU on ARM32 is not reliable. Historically, we error Checkpointing if we find vDSO page on ARM32 and suggest user to disable CONFIG_VDSO. But that's not correct - it goes from x86 where signal processing is ended in vDSO blob. For arm32 it's sigpage, which is not disabled with `CONFIG_VDSO=n'. Looks like C/R was working by luck - because userspace on ARM32 at this moment always sets SA_RESTORER. Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com> Acked-by: Andy Lutomirski <luto@amacapital.net> Cc: linux-arm-kernel@lists.infradead.org Cc: Will Deacon <will.deacon@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Pavel Emelyanov <xemul@virtuozzo.com> Cc: Christopher Covington <cov@codeaurora.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> |
||
|
1be7107fbe |
mm: larger stack guard gap, between vmas
Stack guard page is a useful feature to reduce a risk of stack smashing into a different mapping. We have been using a single page gap which is sufficient to prevent having stack adjacent to a different mapping. But this seems to be insufficient in the light of the stack usage in userspace. E.g. glibc uses as large as 64kB alloca() in many commonly used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX] which is 256kB or stack strings with MAX_ARG_STRLEN. This will become especially dangerous for suid binaries and the default no limit for the stack size limit because those applications can be tricked to consume a large portion of the stack and a single glibc call could jump over the guard page. These attacks are not theoretical, unfortunatelly. Make those attacks less probable by increasing the stack guard gap to 1MB (on systems with 4k pages; but make it depend on the page size because systems with larger base pages might cap stack allocations in the PAGE_SIZE units) which should cover larger alloca() and VLA stack allocations. It is obviously not a full fix because the problem is somehow inherent, but it should reduce attack space a lot. One could argue that the gap size should be configurable from userspace, but that can be done later when somebody finds that the new 1MB is wrong for some special case applications. For now, add a kernel command line option (stack_guard_gap) to specify the stack gap size (in page units). Implementation wise, first delete all the old code for stack guard page: because although we could get away with accounting one extra page in a stack vma, accounting a larger gap can break userspace - case in point, a program run with "ulimit -S -v 20000" failed when the 1MB gap was counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK and strict non-overcommit mode. Instead of keeping gap inside the stack vma, maintain the stack guard gap as a gap between vmas: using vm_start_gap() in place of vm_start (or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few places which need to respect the gap - mainly arch_get_unmapped_area(), and and the vma tree's subtree_gap support for that. Original-patch-by: Oleg Nesterov <oleg@redhat.com> Original-patch-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Tested-by: Helge Deller <deller@gmx.de> # parisc Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
20ac28933c |
mm/mmap: replace SHM_HUGE_MASK with MAP_HUGE_MASK inside mmap_pgoff
Commit 091d0d55b286 ("shm: fix null pointer deref when userspace specifies invalid hugepage size") had replaced MAP_HUGE_MASK with SHM_HUGE_MASK. Though both of them contain the same numeric value of 0x3f, MAP_HUGE_MASK flag sounds more appropriate than the other one in the context. Hence change it back. Link: http://lkml.kernel.org/r/20170404045635.616-1-khandual@linux.vnet.ibm.com Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
94e877d0fb |
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pile two from Al Viro: - orangefs fix - series of fs/namei.c cleanups from me - VFS stuff coming from overlayfs tree * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: orangefs: Use RCU for destroy_inode vfs: use helper for calling f_op->fsync() mm: use helper for calling f_op->mmap() vfs: use helpers for calling f_op->{read,write}_iter() vfs: pass type instead of fn to do_{loop,iter}_readv_writev() vfs: extract common parts of {compat_,}do_readv_writev() vfs: wrap write f_ops with file_{start,end}_write() vfs: deny copy_file_range() for non regular files vfs: deny fallocate() on directory vfs: create vfs helper vfs_tmpfile() namei.c: split unlazy_walk() namei.c: fold the check for DCACHE_OP_REVALIDATE into d_revalidate() lookup_fast(): clean up the logics around the fallback to non-rcu mode namei: fold unlazy_link() into its sole caller |
||
|
653a7746fa |
Merge remote-tracking branch 'ovl/for-viro' into for-linus
Overlayfs-related series from Miklos and Amir |
||
|
def5efe037 |
mm, madvise: fail with ENOMEM when splitting vma will hit max_map_count
If madvise(2) advice will result in the underlying vma being split and the number of areas mapped by the process will exceed /proc/sys/vm/max_map_count as a result, return ENOMEM instead of EAGAIN. EAGAIN is returned by madvise(2) when a kernel resource, such as slab, is temporarily unavailable. It indicates that userspace should retry the advice in the near future. This is important for advice such as MADV_DONTNEED which is often used by malloc implementations to free memory back to the system: we really do want to free memory back when madvise(2) returns EAGAIN because slab allocations (for vmas, anon_vmas, or mempolicies) cannot be allocated. Encountering /proc/sys/vm/max_map_count is not a temporary failure, however, so return ENOMEM to indicate this is a more serious issue. A followup patch to the man page will specify this behavior. Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1701241431120.42507@chino.kir.corp.google.com Signed-off-by: David Rientjes <rientjes@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
897ab3e0c4 |
userfaultfd: non-cooperative: add event for memory unmaps
When a non-cooperative userfaultfd monitor copies pages in the background, it may encounter regions that were already unmapped. Addition of UFFD_EVENT_UNMAP allows the uffd monitor to track precisely changes in the virtual memory layout. Since there might be different uffd contexts for the affected VMAs, we first should create a temporary representation for the unmap event for each uffd context and then notify them one by one to the appropriate userfault file descriptors. The event notification occurs after the mmap_sem has been released. [arnd@arndb.de: fix nommu build] Link: http://lkml.kernel.org/r/20170203165141.3665284-1-arnd@arndb.de [mhocko@suse.com: fix nommu build] Link: http://lkml.kernel.org/r/20170202091503.GA22823@dhcp22.suse.cz Link: http://lkml.kernel.org/r/1485542673-24387-3-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
846b1a0f1d |
mm: call vm_munmap in munmap syscall instead of using open coded version
Patch series "userfaultfd: non-cooperative: better tracking for mapping changes", v2. These patches try to address issues I've encountered during integration of userfaultfd with CRIU. Previously added userfaultfd events for fork(), madvise() and mremap() unfortunately do not cover all possible changes to a process virtual memory layout required for uffd monitor. When one or more VMAs is removed from the process mm, the external uffd monitor has no way to detect those changes and will attempt to fill the removed regions with userfaultfd_copy. Another problematic event is the exit() of the process. Here again, the external uffd monitor will try to use userfaultfd_copy, although mm owning the memory has already gone. The first patch in the series is a minor cleanup and it's not strictly related to the rest of the series. The patches 2 and 3 below add UFFD_EVENT_UNMAP and UFFD_EVENT_EXIT to allow the uffd monitor track changes in the memory layout of a process. The patches 4 and 5 amend error codes returned by userfaultfd_copy to make the uffd monitor able to cope with races that might occur between delivery of unmap and exit events and outstanding userfaultfd_copy's. This patch (of 5): Commit dc0ef0df7b6a ("mm: make mmap_sem for write waits killable for mm syscalls") replaced call to vm_munmap in munmap syscall with open coded version to allow different waits on mmap_sem in munmap syscall and vm_munmap. Now both functions use down_write_killable, so we can restore the call to vm_munmap from the munmap system call. Link: http://lkml.kernel.org/r/1485542673-24387-2-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
3edf41d845 |
mm: fix comments for mmap_init()
mmap_init() is no longer associated with VMA slab. So fix it. Link: http://lkml.kernel.org/r/1485182601-9294-1-git-send-email-iamyooon@gmail.com Signed-off-by: seokhoon.yoon <iamyooon@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
11bac80004 |
mm, fs: reduce fault, page_mkwrite, and pfn_mkwrite to take only vmf
->fault(), ->page_mkwrite(), and ->pfn_mkwrite() calls do not need to take a vma and vmf parameter when the vma already resides in vmf. Remove the vma parameter to simplify things. [arnd@arndb.de: fix ARM build] Link: http://lkml.kernel.org/r/20170125223558.1451224-1-arnd@arndb.de Link: http://lkml.kernel.org/r/148521301778.19116.10840599906674778980.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
16e72e9b30 |
powerpc: do not make the entire heap executable
On 32-bit powerpc the ELF PLT sections of binaries (built with --bss-plt, or with a toolchain which defaults to it) look like this: [17] .sbss NOBITS 0002aff8 01aff8 000014 00 WA 0 0 4 [18] .plt NOBITS 0002b00c 01aff8 000084 00 WAX 0 0 4 [19] .bss NOBITS 0002b090 01aff8 0000a4 00 WA 0 0 4 Which results in an ELF load header: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x019c70 0x00029c70 0x00029c70 0x01388 0x014c4 RWE 0x10000 This is all correct, the load region containing the PLT is marked as executable. Note that the PLT starts at 0002b00c but the file mapping ends at 0002aff8, so the PLT falls in the 0 fill section described by the load header, and after a page boundary. Unfortunately the generic ELF loader ignores the X bit in the load headers when it creates the 0 filled non-file backed mappings. It assumes all of these mappings are RW BSS sections, which is not the case for PPC. gcc/ld has an option (--secure-plt) to not do this, this is said to incur a small performance penalty. Currently, to support 32-bit binaries with PLT in BSS kernel maps *entire brk area* with executable rights for all binaries, even --secure-plt ones. Stop doing that. Teach the ELF loader to check the X bit in the relevant load header and create 0 filled anonymous mappings that are executable if the load header requests that. Test program showing the difference in /proc/$PID/maps: int main() { char buf[16*1024]; char *p = malloc(123); /* make "[heap]" mapping appear */ int fd = open("/proc/self/maps", O_RDONLY); int len = read(fd, buf, sizeof(buf)); write(1, buf, len); printf("%p\n", p); return 0; } Compiled using: gcc -mbss-plt -m32 -Os test.c -otest Unpatched ppc64 kernel: 00100000-00120000 r-xp 00000000 00:00 0 [vdso] 0fe10000-0ffd0000 r-xp 00000000 fd:00 67898094 /usr/lib/libc-2.17.so 0ffd0000-0ffe0000 r--p 001b0000 fd:00 67898094 /usr/lib/libc-2.17.so 0ffe0000-0fff0000 rw-p 001c0000 fd:00 67898094 /usr/lib/libc-2.17.so 10000000-10010000 r-xp 00000000 fd:00 100674505 /home/user/test 10010000-10020000 r--p 00000000 fd:00 100674505 /home/user/test 10020000-10030000 rw-p 00010000 fd:00 100674505 /home/user/test 10690000-106c0000 rwxp 00000000 00:00 0 [heap] f7f70000-f7fa0000 r-xp 00000000 fd:00 67898089 /usr/lib/ld-2.17.so f7fa0000-f7fb0000 r--p 00020000 fd:00 67898089 /usr/lib/ld-2.17.so f7fb0000-f7fc0000 rw-p 00030000 fd:00 67898089 /usr/lib/ld-2.17.so ffa90000-ffac0000 rw-p 00000000 00:00 0 [stack] 0x10690008 Patched ppc64 kernel: 00100000-00120000 r-xp 00000000 00:00 0 [vdso] 0fe10000-0ffd0000 r-xp 00000000 fd:00 67898094 /usr/lib/libc-2.17.so 0ffd0000-0ffe0000 r--p 001b0000 fd:00 67898094 /usr/lib/libc-2.17.so 0ffe0000-0fff0000 rw-p 001c0000 fd:00 67898094 /usr/lib/libc-2.17.so 10000000-10010000 r-xp 00000000 fd:00 100674505 /home/user/test 10010000-10020000 r--p 00000000 fd:00 100674505 /home/user/test 10020000-10030000 rw-p 00010000 fd:00 100674505 /home/user/test 10180000-101b0000 rw-p 00000000 00:00 0 [heap] ^^^^ this has changed f7c60000-f7c90000 r-xp 00000000 fd:00 67898089 /usr/lib/ld-2.17.so f7c90000-f7ca0000 r--p 00020000 fd:00 67898089 /usr/lib/ld-2.17.so f7ca0000-f7cb0000 rw-p 00030000 fd:00 67898089 /usr/lib/ld-2.17.so ff860000-ff890000 rw-p 00000000 00:00 0 [stack] 0x10180008 The patch was originally posted in 2012 by Jason Gunthorpe and apparently ignored: https://lkml.org/lkml/2012/9/30/138 Lightly run-tested. Link: http://lkml.kernel.org/r/20161215131950.23054-1-dvlasenk@redhat.com Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Florian Weimer <fweimer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
f74ac01520 |
mm: use helper for calling f_op->mmap()
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> |
||
|
7c0f6ba682 |
Replace <asm/uaccess.h> with <linux/uaccess.h> globally
This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
8f26e0b176 |
mm: vma_merge: correct false positive from __vma_unlink->validate_mm_rb
The old code was always doing: vma->vm_end = next->vm_end vma_rb_erase(next) // in __vma_unlink vma->vm_next = next->vm_next // in __vma_unlink next = vma->vm_next vma_gap_update(next) The new code still does the above for remove_next == 1 and 2, but for remove_next == 3 it has been changed and it does: next->vm_start = vma->vm_start vma_rb_erase(vma) // in __vma_unlink vma_gap_update(next) In the latter case, while unlinking "vma", validate_mm_rb() is told to ignore "vma" that is being removed, but next->vm_start was reduced instead. So for the new case, to avoid the false positive from validate_mm_rb, it should be "next" that is ignored when "vma" is being unlinked. "vma" and "next" in the above comment, considered pre-swap(). Link: http://lkml.kernel.org/r/1474492522-2261-4-git-send-email-aarcange@redhat.com Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Tested-by: Shaun Tancheff <shaun.tancheff@seagate.com> Cc: Rik van Riel <riel@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Jan Vorlicek <janvorli@microsoft.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |