msm-4.14

mirror of https://github.com/rd-stuffs/msm-4.14.git synced 2025-02-20 11:45:48 +08:00

Author	SHA1	Message	Date
laoyi	bb32c75f6c	fs: proc: Update perms of process_reclaim node Other userspace apps like AppCOmpaction would like to use this node, so update permission. Change-Id: Ied22bd6ad489bef4028cde943ac185d1354ab971 Signed-off-by: <laoyi@codeaurora.org> Signed-off-by: UtsavBalar1231 <utsavbalar1231@gmail.com>	2021-06-05 14:55:41 +05:30
John Dias	f1033fb035	fs: Improve eventpoll logging to stop indicting timerfd timerfd doesn't create any wakelocks; eventpoll can, and is creating the wakelocks we see called "[timerfd]". eventpoll creates two kinds of wakelocks: a single top-level lock associated with the eventpoll fd itself, and one additional lock for each fd it is polling that needs such a lock (e.g. those using EPOLLWAKEUP). Current code names the per-fd locks using the undecorated names of the fds' associated files (hence "[timerfd]"), and is naming the top-level lock after the PID of the caller and the name of the file behind the first fd for which a per-fd lock is created. To make things clearer, the top-level lock is now named using the caller PID and an "epollfd" designation, while the per-fd locks are also named with the caller's PID (to associate them with the top-level lock) and their respective fds' file names. Port of fix already applied to previous 2 generations. Note that this set of changes does not fully solve the problem of eventpoll/timerfd wakelock attribution to the original process, since most activity is relayed through system_server, but it does at least ensure that different eventpoll wakelocks - and their stats - are properly disambiguated. Test: Ran on device and observed new wakelock naming in /d/wakeup_sources and (file naming in) lsof output. Bug: 116363986 Change-Id: I34bada5ddab04cf3830762c745f46bfcd1549cb8 Signed-off-by: John Dias <joaodias@google.com> Signed-off-by: Kelly Rossmoyer <krossmo@google.com> Signed-off-by: Miguel de Dios <migueldedios@google.com> Signed-off-by: UtsavBalar1231 <utsavbalar1231@gmail.com>	2021-06-05 14:55:41 +05:30
Adithya R	4ad4df234e	Merge tag 'LA.UM.9.1.r1-10200-SMxxx0.0' of https://source.codeaurora.org/quic/la/kernel/msm-4.14 into staging/LA.UM.9.x "LA.UM.9.1.r1-10200-SMxxx0.0"	2021-06-05 14:39:57 +05:30
Adhitya Mohan	8a4166ccd4	fs/fuse: shortcircuit: Make it compile	2021-05-25 19:26:37 +05:30
LibXZR	0f1df8b6c0	fs/fuse: shortcircuit: Disable logging	2021-05-25 19:26:37 +05:30
LibXZR	a650eef6fc	fs: fuse: Implement fuse short circuit * This significantly improves i/o performance under /sdcard * From OnePlus 8T Oxygen OS 11.0.8.11.KB05AA and OnePlus 8 Oxygen OS 11.0.5.5.IN21AA and OnePlus 8 Pro Oxygen OS 11.0.5.5.IN11AA RealJohnGalt: make proper Kconfig, add back dependencies from OnePlus source onto our CAF tree. Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-05-25 19:26:37 +05:30
Theodore Ts'o	b8328115cd	ext4: Improve smp scalability for inode generation ->s_next_generation is protected by s_next_gen_lock but its usage pattern is very primitive. We don't actually need sequentially increasing new generation numbers, so let's use prandom_u32() instead. Reported-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: kdrag0n <dragon@khronodragon.com>	2021-05-25 19:26:37 +05:30
Dave Kleikamp	0179baa847	AIO: Don't plug the I/O queue in do_io_submit() Asynchronous I/O latency to a solid-state disk greatly increased between the 2.6.32 and 3.0 kernels. By removing the plug from do_io_submit(), we observed a 34% improvement in the I/O latency. Unfortunately, at this level, we don't know if the request is to a rotating disk or not. Change-Id: I7101df956473ed9fd5dcff18e473dd93b688a5c1 Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Cc: linux-aio@kvack.org Cc: Chris Mason <chris.mason@oracle.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jeff Moyer <jmoyer@redhat.com>	2021-05-25 19:26:34 +05:30
Park Ju Hyung	6da4dd3220	kernfs: Use kmem_cache pool for struct kernfs_open_node/file These get allocated and freed millions of times on this kernel tree. Use a dedicated kmem_cache pool and avoid costly dynamic memory allocations. Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>	2021-05-25 19:23:54 +05:30
Park Ju Hyung	1af6fc556c	sdcardfs: Use kmem_cache pool for struct sdcardfs_file_info These get allocated and freed millions of times on this kernel tree. Use a dedicated kmem_cache pool and avoid costly dynamic memory allocations. Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>	2021-05-25 19:23:54 +05:30
Pradeep P V K	71070edfaa	fuse: Set fuse request error upon fuse abort connection There is a minor race in setting the fuse out request error between fuse_abort_conn() and fuse_dev_do_read() as explained below. Thread-1 Thread-2 ======== ======== ->fuse_simple_request() ->shutdown ->__fuse_request_send() ->queue_request() ->fuse_abort_conn() ->fuse_dev_do_read() ->acquire(fpq->lock) ->wait_for(fpq->lock) ->set err to all req's in fpq->io ->release(fpq->lock) ->acquire(fpq->lock) ->add req to fpq->io The above scenario may cause Thread-1 request to add into fpq->io list after Thread-2 sets -ECONNABORTED err to all its requests in fpq->io list. This leaves Thread-1 request with unset err and this further misleads as a completed request without an err set upon request_end(). Handle this by setting the err appropriately. Change-Id: I7961c421939423290f6b78a619490d2cc3fbcb33 Signed-off-by: Pradeep P V K <pragalla@codeaurora.org>	2021-05-24 00:07:39 -07:00
Sayali Lokhande	2493b43cc1	f2fs: Avoid double lock for cp_rwsem during checkpoint There could be a scenario where f2fs_sync_node_pages gets called during checkpoint, which in turn tries to flush inline data and calls iput(). This results in deadlock as iput() tries to hold cp_rwsem, which is already held at the beginning by checkpoint->block_operations(). Call stack : Thread A Thread B f2fs_write_checkpoint() - block_operations(sbi) - f2fs_lock_all(sbi); - down_write(&sbi->cp_rwsem); - open() - igrab() - write() write inline data - unlink() - f2fs_sync_node_pages() - if (is_inline_node(page)) - flush_inline_data() - ilookup() page = f2fs_pagecache_get_page() if (!page) goto iput_out; iput_out: -close() -iput() iput(inode); - f2fs_evict_inode() - f2fs_truncate_blocks() - f2fs_lock_op() - down_read(&sbi->cp_rwsem); Change-Id: I048bbf42c0b11108e2444f4d9df5d58e7a779c3c Fixes: 2049d4fcb057 ("f2fs: avoid multiple node page writes due to inline_data") Signed-off-by: Sayali Lokhande <sayalil@codeaurora.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Git-commit: 34c061ad85a2f5d5e9e3b045d72f3b211db6e282 Git-repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/ Signed-off-by: Sayali Lokhande <sayalil@codeaurora.org> Signed-off-by: celtare21 <celtare21@gmail.com>	2021-05-22 16:11:26 +05:30
Adithya R	295a88ddbd	fs/ext4: inode: Remove an unused variable	2021-05-15 13:52:20 +05:30
Sultan Alsawaf	c65672b0d7	mm: Perform PID map reads on the little CPU cluster PID map reads for processes with thousands of mappings can be done extensively by certain Android apps, burning through CPU time on higher-performance CPUs even though reading PID maps is never a performance-critical task. We can relieve the load on the important CPUs by moving PID map reads to little CPUs via sched_migrate_to_cpumask_*(). Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>	2021-05-11 20:45:50 +05:30
Alexey Dobriyan	2777028c65	proc: reject "." and ".." as filenames Various subsystems can create files and directories in /proc with names directly controlled by userspace. Which means "/", "." and ".." are no-no. "/" split is already taken care of, do the other 2 prohibited names. Link: http://lkml.kernel.org/r/20180310001223.GB12443@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Florian Westphal <fw@strlen.de> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit 175721be2e505649ee481a9cd2bf14b228d12f2b) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:35 +05:30
Alexey Dobriyan	465a23a06b	proc: do mmput ASAP for /proc/*/map_files mm_struct is not needed while printing as all the data was already extracted. Link: http://lkml.kernel.org/r/20180309223120.GC3843@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit 77c98ef42fa00e1d6e5631ed41c98d75fefd48e3) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:35 +05:30
Alexey Dobriyan	6cf273d41e	proc: register filesystem last As soon as register_filesystem() exits, filesystem can be mounted. It is better to present fully operational /proc. Of course it doesn't matter because /proc is not modular but do it anyway. Drop error check, it should be handled by panicking. Link: http://lkml.kernel.org/r/20180309222709.GA3843@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit 6f695f71eb903c41491e75e25dcfff47ec3a6db7) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:29 +05:30
Alexey Dobriyan	76decf60fd	proc: fix /proc/*/map_files lookup some more I totally forgot that _parse_integer() accepts arbitrary amount of leading zeroes leading to the following lookups: OK # readlink /proc/1/map_files/56427ecba000-56427eddc000 /lib/systemd/systemd bogus # readlink /proc/1/map_files/00000000000056427ecba000-56427eddc000 /lib/systemd/systemd # readlink /proc/1/map_files/56427ecba000-00000000000056427eddc000 /lib/systemd/systemd Link: http://lkml.kernel.org/r/20180303215130.GA23480@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit a97c1ccade29b8cf7a7ca614ece30929b045a143) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:28 +05:30
Danilo Krummrich	f4bf867a08	fs/proc/proc_sysctl.c: remove redundant link check in proc_sys_link_fill_cache() proc_sys_link_fill_cache() does not need to check whether we're called for a link - it's already done by scan(). Link: http://lkml.kernel.org/r/20180228013506.4915-2-danilokrummrich@dk-develop.de Signed-off-by: Danilo Krummrich <danilokrummrich@dk-develop.de> Acked-by: Kees Cook <keescook@chromium.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "Luis R . Rodriguez" <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit b7c0004afe47f4d2e830d1b4a971253911bd861c) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:28 +05:30
Alexey Dobriyan	81cec1ffcc	proc: use set_puts() at /proc/*/wchan Link: http://lkml.kernel.org/r/20180217072011.GB16074@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit 05cca699535589d40946e4eb5e613547b02b5f8f) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:28 +05:30
Alexey Dobriyan	c526f505f7	proc: check permissions earlier for /proc/*/wchan get_wchan() accesses stack page before permissions are checked, let's not play this game. Link: http://lkml.kernel.org/r/20180217071923.GA16074@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit 52dfa0c717dafa851b93cab220d168c7ebd2eaec) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:27 +05:30
Andrei Vagin	90778f2436	proc: replace seq_printf by seq_put_smth to speed up /proc/pid/status seq_printf() works slower than seq_puts, seq_puts, etc. == test_proc.c int main(int argc, char **argv) { int n, i, fd; char buf[16384]; n = atoi(argv[1]); for (i = 0; i < n; i++) { fd = open(argv[2], O_RDONLY); if (fd < 0) return 1; if (read(fd, buf, sizeof(buf)) <= 0) return 1; close(fd); } return 0; } == $ time ./test_proc 1000000 /proc/1/status == Before path == real 0m5.171s user 0m0.328s sys 0m4.783s == After patch == real 0m4.761s user 0m0.334s sys 0m4.366s Link: http://lkml.kernel.org/r/20180212074931.7227-4-avagin@openvz.org Signed-off-by: Andrei Vagin <avagin@openvz.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit 23e929be65a987090ef4f723112865018b339b0f) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:27 +05:30
Andrei Vagin	a3fd0359df	proc: optimize single-symbol delimiters to spead up seq_put_decimal_ull A delimiter is a string which is printed before a number. A syngle-symbol delimiters can be printed by set_putc() and this works faster than printing by set_puts(). == test_proc.c int main(int argc, char **argv) { int n, i, fd; char buf[16384]; n = atoi(argv[1]); for (i = 0; i < n; i++) { fd = open(argv[2], O_RDONLY); if (fd < 0) return 1; if (read(fd, buf, sizeof(buf)) <= 0) return 1; close(fd); } return 0; } == $ time ./test_proc 1000000 /proc/1/stat == Before patch == real 0m3.820s user 0m0.337s sys 0m3.394s == After patch == real 0m3.110s user 0m0.324s sys 0m2.700s Link: http://lkml.kernel.org/r/20180212074931.7227-3-avagin@openvz.org Signed-off-by: Andrei Vagin <avagin@openvz.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit 089d07e8ab9cb3a49baba464378303fbc452cc4b) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:26 +05:30
Andrei Vagin	6a13533dcb	proc: replace seq_printf on seq_putc to speed up /proc/pid/smaps seq_putc() works much faster than seq_printf() == Before patch == $ time python test_smaps.py real 0m3.828s user 0m0.413s sys 0m3.408s == After patch == $ time python test_smaps.py real 0m3.405s user 0m0.401s sys 0m3.003s == Before patch == - 75.51% 4.62% python [kernel.kallsyms] [k] show_smap.isra.33 - 70.88% show_smap.isra.33 + 24.82% seq_put_decimal_ull_aligned + 19.78% __walk_page_range + 12.74% seq_printf + 11.08% show_map_vma.isra.23 + 1.68% seq_puts == After patch == - 69.16% 5.70% python [kernel.kallsyms] [k] show_smap.isra.33 - 63.46% show_smap.isra.33 + 25.98% seq_put_decimal_ull_aligned + 20.90% __walk_page_range + 12.60% show_map_vma.isra.23 1.56% seq_putc + 1.55% seq_puts Link: http://lkml.kernel.org/r/20180212074931.7227-2-avagin@openvz.org Signed-off-by: Andrei Vagin <avagin@openvz.org> Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit 8857ea44552da119f9b8997adeedb6bb4e3e6853) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:25 +05:30
Andrei Vagin	5ec7cad9b9	proc: add seq_put_decimal_ull_width to speed up /proc/pid/smaps seq_put_decimal_ull_w(m, str, val, width) prints a decimal number with a specified minimal field width. It is equivalent of seq_printf(m, "%s%*d", str, width, val), but it works much faster. == test_smaps.py num = 0 with open("/proc/1/smaps") as f: for x in xrange(10000): data = f.read() f.seek(0, 0) == == Before patch == $ time python test_smaps.py real 0m4.593s user 0m0.398s sys 0m4.158s == After patch == $ time python test_smaps.py real 0m3.828s user 0m0.413s sys 0m3.408s $ perf -g record python test_smaps.py == Before patch == - 79.01% 3.36% python [kernel.kallsyms] [k] show_smap.isra.33 - 75.65% show_smap.isra.33 + 48.85% seq_printf + 15.75% __walk_page_range + 9.70% show_map_vma.isra.23 0.61% seq_puts == After patch == - 75.51% 4.62% python [kernel.kallsyms] [k] show_smap.isra.33 - 70.88% show_smap.isra.33 + 24.82% seq_put_decimal_ull_w + 19.78% __walk_page_range + 12.74% seq_printf + 11.08% show_map_vma.isra.23 + 1.68% seq_puts [akpm@linux-foundation.org: fix drivers/of/unittest.c build] Link: http://lkml.kernel.org/r/20180212074931.7227-1-avagin@openvz.org Signed-off-by: Andrei Vagin <avagin@openvz.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit 56d90368ba6ee38ca96ee45ce9c0980af4982117) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:25 +05:30
Konstantin Khlebnikov	603befd8e5	fs/proc/task_mmu.c: do not show VmExe bigger than total executable virtual memory If start_code / end_code pointers are screwed then "VmExe" could be bigger than total executable virtual memory and "VmLib" becomes negative: VmExe: 294320 kB VmLib: 18446744073709327564 kB VmExe and VmLib documented as text segment and shared library code size. Now their sum will be always equal to mm->exec_vm which sums size of executable and not writable and not stack areas. I've seen this for huge (>2Gb) statically linked binary which has whole world inside. For it start_code .. end_code range also covers one of rodata sections. Probably this is bug in customized linker, elf loader or both. Anyway CONFIG_CHECKPOINT_RESTORE allows to change these pointers, thus we cannot trust them without validation. Link: http://lkml.kernel.org/r/150728955451.743749.11276392315459539583.stgit@buzz Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit cfcad2ca397928130217acc011604d530d320049) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:24 +05:30
Kirill A. Shutemov	0dc85f98cd	mm: consolidate page table accounting Currently, we account page tables separately for each page table level, but that's redundant -- we only make use of total memory allocated to page tables for oom_badness calculation. We also provide the information to userspace, but it has dubious value there too. This patch switches page table accounting to single counter. mm->pgtables_bytes is now used to account all page table levels. We use bytes, because page table size for different levels of page table tree may be different. The change has user-visible effect: we don't have VmPMD and VmPUD reported in /proc/[pid]/status. Not sure if anybody uses them. (As alternative, we can always report 0 kB for them.) OOM-killer report is also slightly changed: we now report pgtables_bytes instead of nr_ptes, nr_pmd, nr_puds. Apart from reducing number of counters per-mm, the benefit is that we now calculate oom_badness() more correctly for machines which have different size of page tables depending on level or where page tables are less than a page in size. The only downside can be debuggability because we do not know which page table level could leak. But I do not remember many bugs that would be caught by separate counters so I wouldn't lose sleep over this. [akpm@linux-foundation.org: fix mm/huge_memory.c] Link: http://lkml.kernel.org/r/20171006100651.44742-2-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> [kirill.shutemov@linux.intel.com: fix build] Link: http://lkml.kernel.org/r/20171016150113.ikfxy3e7zzfvsr4w@black.fi.intel.com Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit 82bf5da35830928c04d1cb35ab89d26d4a14da71) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:24 +05:30
Kirill A. Shutemov	4ca2eb2e71	mm: introduce wrappers to access mm->nr_ptes Let's add wrappers for ->nr_ptes with the same interface as for nr_pmd and nr_pud. The patch also makes nr_ptes accounting dependent onto CONFIG_MMU. Page table accounting doesn't make sense if you don't have page tables. It's preparation for consolidation of page-table counters in mm_struct. Link: http://lkml.kernel.org/r/20171006100651.44742-1-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit 7a81a0cfb6310b5453d851fece5fb573049930bd) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:23 +05:30
Kirill A. Shutemov	f6d7bac1dc	mm: account pud page tables On a machine with 5-level paging support a process can allocate significant amount of memory and stay unnoticed by oom-killer and memory cgroup. The trick is to allocate a lot of PUD page tables. We don't account PUD page tables, only PMD and PTE. We already addressed the same issue for PMD page tables, see commit dc6c9a35b66b ("mm: account pmd page tables to the process"). Introduction of 5-level paging brings the same issue for PUD page tables. The patch expands accounting to PUD level. [kirill.shutemov@linux.intel.com: s/pmd_t/pud_t/] Link: http://lkml.kernel.org/r/20171004074305.x35eh5u7ybbt5kar@black.fi.intel.com [heiko.carstens@de.ibm.com: s390/mm: fix pud table accounting] Link: http://lkml.kernel.org/r/20171103090551.18231-1-heiko.carstens@de.ibm.com Link: http://lkml.kernel.org/r/20171002080427.3320-1-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit e786257a2ff2dfbaeaad7aae13fff7c7528da54e) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:23 +05:30
Alexey Dobriyan	397d74d883	proc: account "struct pde_opener" The allocation is persistent in fact as any fool can open a file in /proc and sit on it. Link: http://lkml.kernel.org/r/20180214082409.GC17157@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit a7a0047939ded2ba4e578e510047c4dd7349c326) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:22 +05:30
Alexey Dobriyan	c477fdcf56	proc: move "struct pde_opener" to kmem cache "struct pde_opener" is fixed size and we can have more granular approach to debugging. For those who don't know, per cache SLUB poisoning and red zoning don't work if there is at least one object allocated which is hopeless in case of kmalloc-64 but not in case of standalone cache. Although systemd opens 2 files from the get go, so it is hopeless after all. Link: http://lkml.kernel.org/r/20180214082306.GB17157@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit 579f1c6190bb2d5b1a31c8563a60c809510a88a8) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:22 +05:30
Alexey Dobriyan	92a448b67e	fs/proc: use __ro_after_init /proc/self inode numbers, value of proc_inode_cache and st_nlink of /proc/$TGID are fixed constants. Link: http://lkml.kernel.org/r/20180103184707.GA31849@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit f81586a6d74dd559ca21d65ee80b62eeea19e23d) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:21 +05:30
Alexey Dobriyan	8855c18227	proc: faster open/close of files without ->release hook The whole point of code in fs/proc/inode.c is to make sure ->release hook is called either at close() or at rmmod time. All if it is unnecessary if there is no ->release hook. Save allocation+list manipulations under spinlock in that case. Link: http://lkml.kernel.org/r/20180214063033.GA15579@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit 7055d8fbe00803a0542947ac9ca41ea554cb5d02) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:21 +05:30
Alexey Dobriyan	fd35b002ac	proc: move /proc/sysvipc creation to where it belongs Move the proc_mkdir() call within the sysvipc subsystem such that we avoid polluting proc_root_init() with petty cpp. [dave@stgolabs.net: contributed changelog] Link: http://lkml.kernel.org/r/20180216161732.GA10297@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit 650fedae68888a3b53b7b052147fb5e69af53afb) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:20 +05:30
Alexey Dobriyan	ee658e9c60	proc: do less stuff under ->pde_unload_lock Commit ca469f35a8e9ef ("deal with races between remove_proc_entry() and proc_reg_release()") moved too much stuff under ->pde_unload_lock making a problem described at series "[PATCH v5] procfs: Improve Scaling in proc" worse. While RCU is being figured out, move kfree() out of ->pde_unload_lock. On my potato, difference is only 0.5% speedup with concurrent open+read+close of /proc/cmdline, but the effect should be more noticeable on more capable machines. $ perf stat -r 16 -- ./proc-j 16 Performance counter stats for './proc-j 16' (16 runs): 130569.502377 task-clock (msec) # 15.872 CPUs utilized ( +- 0.05% ) 19,169 context-switches # 0.147 K/sec ( +- 0.18% ) 15 cpu-migrations # 0.000 K/sec ( +- 3.27% ) 437 page-faults # 0.003 K/sec ( +- 1.25% ) 300,172,097,675 cycles # 2.299 GHz ( +- 0.05% ) 96,793,267,308 instructions # 0.32 insn per cycle ( +- 0.04% ) 22,798,342,298 branches # 174.607 M/sec ( +- 0.04% ) 111,764,687 branch-misses # 0.49% of all branches ( +- 0.47% ) 8.226574400 seconds time elapsed ( +- 0.05% ) ^^^^^^^^^^^ $ perf stat -r 16 -- ./proc-j 16 Performance counter stats for './proc-j 16' (16 runs): 129866.777392 task-clock (msec) # 15.869 CPUs utilized ( +- 0.04% ) 19,154 context-switches # 0.147 K/sec ( +- 0.66% ) 14 cpu-migrations # 0.000 K/sec ( +- 1.73% ) 431 page-faults # 0.003 K/sec ( +- 1.09% ) 298,556,520,546 cycles # 2.299 GHz ( +- 0.04% ) 96,525,366,833 instructions # 0.32 insn per cycle ( +- 0.04% ) 22,730,194,043 branches # 175.027 M/sec ( +- 0.04% ) 111,506,074 branch-misses # 0.49% of all branches ( +- 0.18% ) 8.183629778 seconds time elapsed ( +- 0.04% ) ^^^^^^^^^^^ Link: http://lkml.kernel.org/r/20180213132911.GA24298@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit c67e4ec3a63b3c3935de1dabd249e859c49bdd9d) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:20 +05:30
Mateusz Guzik	2d7289323f	proc: get rid of task lock/unlock pair to read umask for the "status" file get_task_umask locks/unlocks the task on its own. The only caller does the same thing immediately after. Utilize the fact the task has to be locked anyway and just do it once. Since there are no other users and the code is short, fold it in. Link: http://lkml.kernel.org/r/1517995608-23683-1-git-send-email-mguzik@redhat.com Signed-off-by: Mateusz Guzik <mguzik@redhat.com> Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit 7172a169e15bd4dd022214b349d0d304ce296475) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:19 +05:30
Andrei Vagin	a5c7b9ffa5	procfs: optimize seq_pad() to speed up /proc/pid/maps seq_printf() is slow and it can be replaced by memset() in this case. == test.py num = 0 with open("/proc/1/maps") as f: while num < 10000 : data = f.read() f.seek(0, 0) num = num + 1 == == Before patch == $ time python test.py real 0m0.986s user 0m0.279s sys 0m0.707s == After patch == $ time python test.py real 0m0.932s user 0m0.261s sys 0m0.669s $ perf record -g python test.py == Before patch == - 47.35% 3.38% python [kernel.kallsyms] [k] show_map_vma.isra.23 - 43.97% show_map_vma.isra.23 + 20.84% seq_path - 15.73% show_vma_header_prefix + 6.96% seq_pad + 2.94% __GI___libc_read == After patch == - 44.01% 0.34% python [kernel.kallsyms] [k] show_pid_map - 43.67% show_pid_map - 42.91% show_map_vma.isra.23 + 21.55% seq_path - 15.68% show_vma_header_prefix + 2.08% seq_pad 0.55% seq_putc Link: http://lkml.kernel.org/r/20180112185812.7710-2-avagin@openvz.org Signed-off-by: Andrei Vagin <avagin@openvz.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit e29c242fe3af7ff38fe3a116cfcaddb6d81af2bf) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:19 +05:30
Andrei Vagin	ab9961756e	procfs: add seq_put_hex_ll to speed up /proc/pid/maps seq_put_hex_ll() prints a number in hexadecimal notation and works faster than seq_printf(). == test.py num = 0 with open("/proc/1/maps") as f: while num < 10000 : data = f.read() f.seek(0, 0) num = num + 1 == == Before patch == $ time python test.py real 0m1.561s user 0m0.257s sys 0m1.302s == After patch == $ time python test.py real 0m0.986s user 0m0.279s sys 0m0.707s $ perf -g record python test.py: == Before patch == - 67.42% 2.82% python [kernel.kallsyms] [k] show_map_vma.isra.22 - 64.60% show_map_vma.isra.22 - 44.98% seq_printf - seq_vprintf - vsnprintf + 14.85% number + 12.22% format_decode 5.56% memcpy_erms + 15.06% seq_path + 4.42% seq_pad + 2.45% __GI___libc_read == After patch == - 47.35% 3.38% python [kernel.kallsyms] [k] show_map_vma.isra.23 - 43.97% show_map_vma.isra.23 + 20.84% seq_path - 15.73% show_vma_header_prefix 10.55% seq_put_hex_ll + 2.65% seq_put_decimal_ull 0.95% seq_putc + 6.96% seq_pad + 2.94% __GI___libc_read [avagin@openvz.org: use unsigned int instead of int where it is suitable] Link: http://lkml.kernel.org/r/20180214025619.4005-1-avagin@openvz.org [avagin@openvz.org: v2] Link: http://lkml.kernel.org/r/20180117082050.25406-1-avagin@openvz.org Link: http://lkml.kernel.org/r/20180112185812.7710-1-avagin@openvz.org Signed-off-by: Andrei Vagin <avagin@openvz.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Panchajanya1999 <rsk52959@gmail.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> (cherry picked from commit 962ec0e343fc30e08568fcac42cc4ef099d30e14) Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:57:18 +05:30
Adithya R	9a0ad9edbe	Revert "mm: Micro-optimize PID map reads for arm64 while retaining output format" * causes some games to ban users weirdly This reverts commit c0eaff615fd0380759424e54dcada03f5ec2512e. Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-04-20 14:45:59 +05:30
Sultan Alsawaf	c0eaff615f	mm: Micro-optimize PID map reads for arm64 while retaining output format Android and various applications in Android need to read PID map data in order to work. Some processes can contain over 10,000 mappings, which results in lots of time wasted on simply generating strings. This wasted time adds up, especially in the case of Unity-based games, which utilize the Boehm garbage collector. A game's main process typically has well over 10,000 mappings due to the loaded textures, and the Boehm GC reads PID maps several times a second. This results in over 100,000 map entries being printed out per second, so micro-optimization here is important. Before this commit, show_vma_header_prefix() would typically take around 1000 ns to run on a Snapdragon 855; now it only takes about 50 ns to run, which is a 20x improvement. The primary micro-optimizations here assume that there are no more than 40 bits in the virtual address space, hence the CONFIG_ARM64_VA_BITS check. Arm64 uses a virtual address size of 39 bits, so this perfectly covers it. This also removes padding used to beautify PID map output to further speed up reads and reduce the amount of bytes printed, and optimizes the dentry path retrieval for file-backed mappings. Note, however, that the trailing space at the end of the line for non-file-backed mappings cannot be omitted, as it breaks some PID map parsers. This still retains insignificant leading zeros from printed hex values to maintain the current output format. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>	2021-04-09 19:32:56 +05:30
Adithya R	10a895011e	Revert "mm: Micro-optimize PID map reads for arm64" This reverts commit 9d2d6e9568b96acf0d858801707c64a2bcb058c7.	2021-04-09 19:31:16 +05:30
Park Ju Hyung	8840d7f9a9	ext4: Remove additional tracings added by CAF Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com> Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-03-29 15:36:05 +05:30
Park Ju Hyung	6ac1ece269	f2fs: Remove additional tracings added by CAF Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com> Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-03-29 15:36:05 +05:30
Park Ju Hyung	ddffb9da80	time: Move frequently used functions to headers and inline Those function are frequently used in various places and declaring them inline can reduce overheads. Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com> Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com> Signed-off-by: Yaroslav Furman <yaro330@gmail.com>	2021-03-29 03:47:52 +05:30
NeilBrown	ad0b0e6fad	VFS: Use synchronize_rcu_expedited() in namespace_unlock() The synchronize_rcu() in namespace_unlock() is called every time a filesystem is unmounted. If a great many filesystems are mounted, this can cause a noticable slow-down in, for example, system shutdown. The sequence: mkdir -p /tmp/Mtest/{0..5000} time for i in /tmp/Mtest/; do mount -t tmpfs tmpfs $i ; done time umount /tmp/Mtest/ on a 4-cpu VM can report 8 seconds to mount the tmpfs filesystems, and 100 seconds to unmount them. Boot the same VM with 1 CPU and it takes 18 seconds to mount the tmpfs filesystems, but only 36 to unmount. If we change the synchronize_rcu() to synchronize_rcu_expedited() the umount time on a 4-cpu VM drop to 0.6 seconds I think this 200-fold speed up is worth the slightly high system impact of using synchronize_rcu_expedited(). Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> (from general rcu perspective) Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Yaroslav Furman <yaro330@gmail.com>	2021-03-29 03:43:58 +05:30
Suren Baghdasaryan	267f03a540	mm: Don't loop through tasks in __set_oom_adj when not necessary [ Upstream commit 67197a4f28d28d0b073ab0427b03cb2ee5382578 ] Currently __set_oom_adj loops through all processes in the system to keep oom_score_adj and oom_score_adj_min in sync between processes sharing their mm. This is done for any task with more that one mm_users, which includes processes with multiple threads (sharing mm and signals). However for such processes the loop is unnecessary because their signal structure is shared as well. Android updates oom_score_adj whenever a tasks changes its role (background/foreground/...) or binds to/unbinds from a service, making it more/less important. Such operation can happen frequently. We noticed that updates to oom_score_adj became more expensive and after further investigation found out that the patch mentioned in "Fixes" introduced a regression. Using Pixel 4 with a typical Android workload, write time to oom_score_adj increased from ~3.57us to ~362us. Moreover this regression linearly depends on the number of multi-threaded processes running on the system. Mark the mm with a new MMF_MULTIPROCESS flag bit when task is created with (CLONE_VM && !CLONE_THREAD && !CLONE_VFORK). Change __set_oom_adj to use MMF_MULTIPROCESS instead of mm_users to decide whether oom_score_adj update should be synchronized between multiple processes. To prevent races between clone() and __set_oom_adj(), when oom_score_adj of the process being cloned might be modified from userspace, we use oom_adj_mutex. Its scope is changed to global. The combination of (CLONE_VM && !CLONE_THREAD) is rarely used except for the case of vfork(). To prevent performance regressions of vfork(), we skip taking oom_adj_mutex and setting MMF_MULTIPROCESS when CLONE_VFORK is specified. Clearing the MMF_MULTIPROCESS flag (when the last process sharing the mm exits) is left out of this patch to keep it simple and because it is believed that this threading model is rare. Should there ever be a need for optimizing that case as well, it can be done by hooking into the exit path, likely following the mm_update_next_owner pattern. With the combination of (CLONE_VM && !CLONE_THREAD && !CLONE_VFORK) being quite rare, the regression is gone after the change is applied. [surenb@google.com: v3] Link: https://lkml.kernel.org/r/20200902012558.2335613-1-surenb@google.com Fixes: 44a70adec910 ("mm, oom_adj: make sure processes sharing mm have same view of oom_score_adj") Reported-by: Tim Murray <timmurray@google.com> Suggested-by: Michal Hocko <mhocko@kernel.org> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Eugene Syromiatnikov <esyr@redhat.com> Cc: Christian Kellner <christian@kellner.me> Cc: Adrian Reber <areber@redhat.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Alexey Gladkov <gladkov.alexey@gmail.com> Cc: Michel Lespinasse <walken@google.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: Bernd Edlinger <bernd.edlinger@hotmail.de> Cc: John Johansen <john.johansen@canonical.com> Cc: Yafang Shao <laoar.shao@gmail.com> Link: https://lkml.kernel.org/r/20200824153036.3201505-1-surenb@google.com Debugged-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-29 03:37:26 +05:30
Alexey Dobriyan	974e2cf043	fs: proc: Faster /proc/cmdline Use seq_puts() and skip format string processing. Link: http://lkml.kernel.org/r/20180309222948.GB3843@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Danny Lin <danny@kdrag0n.dev>	2021-03-28 04:34:17 +05:30
Sultan Alsawaf	fb6704a8d0	proc: cmdline: Patch SafetyNet flags These can be checked by `cat /proc/cmdline` Change-Id: I1438f14121f13ee09ae5c7577c7001544416e3c6 Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live>	2021-03-28 04:33:52 +05:30
Vinayak Menon	cb1aedb06e	mm: process_reclaim: Consider compound pages Avoid passing tail pages to isolate_lru_page. In the case process reclaim, since only lru pages are considered, this is just to avoid a warning from isolate_lru_page. Change-Id: I1f54dcec15f8c2d5ba16738657e79d2793d36c77 Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> (cherry-picked from commit 2a580c4466231c19452284c9a4cca22486ae8967) Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live>	2021-03-27 22:11:38 +05:30
Jeff Liu	406cbc0149	binfmt_elf: Use get_random_int() to fix entropy depleting Changes: -------- v4->v3: - s/random_stack_user()/get_atrandom_bytes()/ - Move this function to ahead of its use to avoid the predeclaration. v3->v2: - Tweak code comments of random_stack_user(). - Remove redundant bits mask and shift upon the random variable. v2->v1: - Fix random copy to check up buffer length that are not 4-byte multiples. v3 can be found at: http://www.spinics.net/lists/linux-fsdevel/msg59597.html v2 can be found at: http://www.spinics.net/lists/linux-fsdevel/msg59418.html v1 can be found at: http://www.spinics.net/lists/linux-fsdevel/msg59128.html Thanks, -Jeff Entropy is quickly depleted under normal operations like ls(1), cat(1), etc... between 2.6.30 to current mainline, for instance: $ cat /proc/sys/kernel/random/entropy_avail 3428 $ cat /proc/sys/kernel/random/entropy_avail 2911 $cat /proc/sys/kernel/random/entropy_avail 2620 We observed this problem has been occurring since 2.6.30 with fs/binfmt_elf.c: create_elf_tables()->get_random_bytes(), introduced by f06295b44c296c8f ("ELF: implement AT_RANDOM for glibc PRNG seeding"). /* * Generate 16 random bytes for userspace PRNG seeding. */ get_random_bytes(k_rand_bytes, sizeof(k_rand_bytes)); The patch introduces a wrapper around get_random_int() which has lower overhead than calling get_random_bytes() directly. With this patch applied: $ cat /proc/sys/kernel/random/entropy_avail 2731 $ cat /proc/sys/kernel/random/entropy_avail 2802 $ cat /proc/sys/kernel/random/entropy_avail 2878 Analyzed by John Sobecki. Signed-off-by: Jie Liu <jeff.liu@oracle.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andreas Dilger <aedilger@gmail.com> Cc: Alan Cox <alan@linux.intel.com> Cc: Arnd Bergmann <arnn@arndb.de> Cc: John Sobecki <john.sobecki@oracle.com> Cc: James Morris <james.l.morris@oracle.com> Cc: Jakub Jelinek <jakub@redhat.com> Cc: Ted Ts'o <tytso@mit.edu> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Kees Cook <keescook@chromium.org> Cc: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Alex Naidis <alex.naidis@linux.com>	2021-03-27 22:11:36 +05:30

1 2 3 4 5 ...

54423 Commits