msm-4.14

mirror of https://github.com/rd-stuffs/msm-4.14.git synced 2025-02-20 11:45:48 +08:00

Author	SHA1	Message	Date
Al Viro	5e4c4c5f6b	BACKPORT: FROMGIT: [PATCH] __inode_security_revalidate() never gets NULL opt_dentry Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jebaitedneko <Jebaitedneko@gmail.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:27:58 +07:00
Al Viro	95f5c735e3	BACKPORT: FROMGIT: [PATCH] fix breakage caused by d_find_alias() semantics change "VFS: don't keep disconnected dentries on d_anon" had a non-trivial side-effect - d_unhashed() now returns true for those dentries, making d_find_alias() skip them altogether. For most of its callers that's fine - we really want a connected alias there. However, there is a codepath where we relied upon picking such aliases if nothing else could be found - selinux delayed initialization of contexts for inodes on already mounted filesystems used to rely upon that. Cc: stable@kernel.org # f1ee616214cb "VFS: don't keep disconnected dentries on d_anon" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jebaitedneko <Jebaitedneko@gmail.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:27:57 +07:00
Richard Guy Briggs	0d9be5f3fc	BACKPORT: FROMGIT: [PATCH] audit: normalize MAC_POLICY_LOAD record The audit MAC_POLICY_LOAD record had redundant dangling keywords and was missing information about which LSM was responsible and its completion status. While this record is only issued on success, the parser expects the res= field to be present. Old record: type=MAC_POLICY_LOAD msg=audit(1479299795.404:43): policy loaded auid=0 ses=1 Delete the redundant dangling keywords, add the lsm= field and the res= field. New record: type=MAC_POLICY_LOAD msg=audit(1523293846.204:894): auid=0 ses=1 lsm=selinux res=1 See: https://github.com/linux-audit/audit-kernel/issues/47 Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Jebaitedneko <Jebaitedneko@gmail.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:27:57 +07:00
Richard Guy Briggs	c178d7130e	BACKPORT: FROMGIT: [PATCH] audit: normalize MAC_STATUS record There were two formats of the audit MAC_STATUS record, one of which was more standard than the other. One listed enforcing status changes and the other listed enabled status changes with a non-standard label. In addition, the record was missing information about which LSM was responsible and the operation's completion status. While this record is only issued on success, the parser expects the res= field to be present. old enforcing/permissive: type=MAC_STATUS msg=audit(1523312831.378:24514): enforcing=0 old_enforcing=1 auid=0 ses=1 old enable/disable: type=MAC_STATUS msg=audit(1523312831.378:24514): selinux=0 auid=0 ses=1 List both sets of status and old values and add the lsm= field and the res= field. Here is the new format: type=MAC_STATUS msg=audit(1523293828.657:891): enforcing=0 old_enforcing=1 auid=0 ses=1 enabled=1 old-enabled=1 lsm=selinux res=1 This record already accompanied a SYSCALL record. See: https://github.com/linux-audit/audit-kernel/issues/46 Signed-off-by: Richard Guy Briggs <rgb@redhat.com> [PM: 80-char fixes, merge fuzz, use new SELinux state functions] Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Jebaitedneko <Jebaitedneko@gmail.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:27:56 +07:00
Stephen Smalley	3f24797324	BACKPORT: FROMGIT: [PATCH] selinux: fix missing dput() before selinuxfs unmount Commit 0619f0f5e36f ("selinux: wrap selinuxfs state") triggers a BUG when SELinux is runtime-disabled (i.e. systemd or equivalent disables SELinux before initial policy load via /sys/fs/selinux/disable based on /etc/selinux/config SELINUX=disabled). This does not manifest if SELinux is disabled via kernel command line argument or if SELinux is enabled (permissive or enforcing). Before: SELinux: Disabled at runtime. BUG: Dentry 000000006d77e5c7{i=17,n=null} still in use (1) [unmount of selinuxfs selinuxfs] After: SELinux: Disabled at runtime. Fixes: 0619f0f5e36f ("selinux: wrap selinuxfs state") Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jebaitedneko <Jebaitedneko@gmail.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:27:56 +07:00
Kirill Tkhai	264647a7ef	BACKPORT: FROMGIT: [PATCH] security: Remove rtnl_lock() in selinux_xfrm_notify_policyload() rt_genid_bump_all() consists of ipv4 and ipv6 part. ipv4 part is incrementing of net::ipv4::rt_genid, and I see many places, where it's read without rtnl_lock(). ipv6 part calls __fib6_clean_all(), and it's also called without rtnl_lock() in other places. So, rtnl_lock() here was used to iterate net_namespace_list only, and we can remove it. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jebaitedneko <Jebaitedneko@gmail.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:27:55 +07:00
Matthew Garrett	dfeb1b9925	BACKPORT: FROMGIT: [PATCH] security: Add a cred_getsecid hook For IMA purposes, we want to be able to obtain the prepared secid in the bprm structure before the credentials are committed. Add a cred_getsecid hook that makes this possible. Signed-off-by: Matthew Garrett <mjg59@google.com> Acked-by: Paul Moore <paul@paul-moore.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Jebaitedneko <Jebaitedneko@gmail.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:27:55 +07:00
Eric W. Biederman	9ee79460d0	BACKPORT: FROMGIT: [PATCH] msg/security: Pass kern_ipc_perm not msg_queue into the msg_queue security hooks All of the implementations of security hooks that take msg_queue only access q_perm the struct kern_ipc_perm member. This means the dependencies of the msg_queue security hooks can be simplified by passing the kern_ipc_perm member of msg_queue. Making this change will allow struct msg_queue to become private to ipc/msg.c. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Jebaitedneko <Jebaitedneko@gmail.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:27:54 +07:00
Eric W. Biederman	a34d86144a	BACKPORT: FROMGIT: [PATCH] shm/security: Pass kern_ipc_perm not shmid_kernel into the shm security hooks All of the implementations of security hooks that take shmid_kernel only access shm_perm the struct kern_ipc_perm member. This means the dependencies of the shm security hooks can be simplified by passing the kern_ipc_perm member of shmid_kernel.. Making this change will allow struct shmid_kernel to become private to ipc/shm.c. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Jebaitedneko <Jebaitedneko@gmail.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:27:53 +07:00
Eric W. Biederman	72372af9d6	BACKPORT: FROMGIT: sem/security: Pass kern_ipc_perm not sem_array into the sem security hooks All of the implementations of security hooks that take sem_array only access sem_perm the struct kern_ipc_perm member. This means the dependencies of the sem security hooks can be simplified by passing the kern_ipc_perm member of sem_array. Making this change will allow struct sem and struct sem_array to become private to ipc/sem.c. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Jebaitedneko <Jebaitedneko@gmail.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:27:53 +07:00
Stephen Smalley	92941cb641	BACKPORT: FROMGIT: [PATCH] selinux: fix handling of uninitialized selinux state in get_bools/classes If security_get_bools/classes are called before the selinux state is initialized (i.e. before first policy load), then they should just return immediately with no booleans/classes. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Jebaitedneko <Jebaitedneko@gmail.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:27:52 +07:00
Eric Biggers	d9022c4e5d	BACKPORT: FROMGIT: [PATCH] selinux: constify write_op[] write_op[] is never modified, so make it 'const'. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Jebaitedneko <Jebaitedneko@gmail.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:27:51 +07:00
Peter Enderborg	74089ec737	BACKPORT: FROMGIT: selinux: Squash cleanup printk loggings Replace printk with pr_* to avoid checkpatch warnings. Signed-off-by: Peter Enderborg <peter.enderborg@sony.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Jebaitedneko <Jebaitedneko@gmail.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:27:51 +07:00
Kent Overstreet	ad1bcf9435	BACKPORT: FROMGIT: selinux: convert to kvmalloc The flex arrays were being used for constant sized arrays, so there's no benefit to using flex_arrays over something simpler. Link: http://lkml.kernel.org/r/20181217131929.11727-4-kent.overstreet@gmail.com Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Eric Paris <eparis@parisplace.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Pravin B Shelar <pshelar@ovn.org> Cc: Shaohua Li <shli@kernel.org> Cc: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jebaitedneko <Jebaitedneko@gmail.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:27:50 +07:00
Deepak Kumar Singh	813474a82b	rpmsg: glink: reset should_wakeup before calling system wakeup There may be multiple packets coming in suspend state. This may cause glink to call pm_system_wakeup for each packet. To avoid such scenario make should_wakeup false after first packet received in suspend state. Change-Id: Ifa2bd13229ec756c11d02c1891f105596697e87b Signed-off-by: Deepak Kumar Singh <deesin@codeaurora.org> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:27:49 +07:00
Andrzej Perczak	6813115336	drivers: power/irqchip: Add wakeup irq loggers Currently wakeup irq logging is broken which ends up with "Resume cause unknown" in logs. This makes battery stats unreadable thus turning power usage analysis into divination from a crystal ball. Fix this by logging wakeup irq to Google wakeup_stats driver. Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:27:34 +07:00
Peter Zijlstra (Intel)	547fb63531	idle: Prevent late-arriving interrupts from disrupting offline [ Upstream commit e78a761 ] Scheduling-clock interrupts can arrive late in the CPU-offline process, after idle entry and the subsequent call to cpuhp_report_idle_dead(). Once execution passes the call to rcu_report_dead(), RCU is ignoring the CPU, which results in lockdep complaints when the interrupt handler uses RCU: ------------------------------------------------------------------------ ============================= WARNING: suspicious RCU usage 5.2.0-rc1+ #681 Not tainted ----------------------------- kernel/sched/fair.c:9542 suspicious rcu_dereference_check() usage! other info that might help us debug this: RCU used illegally from offline CPU! rcu_scheduler_active = 2, debug_locks = 1 no locks held by swapper/5/0. stack backtrace: CPU: 5 PID: 0 Comm: swapper/5 Not tainted 5.2.0-rc1+ #681 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Bochs 01/01/2011 Call Trace: <IRQ> dump_stack+0x5e/0x8b trigger_load_balance+0xa8/0x390 ? tick_sched_do_timer+0x60/0x60 update_process_times+0x3b/0x50 tick_sched_handle+0x2f/0x40 tick_sched_timer+0x32/0x70 __hrtimer_run_queues+0xd3/0x3b0 hrtimer_interrupt+0x11d/0x270 ? sched_clock_local+0xc/0x74 smp_apic_timer_interrupt+0x79/0x200 apic_timer_interrupt+0xf/0x20 </IRQ> RIP: 0010:delay_tsc+0x22/0x50 Code: ff 0f 1f 80 00 00 00 00 65 44 8b 05 18 a7 11 48 0f ae e8 0f 31 48 89 d6 48 c1 e6 20 48 09 c6 eb 0e f3 90 65 8b 05 fe a6 11 48 <41> 39 c0 75 18 0f ae e8 0f 31 48 c1 e2 20 48 09 c2 48 89 d0 48 29 RSP: 0000:ffff8f92c0157ed0 EFLAGS: 00000212 ORIG_RAX: ffffffffffffff13 RAX: 0000000000000005 RBX: ffff8c861f356400 RCX: ffff8f92c0157e64 RDX: 000000321214c8cc RSI: 00000032120daa7f RDI: 0000000000260f15 RBP: 0000000000000005 R08: 0000000000000005 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 R13: 0000000000000000 R14: ffff8c861ee18000 R15: ffff8c861ee18000 cpuhp_report_idle_dead+0x31/0x60 do_idle+0x1d5/0x200 ? _raw_spin_unlock_irqrestore+0x2d/0x40 cpu_startup_entry+0x14/0x20 start_secondary+0x151/0x170 secondary_startup_64+0xa4/0xb0 ------------------------------------------------------------------------ This happens rarely, but can be forced by happen more often by placing delays in cpuhp_report_idle_dead() following the call to rcu_report_dead(). With this in place, the following rcutorture scenario reproduces the problem within a few minutes: tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 8 --duration 5 --kconfig "CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y" --configs "TREE04" This commit uses the crude but effective expedient of moving the disabling of interrupts within the idle loop to precede the cpu_is_offline() check. It also invokes tick_nohz_idle_stop_tick() instead of tick_nohz_idle_stop_tick_protected() to shut off the scheduling-clock interrupt. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> [ paulmck: Revert tick_nohz_idle_stop_tick_protected() removal, new callers. ] Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:27:33 +07:00
Juhyung Park	15c03df8c0	sched: promote nodes out of CONFIG_SCHED_DEBUG xNombre: Android modifies some scheduler parameters on boot. Applying these manually resulted in better hackbench performance. Signed-off-by: Juhyung Park <qkrwngud825@gmail.com> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:27:29 +07:00
Minchan Kim	be7a87c7d6	mm: introduce deactivate_page perprocess reclaims needs to deactivate file pages from active LRU when echo file > /proc/<pid>/reclaim. Add deactivate_file pages. Bug: 131016077 Bug: 153444106 (cherry picked from b07eab27085611203ad359b7f4eecd138d7d771a) Change-Id: I06fed20103671e4ca6fb8663d5029736442162a5 Signed-off-by: Minchan Kim <minchan@google.com> Signed-off-by: Martin Liu <liumartin@google.com> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:23:44 +07:00
Minchan Kim	6e93bc3e86	mm: reclaim more pages to find free pages in compaction There were many order-3 fail allocation report while VM had lots of reclaimable memory. 17353.434071] kworker/u16:4 invoked oom-killer: gfp_mask=0x6160c0(GFP_KERNEL\|__GFP_COMP\|__GFP_NOMEMALLOC\|__GFP_MEMALLOC), nodemask=(null), order=3, oom_score_adj=0 [17353.434079] kworker/u16:4 cpuset=/ mems_allowed=0 [17353.434086] CPU: 6 PID: 30045 Comm: kworker/u16:4 Tainted: G S WC O 4.19.95-g8137b6ce669e-ab6554412 #1 [17353.434089] Hardware name: Google Inc. MSM sm7250 v2 Bramble DVT (DT) [17353.434194] Workqueue: iparepwq95 __typeid__ZTSFiP44ipa_disable_force_clear_datapath_req_msg_v01E_global_addr [ipa3] [17353.434197] Call trace: [17353.434206] __typeid__ZTSFjP11task_structPK11user_regsetE_global_addr+0x14/0x18 [17353.434210] dump_stack+0xbc/0xf8 [17353.434217] dump_header+0xc8/0x250 [17353.434220] oom_kill_process+0x130/0x538 [17353.434222] out_of_memory+0x320/0x444 [17353.434226] __alloc_pages_nodemask+0x1124/0x13b4 [17353.434314] ipa3_alloc_rx_pkt_page+0x64/0x1a8 [ipa3] [17353.434403] ipa3_wq_page_repl+0x78/0x1a4 [ipa3] [17353.434407] process_one_work+0x3a8/0x6e4 [17353.434410] worker_thread+0x394/0x820 [17353.434413] kthread+0x19c/0x1ac [17353.434417] ret_from_fork+0x10/0x18 [17353.434419] Mem-Info: [17353.434424] active_anon:357378 inactive_anon:119141 isolated_anon:13\x0a active_file:97495 inactive_file:122151 isolated_file:22\x0a unevictable:49750 dirty:3553 writeback:0 unstable:0\x0a slab_reclaimable:30018 slab_unreclaimable:73884\x0a mapped:259586 shmem:27580 pagetables:39581 bounce:0\x0a free:17710 free_pcp:301 free_cma:0 [17353.434433] Node 0 active_anon:1429512kB inactive_anon:476564kB active_file:389980kB inactive_file:488604kB unevictable:199000kB isolated(anon):52kB isolated(file):88kB mapped:1038344kB dirty:14212kB writeback:0kB shmem:110320kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no [17353.434439] Normal free:70840kB min:9172kB low:43900kB high:49484kB active_anon:1429284kB inactive_anon:476336kB active_file:389980kB inactive_file:488604kB unevictable:199000kB writepending:14212kB present:5764280kB managed:5584928kB mlocked:199000kB kernel_stack:92656kB shadow_call_stack:5792kB pagetables:158324kB bounce:0kB free_pcp:1204kB local_pcp:108kB free_cma:0kB [17353.434441] lowmem_reserve[]: 0 0 [17353.434444] Normal: 89564kB (UMEH) 27268kB (UH) 75116kB (UH) 3332kB (H) 764kB (H) 0128kB 0256kB 0512kB 01024kB 02048kB 0*4096kB = 71152kB [17353.434451] 300317 total pagecache pages [17353.434454] 4228 pages in swap cache [17353.434456] Swap cache stats: add 20710158, delete 20707317, find 1014864/9891370 [17353.434459] Free swap = 103732kB [17353.434460] Total swap = 2097148kB [17353.434462] 1441070 pages RAM [17353.434465] 0 pages HighMem/MovableOnly [17353.434466] 44838 pages reserved [17353.434469] 73728 pages cma reserved When we saw the trace, compaction finished with COMPACT_COMPLETE(iow, it already did full scanning a zone but failed to create order-3 allocation) so should_compact_retry returns "false". <...>-30045 [006] .... 17353.433704: reclaim_retry_zone: node=0 zone=Normal order=3 reclaimable=696132 available=713920 min_wmark=2293 no_progress_loops=0 wmark_check=0 <...>-30045 [006] .... 17353.433706: compact_retry: order=3 priority=COMPACT_PRIO_SYNC_FULL compaction_result=failed retries=0 max_retries=16 should_retry=0 If we see previous trace, we could see compaction is hard to find free pages in the zone so free scanner of compaction moves fast toward migration scanner and finally, they(migration scanner and free page scanner) crossed over. <...>-30045 [006] .... 17353.427026: mm_compaction_isolate_freepages: range=(0x144c00 ~ 0x145000) nr_scanned=784 nr_taken=0 <...>-30045 [006] .... 17353.427037: mm_compaction_isolate_freepages: range=(0x144800 ~ 0x144c00) nr_scanned=1019 nr_taken=0 <...>-30045 [006] .... 17353.427049: mm_compaction_isolate_freepages: range=(0x144400 ~ 0x144800) nr_scanned=880 nr_taken=1 <...>-30045 [006] .... 17353.427061: mm_compaction_isolate_freepages: range=(0x144000 ~ 0x144400) nr_scanned=869 nr_taken=0 <...>-30045 [006] .... 17353.427212: mm_compaction_isolate_freepages: range=(0x140c00 ~ 0x141000) nr_scanned=1016 nr_taken=0 .. .. <...>-30045 [006] .... 17353.433696: mm_compaction_finished: node=0 zone=Normal order=3 ret=complete <...>-30045 [006] .... 17353.433698: mm_compaction_end: zone_start=0x80600 migrate_pfn=0xc9400 free_pfn=0xc9500 zone_end=0x200000, mode=sync status=complete If we see previous trace to see reclaim activities, we could see it was not hard to reclaim memory. <...>-30045 [006] .... 17353.413941: mm_vmscan_direct_reclaim_begin: order=3 may_writepage=1 gfp_flags=GFP_KERNEL\|__GFP_COMP\|__GFP_NOMEMALLOC\|__GFP_MEMALLOC classzone_idx=0 <...>-30045 [006] d..1 17353.413946: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=3 nr_requested=8 nr_scanned=8 nr_skipped=0 nr_taken=8 lru=inactive_anon <...>-30045 [006] .... 17353.413958: mm_vmscan_lru_shrink_inactive: nid=0 nr_scanned=8 nr_reclaimed=0 nr_dirty=0 nr_writeback=0 nr_congested=0 nr_immediate=0 nr_activate=8 nr_ref_keep=0 nr_unmap_fail=0 priority=12 flags=RECLAIM_WB_ANON\|RECLAIM_WB_ASYNC <...>-30045 [006] .... 17353.413960: mm_vmscan_inactive_list_is_low: nid=0 reclaim_idx=0 total_inactive=119119 inactive=119119 total_active=357352 active=357352 ratio=3 flags=RECLAIM_WB_ANON <...>-30045 [006] d..1 17353.413965: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=3 nr_requested=22 nr_scanned=22 nr_skipped=0 nr_taken=22 lru=inactive_file <...>-30045 [006] .... 17353.413979: mm_vmscan_lru_shrink_inactive: nid=0 nr_scanned=22 nr_reclaimed=22 nr_dirty=0 nr_writeback=0 nr_congested=0 nr_immediate=0 nr_activate=0 nr_ref_keep=0 nr_unmap_fail=0 priority=12 flags=RECLAIM_WB_FILE\|RECLAIM_WB_ASYNC <...>-30045 [006] .... 17353.413979: mm_vmscan_inactive_list_is_low: nid=0 reclaim_idx=0 total_inactive=122195 inactive=122195 total_active=97508 active=97508 ratio=1 flags=RECLAIM_WB_FILE <...>-30045 [006] .... 17353.413980: mm_vmscan_inactive_list_is_low: nid=0 reclaim_idx=0 total_inactive=119119 inactive=119119 total_active=357352 active=357352 ratio=3 flags=RECLAIM_WB_ANON <...>-30045 [006] .... 17353.414134: mm_vmscan_inactive_list_is_low: nid=0 reclaim_idx=0 total_inactive=0 inactive=0 total_active=0 active=0 ratio=1 flags=RECLAIM_WB_ANON <...>-30045 [006] .... 17353.414135: mm_vmscan_inactive_list_is_low: nid=0 reclaim_idx=0 total_inactive=0 inactive=0 total_active=0 active=0 ratio=1 flags=RECLAIM_WB_ANON <...>-30045 [006] d..1 17353.414141: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=3 nr_requested=29 nr_scanned=29 nr_skipped=0 nr_taken=29 lru=inactive_anon <...>-30045 [006] .... 17353.414170: mm_vmscan_lru_shrink_inactive: nid=0 nr_scanned=29 nr_reclaimed=0 nr_dirty=0 nr_writeback=0 nr_congested=0 nr_immediate=0 nr_activate=29 nr_ref_keep=0 nr_unmap_fail=0 priority=10 flags=RECLAIM_WB_ANON\|RECLAIM_WB_ASYNC <...>-30045 [006] .... 17353.414170: mm_vmscan_inactive_list_is_low: nid=0 reclaim_idx=0 total_inactive=119107 inactive=119107 total_active=357385 active=357385 ratio=3 flags=RECLAIM_WB_ANON <...>-30045 [006] d..1 17353.414176: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=3 nr_requested=32 nr_scanned=32 nr_skipped=0 nr_taken=32 lru=active_anon <...>-30045 [006] .... 17353.414206: mm_vmscan_lru_shrink_active: nid=0 nr_taken=32 nr_active=0 nr_deactivated=32 nr_referenced=32 priority=10 flags=RECLAIM_WB_ANON\|RECLAIM_WB_ASYNC <...>-30045 [006] d..1 17353.414212: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=3 nr_requested=32 nr_scanned=32 nr_skipped=0 nr_taken=32 lru=inactive_file <...>-30045 [006] .... 17353.414225: mm_vmscan_lru_shrink_inactive: nid=0 nr_scanned=32 nr_reclaimed=32 nr_dirty=0 nr_writeback=0 nr_congested=0 nr_immediate=0 nr_activate=0 nr_ref_keep=0 nr_unmap_fail=0 priority=10 flags=RECLAIM_WB_FILE\|RECLAIM_WB_ASYNC <...>-30045 [006] .... 17353.414225: mm_vmscan_inactive_list_is_low: nid=0 reclaim_idx=0 total_inactive=122131 inactive=122131 total_active=97508 active=97508 ratio=1 flags=RECLAIM_WB_FILE <...>-30045 [006] d..1 17353.414228: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=3 nr_requested=16 nr_scanned=16 nr_skipped=0 nr_taken=16 lru=inactive_file <...>-30045 [006] .... 17353.414235: mm_vmscan_lru_shrink_inactive: nid=0 nr_scanned=16 nr_reclaimed=16 nr_dirty=0 nr_writeback=0 nr_congested=0 nr_immediate=0 nr_activate=0 nr_ref_keep=0 nr_unmap_fail=0 priority=10 flags=RECLAIM_WB_FILE\|RECLAIM_WB_ASYNC <...>-30045 [006] .... 17353.414235: mm_vmscan_inactive_list_is_low: nid=0 reclaim_idx=0 total_inactive=122115 inactive=122115 total_active=97508 active=97508 ratio=1 flags=RECLAIM_WB_FILE <...>-30045 [006] .... 17353.414236: mm_vmscan_inactive_list_is_low: nid=0 reclaim_idx=0 total_inactive=119139 inactive=119139 total_active=357353 active=357353 ratio=3 flags=RECLAIM_WB_ANON <...>-30045 [006] .... 17353.414320: mm_vmscan_inactive_list_is_low: nid=0 reclaim_idx=0 total_inactive=0 inactive=0 total_active=0 active=0 ratio=1 flags=RECLAIM_WB_ANON <...>-30045 [006] .... 17353.414321: mm_vmscan_inactive_list_is_low: nid=0 reclaim_idx=0 total_inactive=0 inactive=0 total_active=0 active=0 ratio=1 flags=RECLAIM_WB_ANON <...>-30045 [006] .... 17353.414339: mm_vmscan_direct_reclaim_end: nr_reclaimed=70 Based on that, we could assume that if reclaimer has reclaimed more pages, compaction could find free pages easily so free scanner of compaction were not moved fast like that. That means it wouldn't fail for non-costly high-order allocation. What this patch does is if the order is non-costly high order allocation, it will keep trying migration with reclaiming if system has enough reclaimable memory. Bug: 156785617 Bug: 158449887 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: Ic02146be8acc4334b51be6cea54411432547608d Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:23:44 +07:00
Vlastimil Babka	1ce0e31661	mm, compaction: raise compaction priority after it withdrawns Mike Kravetz reports that "hugetlb allocations could stall for minutes or hours when should_compact_retry() would return true more often then it should. Specifically, this was in the case where compact_result was COMPACT_DEFERRED and COMPACT_PARTIAL_SKIPPED and no progress was being made." The problem is that the compaction_withdrawn() test in should_compact_retry() includes compaction outcomes that are only possible on low compaction priority, and results in a retry without increasing the priority. This may result in furter reclaim, and more incomplete compaction attempts. With this patch, compaction priority is raised when possible, or should_compact_retry() returns false. The COMPACT_SKIPPED result doesn't really fit together with the other outcomes in compaction_withdrawn(), as that's a result caused by insufficient order-0 pages, not due to low compaction priority. With this patch, it is moved to a new compaction_needs_reclaim() function, and for that outcome we keep the current logic of retrying if it looks like reclaim will be able to help. Bug: 156785617 Link: http://lkml.kernel.org/r/20190806014744.15446-4-mike.kravetz@oracle.com Reported-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Tested-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: I67134003597caa963d5ecff7e2a42ef101e3aa4a Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:23:43 +07:00
Jaewon Kim	9ad9dbeb35	BACKPORT: page_alloc: consider highatomic reserve in watermark fast zone_watermark_fast was introduced by commit 48ee5f3696f6 ("mm, page_alloc: shortcut watermark checks for order-0 pages"). The commit simply checks if free pages is bigger than watermark without additional calculation such like reducing watermark. It considered free cma pages but it did not consider highatomic reserved. This may incur exhaustion of free pages except high order atomic free pages. Assume that reserved_highatomic pageblock is bigger than watermark min, and there are only few free pages except high order atomic free. Because zone_watermark_fast passes the allocation without considering high order atomic free, normal reclaimable allocation like GFP_HIGHUSER will consume all the free pages. Then finally order-0 atomic allocation may fail on allocation. This means watermark min is not protected against non-atomic allocation. The order-0 atomic allocation with ALLOC_HARDER unwantedly can be failed. Additionally the __GFP_MEMALLOC allocation with ALLOC_NO_WATERMARKS also can be failed. To avoid the problem, zone_watermark_fast should consider highatomic reserve. If the actual size of high atomic free is counted accurately like cma free, we may use it. On this patch just use nr_reserved_highatomic. Additionally introduce __zone_watermark_unusable_free to factor out common parts between zone_watermark_fast and __zone_watermark_ok. This is an example of ALLOC_HARDER allocation failure using v4.19 based kernel. Binder:9343_3: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null) Call trace: [<ffffff8008f40f8c>] dump_stack+0xb8/0xf0 [<ffffff8008223320>] warn_alloc+0xd8/0x12c [<ffffff80082245e4>] __alloc_pages_nodemask+0x120c/0x1250 [<ffffff800827f6e8>] new_slab+0x128/0x604 [<ffffff800827b0cc>] ___slab_alloc+0x508/0x670 [<ffffff800827ba00>] __kmalloc+0x2f8/0x310 [<ffffff80084ac3e0>] context_struct_to_string+0x104/0x1cc [<ffffff80084ad8fc>] security_sid_to_context_core+0x74/0x144 [<ffffff80084ad880>] security_sid_to_context+0x10/0x18 [<ffffff800849bd80>] selinux_secid_to_secctx+0x20/0x28 [<ffffff800849109c>] security_secid_to_secctx+0x3c/0x70 [<ffffff8008bfe118>] binder_transaction+0xe68/0x454c Mem-Info: active_anon:102061 inactive_anon:81551 isolated_anon:0 active_file:59102 inactive_file:68924 isolated_file:64 unevictable:611 dirty:63 writeback:0 unstable:0 slab_reclaimable:13324 slab_unreclaimable:44354 mapped:83015 shmem:4858 pagetables:26316 bounce:0 free:2727 free_pcp:1035 free_cma:178 Node 0 active_anon:408244kB inactive_anon:326204kB active_file:236408kB inactive_file:275696kB unevictable:2444kB isolated(anon):0kB isolated(file):256kB mapped:332060kB dirty:252kB writeback:0kB shmem:19432kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no Normal free:10908kB min:6192kB low:44388kB high:47060kB active_anon:409160kB inactive_anon:325924kB active_file:235820kB inactive_file:276628kB unevictable:2444kB writepending:252kB present:3076096kB managed:2673676kB mlocked:2444kB kernel_stack:62512kB pagetables:105264kB bounce:0kB free_pcp:4140kB local_pcp:40kB free_cma:712kB lowmem_reserve[]: 0 0 Normal: 5054kB (H) 3578kB (H) 20116kB (H) 6532kB (H) 164kB (H) 0128kB 0256kB 0512kB 01024kB 02048kB 04096kB = 10236kB 138826 total pagecache pages 5460 pages in swap cache Swap cache stats: add 8273090, delete 8267506, find 1004381/4060142 This is an example of ALLOC_NO_WATERMARKS allocation failure using v4.14 based kernel. kswapd0: page allocation failure: order:0, mode:0x140000a(GFP_NOIO\|__GFP_HIGHMEM\|__GFP_MOVABLE), nodemask=(null) kswapd0 cpuset=/ mems_allowed=0 CPU: 4 PID: 1221 Comm: kswapd0 Not tainted 4.14.113-18770262-userdebug #1 Call trace: [<0000000000000000>] dump_backtrace+0x0/0x248 [<0000000000000000>] show_stack+0x18/0x20 [<0000000000000000>] __dump_stack+0x20/0x28 [<0000000000000000>] dump_stack+0x68/0x90 [<0000000000000000>] warn_alloc+0x104/0x198 [<0000000000000000>] __alloc_pages_nodemask+0xdc0/0xdf0 [<0000000000000000>] zs_malloc+0x148/0x3d0 [<0000000000000000>] zram_bvec_rw+0x410/0x798 [<0000000000000000>] zram_rw_page+0x88/0xdc [<0000000000000000>] bdev_write_page+0x70/0xbc [<0000000000000000>] __swap_writepage+0x58/0x37c [<0000000000000000>] swap_writepage+0x40/0x4c [<0000000000000000>] shrink_page_list+0xc30/0xf48 [<0000000000000000>] shrink_inactive_list+0x2b0/0x61c [<0000000000000000>] shrink_node_memcg+0x23c/0x618 [<0000000000000000>] shrink_node+0x1c8/0x304 [<0000000000000000>] kswapd+0x680/0x7c4 [<0000000000000000>] kthread+0x110/0x120 [<0000000000000000>] ret_from_fork+0x10/0x18 Mem-Info: active_anon:111826 inactive_anon:65557 isolated_anon:0\x0a active_file:44260 inactive_file:83422 isolated_file:0\x0a unevictable:4158 dirty:117 writeback:0 unstable:0\x0a slab_reclaimable:13943 slab_unreclaimable:43315\x0a mapped:102511 shmem:3299 pagetables:19566 bounce:0\x0a free:3510 free_pcp:553 free_cma:0 Node 0 active_anon:447304kB inactive_anon:262228kB active_file:177040kB inactive_file:333688kB unevictable:16632kB isolated(anon):0kB isolated(file):0kB mapped:410044kB d irty:468kB writeback:0kB shmem:13196kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no Normal free:14040kB min:7440kB low:94500kB high:98136kB reserved_highatomic:32768KB active_anon:447336kB inactive_anon:261668kB active_file:177572kB inactive_file:333768k B unevictable:16632kB writepending:480kB present:4081664kB managed:3637088kB mlocked:16632kB kernel_stack:47072kB pagetables:78264kB bounce:0kB free_pcp:2280kB local_pcp:720kB free_cma:0kB [ 4738.329607] lowmem_reserve[]: 0 0 Normal: 8604kB (H) 4538kB (H) 18016kB (H) 2632kB (H) 3464kB (H) 6128kB (H) 2256kB (H) 0512kB 01024kB 02048kB 04096kB = 14232kB This is trace log which shows GFP_HIGHUSER consumes free pages right before ALLOC_NO_WATERMARKS. <...>-22275 [006] .... 889.213383: mm_page_alloc: page=00000000d2be5665 pfn=970744 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER\|__GFP_ZERO <...>-22275 [006] .... 889.213385: mm_page_alloc: page=000000004b2335c2 pfn=970745 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER\|__GFP_ZERO <...>-22275 [006] .... 889.213387: mm_page_alloc: page=00000000017272e1 pfn=970278 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER\|__GFP_ZERO <...>-22275 [006] .... 889.213389: mm_page_alloc: page=00000000c4be79fb pfn=970279 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER\|__GFP_ZERO <...>-22275 [006] .... 889.213391: mm_page_alloc: page=00000000f8a51d4f pfn=970260 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER\|__GFP_ZERO <...>-22275 [006] .... 889.213393: mm_page_alloc: page=000000006ba8f5ac pfn=970261 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER\|__GFP_ZERO <...>-22275 [006] .... 889.213395: mm_page_alloc: page=00000000819f1cd3 pfn=970196 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER\|__GFP_ZERO <...>-22275 [006] .... 889.213396: mm_page_alloc: page=00000000f6b72a64 pfn=970197 order=0 migratetype=0 nr_free=3650 gfp_flags=GFP_HIGHUSER\|__GFP_ZERO kswapd0-1207 [005] ...1 889.213398: mm_page_alloc: page= (null) pfn=0 order=0 migratetype=1 nr_free=3650 gfp_flags=GFP_NOWAIT\|__GFP_HIGHMEM\|__GFP_NOWARN\|__GFP_MOVABLE [jaewon31.kim@samsung.com: remove redundant code for high-order] Link: http://lkml.kernel.org/r/20200623035242.27232-1-jaewon31.kim@samsung.com Reported-by: Yong-Taek Lee <ytk.lee@samsung.com> Suggested-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Baoquan He <bhe@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Yong-Taek Lee <ytk.lee@samsung.com> Cc: Michal Hocko <mhocko@kernel.org> Link: http://lkml.kernel.org/r/20200619235958.11283-1-jaewon31.kim@samsung.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit f27ce0e14088b23f8d54ae4a44f70307ec420e64) Change-Id: I2638d575f809e885272c3b2a4e5100f2d6b8934d Bug: 175184106 Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:23:43 +07:00
Minchan	6e6e36e8ce	mm: abort per-process reclaim It's possible user to launch a app while platform reclaims memory of the app via per-process reclaim. In that case, platform should stop the reclaim and let the app launch via releasing mmap_sem. Bug: 158479061 Signed-off-by: Minchan <minchan@google.com> Change-Id: I7315031981629f32eb16a465e0e6da3cd13d6373 (cherry picked from commit 2cd57b68c235a8b5cbd7c970cddc1a0fdd643fd1) Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:23:42 +07:00
Minchan Kim	a9b4957653	mm: perproc-reclaim: do not scanning anonymous vma If we don't have enough swap space or no more anonymous page, it's pointless to scan anonymous vma. Let's skip it. Bug: 131016077 Bug: 152499875 Test: boot (cherry picked from f57c6720030cf839e2bef149796d7e9f86c0d7d6) Change-Id: If9d0831e9e712a2a335c3f3e771eb8ed4af94c6e Signed-off-by: Minchan Kim <minchan@google.com> Signed-off-by: Martin Liu <liumartin@google.com> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:23:41 +07:00
Minchan Kim	feff085d5c	mm: perproc-reclaim: do not discarding file-backed pages With testing, I found sometime perproc-reclaim shows regression with more LMKD kill because LMKD still relies on the file-LRU size while perproc-reclaim could make the file-LRU smaller. This patch changes the policy of file-backed LRU shrinking. Instead of discarding pages of perprocess reclaim, just makes them easy-reclaimable pages, for instance, move active file pages to inactive, clear PG_referenced and pte access bits. With that, we could keep file-LRU bigger so that more change to hit in the cache while VM can shrink quickly when memory spike happens. Bug: 131016077 Bug: 153444106 Test: boot (cherry picked from 0f412d789861f06b14737d1e681b06e95cefda62) Change-Id: I6b054feb223ac66977ddcf92a669f032d4030de1 Signed-off-by: Minchan Kim <minchan@google.com> Signed-off-by: Martin Liu <liumartin@google.com> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:23:40 +07:00
Minchan Kim	72220ad54d	mm: per-process reclaim These day, there are many platforms available in the embedded market and they are smarter than kernel which has very limited information about working set so they want to involve memory management more heavily like android's lowmemory killer and ashmem or recent many lowmemory notifier. One of the simple imagine scenario about userspace's intelligence is that platform can manage tasks as forground and background so it would be better to reclaim background's task pages for end-user's responsibility although it has frequent referenced pages. This patch adds new knob "reclaim under proc/<pid>/" so task manager can reclaim any target process anytime, anywhere. It could give another method to platform for using memory efficiently. It can avoid process killing for getting free memory, which was really terrible experience because I lost my best score of game I had ever after I switch the phone call while I enjoyed the game. Reclaim file-backed pages only. echo file > /proc/PID/reclaim Reclaim anonymous pages only. echo anon > /proc/PID/reclaim Reclaim all pages echo all > /proc/PID/reclaim Bug: 131016077 Bug: 153444106 Test: boot (cherry picked from 18c2af05a553f17d354b88b3a45dadc114c8c72c) Change-Id: I99b51544f79202c097214d3856678cac4449a743 Signed-off-by: Tim Murray <timmurray@google.com> Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Martin Liu <liumartin@google.com> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:23:40 +07:00
Andrzej Perczak	dc764e5ee6	mm: Remove process reclaim It will be backported from redbull. Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:23:38 +07:00
Matthew Wilcox	215918c2f1	mm: get 7% more pages in a pagevec We don't have to use an entire 'long' for the number of elements in the pagevec; we know it's a number between 0 and 14 (now 15). So we can store it in a char, and then the bool packs next to it and we still have two or six bytes of padding for more elements in the header. That gives us space to cram in an extra page. Link: http://lkml.kernel.org/r/20171206022521.GM26021@bombadil.infradead.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:22:57 +07:00
Mel Gorman	b698bab3fb	mm, pagevec: rename pagevec drained field According to Vlastimil Babka, the drained field in pagevec is potentially misleading because it might be interpreted as draining this pagevec instead of the percpu lru pagevecs. Rename the field for clarity. Link: http://lkml.kernel.org/r/20171019093346.ylahzdpzmoriyf4v@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Suggested-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:22:57 +07:00
Mel Gorman	9443cf7bcc	mm, pagevec: remove cold parameter for pagevecs Every pagevec_init user claims the pages being released are hot even in cases where it is unlikely the pages are hot. As no one cares about the hotness of pages being released to the allocator, just ditch the parameter. No performance impact is expected as the overhead is marginal. The parameter is removed simply because it is a bit stupid to have a useless parameter copied everywhere. Link: http://lkml.kernel.org/r/20171018075952.10627-6-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:22:56 +07:00
Mel Gorman	8fbf24ee0e	mm: only drain per-cpu pagevecs once per pagevec usage When a pagevec is initialised on the stack, it is generally used multiple times over a range of pages, looking up entries and then releasing them. On each pagevec_release, the per-cpu deferred LRU pagevecs are drained on the grounds the page being released may be on those queues and the pages may be cache hot. In many cases only the first drain is necessary as it's unlikely that the range of pages being walked is racing against LRU addition. Even if there is such a race, the impact is marginal where as constantly redraining the lru pagevecs costs. This patch ensures that pagevec is only drained once in a given lifecycle without increasing the cache footprint of the pagevec structure. Only sparsetruncate tiny is shown here as large files have many exceptional entries and calls pagecache_release less frequently. sparsetruncate (tiny) 4.14.0-rc4 4.14.0-rc4 batchshadow-v1r1 onedrain-v1r1 Min Time 141.00 ( 0.00%) 141.00 ( 0.00%) 1st-qrtle Time 142.00 ( 0.00%) 142.00 ( 0.00%) 2nd-qrtle Time 142.00 ( 0.00%) 142.00 ( 0.00%) 3rd-qrtle Time 143.00 ( 0.00%) 143.00 ( 0.00%) Max-90% Time 144.00 ( 0.00%) 144.00 ( 0.00%) Max-95% Time 146.00 ( 0.00%) 145.00 ( 0.68%) Max-99% Time 198.00 ( 0.00%) 194.00 ( 2.02%) Max Time 254.00 ( 0.00%) 208.00 ( 18.11%) Amean Time 145.12 ( 0.00%) 144.30 ( 0.56%) Stddev Time 12.74 ( 0.00%) 9.62 ( 24.49%) Coeff Time 8.78 ( 0.00%) 6.67 ( 24.06%) Best99%Amean Time 144.29 ( 0.00%) 143.82 ( 0.32%) Best95%Amean Time 142.68 ( 0.00%) 142.31 ( 0.26%) Best90%Amean Time 142.52 ( 0.00%) 142.19 ( 0.24%) Best75%Amean Time 142.26 ( 0.00%) 141.98 ( 0.20%) Best50%Amean Time 141.90 ( 0.00%) 141.71 ( 0.13%) Best25%Amean Time 141.80 ( 0.00%) 141.43 ( 0.26%) The impact on bonnie is marginal and within the noise because a significant percentage of the file being truncated has been reclaimed and consists of shadow entries which reduce the hotness of the pagevec_release path. Link: http://lkml.kernel.org/r/20171018075952.10627-5-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:22:55 +07:00
celtare21	82bca10d95	sched/core: Fix rq clock warning in sched_migrate_to_cpumask_end() The following warning occurs because we don't update the runqueue's clock when taking rq->lock in sched_migrate_to_cpumask_end(): rq->clock_update_flags < RQCF_ACT_SKIP WARNING: CPU: 0 PID: 991 at update_curr+0x1c8/0x2bc [...] Call trace: update_curr+0x1c8/0x2bc dequeue_task_fair+0x7c/0x1238 do_set_cpus_allowed+0x64/0x28c sched_migrate_to_cpumask_end+0xa8/0x1b4 m_stop+0x40/0x78 seq_read+0x39c/0x4ac __vfs_read+0x44/0x12c vfs_read+0xf0/0x1d8 SyS_read+0x6c/0xcc el0_svc_naked+0x34/0x38 Fix it by adding an update_rq_clock() call when taking rq->lock. Signed-off-by: celtare21 <celtare21@gmail.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:22:52 +07:00
Danny Lin	209041fb24	qcacld: Fix regulatory domain country names Clang warns: ../drivers/staging/qcacld-3.0/core/cds/src/cds_regdomain.c:284:43: warning: suspicious concatenation of string literals in an array initialization; did you mean to separate the elements with a comma? [-Wstring-concatenation] {CTRY_TURKS_AND_CAICOS, FCC3_WORLD, "TC" "TURKS AND CAICOS"}, ^ , ../drivers/staging/qcacld-3.0/core/cds/src/cds_regdomain.c:284:38: note: place parentheses around the string literal to silence warning {CTRY_TURKS_AND_CAICOS, FCC3_WORLD, "TC" "TURKS AND CAICOS"}, ^ ../drivers/staging/qcacld-3.0/core/cds/src/cds_regdomain.c:296:45: warning: suspicious concatenation of string literals in an array initialization; did you mean to separate the elements with a comma? [-Wstring-concatenation] {CTRY_WALLIS_AND_FUTUNA, ETSI1_WORLD, "WF" "WALLIS"}, ^ , ../drivers/staging/qcacld-3.0/core/cds/src/cds_regdomain.c:296:40: note: place parentheses around the string literal to silence warning {CTRY_WALLIS_AND_FUTUNA, ETSI1_WORLD, "WF" "WALLIS"}, ^ Signed-off-by: Danny Lin <danny@kdrag0n.dev> Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com> Signed-off-by: atndko <z1281552865@gmail.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:22:51 +07:00
Aditya Bavanari	7d99ca60c8	dsp: Fix improper mutex unlock in afe close During SSR use cases, when AFE APR handle is NULL and AFE close is invoked, mutex unlock is done without locking. Fix it and bail out without unlocking the mutex in this scenario. Change-Id: Ia2988b56425d8c2d5c726d5860c13e655e7e4ed1 Signed-off-by: Aditya Bavanari <abavanar@codeaurora.org> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:22:50 +07:00
Vincent Guittot	4199755a1c	sched/fair: Fix load_balance redo for !imbalance It can happen that load_balance() finds a busiest group and then a busiest rq but the calculated imbalance is in fact 0. In such situation, detach_tasks() returns immediately and lets the flag LBF_ALL_PINNED set. The busiest CPU is then wrongly assumed to have pinned tasks and removed from the load balance mask. then, we redo a load balance without the busiest CPU. This creates wrong load balance situation and generates wrong task migration. If the calculated imbalance is 0, it's useless to try to find a busiest rq as no task will be migrated and we can return immediately. This situation can happen with heterogeneous system or smp system when RT tasks are decreasing the capacity of some CPUs. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: jhugo@codeaurora.org Link: http://lkml.kernel.org/r/1536306664-29827-1-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:22:50 +07:00
Peter Zijlstra	5a6759a303	sched/fair: Do not migrate due to a sync wakeup on exit When a task exits, it notifies the parent that it has exited. This is a sync wakeup and the exiting task may pull the parent towards the wakers CPU. For simple workloads like using a shell, it was observed that the shell is pulled across nodes by exiting processes. This is daft as the parent may be long-lived and properly placed. This patch special cases a sync wakeup on exit to avoid pulling tasks across nodes. Testing on a range of workloads and machines showed very little differences in performance although there was a small 3% boost on some machines running a shellscript intensive workload (git regression test suite). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Giovanni Gherdovich <ggherdovich@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180213133730.24064-5-mgorman@techsingularity.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:22:49 +07:00
Xunlei Pang	91be8a1686	sched/fair: Advance global expiration when period timer is restarted When period gets restarted after some idle time, start_cfs_bandwidth() doesn't update the expiration information, expire_cfs_rq_runtime() will see cfs_rq->runtime_expires smaller than rq clock and go to the clock drift logic, wasting needless CPU cycles on the scheduler hot path. Update the global expiration in start_cfs_bandwidth() to avoid frequent expire_cfs_rq_runtime() calls once a new period begins. Signed-off-by: Xunlei Pang <xlpang@linux.alibaba.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ben Segall <bsegall@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180620101834.24455-2-xlpang@linux.alibaba.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:22:48 +07:00
Juri Lelli	f497440c3c	BACKPORT: sched/deadline: Fix switched_from_dl() warning Mark noticed that syzkaller is able to reliably trigger the following warning: dl_rq->running_bw > dl_rq->this_bw WARNING: CPU: 1 PID: 153 at kernel/sched/deadline.c:124 switched_from_dl+0x454/0x608 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 153 Comm: syz-executor253 Not tainted 4.18.0-rc3+ #29 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0x0/0x458 show_stack+0x20/0x30 dump_stack+0x180/0x250 panic+0x2dc/0x4ec __warn_printk+0x0/0x150 report_bug+0x228/0x2d8 bug_handler+0xa0/0x1a0 brk_handler+0x2f0/0x568 do_debug_exception+0x1bc/0x5d0 el1_dbg+0x18/0x78 switched_from_dl+0x454/0x608 __sched_setscheduler+0x8cc/0x2018 sys_sched_setattr+0x340/0x758 el0_svc_naked+0x30/0x34 syzkaller reproducer runs a bunch of threads that constantly switch between DEADLINE and NORMAL classes while interacting through futexes. The splat above is caused by the fact that if a DEADLINE task is setattr back to NORMAL while in non_contending state (blocked on a futex - inactive timer armed), its contribution to running_bw is not removed before sub_rq_bw() gets called (!task_on_rq_queued() branch) and the latter sees running_bw > this_bw. Fix it by removing a task contribution from running_bw if the task is not queued and in non_contending state while switched to a different class. Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> Reviewed-by: Luca Abeni <luca.abeni@santannapisa.it> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: claudio@evidence.eu.com Cc: rostedt@goodmis.org Link: http://lkml.kernel.org/r/20180711072948.27061-1-juri.lelli@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:22:48 +07:00
Daniel Bristot de Oliveira	9943a95c87	sched/deadline: Update rq_clock of later_rq when pushing a task Daniel Casini got this warn while running a DL task here at RetisLab: [ 461.137582] ------------[ cut here ]------------ [ 461.137583] rq->clock_update_flags < RQCF_ACT_SKIP [ 461.137599] WARNING: CPU: 4 PID: 2354 at kernel/sched/sched.h:967 assert_clock_updated.isra.32.part.33+0x17/0x20 [a ton of modules] [ 461.137646] CPU: 4 PID: 2354 Comm: label_image Not tainted 4.18.0-rc4+ #3 [ 461.137647] Hardware name: ASUS All Series/Z87-K, BIOS 0801 09/02/2013 [ 461.137649] RIP: 0010:assert_clock_updated.isra.32.part.33+0x17/0x20 [ 461.137649] Code: ff 48 89 83 08 09 00 00 eb c6 66 0f 1f 84 00 00 00 00 00 55 48 c7 c7 98 7a 6c a5 c6 05 bc 0d 54 01 01 48 89 e5 e8 a9 84 fb ff <0f> 0b 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 83 7e 60 01 74 0a 48 3b [ 461.137673] RSP: 0018:ffffa77e08cafc68 EFLAGS: 00010082 [ 461.137674] RAX: 0000000000000000 RBX: ffff8b3fc1702d80 RCX: 0000000000000006 [ 461.137674] RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff8b3fded164b0 [ 461.137675] RBP: ffffa77e08cafc68 R08: 0000000000000026 R09: 0000000000000339 [ 461.137676] R10: ffff8b3fd060d410 R11: 0000000000000026 R12: ffffffffa4e14e20 [ 461.137677] R13: ffff8b3fdec22940 R14: ffff8b3fc1702da0 R15: ffff8b3fdec22940 [ 461.137678] FS: 00007efe43ee5700(0000) GS:ffff8b3fded00000(0000) knlGS:0000000000000000 [ 461.137679] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 461.137680] CR2: 00007efe30000010 CR3: 0000000301744003 CR4: 00000000001606e0 [ 461.137680] Call Trace: [ 461.137684] push_dl_task.part.46+0x3bc/0x460 [ 461.137686] task_woken_dl+0x60/0x80 [ 461.137689] ttwu_do_wakeup+0x4f/0x150 [ 461.137690] ttwu_do_activate+0x77/0x80 [ 461.137692] try_to_wake_up+0x1d6/0x4c0 [ 461.137693] wake_up_q+0x32/0x70 [ 461.137696] do_futex+0x7e7/0xb50 [ 461.137698] __x64_sys_futex+0x8b/0x180 [ 461.137701] do_syscall_64+0x5a/0x110 [ 461.137703] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 461.137705] RIP: 0033:0x7efe4918ca26 [ 461.137705] Code: 00 00 00 74 17 49 8b 48 20 44 8b 59 10 41 83 e3 30 41 83 fb 20 74 1e be 85 00 00 00 41 ba 01 00 00 00 41 b9 01 00 00 04 0f 05 <48> 3d 01 f0 ff ff 73 1f 31 c0 c3 be 8c 00 00 00 49 89 c8 4d 31 d2 [ 461.137738] RSP: 002b:00007efe43ee4928 EFLAGS: 00000283 ORIG_RAX: 00000000000000ca [ 461.137739] RAX: ffffffffffffffda RBX: 0000000005094df0 RCX: 00007efe4918ca26 [ 461.137740] RDX: 0000000000000001 RSI: 0000000000000085 RDI: 0000000005094e24 [ 461.137741] RBP: 00007efe43ee49c0 R08: 0000000005094e20 R09: 0000000004000001 [ 461.137741] R10: 0000000000000001 R11: 0000000000000283 R12: 0000000000000000 [ 461.137742] R13: 0000000005094df8 R14: 0000000000000001 R15: 0000000000448a10 [ 461.137743] ---[ end trace 187df4cad2bf7649 ]--- This warning happened in the push_dl_task(), because __add_running_bw()->cpufreq_update_util() is getting the rq_clock of the later_rq before its update, which takes place at activate_task(). The fix then is to update the rq_clock before calling add_running_bw(). To avoid double rq_clock_update() call, we set ENQUEUE_NOCLOCK flag to activate_task(). Reported-by: Daniel Casini <daniel.casini@santannapisa.it> Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Juri Lelli <juri.lelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luca Abeni <luca.abeni@santannapisa.it> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tommaso Cucinotta <tommaso.cucinotta@santannapisa.it> Fixes: e0367b12674b sched/deadline: Move CPU frequency selection triggering points Link: http://lkml.kernel.org/r/ca31d073a4788acf0684a8b255f14fea775ccf20.1532077269.git.bristot@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:22:47 +07:00
Juri Lelli	1a2c8cb99d	sched/deadline: Fix missing clock update A missing clock update is causing the following warning: rq->clock_update_flags < RQCF_ACT_SKIP WARNING: CPU: 10 PID: 0 at kernel/sched/sched.h:963 inactive_task_timer+0x5d6/0x720 Call Trace: <IRQ> __hrtimer_run_queues+0x10f/0x530 hrtimer_interrupt+0xe5/0x240 smp_apic_timer_interrupt+0x79/0x2b0 apic_timer_interrupt+0xf/0x20 </IRQ> do_idle+0x203/0x280 cpu_startup_entry+0x6f/0x80 start_secondary+0x1b0/0x200 secondary_startup_64+0xa5/0xb0 hardirqs last enabled at (793919): [<ffffffffa27c5f6e>] cpuidle_enter_state+0x9e/0x360 hardirqs last disabled at (793920): [<ffffffffa2a0096e>] interrupt_entry+0xce/0xe0 softirqs last enabled at (793922): [<ffffffffa20bef78>] irq_enter+0x68/0x70 softirqs last disabled at (793921): [<ffffffffa20bef5d>] irq_enter+0x4d/0x70 This happens because inactive_task_timer() calls sub_running_bw() (if TASK_DEAD and non_contending) that might trigger a schedutil update, which might access the clock. Clock is however currently updated only later in inactive_task_timer() function. Fix the problem by updating the clock right after task_rq_lock(). Reported-by: kernel test robot <xiaolong.ye@intel.com> Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Claudio Scordino <claudio@evidence.eu.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luca Abeni <luca.abeni@santannapisa.it> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180530160809.9074-1-juri.lelli@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:22:46 +07:00
Wen Yang	80974f58bd	sched/deadline: Make update_curr_dl() more accurate rq->clock_task may be updated between the two calls of rq_clock_task() in update_curr_dl(). Calling rq_clock_task() only once makes it more accurate and efficient, taking update_curr() as reference. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Wen Yang <wen.yang99@zte.com.cn> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: zhong.weidong@zte.com.cn Link: http://lkml.kernel.org/r/1517882148-44599-1-git-send-email-wen.yang99@zte.com.cn Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:22:46 +07:00
Juri Lelli	85aeed6e76	sched/deadline: Make bandwidth enforcement scale-invariant Apply frequency and CPU scale-invariance correction factor to bandwidth enforcement (similar to what we already do to fair utilization tracking). Each delta_exec gets scaled considering current frequency and maximum CPU capacity; which means that the reservation runtime parameter (that need to be specified profiling the task execution at max frequency on biggest capacity core) gets thus scaled accordingly. Signed-off-by: Juri Lelli <juri.lelli@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Claudio Scordino <claudio@evidence.eu.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luca Abeni <luca.abeni@santannapisa.it> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: alessio.balsini@arm.com Cc: bristot@redhat.com Cc: dietmar.eggemann@arm.com Cc: joelaf@google.com Cc: juri.lelli@redhat.com Cc: mathieu.poirier@linaro.org Cc: morten.rasmussen@arm.com Cc: patrick.bellasi@arm.com Cc: rjw@rjwysocki.net Cc: rostedt@goodmis.org Cc: tkjos@android.com Cc: tommaso.cucinotta@santannapisa.it Cc: vincent.guittot@linaro.org Link: http://lkml.kernel.org/r/20171204102325.5110-9-juri.lelli@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:22:45 +07:00
Juri Lelli	45ee5bb85f	sched/deadline: Fix bandwidth accounting at all levels after offline migration [ Upstream commit 59d06cea1198d665ba11f7e8c5f45b00ff2e4812 ] If a task happens to be throttled while the CPU it was running on gets hotplugged off, the bandwidth associated with the task is not correctly migrated with it when the replenishment timer fires (offline_migration). Fix things up, for this_bw, running_bw and total_bw, when replenishment timer fires and task is migrated (dl_task_offline_migration()). Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bristot@redhat.com Cc: claudio@evidence.eu.com Cc: lizefan@huawei.com Cc: longman@redhat.com Cc: luca.abeni@santannapisa.it Cc: mathieu.poirier@linaro.org Cc: rostedt@goodmis.org Cc: tj@kernel.org Cc: tommaso.cucinotta@santannapisa.it Link: https://lkml.kernel.org/r/20190719140000.31694-5-juri.lelli@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:22:44 +07:00
Prasad Sodagudi	7067d40574	sched: Take irq_sparse lock during the isolation irq_migrate_all_off_this_cpu() is used to migrate IRQs and this function checks for all active irq in the allocated_irqs mask. irq_migrate_all_off_this_cpu() expects the caller to take irq_sparse lock to avoid race conditions while accessing allocated_irqs mask variable. Prevent a race between irq alloc/free and irq migration by adding irq_sparse lock across CPU isolation. Change-Id: I9edece1ecea45297c8f6529952d88b3133046467 Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:22:42 +07:00
Satya Durga Srinivasu Prabhala	9fae95e5b2	sched: move watchdog_disable() call before isolation work Commit be45bf5395e0886 ("watchdog/softlockup: Fix cpu_stop_queue_work() double-queue bug") added wait_for_completion() call in watchdog_disable() which leads to below issue when we try to isolate any CPU. Fix it by moving watchdog_disable() call. [ 207.300191] BUG: sleeping function called from invalid context at \ kernel/sched/completion.c:99 [ 208.006089] ___might_sleep+0x1c8/0x1e0 [ 208.010032] __might_sleep+0x50/0x88 [ 208.013709] wait_for_completion+0x28/0x60 [ 208.017919] watchdog_disable+0x70/0x90 [ 208.021860] do_isolation_work_cpu_stop+0x54/0x200 [ 208.026784] cpu_stopper_thread+0xac/0x150 [ 208.030993] smpboot_thread_fn+0x1c8/0x2e8 [ 208.035202] kthread+0x11c/0x130 [ 208.038526] ret_from_fork+0x10/0x1c. Change-Id: I4d928dc03c71e68604c61c4986675fd629b69d1d Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:22:41 +07:00
Aniket Randive	535facd55d	Serial: msm_geni_serial: Use correct condition for device suspend check Check the device is suspended or not by using the proper condition and return true if usage count is zero and status of runtime PM is suspended. Change-Id: Id7d99959966871da2a1bb405deb9d29cba1df408 Signed-off-by: Aniket Randive <arandive@codeaurora.org> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:21:15 +07:00
0ctobot	f9e7c5c566	HACK: lib: Compile out nmi_backtrace for ARM64 This silences the following compilation warning, presumably emitted by llvm-ar in conjunction with Clang (Thin)LTO: lib/nmi_backtrace.o: no symbols This is a watchdog support library which is no-op on this architecture, hence the empty object file, so let's avoid building it entirely until a more aesthetically pleasing solution presents itself. Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:19:55 +07:00
Andrzej Perczak	3dfdbb7afe	arm64: Optimize for Cortex A76 By following the code of obtaining flags for mcpu=native [1] I found out that the most proper optimization for our SOC is cortex-a76. /proc/cpuinfo says: * implementer: 0x51 * part: 0x805 (LITTLE), 0x804 (big) [1] https://github.com/llvm/llvm-project/blob/main/llvm/lib/Support/Host.cpp Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:19:27 +07:00
davidchao	ca8c30120d	arm64: dts: Remove QC BCL default settings for performance impact Bug: 145785063 Bug: 152664241 Test: thermistor reading works normally Change-Id: I55cd31b190fbc4995da8bc690aa6a83871eefcc1 Signed-off-by: davidchao <davidchao@google.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-09 11:04:10 +07:00
Sultan Alsawaf	bbb36e45c3	kernel: Allow wakeup IRQs to cancel ongoing suspend Wakeup IRQs are only "armed" to cancel suspend very late into the suspend process, meaning that they cannot stop a suspend that's ongoing. This can be particularly painful due to how long the freezer may spend trying to freeze processes, during which time a wakeup IRQ cannot make the freezer abort. Wakeup IRQs should be honored throughout the entire suspend process rather than just at the end, so tweak the IRQ PM wakeup check to allow unarmed wakeup IRQs to cancel suspend partway through. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: azrim <mirzaspc@gmail.com>	2022-05-06 12:24:38 +07:00

... 2 3 4 5 6 ...

802645 Commits