msm-4.14

mirror of https://github.com/rd-stuffs/msm-4.14.git synced 2025-02-20 11:45:48 +08:00

Author	SHA1	Message	Date
Can Guo	3879476438	scsi: ufs-qcom: Turn off PHY only if link is not active Do not assume that AH8 always puts the link to hibern8 state before turnning off PHY during suspend. Turn off PHY only if link is not active. Change-Id: Ia50fdbe95e29e825b40679ba4519b49e2806b67a Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: UtsavBalar1231 <utsavbalar1231@gmail.com>	2021-06-05 14:55:41 +05:30
laoyi	bb32c75f6c	fs: proc: Update perms of process_reclaim node Other userspace apps like AppCOmpaction would like to use this node, so update permission. Change-Id: Ied22bd6ad489bef4028cde943ac185d1354ab971 Signed-off-by: <laoyi@codeaurora.org> Signed-off-by: UtsavBalar1231 <utsavbalar1231@gmail.com>	2021-06-05 14:55:41 +05:30
John Dias	f1033fb035	fs: Improve eventpoll logging to stop indicting timerfd timerfd doesn't create any wakelocks; eventpoll can, and is creating the wakelocks we see called "[timerfd]". eventpoll creates two kinds of wakelocks: a single top-level lock associated with the eventpoll fd itself, and one additional lock for each fd it is polling that needs such a lock (e.g. those using EPOLLWAKEUP). Current code names the per-fd locks using the undecorated names of the fds' associated files (hence "[timerfd]"), and is naming the top-level lock after the PID of the caller and the name of the file behind the first fd for which a per-fd lock is created. To make things clearer, the top-level lock is now named using the caller PID and an "epollfd" designation, while the per-fd locks are also named with the caller's PID (to associate them with the top-level lock) and their respective fds' file names. Port of fix already applied to previous 2 generations. Note that this set of changes does not fully solve the problem of eventpoll/timerfd wakelock attribution to the original process, since most activity is relayed through system_server, but it does at least ensure that different eventpoll wakelocks - and their stats - are properly disambiguated. Test: Ran on device and observed new wakelock naming in /d/wakeup_sources and (file naming in) lsof output. Bug: 116363986 Change-Id: I34bada5ddab04cf3830762c745f46bfcd1549cb8 Signed-off-by: John Dias <joaodias@google.com> Signed-off-by: Kelly Rossmoyer <krossmo@google.com> Signed-off-by: Miguel de Dios <migueldedios@google.com> Signed-off-by: UtsavBalar1231 <utsavbalar1231@gmail.com>	2021-06-05 14:55:41 +05:30
Sultan Alsawaf	94598231de	bpf: Eliminate CONFIG_MODULES limitation from JIT for arm64 CONFIG_MODULES is only needed for its memory allocator, which is trivial to add back into arm64 when modules aren't enabled. Do so in order to take advantage of JIT compilation when CONFIG_MODULES=n. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: UtsavBalar1231 <utsavbalar1231@gmail.com>	2021-06-05 14:55:40 +05:30
Adithya R	1cd596b249	Merge tag 'LA.UM.9.1.r1-10200-SMxxx0.0' of https://source.codeaurora.org/quic/la/platform/vendor/opensource/audio-kernel into staging/LA.UM.9.x "LA.UM.9.1.r1-10200-SMxxx0.0"	2021-06-05 14:54:16 +05:30
Adithya R	e3f73a5a8d	Merge tag 'LA.UM.9.1.r1-10200-SMxxx0.0' of https://source.codeaurora.org/quic/la/platform/vendor/qcom-opensource/wlan/qca-wifi-host-cmn into staging/LA.UM.9.x "LA.UM.9.1.r1-10200-SMxxx0.0"	2021-06-05 14:41:40 +05:30
Adithya R	bbd4ca491c	Merge tag 'LA.UM.9.1.r1-10200-SMxxx0.0' of https://source.codeaurora.org/quic/la/platform/vendor/qcom-opensource/wlan/qcacld-3.0 into staging/LA.UM.9.x "LA.UM.9.1.r1-10200-SMxxx0.0"	2021-06-05 14:40:33 +05:30
Adithya R	4ad4df234e	Merge tag 'LA.UM.9.1.r1-10200-SMxxx0.0' of https://source.codeaurora.org/quic/la/kernel/msm-4.14 into staging/LA.UM.9.x "LA.UM.9.1.r1-10200-SMxxx0.0"	2021-06-05 14:39:57 +05:30
Adithya R	9ea5ae0229	drm/msm: dsi_display: Prevent GPU boosting with battery saver refer 3db0705104385b945dfa6f475c60d00b38b082db	2021-06-04 13:34:58 +05:30
Adithya R	fa86a4a095	Revert "ARM64/configs: surya: Disable RELR relocations" This reverts commit a90526691613bc755d985c3fcc32081ff496f6a2.	2021-06-04 13:01:58 +05:30
FlyFrog	3c2d7070a3	lib: int_sqrt: Improve 3x faster integer sqrt. Result on 10,000,000 call. Old: sqrt(12345689) = 3513 real 0m0.768s user 0m0.760s sys 0m0.004s New: sqrt(12345689) = 3513 real 0m0.222s user 0m0.224s sys 0m0.000s Signed-off-by: Vaisakh Murali <mvaisakh@statixos.com>	2021-06-01 17:06:34 +05:30
Mohammed Nayeem Ur Rahman	b5d4c7f447	char: adsprpc: Set QoS only to silver cluster Glink IRQ mostly is taken by the silver cluster. RPC driver vote for sliver cluster prevents collapsing gold cluster. This saves significant power. Change-Id: Ic24bddceb7ae37d1182d2fca683c622b4ab71a55 Acked-by: Tadakamalla Krishnaiah <ktadakam@qti.qualcomm.com> Signed-off-by: Mohammed Nayeem Ur Rahman <mohara@codeaurora.org> Signed-off-by: celtare21 <celtare21@gmail.com	2021-06-01 17:06:32 +05:30
Vaisakh Murali	7a6332d901	msm: ipa: Only include emulation init with CONFIG_IPA_EMULATION * Native code doesn't seem to have any use for this. Change-Id: I59b33af3a67b5b5a7f4b42dbffa6f41f21aae567 Signed-off-by: Vaisakh Murali <mvaisakh@statixos.com>	2021-06-01 17:06:32 +05:30
Linux Build Service Account	d2fd228796	Merge 516a0a6251f23be4dc7136a996b762d586423079 on remote branch Change-Id: Ie0361b7309d4b15904757569ecd7b461f12d1825	2021-05-27 10:46:33 -07:00
Linux Build Service Account	857bd2e080	Merge f8526eed1d13bee359929006ea9f3f7f0fe26438 on remote branch Change-Id: I49addd030b5fdc3586d37dda20b25956097b7d0f	2021-05-27 10:44:41 -07:00
Linux Build Service Account	c0206fbd70	Merge 407d2eafc7aca94bb28098710cbc48555c3dfb55 on remote branch Change-Id: I4f0901a343384e33bcda53a96231e7522ed18e1d	2021-05-27 10:44:34 -07:00
Linux Build Service Account	9838220d29	Merge 42140f1656cd1a6b073540039d116f7c2f6f78f5 on remote branch Change-Id: I69f41511ff63fc6651655f1bd832ab5c3cd0a665	2021-05-27 10:40:20 -07:00
Adithya R	8207e6c43c	build.sh: Switch to transfer.sh for zip upload	2021-05-25 19:26:37 +05:30
Adithya R	18b2b5faa4	ARM64/configs: surya: Enable fuse short circuit	2021-05-25 19:26:37 +05:30
Adhitya Mohan	8a4166ccd4	fs/fuse: shortcircuit: Make it compile	2021-05-25 19:26:37 +05:30
LibXZR	0f1df8b6c0	fs/fuse: shortcircuit: Disable logging	2021-05-25 19:26:37 +05:30
LibXZR	a650eef6fc	fs: fuse: Implement fuse short circuit * This significantly improves i/o performance under /sdcard * From OnePlus 8T Oxygen OS 11.0.8.11.KB05AA and OnePlus 8 Oxygen OS 11.0.5.5.IN21AA and OnePlus 8 Pro Oxygen OS 11.0.5.5.IN11AA RealJohnGalt: make proper Kconfig, add back dependencies from OnePlus source onto our CAF tree. Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-05-25 19:26:37 +05:30
Theodore Ts'o	b8328115cd	ext4: Improve smp scalability for inode generation ->s_next_generation is protected by s_next_gen_lock but its usage pattern is very primitive. We don't actually need sequentially increasing new generation numbers, so let's use prandom_u32() instead. Reported-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: kdrag0n <dragon@khronodragon.com>	2021-05-25 19:26:37 +05:30
Vincent Guittot	9f4d369621	sched/fair: Fix unthrottle_cfs_rq() for leaf_cfs_rq list Although not exactly identical, unthrottle_cfs_rq() and enqueue_task_fair() are quite close and follow the same sequence for enqueuing an entity in the cfs hierarchy. Modify unthrottle_cfs_rq() to use the same pattern as enqueue_task_fair(). This fixes a problem already faced with the latter and add an optimization in the last for_each_sched_entity loop. Fixes: fe61468b2cb (sched/fair: Fix enqueue_task_fair warning) Reported-by Tao Zhou <zohooouoto@zoho.com.cn> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Phil Auld <pauld@redhat.com> Reviewed-by: Ben Segall <bsegall@google.com> Link: https://lkml.kernel.org/r/20200513135528.4742-1-vincent.guittot@linaro.org	2021-05-25 19:26:37 +05:30
Muchun Song	1bbdcf23b4	sched/fair: Mark sched_init_granularity __init Function sched_init_granularity() is only called from __init functions, so mark it __init as well. Signed-off-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lkml.kernel.org/r/20200406074750.56533-1-songmuchun@bytedance.com	2021-05-25 19:26:37 +05:30
Huaixin Chang	0951f4d996	sched/fair: Refill bandwidth before scaling In order to prevent possible hardlockup of sched_cfs_period_timer() loop, loop count is introduced to denote whether to scale quota and period or not. However, scale is done between forwarding period timer and refilling cfs bandwidth runtime, which means that period timer is forwarded with old "period" while runtime is refilled with scaled "quota". Move do_sched_cfs_period_timer() before scaling to solve this. Fixes: 2e8e19226398 ("sched/fair: Limit sched_cfs_period_timer() loop to avoid hard lockup") Signed-off-by: Huaixin Chang <changhuaixin@linux.alibaba.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ben Segall <bsegall@google.com> Reviewed-by: Phil Auld <pauld@redhat.com> Link: https://lkml.kernel.org/r/20200420024421.22442-3-changhuaixin@linux.alibaba.com	2021-05-25 19:26:37 +05:30
Peng Wang	aa3d11385a	sched/fair: Simplify the code of should_we_balance() We only consider group_balance_cpu() after there is no idle cpu. So, just do comparison before return at these two cases. Signed-off-by: Peng Wang <rocking@linux.alibaba.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/245c792f0e580b3ca342ad61257f4c066ee0f84f.1586594833.git.rocking@linux.alibaba.com Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2021-05-25 19:26:37 +05:30
Paul Turner	24619225af	sched/fair: Eliminate bandwidth race between throttling and distribution There is a race window in which an entity begins throttling before quota is added to the pool, but does not finish throttling until after we have finished with distribute_cfs_runtime(). This entity is not observed by distribute_cfs_runtime() because it was not on the throttled list at the time that distribution was running. This race manifests as rare period-length statlls for such entities. Rather than heavy-weight the synchronization with the progress of distribution, we can fix this by aborting throttling if bandwidth has become available. Otherwise, we immediately add the entity to the throttled list so that it can be observed by a subsequent distribution. Additionally, we can remove the case of adding the throttled entity to the head of the throttled list, and simply always add to the tail. Thanks to 26a8b12747c97, distribute_cfs_runtime() no longer holds onto its own pool of runtime. This means that if we do hit the !assign and distribute_running case, we know that distribution is about to end. Signed-off-by: Paul Turner <pjt@google.com> Signed-off-by: Ben Segall <bsegall@google.com> Signed-off-by: Josh Don <joshdon@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Phil Auld <pauld@redhat.com> Link: https://lkml.kernel.org/r/20200410225208.109717-2-joshdon@google.com	2021-05-25 19:26:36 +05:30
Peter Zijlstra	77c9280087	sched,rt: Use cpumask_any_distribute() Replace a bunch of cpumask_any() instances with cpumask_any*_distribute(), by injecting this little bit of random in cpu selection, we reduce the chance two competing balance operations working off the same lowest_mask pick the same CPU. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> Link: https://lkml.kernel.org/r/20201023102347.190759694@infradead.org	2021-05-25 19:26:36 +05:30
Paul Turner	3155932a9e	sched/core: Distribute tasks within affinity masks Currently, when updating the affinity of tasks via either cpusets.cpus, or, sched_setaffinity(); tasks not currently running within the newly specified mask will be arbitrarily assigned to the first CPU within the mask. This (particularly in the case that we are restricting masks) can result in many tasks being assigned to the first CPUs of their new masks. This: 1) Can induce scheduling delays while the load-balancer has a chance to spread them between their new CPUs. 2) Can antogonize a poor load-balancer behavior where it has a difficult time recognizing that a cross-socket imbalance has been forced by an affinity mask. This change adds a new cpumask interface to allow iterated calls to distribute within the intersection of the provided masks. The cases that this mainly affects are: - modifying cpuset.cpus - when tasks join a cpuset - when modifying a task's affinity via sched_setaffinity(2) Signed-off-by: Paul Turner <pjt@google.com> Signed-off-by: Josh Don <joshdon@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Qais Yousef <qais.yousef@arm.com> Tested-by: Qais Yousef <qais.yousef@arm.com> Link: https://lkml.kernel.org/r/20200311010113.136465-1-joshdon@google.com	2021-05-25 19:26:36 +05:30
Peter Zijlstra	171dc61962	sched: core: Fix balance_callback() The intent of balance_callback() has always been to delay executing balancing operations until the end of the current rq->lock section. This is because balance operations must often drop rq->lock, and that isn't safe in general. However, as noted by Scott, there were a few holes in that scheme; balance_callback() was called after rq->lock was dropped, which means another CPU can interleave and touch the callback list. Rework code to call the balance callbacks before dropping rq->lock where possible, and otherwise splice the balance list onto a local stack. This guarantees that the balance list must be empty when we take rq->lock. IOW, we'll only ever run our own balance callbacks. Reported-by: Scott Wood <swood@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> Link: https://lkml.kernel.org/r/20201023102346.203901269@infradead.org	2021-05-25 19:26:36 +05:30
Vincent Guittot	957309028f	sched/fair: Optimize update_blocked_averages() commit 31bc6aeaab1d1de8959b67edbed5c7a4b3cdbe7c upstream. Removing a cfs_rq from rq->leaf_cfs_rq_list can break the parent/child ordering of the list when it will be added back. In order to remove an empty and fully decayed cfs_rq, we must remove its children too, so they will be added back in the right order next time. With a normal decay of PELT, a parent will be empty and fully decayed if all children are empty and fully decayed too. In such a case, we just have to ensure that the whole branch will be added when a new task is enqueued. This is default behavior since : commit f6783319737f ("sched/fair: Fix insertion in rq->leaf_cfs_rq_list") In case of throttling, the PELT of throttled cfs_rq will not be updated whereas the parent will. This breaks the assumption made above unless we remove the children of a cfs_rq that is throttled. Then, they will be added back when unthrottled and a sched_entity will be enqueued. As throttled cfs_rq are now removed from the list, we can remove the associated test in update_blocked_averages(). Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: sargun@sargun.me Cc: tj@kernel.org Cc: xiexiuqi@huawei.com Cc: xiezhipeng1@huawei.com Link: https://lkml.kernel.org/r/1549469662-13614-2-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Vishnu Rangayyan <vishnu.rangayyan@apple.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-05-25 19:26:36 +05:30
Vincent Guittot	1c5a28a2b9	sched/fair: Move the rq_of() helper function Move rq_of() helper function so it can be used in pelt.c [ mingo: Improve readability while at it. ] Bug: 120440300 Change-Id: I2133979476631d68baaffcaa308f4cdab94f22b1 Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Morten.Rasmussen@arm.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bsegall@google.com Cc: dietmar.eggemann@arm.com Cc: patrick.bellasi@arm.com Cc: pjt@google.com Cc: pkondeti@codeaurora.org Cc: quentin.perret@arm.com Cc: rjw@rjwysocki.net Cc: srinivas.pandruvada@linux.intel.com Cc: thara.gopinath@linaro.org Link: https://lkml.kernel.org/r/1548257214-13745-2-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 62478d9911fab9694c195f0ca8e4701de09be98e) Signed-off-by: Quentin Perret <quentin.perret@arm.com> Signed-off-by: DennySPB <dennyspb@gmail.com>	2021-05-25 19:26:36 +05:30
Srikar Dronamraju	c40b2cdda2	BACKPORT: sched/fair: Optimize select_idle_core() Currently we loop through all threads of a core to evaluate if the core is idle or not. This is unnecessary. If a thread of a core is not idle, skip evaluating other threads of a core. Also while clearing the cpumask, bits of all CPUs of a core can be cleared in one-shot. Collecting ticks on a Power 9 SMT 8 system around select_idle_core while running schbench shows us (units are in ticks, hence lesser is better) Without patch N Min Max Median Avg Stddev x 130 151 1083 284 322.72308 144.41494 With patch N Min Max Median Avg Stddev Improvement x 164 88 610 201 225.79268 106.78943 30.03% Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Acked-by: Mel Gorman <mgorman@techsingularity.net> Link: https://lkml.kernel.org/r/20191206172422.6578-1-srikar@linux.vnet.ibm.com Signed-off-by: DennySPB <dennyspb@gmail.com>	2021-05-25 19:26:36 +05:30
bsegall@google.com	282c967bd3	sched/fair: Don't push cfs_bandwith slack timers forward When a cfs_rq sleeps and returns its quota, we delay for 5ms before waking any throttled cfs_rqs to coalesce with other cfs_rqs going to sleep, as this has to be done outside of the rq lock we hold. The current code waits for 5ms without any sleeps, instead of waiting for 5ms from the first sleep, which can delay the unthrottle more than we want. Switch this around so that we can't push this forward forever. This requires an extra flag rather than using hrtimer_active, since we need to start a new timer if the current one is in the process of finishing. Signed-off-by: Ben Segall <bsegall@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Xunlei Pang <xlpang@linux.alibaba.com> Acked-by: Phil Auld <pauld@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/xm26a7euy6iq.fsf_-_@bsegall-linux.svl.corp.google.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: DennySPB <dennyspb@gmail.com>	2021-05-25 19:26:36 +05:30
Vincent Guittot	cc0fe39ac0	sched/fair: Fix runnable_avg for throttled cfs When a cfs_rq is throttled, its group entity is dequeued and its running tasks are removed. We must update runnable_avg with the old h_nr_running and update group_se->runnable_weight with the new h_nr_running at each level of the hierarchy. Reviewed-by: Ben Segall <bsegall@google.com> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Fixes: 9f68395333ad ("sched/pelt: Add a new runnable average signal") Link: https://lkml.kernel.org/r/20200227154115.8332-1-vincent.guittot@linaro.org	2021-05-25 19:26:36 +05:30
Vincent Guittot	fb07d11ba5	sched/fair: Don't set LBF_ALL_PINNED unnecessarily Setting LBF_ALL_PINNED during active load balance is only valid when there is only 1 running task on the rq otherwise this ends up increasing the balance interval whereas other tasks could migrate after the next interval once they become cache-cold as an example. LBF_ALL_PINNED flag is now always set it by default. It is then cleared when we find one task that can be pulled when calling detach_tasks() or during active migration. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Acked-by: Mel Gorman <mgorman@suse.de> Link: https://lkml.kernel.org/r/20210107103325.30851-3-vincent.guittot@linaro.org	2021-05-25 19:26:36 +05:30
Anna-Maria Behnsen	d068c8d169	sched: Prevent raising SCHED_SOFTIRQ when CPU is !active SCHED_SOFTIRQ is raised to trigger periodic load balancing. When CPU is not active, CPU should not participate in load balancing. The scheduler uses nohz.idle_cpus_mask to keep track of the CPUs which can do idle load balancing. When bringing a CPU up the CPU is added to the mask when it reaches the active state, but on teardown the CPU stays in the mask until it goes offline and invokes sched_cpu_dying(). When SCHED_SOFTIRQ is raised on a !active CPU, there might be a pending softirq when stopping the tick which triggers a warning in NOHZ code. The SCHED_SOFTIRQ can also be raised by the scheduler tick which has the same issue. Therefore remove the CPU from nohz.idle_cpus_mask when it is marked inactive and also prevent the scheduler_tick() from raising SCHED_SOFTIRQ after this point. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lkml.kernel.org/r/20201215104400.9435-1-anna-maria@linutronix.de	2021-05-25 19:26:36 +05:30
Joel Fernandes	42b731c077	sched/fair: Remove impossible condition from find_idlest_group_cpu() find_idlest_group_cpu() goes through CPUs of a group previous selected by find_idlest_group(). find_idlest_group() returns NULL if the local group is the selected one and doesn't execute find_idlest_group_cpu if the group to which 'cpu' belongs to is chosen. So we're always guaranteed to call find_idlest_group_cpu() with a group to which 'cpu' is non-local. This makes one of the conditions in find_idlest_group_cpu() an impossible one, which we can get rid off. Signed-off-by: Joel Fernandes <joelaf@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Brendan Jackman <brendan.jackman@arm.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Cc: Android Kernel <kernel-team@android.com> Cc: Atish Patra <atish.patra@oracle.com> Cc: Chris Redpath <Chris.Redpath@arm.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: EAS Dev <eas-dev@lists.linaro.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Josef Bacik <jbacik@fb.com> Cc: Juri Lelli <juri.lelli@arm.com> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Ramussen <morten.rasmussen@arm.com> Cc: Patrick Bellasi <patrick.bellasi@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Rohit Jain <rohit.k.jain@oracle.com> Cc: Saravana Kannan <skannan@quicinc.com> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vikram Mulukutla <markivx@codeaurora.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: http://lkml.kernel.org/r/20171215153944.220146-3-joelaf@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2021-05-25 19:26:36 +05:30
Wen Yang	0a6a693e3f	sched/rt: Make update_curr_rt() more accurate rq->clock_task may be updated between the two calls of rq_clock_task() in update_curr_rt(). Calling rq_clock_task() only once makes it more accurate and efficient, taking update_curr() as reference. Signed-off-by: Wen Yang <wen.yang99@zte.com.cn> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: zhong.weidong@zte.com.cn Link: http://lkml.kernel.org/r/1517800721-42092-1-git-send-email-wen.yang99@zte.com.cn Signed-off-by: Ingo Molnar <mingo@kernel.org>	2021-05-25 19:26:36 +05:30
Peter Zijlstra	42a8cffcad	sched/fair: Do not migrate due to a sync wakeup on exit When a task exits, it notifies the parent that it has exited. This is a sync wakeup and the exiting task may pull the parent towards the wakers CPU. For simple workloads like using a shell, it was observed that the shell is pulled across nodes by exiting processes. This is daft as the parent may be long-lived and properly placed. This patch special cases a sync wakeup on exit to avoid pulling tasks across nodes. Testing on a range of workloads and machines showed very little differences in performance although there was a small 3% boost on some machines running a shellscript intensive workload (git regression test suite). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Giovanni Gherdovich <ggherdovich@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180213133730.24064-5-mgorman@techsingularity.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2021-05-25 19:26:36 +05:30
Paul Walmsley	b8206629a6	sched: Reinitialize rq->next_balance when a CPU is hot-added Reinitialize rq->next_balance when a CPU is hot-added. Otherwise, scheduler domain rebalancing may be skipped if rq->next_balance was set to a future time when the CPU was last active, and the newly-re-added CPU is in idle_balance(). As a result, the newly-re-added CPU will remain idle with no tasks scheduled until the softlockup watchdog runs - potentially 4 seconds later. This can waste energy and reduce performance. This behavior can be observed in some SoC kernels, which use CPU hotplug to dynamically remove and add CPUs in response to load. In one case that triggered this behavior, 0. the system started with all cores enabled, running multi-threaded CPU-bound code; 1. the system entered some single-threaded code; 2. a CPU went idle and was hot-removed; 3. the system started executing a multi-threaded CPU-bound task; 4. the CPU from event 2 was re-added, to respond to the load. The time interval between events 2 and 4 was approximately 300 milliseconds. Of course, ideally CPU hotplug would not be used in this manner, but this patch does appear to fix a real bug. Nvidia folks: this patch is submitted as at least a partial fix for bug 1243368 ("[sched] Load-balancing not happening correctly after cores brought online") Change-Id: Iabac21e110402bb581b7db40c42babc951d378d0 Signed-off-by: Paul Walmsley <pwalmsley@nvidia.com> Cc: Peter Boonstoppel <pboonstoppel@nvidia.com> Reviewed-on: http://git-master/r/206918 Reviewed-by: Automatic_Commit_Validation_User Reviewed-by: Amit Kamath <akamath@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Peter Boonstoppel <pboonstoppel@nvidia.com> Reviewed-by: Diwakar Tundlam <dtundlam@nvidia.com>	2021-05-25 19:26:36 +05:30
John Dias	cd063abd00	sched/fair: vruntime should normalize when switching from fair When rt_mutex_setprio changes a task's scheduling class to RT, we're seeing cases where the task's vruntime is not updated correctly upon return to the fair class. Specifically, the following is being observed: - task is deactivated while still in the fair class - task is boosted to RT via rt_mutex_setprio, which changes the task to RT and calls check_class_changed. - check_class_changed leads to detach_task_cfs_rq, at which point the vruntime_normalized check sees that the task's state is TASK_WAKING, which results in skipping the subtraction of the rq's min_vruntime from the task's vruntime - later, when the prio is deboosted and the task is moved back to the fair class, the fair rq's min_vruntime is added to the task's vruntime, even though it wasn't subtracted earlier. The immediate result is inflation of the task's vruntime, giving it lower priority (starving it if there's enough available work). The longer-term effect is inflation of all vruntimes because the task's vruntime becomes the rq's min_vruntime when the higher priority tasks go idle. That leads to a vicious cycle, where the vruntime inflation repeatedly doubled. The change here is to detect when vruntime_normalized is being called when the task is waking but is waking in another class, and to conclude that this is a case where vruntime has not been normalized. Bug: 80502612 Change-Id: If0bb02eb16939ca5e91ef282b7f9119ff68622c4 Signed-off-by: John Dias <joaodias@google.com> Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>	2021-05-25 19:26:36 +05:30
Linus Torvalds	cbf26b2f83	sched/fair: Fix infinite loop in update_blocked_averages() commit c40f7d74c741a907cfaeb73a7697081881c497d0 upstream. Zhipeng Xie, Xie XiuQi and Sargun Dhillon reported lockups in the scheduler under high loads, starting at around the v4.18 time frame, and Zhipeng Xie tracked it down to bugs in the rq->leaf_cfs_rq_list manipulation. Do a (manual) revert of: a9e7f6544b9c ("sched/fair: Fix O(nr_cgroups) in load balance path") It turns out that the list_del_leaf_cfs_rq() introduced by this commit is a surprising property that was not considered in followup commits such as: 9c2791f936ef ("sched/fair: Fix hierarchical order in rq->leaf_cfs_rq_list") As Vincent Guittot explains: "I think that there is a bigger problem with commit a9e7f6544b9c and cfs_rq throttling: Let take the example of the following topology TG2 --> TG1 --> root: 1) The 1st time a task is enqueued, we will add TG2 cfs_rq then TG1 cfs_rq to leaf_cfs_rq_list and we are sure to do the whole branch in one path because it has never been used and can't be throttled so tmp_alone_branch will point to leaf_cfs_rq_list at the end. 2) Then TG1 is throttled 3) and we add TG3 as a new child of TG1. 4) The 1st enqueue of a task on TG3 will add TG3 cfs_rq just before TG1 cfs_rq and tmp_alone_branch will stay on rq->leaf_cfs_rq_list. With commit a9e7f6544b9c, we can del a cfs_rq from rq->leaf_cfs_rq_list. So if the load of TG1 cfs_rq becomes NULL before step 2) above, TG1 cfs_rq is removed from the list. Then at step 4), TG3 cfs_rq is added at the beginning of rq->leaf_cfs_rq_list but tmp_alone_branch still points to TG3 cfs_rq because its throttled parent can't be enqueued when the lock is released. tmp_alone_branch doesn't point to rq->leaf_cfs_rq_list whereas it should. So if TG3 cfs_rq is removed or destroyed before tmp_alone_branch points on another TG cfs_rq, the next TG cfs_rq that will be added, will be linked outside rq->leaf_cfs_rq_list - which is bad. In addition, we can break the ordering of the cfs_rq in rq->leaf_cfs_rq_list but this ordering is used to update and propagate the update from leaf down to root." Instead of trying to work through all these cases and trying to reproduce the very high loads that produced the lockup to begin with, simplify the code temporarily by reverting a9e7f6544b9c - which change was clearly not thought through completely. This (hopefully) gives us a kernel that doesn't lock up so people can continue to enjoy their holidays without worrying about regressions. ;-) [ mingo: Wrote changelog, fixed weird spelling in code comment while at it. ] Analyzed-by: Xie XiuQi <xiexiuqi@huawei.com> Analyzed-by: Vincent Guittot <vincent.guittot@linaro.org> Reported-by: Zhipeng Xie <xiezhipeng1@huawei.com> Reported-by: Sargun Dhillon <sargun@sargun.me> Reported-by: Xie XiuQi <xiexiuqi@huawei.com> Tested-by: Zhipeng Xie <xiezhipeng1@huawei.com> Tested-by: Sargun Dhillon <sargun@sargun.me> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Vincent Guittot <vincent.guittot@linaro.org> Cc: <stable@vger.kernel.org> # v4.13+ Cc: Bin Li <huawei.libin@huawei.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: a9e7f6544b9c ("sched/fair: Fix O(nr_cgroups) in load balance path") Link: http://lkml.kernel.org/r/1545879866-27809-1-git-send-email-xiexiuqi@huawei.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-05-25 19:26:35 +05:30
Cheng Jian	4f395300de	sched/fair: Optimize select_idle_cpu select_idle_cpu() will scan the LLC domain for idle CPUs, it's always expensive. so the next commit : 1ad3aaf3fcd2 ("sched/core: Implement new approach to scale select_idle_cpu()") introduces a way to limit how many CPUs we scan. But it consume some CPUs out of 'nr' that are not allowed for the task and thus waste our attempts. The function always return nr_cpumask_bits, and we can't find a CPU which our task is allowed to run. Cpumask may be too big, similar to select_idle_core(), use per_cpu_ptr 'select_idle_mask' to prevent stack overflow. Fixes: 1ad3aaf3fcd2 ("sched/core: Implement new approach to scale select_idle_cpu()") Signed-off-by: Cheng Jian <cj.chengjian@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20191213024530.28052-1-cj.chengjian@huawei.com Signed-off-by: DennySPB <dennyspb@gmail.com>	2021-05-25 19:26:35 +05:30
Tejun Heo	f0773485ab	sched/fair: Fix O(nr_cgroups) in load balance path Currently, rq->leaf_cfs_rq_list is a traversal ordered list of all live cfs_rqs which have ever been active on the CPU; unfortunately, this makes update_blocked_averages() O(# total cgroups) which isn't scalable at all. This shows up as a small CPU consumption and scheduling latency increase in the load balancing path in systems with CPU controller enabled across most cgroups. In an edge case where temporary cgroups were leaking, this caused the kernel to consume good several tens of percents of CPU cycles running update_blocked_averages(), each run taking multiple millisecs. This patch fixes the issue by taking empty and fully decayed cfs_rqs off the rq->leaf_cfs_rq_list. Signed-off-by: Tejun Heo <tj@kernel.org> [ Added cfs_rq_is_decayed() ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Vincent Guittot <vincent.guittot@linaro.org> Cc: Chris Mason <clm@fb.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170426004350.GB3222@wtj.duckdns.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2021-05-25 19:26:35 +05:30
Vincent Guittot	aa2e2889cc	sched/fair: Fix load_balance redo for !imbalance It can happen that load_balance() finds a busiest group and then a busiest rq but the calculated imbalance is in fact 0. In such situation, detach_tasks() returns immediately and lets the flag LBF_ALL_PINNED set. The busiest CPU is then wrongly assumed to have pinned tasks and removed from the load balance mask. then, we redo a load balance without the busiest CPU. This creates wrong load balance situation and generates wrong task migration. If the calculated imbalance is 0, it's useless to try to find a busiest rq as no task will be migrated and we can return immediately. This situation can happen with heterogeneous system or smp system when RT tasks are decreasing the capacity of some CPUs. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: jhugo@codeaurora.org Link: http://lkml.kernel.org/r/1536306664-29827-1-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: celtare21 <celtare21@gmail.com> Signed-off-by: atndko <z1281552865@gmail.com> Signed-off-by: Zlatan Radovanovic <zlatan.radovanovic@fet.ba>	2021-05-25 19:26:35 +05:30
Sai Gurrappadi	8b2fae44ed	tick: Don't clear idle and iowait sums on CPU down NOHZ related per-cpu data is cleared on CPU down. This was introduced by 4b0c0f294 "tick: Cleanup NOHZ per cpu data on cpu down" which breaks /proc/stats because the idle and iowait sums are now non-monotonic across a CPU down/up cycle. Fix this by not clearing the idle_sleeptime and iowait_sleeptime fields on CPU down. Change-Id: Ifb755e15b601c74dad81655ebb25e037dac2afd0 Signed-off-by: Sai Gurrappadi <sgurrappadi@nvidia.com> Signed-off-by: Peter De Schrijver <pdeschrijver@nvidia.com> Patch-mainline: linux-kernel @ 30 Apr 2014 13:18:34 Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org> Signed-off-by: Swetha Chikkaboraiah <schikk@codeaurora.org> Signed-off-by: Pranav Vashi <neobuddy89@gmail.com> Signed-off-by: celtare21 <celtare21@gmail.com> Signed-off-by: kdrag0n <dragon@khronodragon.com>	2021-05-25 19:26:35 +05:30
Mel Gorman	a3f3189165	mm: slub: Default slub_max_order to 0 To avoid locking and per-cpu overhead, SLUB optimisically uses high-order allocations up to order-3 by default and falls back to lower allocations if they fail. While care is taken that the caller and kswapd take no unusual steps in response to this, there are further consequences like shrinkers who have to free more objects to release any memory. There is anecdotal evidence that significant time is being spent looping in shrinkers with insufficient progress being made (https://lkml.org/lkml/2011/4/28/361) and keeping kswapd awake. SLUB is now the default allocator and some bug reports have been pinned down to SLUB using high orders during operations like copying large amounts of data. SLUBs use of high-orders benefits applications that are sized to memory appropriately but this does not necessarily apply to large file servers or desktops. This patch causes SLUB to use order-0 pages like SLAB does by default. There is further evidence that this keeps kswapd's usage lower (https://lkml.org/lkml/2011/5/10/383). Signed-off-by: Mel Gorman <mgorman@suse.de>	2021-05-25 19:26:35 +05:30
Kefeng Wang	682d77d038	arm64: Add support ARCH_SUPPORTS_INT128 The gcc support __SIZEOF_INT128__ and __int128 in arm64, thus, enable ARCH_SUPPORTS_INT128 to make mul_u64_u32_shr() a bit more efficient in scheduler. Change-Id: I56abffc9acb4c519acdd1be3dc2aa1a7a66c385d Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: mydongistiny <jaysonedson@gmail.com> Signed-off-by: DennySPB <dennyspb@gmail.com>	2021-05-25 19:26:35 +05:30

1 2 3 4 5 ...

793506 Commits