msm-4.14

mirror of https://github.com/rd-stuffs/msm-4.14.git synced 2025-02-20 11:45:48 +08:00

Author	SHA1	Message	Date
Richard Raya	1070e22534	ARM64/dts: sdmmagpie: Drop rcu_normal_after_boot Android works much better with rcu_expedited, since it speeds up a lot of procedures that are important to Android such as process death, which is needed to free up processes' resources quickly when memory runs low. `e7654fa177` Change-Id: I4be5d1d433888b7c359102444c051c494d9e74ce Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:58 -03:00
Vinayak Menon	8e01f1bbd5	ARM64/dts: sdmmagpie: Disable memcg kernel and socket accounting Disable memcg kernel and socket accounting on sm8150 since memory cgroups is used in this target as a means of user space task grouping based on oom_score_adj, and controlling the userspace memory consumed by tasks. Kernel and socket memory accounting is not considered. Since memcg is created per process on this target enabling kernel accounting results in the creation of numerous kmem caches resulting in significant overhead. Change-Id: I2caad9ce6cca5846b6183a0f0753977db54a660d Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:58 -03:00
Adithya R	5a9efeca04	kernel: Boost if only within 1.5s input timeout We don't want to boost DDR for zygote forks not initiated by user input, such as services spawning in background Change-Id: I24ea074c0ebfe7d27a31b3a2766f06a48bf56653 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:58 -03:00
Adithya R	5f2107a12c	drm: Boost if only within 3.2s input timeout * this prevents cpu or ddr bus from boosting unnecessarily during video playback and similar usecases where a CPU boost is unnecessary during a frame commit * the default timeout of 3250 ms has been empirically determined by kdrag0n by performing multiple flings at various speeds while timing them. he observed the maximum fling time to be approximately 2850 ms, so our frame boost timeout of 3250 ms offers some headroom for more intensive applications such as games. Change-Id: Iceb4bcb5f8c3c2d1dc231c0fb96ceeece6052054 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:58 -03:00
Richard Raya	8bda1e801c	defconfig: Regenerate full defconfig Change-Id: I6137fac2491f3430aab19dd3267c351435ef9bd1 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:58 -03:00
idkwhoiam322	b470582782	devfreq_boost: Update and expand to handle CPUBW/LLCCBW boosting This will enable us to more accurately replicate Pixel 4 userspace DDR boosting by adding support for both CPUBW and LLCCBW devices. https://android.googlesource.com/device/google/coral/+/refs/heads/master/init.power.rc https://android.googlesource.com/device/google/coral/+/refs/heads/master/powerhint.json Change-Id: Iae78fa01f96de8338725fa9309e5eee6d8c82313 Signed-off-by: idkwhoiam322 <idkwhoiam322@raphielgang.org> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:58 -03:00
Richard Raya	2d8bd072e8	defconfig: Enable CPU input boost Change-Id: Ie63fdd6de31e870fe47da1d75b7f06aef146ec0d Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:58 -03:00
Richard Raya	3f6b79925f	mm: Boost CPU when memory pressure becomes high Change-Id: Id1b9978d0d68612af02aee88e53af9645a5951ca Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:58 -03:00
Richard Raya	3db01c208f	kernel: Boost CPU for a short period when zygote forks Change-Id: I74568b295bd52546a9580365c137ba0a066eb8a7 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:58 -03:00
Yaroslav Furman	c3058fc912	drm: Boost CPU upon running an atomic ioctl Change-Id: I10b0bbb2e4f65573658a9510c6cf34ea59244645 Signed-off-by: Yaroslav Furman <yaro330@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:58 -03:00
Francisco Franco	346516eb1d	drm: Boost CPU when a new frame is ready to be committed Seems like this is the correct place for this. crtc_commit goes high in CPU during UiBench invalidade test running top: CPU usage: 6.0% [crtc_commit:111] Change-Id: I510d89d55e17d6c66898ee36f3bbccbdf58b6614 Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:58 -03:00
Panchajanya1999	6962587030	cpu_input_boost: Boost on power key There is a little delay in turning on the screen, even with wake_boost enabled. This delay is caused due to switcing of CPU to max state from idle state in an instant. Directly boosting power_key id mask is helpful. Change-Id: I7dbd0106613757c97206ff2d42352e6c8d1d519a Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:57 -03:00
Adithya R	cee1e6b3c1	cpu_input_boost: Get max boost freq from policy->cpuinfo.max_freq * Max boost freq should always be the max possible freq of the CPU, there's no need to have a separate config for it. Change-Id: Ia0036c6fddd961d6fe0ccb96f186b1a628e2d66e Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:57 -03:00
Francisco Franco	1e00290bcd	cpu_input_boost: Expose input freqs and duration to userspace Users like to change stuff and it's useful to tie it in with my Performance Profiles and it's faster to test new configurations than recompiling the whole thing and flash every time. Change-Id: Iccebb05ea79d5a09b544d16084051cca7d606987 Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:57 -03:00
Yaroslav Furman	0fef46c3e1	cpu_input_boost: Rewrite update_online_cpu_policy function I saw a few warnings indicating that we were trying to get policy of invalid (8 and above) cpus. Let's try to avoid this. Change-Id: I677218a5a4e204430d9703a76533a92c10044d99 Signed-off-by: Yaroslav Furman <yaro330@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:57 -03:00
Yaroslav Furman	3c217f5d04	cpu_input_boost: Disable input boost of duration==0 Change-Id: I473277897191fe4f765435f8fef1b5c3bf6523c5 Signed-off-by: Yaroslav Furman <yaro330@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:57 -03:00
Yaroslav Furman	d24d1770aa	cpu_input_boost: Add support for boost freqs lower than min freq Adds ability to disable boosting of certain clusters by setting boost freq to 0 for it. Otherwise freq would drop to cpuinfo.min, which isn't desirable. Change-Id: If0f93461f35902194cdfa9043ccd2d271ef8dedb Signed-off-by: Yaroslav Furman <yaro330@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:57 -03:00
Danny Lin	e52c1b40ad	cpu_input_boost: Track input event times and expose timeout helper Sometimes we may want to limit certain boost kicks to only take effect within a certain timeout after receiving the last input event to save power. For example, we may want to boost when an intensive process forks to improve fluidity, but boosting when the fork was not initiated by input is wasteful. Change-Id: I0ccb8154277d0213b90dcc0360b83d11333fcfd7 Signed-off-by: Danny Lin <danny@kdrag0n.dev> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:57 -03:00
Sultan Alsawaf	7def2a02db	cpu_input_boost: Mark boost kthread as performance critical The boost kthread is performance critical for obvious reasons. Change-Id: Ic81655fb950a3e14e98159ce42df96a26e61ad0b Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:57 -03:00
Sultan Alsawaf	062829719d	cpufreq: Introduce driver for event-based CPU boosting This is a simple CPU input boost driver that boosts all online CPUs for a fixed amount of time. Additionally, there is an API for other drivers to request a boost kick (or a max-boost kick), so boosting can be done on any custom event. This API is mainly intended for the framebuffer driver to send a boost kick whenever there is a new frame ready to be rendered to the display. This driver also boosts all online CPUs to their maximum frequencies when the display is powered on (this is the wake boost). Since this driver requires careful tuning for optimal performance, there are no user-exposed knobs to configure it. All necessary configuration is done via the supplied Kconfig options. This driver is designed for heterogeneous multi-processor systems with two CPU clusters. Change-Id: I4ca8e6a9233c875d07d6471af68bad64a3addbba Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:57 -03:00
Sultan Alsawaf	78dc5bb2a3	defconfig: Bump the scheduler tick rate to 300Hz Now that the worst offenders for uninterruptible scheduling delays have been fixed (i.e., code that would keep IRQs and/or preemption disabled for way too long), increasing the scheduler tick rate yields noticeably better latency. Use the Android-typical 300 Hz tick rate to take advantage of the scheduling delay fixes. Change-Id: Ibfe34b6885598556be5c790e61d493c8c75ef354 Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:57 -03:00
Sultan Alsawaf	96a29e19c3	Makefile: Disable stack conservation for clang There's plenty of room on the stack for a few more inlined bytes here and there. The measured stack usage at runtime is still safe without this, and performance is surely improved at a microscopic level, so remove it. Change-Id: I9521e924ba492fe01f15afe2c0dd7c4142490a17 Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:57 -03:00
Sultan Alsawaf	f996b29a4a	drm: Reduce latency while completing non-blocking commits Since non-blocking commits are the common case, preemptively kick the big cluster out of idle when the atomic ioctl runs so that it'll be ready to process the DRM IRQ and display commit kthreads. Change-Id: I2e59d03bdc78f5314d3158fb437d68074fd2f0ee Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:57 -03:00
Richard Raya	0cfcc7e196	usb: dwc3: Add timeout to wakelock Change-Id: I54ad2e4df4f8fdd5abe23e2f7fbc4ff33e3b5841 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:57 -03:00
YaroST12	462bd3f26f	power/supply: qcom: Add timeouts to wakelocks These can get stuck sometimes and prevert system from sleeping. Change-Id: Ib198cd027e7353110997bcc46d9f39989e877c16 Signed-off-by: Yaroslav Furman <yaro330@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:57 -03:00
tytydraco	53e8303ec6	ARM64/dts: sdmmagpie-gpu: Allow Adreno 618 to nap Change-Id: I3a4801647c7135e67a4eae7d2d5eaa8a97c7f6ea Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:57 -03:00
Richard Raya	112243ebc3	drm: Boost DDR bus when a new frame is ready to be committed Change-Id: I5e1bb49eb16b153161f7a876289928e041eb53ec Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:57 -03:00
Richard Raya	b4fb1a8b46	sched/cass: Drop UClamp Change-Id: I184b0cbf9e40a6311b92fd9c77a508f5b6a88219 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:57 -03:00
Sultan Alsawaf	90bf8b5467	sched/cass: Eliminate redundant calls to smp_processor_id() Calling smp_processor_id() can be expensive depending on how an arch implements it, so avoid calling it more than necessary. Use the raw variant too since this code is always guaranteed to run with preemption disabled. Change-Id: Ic3f46cb34eb07c4d7ad43115da341e93aa0dbf5b Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:57 -03:00
Sultan Alsawaf	749b703b85	sched/cass: Only treat sync waker CPU as idle if there's one task running For synchronized wakes, the waker's CPU should only be treated as idle if there aren't any other running tasks on that CPU. This is because, for synchronized wakes, it is assumed that the waker will immediately go to sleep after waking the wakee; therefore, if there aren't any other tasks running on the waker's CPU, it'll go idle and should be treated as such to improve task placement. This optimization only applies when there aren't any other tasks running on the waker's CPU, however. Fix it by ensuring that there's only the waker running on its CPU. Change-Id: I844386e51b9551b7ccb8fdb6723a255f1142509c Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:57 -03:00
Sultan Alsawaf	adc0f22433	sched/cass: Clean up local variable scope in cass_best_cpu() Move `curr` and `idle_state` to within the loop's scope for better readability. Also, leave a comment about `curr->cpu` to make it clear that `curr->cpu` must be initialized within the loop in order for `best->cpu` to be valid. Change-Id: I0983b5283b46792f987d841b7e17d00804e28b65 Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:56 -03:00
Sultan Alsawaf	da400d5247	sched/cass: Fix CPU selection when no candidate CPUs are idle When no candidate CPUs are idle, CASS would keep `cidx` unchanged, and thus `best == curr` would always be true. As a result, since the empty candidate slot never changes, the current candidate `curr` always overwrites the best candidate `best`. This causes the last valid CPU to always be selected by CASS when no CPUs are idle (i.e., under heavy load). Fix it by ensuring that the CPU loop in cass_best_cpu() flips the free candidate index after the first candidate CPU is evaluated. Change-Id: I1745f12ded5079532c1bc47f851d8c164acf406b Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:56 -03:00
Richard Raya	f7f0fac7ee	sched/cass: Checkout to a newer version Change-Id: I88c4ad4c8bfa3360502e2e0e3517df351c3ad3b2 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:56 -03:00
Patrick Bellasi	c03a31f0f2	sched/fair: Add lsub_positive() and use it consistently The following pattern: var -= min_t(typeof(var), var, val); is used multiple times in fair.c. The existing sub_positive() already captures that pattern, but it also adds an explicit load-store to properly support lockless observations. In other cases the pattern above is used to update local, and/or not concurrently accessed, variables. Let's add a simpler version of sub_positive(), targeted at local variables updates, which gives the same readability benefits at calling sites, without enforcing {READ,WRITE}_ONCE() barriers. Link: https://lore.kernel.org/lkml/20181031184527.GA3178@hirez.programming.kicks-ass.net Change-Id: I187e2bc9e8d62db14f23d89a7342cc1bdc513760 Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:56 -03:00
Patrick Bellasi	34bf7baf83	sched/fair: Mask UTIL_AVG_UNCHANGED usages The _task_util_est() is mainly used to add/remove the task contribution to/from the rq's estimated utilization at task enqueue/dequeue time. In both cases we ensure the UTIL_AVG_UNCHANGED flag is set to keep consistency between enqueue and dequeue time while still being transparent to update_load_avg calls which will eventually reset the flag. Let's move the flag forcing within _task_util_est() itself so that we can simplify calling code by hiding that estimated utilization implementation detail into one of its internal functions. This will affect also the "public" API task_util_est() but we know that the flag will (eventually) impact just on the LSB of the estimated utilization, thus it's certainly acceptable. Link: http://lkml.kernel.org/r/20181105145400.935-3-patrick.bellasi@arm.com Change-Id: I8a4bbdf89dba8abb03d4757249fcc8432cf18173 Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:56 -03:00
Valentin Schneider	319dc707a0	sched/fair: Clean up active balance nr_balance_failed trickery When triggering an active load balance, sd->nr_balance_failed is set to such a value that any further can_migrate_task() using said sd will ignore the output of task_hot(). This behaviour makes sense, as active load balance intentionally preempts a rq's running task to migrate it right away, but this asynchronous write is a bit shoddy, as the stopper thread might run active_load_balance_cpu_stop before the sd->nr_balance_failed write either becomes visible to the stopper's CPU or even happens on the CPU that appended the stopper work. Add a struct lb_env flag to denote active balancing, and use it in can_migrate_task(). Remove the sd->nr_balance_failed write that served the same purpose. Cleanup the LBF_DST_PINNED active balance special case. Link: https://lkml.kernel.org/r/20210407220628.3798191-3-valentin.schneider@arm.com Change-Id: I95928eae80bd3d5d247e20715d8a1c3ea268fc0f Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:56 -03:00
Viresh Kumar	e4758a0d52	sched/core: Create task_has_idle_policy() helper We already have task_has_rt_policy() and task_has_dl_policy() helpers, create task_has_idle_policy() as well and update sched core to start using it. While at it, use task_has_dl_policy() at one more place. Link: http://lkml.kernel.org/r/ce3915d5b490fc81af926a3b6bfb775e7188e005.1541416894.git.viresh.kumar@linaro.org Change-Id: I5b55c846dc6d595dd7015bf059aa2a0c13a74a7b Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vincent Guittot <vincent.guittot@linaro.org> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:56 -03:00
Tashfin Shakeer Rhythm	6f04d1d4d9	Revert "ANDROID: sched/fair: Avoid unnecessary balancing of asymmetric capacity groups" This negatively affects the scheduler performance. Revert it. This reverts commit 0d6eeac4fba5f065b4033f59c03897d9b1203d20. Change-Id: I5b19db2e81a877454bd40243c5ac71c4850e253b Suggested-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:56 -03:00
Yaroslav Furman	2f715d997d	schedutil: Remove useless tracing Change-Id: I78911172073047e10fcac3e4bd36a720ece4672e Signed-off-by: Yaroslav Furman <yaro330@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:56 -03:00
Tyler Nijmeh	7f1de0f5d2	schedutil: Smoothen predicted load boosting To prevent spikes in CPU frequency selection, let's make sure that instead of immediately switching to WALT's predicted load (if > *util), we average it out and produce a smoother, more realistic load boost. Change-Id: Ifcdf12bb2daf23ba9b4ec55ac26069dd48876c42 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:56 -03:00
Richard Raya	4d42f71668	schedutil: Restore CAF's hispeed boost and predicted load This reverts commit f0504b6779eed2da109c403b2f986be2fb80ebec. Change-Id: I93328b5782682f1a3545c763dc585f73e7a43ec7 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:56 -03:00
Richard Raya	4198920708	sched/fair: Fix compilation Change-Id: I6ee63f3b88ebb34ffa2cd592fe93d16a447c7821 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:56 -03:00
Uladzislau Rezki	0f2cf05b34	sched/fair: Search a task from the tail of the queue As a first step this patch makes cfs_tasks list as MRU one. It means, that when a next task is picked to run on physical CPU it is moved to the front of the list. Therefore, the cfs_tasks list is more or less sorted (except woken tasks) starting from recently given CPU time tasks toward tasks with max wait time in a run-queue, i.e. MRU list. Second, as part of the load balance operation, this approach starts detach_tasks()/detach_one_task() from the tail of the queue instead of the head, giving some advantages: - tends to pick a task with highest wait time; - tasks located in the tail are less likely cache-hot, therefore the can_migrate_task() decision is higher. hackbench illustrates slightly better performance. For example doing 1000 samples and 40 groups on i5-3320M CPU, it shows below figures: default: 0.657 avg patched: 0.646 avg Link: http://lkml.kernel.org/r/20170913102430.8985-2-urezki@gmail.com Change-Id: Id1b4152d769e1a1e1536c67d9369073fcfc3013c Cc: Kirill Tkhai <tkhai@yandex.ru> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com> Signed-off-by: Yaroslav Furman <yaro330@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:56 -03:00
Zachariah Kennedy	4b4d128ca8	sched/fair: Honor sync only if CPU is about to goto idle sync is causing excessive latencies during binder replies as its causing migration of important tasks to busy CPU. Incase the CPU has a lot of tasks running, prevent sync from happening Based on: `3944217bbb` Change-Id: I2077bf11dccc9ae1f68ac1363b06a5d200e385da Signed-off-by: Zachariah Kennedy <zkennedy87@gmail.com> Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com> Signed-off-by: Yaroslav Furman <yaro330@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:56 -03:00
Patrick Bellasi	d34e463b5b	sched/fair: Fast ramp-up EWMA on utilization increases The estimated utilization for a task: util_est = max(util_avg, est.enqueue, est.ewma) is defined based on: - util_avg: the PELT defined utilization - est.enqueued: the util_avg at the end of the last activation - est.ewma: a exponential moving average on the est.enqueued samples According to this definition, when a task suddenly change its bandwidth requirements from small to big, the EWMA will need to collect multiple samples before converging up to track the new big utilization. This slow convergence towards bigger utilization values is not aligned to the default scheduler behavior, which is to optimize for performance. Moreover, the est.ewma component fails to compensate for temporarely utilization drops which spans just few est.enqueued samples. To let util_est do a better job in the scenario depicted above, change its definition by making util_est directly follow upward motion and only decay the est.ewma on downward. https://lkml.org/lkml/2019/10/23/1071 Change-Id: I1837e322b02958faeaff65563127e6ab682c93e1 Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Patrick Bellasi <patrick.bellasi@matbug.com> Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com> Signed-off-by: Yaroslav Furman <yaro330@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:56 -03:00
Pavankumar Kondeti	61e59264a4	sched/fair: Optimize the tick path active migration When a task is upmigrating via tickpath, the lower capacity CPU that is running the task will wake up the migration task to carry the migration to the other higher capacity CPU. The migration task dequeue the task from lower capacity CPU and enqueue it on the higher capacity CPU. A rescheduler IPI is sent now to the higher capacity CPU. If the higher capacity CPU was in deep sleep state, it results in more waiting time for the task to be upmigrated. This can be optimized by waking up the higher capacity CPU along with waking the migration task on the lower capacity CPU. Since we reserve the higher capacity CPU, the is_reserved() API can be used to prevent the CPU entering idle again. [clingutla@codeaurora.org: Resolved minor merge conflicts] Change-Id: I7bda9a905a66a9326c1dc74e50fa94eb58e6b705 Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org> Signed-off-by: Lingutla Chandrasekhar <clingutla@codeaurora.org> Signed-off-by: Yaroslav Furman <yaro330@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:56 -03:00
Frederic Weisbecker	fff2a8b926	sched/fair: Spare resched IPI when prio changes on a single fair task The runqueue of a fair task being remotely reniced is going to get a resched IPI in order to reassess which task should be the current running on the CPU. However that evaluation is useless if the fair task is running alone, in which case we can spare that IPI, preventing nohz_full CPUs from being disturbed. Link: https://lkml.kernel.org/r/20191203160106.18806-2-frederic@kernel.org Change-Id: I7fc11e9b48643f0a4cb73838028c1b556cd79dbb Cc: Ingo Molnar <mingo@kernel.org> Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Yaroslav Furman <yaro330@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:56 -03:00
Wei Wang	d02797647d	sched/fair: Placement optimization for heavy load Previously we have used pure CFS wakeup in overutilized case. This is a tweaked version to activate the path only for important tasks. Bug: 161190988 Bug: 160883639 Test: boot and systrace Change-Id: I2a27f241b3ba32a04cf6f88deb483d6636440dcf Signed-off-by: Wei Wang <wvw@google.com> Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com> Signed-off-by: Yaroslav Furman <yaro330@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:56 -03:00
Rick Yiu	a50007e7ef	sched/fair: Use actual cpu capacity to calculate boosted util Currently when calculating boosted util for a cpu, it uses a fixed value of 1024 for calculation. So when top-app tasks moved to LC, which has much lower capacity than BC, the freq calculated will be high even the cpu util is low. This results in higher power consumption, especially on arch which has more little cores than big cores. By replacing the fixed value of 1024 with actual cpu capacity will reduce the freq calculated on LC. Bug: 152925197 Test: boosted util reduced on little cores Change-Id: I80cdd08a2c7fa5e674c43bfc132584d85c14622b Signed-off-by: Rick Yiu <rickyiu@google.com> Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com> Signed-off-by: Yaroslav Furman <yaro330@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:56 -03:00
Viresh Kumar	add478ca0f	sched/fair: Load balance aggressively for SCHED_IDLE CPUs The fair scheduler performs periodic load balance on every CPU to check if it can pull some tasks from other busy CPUs. The duration of this periodic load balance is set to sd->balance_interval for the idle CPUs and is calculated by multiplying the sd->balance_interval with the sd->busy_factor (set to 32 by default) for the busy CPUs. The multiplication is done for busy CPUs to avoid doing load balance too often and rather spend more time executing actual task. While that is the right thing to do for the CPUs busy with SCHED_OTHER or SCHED_BATCH tasks, it may not be the optimal thing for CPUs running only SCHED_IDLE tasks. With the recent enhancements in the fair scheduler around SCHED_IDLE CPUs, we now prefer to enqueue a newly-woken task to a SCHED_IDLE CPU instead of other busy or idle CPUs. The same reasoning should be applied to the load balancer as well to make it migrate tasks more aggressively to a SCHED_IDLE CPU, as that will reduce the scheduling latency of the migrated (SCHED_OTHER) tasks. This patch makes minimal changes to the fair scheduler to do the next load balance soon after the last non SCHED_IDLE task is dequeued from a runqueue, i.e. making the CPU SCHED_IDLE. Also the sd->busy_factor is ignored while calculating the balance_interval for such CPUs. This is done to avoid delaying the periodic load balance by few hundred milliseconds for SCHED_IDLE CPUs. This is tested on ARM64 Hikey620 platform (octa-core) with the help of rt-app and it is verified, using kernel traces, that the newly SCHED_IDLE CPU does load balancing shortly after it becomes SCHED_IDLE and pulls tasks from other busy CPUs. Link: https://lkml.kernel.org/r/e485827eb8fe7db0943d6f3f6e0f5a4a70272781.1578471925.git.viresh.kumar@linaro.org Change-Id: Id9b8afbb2825369716a33edc3f3ec69a2abee6b9 Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Yaroslav Furman <yaro330@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:55 -03:00

1 2 3 4 5 ...

810447 Commits