msm-4.14

mirror of https://github.com/rd-stuffs/msm-4.14.git synced 2025-02-20 11:45:48 +08:00

Author	SHA1	Message	Date
Rafael J. Wysocki	d0674a6d37	BACKPORT: cpuidle: Consolidate disabled state checks There are two reasons why CPU idle states may be disabled: either because the driver has disabled them or because they have been disabled by user space via sysfs. In the former case, the state's "disabled" flag is set once during the initialization of the driver and it is never cleared later (it is read-only effectively). In the latter case, the "disable" field of the given state's cpuidle_state_usage struct is set and it may be changed via sysfs. Thus checking whether or not an idle state has been disabled involves reading these two flags every time. In order to avoid the additional check of the state's "disabled" flag (which is effectively read-only anyway), use the value of it at the init time to set a (new) flag in the "disable" field of that state's cpuidle_state_usage structure and use the sysfs interface to manipulate another (new) flag in it. This way the state is disabled whenever the "disable" field of its cpuidle_state_usage structure is nonzero, whatever the reason, and it is the only place to look into to check whether or not the state has been disabled. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Samuel Pascua <pascua.samuel.14@gmail.com> Signed-off-by: Dark-Matter7232 <me@const.eu.org> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-22 23:56:59 -03:00
Rafael J. Wysocki	c59da5221a	BACKPORT: cpuidle: teo: Fix "early hits" handling for disabled idle states commit 159e48560f51d9c2aa02d762a18cd24f7868ab27 upstream. The TEO governor uses idle duration "bins" defined in accordance with the CPU idle states table provided by the driver, so that each "bin" covers the idle duration range between the target residency of the idle state corresponding to it and the target residency of the closest deeper idle state. The governor collects statistics for each bin regardless of whether or not the idle state corresponding to it is currently enabled. In particular, the "early hits" metric measures the likelihood of a situation in which the idle duration measured after wakeup falls into to given bin, but the time till the next timer (sleep length) falls into a bin corresponding to one of the deeper idle states. It is used when the "hits" and "misses" metrics indicate that the state "matching" the sleep length should not be selected, so that the state with the maximum "early hits" value is selected instead of it. If the idle state corresponding to the given bin is disabled, it cannot be selected and if it turns out to be the one that should be selected, a shallower idle state needs to be used instead of it. Nevertheless, the metrics collected for the bin corresponding to it are still valid and need to be taken into account as though that state had not been disabled. As far as the "early hits" metric is concerned, teo_select() tries to take disabled states into account, but the state index corresponding to the maximum "early hits" value computed by it may be incorrect. Namely, it always uses the index of the previous maximum "early hits" state then, but there may be enabled idle states closer to the disabled one in question. In particular, if the current candidate state (whose index is the idx value) is closer to the disabled one and the "early hits" value of the disabled state is greater than the current maximum, the index of the current candidate state (idx) should replace the "maximum early hits state" index. Modify the code to handle that case correctly. Fixes: b26bf6ab716f ("cpuidle: New timer events oriented governor for tickless systems") Reported-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: 5.1+ <stable@vger.kernel.org> # 5.1+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [ Tashar02: Backport to k4.19 ] Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-22 23:56:59 -03:00
Rafael J. Wysocki	3c8eac93fa	BACKPORT: cpuidle: teo: Consider hits and misses metrics of disabled states commit e43dcf20215f0287ea113102617ca04daa76b70e upstream. The TEO governor uses idle duration "bins" defined in accordance with the CPU idle states table provided by the driver, so that each "bin" covers the idle duration range between the target residency of the idle state corresponding to it and the target residency of the closest deeper idle state. The governor collects statistics for each bin regardless of whether or not the idle state corresponding to it is currently enabled. In particular, the "hits" and "misses" metrics measure the likelihood of a situation in which both the time till the next timer (sleep length) and the idle duration measured after wakeup fall into the given bin. Namely, if the "hits" value is greater than the "misses" one, that situation is more likely than the one in which the sleep length falls into the given bin, but the idle duration measured after wakeup falls into a bin corresponding to one of the shallower idle states. If the idle state corresponding to the given bin is disabled, it cannot be selected and if it turns out to be the one that should be selected, a shallower idle state needs to be used instead of it. Nevertheless, the metrics collected for the bin corresponding to it are still valid and need to be taken into account as though that state had not been disabled. For this reason, make teo_select() always use the "hits" and "misses" values of the idle duration range that the sleep length falls into even if the specific idle state corresponding to it is disabled and if the "hits" values is greater than the "misses" one, select the closest enabled shallower idle state in that case. Fixes: b26bf6ab716f ("cpuidle: New timer events oriented governor for tickless systems") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: 5.1+ <stable@vger.kernel.org> # 5.1+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [ Tashar02: Backport to k4.19 ] Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-22 23:56:59 -03:00
Rafael J. Wysocki	5525852cb2	BACKPORT: cpuidle: teo: Rename local variable in teo_select() commit 4f690bb8ce4cc5d3fabe3a8e9c2401de1554cdc1 upstream. Rename a local variable in teo_select() in preparation for subsequent code modifications, no intentional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: 5.1+ <stable@vger.kernel.org> # 5.1+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [ Tashar02: Backport to k4.19 ] Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-22 23:56:59 -03:00
Rafael J. Wysocki	e5af7f73e9	BACKPORT: cpuidle: teo: Ignore disabled idle states that are too deep commit 069ce2ef1a6dd84cbd4d897b333e30f825e021f0 upstream. Prevent disabled CPU idle state with target residencies beyond the anticipated idle duration from being taken into account by the TEO governor. Fixes: b26bf6ab716f ("cpuidle: New timer events oriented governor for tickless systems") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: 5.1+ <stable@vger.kernel.org> # 5.1+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [ Tashar02: Backport to k4.19 ] Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-22 23:56:58 -03:00
Rafael J. Wysocki	13a6873fff	BACKPORT: cpuidle: teo: Get rid of redundant check in teo_update() Notice that setting measured_us to UINT_MAX in teo_update() earlier doesn't change the behavior of the following code, so do that and eliminate a redundant check used for setting measured_us to UINT_MAX. This change is not expected to alter functionality. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [ Tashar02: Backport to k4.19 ] Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-22 23:56:58 -03:00
Rafael J. Wysocki	8138e95761	BACKPORT: cpuidle: teo: Allow tick to be stopped if PM QoS is used The TEO goveror prevents the scheduler tick from being stopped (unless stopped already) if there is a PM QoS latency constraint for the given CPU and the target residency of the deepest idle state matching that constraint is below the tick boundary. However, that is problematic if CPUs with PM QoS latency constraints are idle for long times, because it effectively causes the tick to run on them all the time which is wasteful. [It is also confusing and questionable if they are full dynticks CPUs.] To address that issue, modify the TEO governor to carry out the entire search for the most suitable idle state (from the target residency perspective) even if a latency constraint is present, to allow it to determine the expected idle duration in all cases. Also, when using the last several measured idle duration values to refine the idle state selection, make it compare those values with the current expected idle duration value (instead of comparing them with the target residency of the idle state selected so far) which should prevent the tick from being retained when it makes sense to stop it sometimes (especially in the presence of PM QoS latency constraints). Fixes: b26bf6ab716f ("cpuidle: New timer events oriented governor for tickless systems") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [ Tashar02: Backport to k4.19 ] Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-22 23:56:58 -03:00
Rafael J. Wysocki	4d2483cf76	BACKPORT: cpuidle: New timer events oriented governor for tickless systems The venerable menu governor does some things that are quite questionable in my view. First, it includes timer wakeups in the pattern detection data and mixes them up with wakeups from other sources which in some cases causes it to expect what essentially would be a timer wakeup in a time frame in which no timer wakeups are possible (because it knows the time until the next timer event and that is later than the expected wakeup time). Second, it uses the extra exit latency limit based on the predicted idle duration and depending on the number of tasks waiting on I/O, even though those tasks may run on a different CPU when they are woken up. Moreover, the time ranges used by it for the sleep length correction factors depend on whether or not there are tasks waiting on I/O, which again doesn't imply anything in particular, and they are not correlated to the list of available idle states in any way whatever. Also, the pattern detection code in menu may end up considering values that are too large to matter at all, in which cases running it is a waste of time. A major rework of the menu governor would be required to address these issues and the performance of at least some workloads (tuned specifically to the current behavior of the menu governor) is likely to suffer from that. It is thus better to introduce an entirely new governor without them and let everybody use the governor that works better with their actual workloads. The new governor introduced here, the timer events oriented (TEO) governor, uses the same basic strategy as menu: it always tries to find the deepest idle state that can be used in the given conditions. However, it applies a different approach to that problem. First, it doesn't use "correction factors" for the time till the closest timer, but instead it tries to correlate the measured idle duration values with the available idle states and use that information to pick up the idle state that is most likely to "match" the upcoming CPU idle interval. Second, it doesn't take the number of "I/O waiters" into account at all and the pattern detection code in it avoids taking timer wakeups into account. It also only uses idle duration values less than the current time till the closest timer (with the tick excluded) for that purpose. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> [ Tashar02: Backport to k4.19 ] Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-22 23:56:58 -03:00
Lina Iyer	102886177b	BACKPORT: cpuidle: governor: export cpuidle governor functions Commit 83788c0caed3 ("cpuidle: remove unused exports") removed capability of registering cpuidle governors, which was unused at that time. By exporting the symbol, let's allow platform specific modules to register cpuidle governors and use cpuidle_governor_latency_req() to get the QoS for the CPU. Signed-off-by: Lina Iyer <ilina@codeaurora.org> Teo Fix. Signed-off-by: Carlos Jimenez (JavaShin-X) <javashin1986@gmail.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-22 23:56:58 -03:00
Rafael J. Wysocki	5e097c5c52	BACKPORT: UPSTREAM: cpuidle: governors: Consolidate PM QoS handling There is some code duplication related to the PM QoS handling between the existing cpuidle governors, so move that code to a common helper function and call that from the governors. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> * Fix For TEO (cpuidle: New timer events oriented governor for tickless systems) Partial Commit From Kernel 4.19 Signed-off-by: Carlos Jimenez (JavaShin-X) <javashin1986@gmail.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-22 23:56:58 -03:00
Rafael J. Wysocki	966317657d	BACKPORT: UPSTREAM: cpuidle: menu: Fix wakeup statistics updates for polling state [ Upstream commit 5f26bdceb9c0a5e6c696aa2899d077cd3ae93413 ] If the CPU exits the "polling" state due to the time limit in the loop in poll_idle(), this is not a real wakeup and it just means that the "polling" state selection was not adequate. The governor mispredicted short idle duration, but had a more suitable state been selected, the CPU might have spent more time in it. In fact, there is no reason to expect that there would have been a wakeup event earlier than the next timer in that case. Handling such cases as regular wakeups in menu_update() may cause the menu governor to make suboptimal decisions going forward, but ignoring them altogether would not be correct either, because every time menu_select() is invoked, it makes a separate new attempt to predict the idle duration taking distinct time to the closest timer event as input and the outcomes of all those attempts should be recorded. For this reason, make menu_update() always assume that if the "polling" state was exited due to the time limit, the next proper wakeup event for the CPU would be the next timer event (not including the tick). Fixes: a37b969a61c1 "cpuidle: poll_state: Add time limit to poll_idle()" Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org> From Kernel 4.19 Signed-off-by: Carlos Jimenez (JavaShin-X) <javashin1986@gmail.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-22 23:56:58 -03:00
Rafael J. Wysocki	239e93294f	BACKPORT: UPSTREAM: cpuidle: poll_state: Avoid invoking local_clock() too often Rik reports that he sees an increase in CPU use in one benchmark due to commit 612f1a22f067 "cpuidle: poll_state: Add time limit to poll_idle()" that caused poll_idle() to call local_clock() in every iteration of the loop. Utilization increase generally means more non-idle time with respect to total CPU time (on the average) which implies reduced CPU frequency. Doug reports that limiting the rate of local_clock() invocations in there causes much less power to be drawn during a CPU-intensive parallel workload (with idle states 1 and 2 disabled to enforce more state 0 residency). These two reports together suggest that executing local_clock() on multiple CPUs in parallel at a high rate may cause chips to get hot and trigger thermal/power limits on them to kick in, so reduce the rate of local_clock() invocations in poll_idle() to avoid that issue. Fixes: 612f1a22f067 "cpuidle: poll_state: Add time limit to poll_idle()" Reported-by: Rik van Riel <riel@surriel.com> Reported-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Rik van Riel <riel@surriel.com> Reviewed-by: Rik van Riel <riel@surriel.com> From Kernel 4.19 Signed-off-by: Carlos Jimenez (JavaShin-X) <javashin1986@gmail.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-22 23:56:58 -03:00
Rafael J. Wysocki	56a8f286a4	BACKPORT: UPSTREAM: cpuidle: poll_state: Add time limit to poll_idle() If poll_idle() is allowed to spin until need_resched() returns 'true', it may actually spin for a much longer time than expected by the idle governor, since set_tsk_need_resched() is not always called by the timer interrupt handler. If that happens, the CPU may spend much more time than anticipated in the "polling" state. To prevent that from happening, limit the time of the spinning loop in poll_idle(). Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Doug Smythies <dsmythies@telus.net> From Kernel 4.19 Signed-off-by: Carlos Jimenez (JavaShin-X) <javashin1986@gmail.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com> Change-Id: Iea6540fa03bdca659562b9b47361beda8d5db4ee Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-22 23:56:58 -03:00
Richard Raya	98358ce487	build.sh: Rework repo management Change-Id: Ic74095861e4402bd7740307790ad92feb46ceb94 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-19 03:24:15 -03:00
Angelo G. Del Regno	69fb1f355d	scripts: Lower kernel gzip compression to fastest First of all, this is a downstream kernel - always keep that in mind! Now, this kernel is targeting new very powerful Qualcomm platforms like SM8250 and the Sony Edo platform - which has a very fast UFS card. Keep in mind that the bootloader sets the CPU at a frequency that is slightly faster than the "in the middle" ones, which is anyway not veeeery fast - but that's good, really. I agree. So.. check this out: for Image.gz-dtb..... COMP_LEVEL SIZE 9 20116171 5 20220479 2 20940223 1 21231290 Remember again that we're loading from a UFS card and that we are loading ~1.1MB more out of a 20MB file. If you're smart enough you surely know already about RAM and CPU overhead of very high compression levels. If you still disagree with what I just did, read this commit description another 20 times, or more, until you understand it. :))) Change-Id: Ic28bff2011b40631fc81b582a25029ac8d12d48e Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-19 03:24:15 -03:00
Jebaitedneko	5a0532fc09	subsys-pil-tz: Use memcpy_toio() for pil_init_image_trusted() The new optimized memcpy doesn't work well on device memory, and when subsys tries to load any FW, we are met with: [ 11.111213] ueventd: firmware: loading 'cdsp.mdt' for '/devices/platform/soc/8300000.qcom,turing/firmware/cdsp.mdt' [ 11.113128] ueventd: loading /devices/platform/soc/8300000.qcom,turing/firmware/cdsp.mdt took 2ms [ 11.113170] subsys-pil-tz 8300000.qcom,turing: cdsp: loading from 0x0000000099100000 to 0x000000009a500000 [ 11.117481] ueventd: firmware: loading 'adsp.mdt' for '/devices/platform/soc/17300000.qcom,lpass/firmware/adsp.mdt' [ 11.117518] Unable to handle kernel paging request at virtual address ffffff801f7c6d5c [ 11.117522] Mem abort info: [ 11.117525] Exception class = DABT (current EL), IL = 32 bits [ 11.117527] SET = 0, FnV = 0 [ 11.117529] EA = 0, S1PTW = 0 [ 11.117530] FSC = 33 [ 11.117532] Data abort info: [ 11.117534] ISV = 0, ISS = 0x00000061 [ 11.117536] CM = 0, WnR = 1 [ 11.117539] swapper pgtable: 4k pages, 39-bit VAs, pgd = 000000003e4fd651 [ 11.117541] [ffffff801f7c6d5c] pgd=00000001f8883003, pud=00000001f8883003, pmd=00000001f30fe003, pte=00680000fd475703 [ 11.117547] Internal error: Oops: 96000061 [#1] PREEMPT SMP [ 11.117551] Modules linked in: [ 11.117554] Process init (pid: 568, stack limit = 0x000000005be89f40) [ 11.117558] CPU: 4 PID: 568 Comm: init Tainted: G S W 4.14.239-MOCHI #1 [ 11.117560] Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 VAYU (DT) [ 11.117562] task: 00000000e65c9d8d task.stack: 000000005be89f40 [ 11.117570] pc : memcpy+0x188/0x2a0 [ 11.117577] lr : pil_init_image_trusted+0x130/0x234 [ 11.117579] sp : ffffff80229937c0 pstate : 80000145 [ 11.117581] x29: ffffff80229937c0 x28: ffffffeaebf93a00 [ 11.117584] x27: ffffff8599334000 x26: 0000000000000040 [ 11.117586] x25: ffffff859a0aeaa0 x24: ffffff859a63a000 [ 11.117589] x23: ffffffeaf557d880 x22: ffffff801f7c5000 [ 11.117592] x21: 0000000000001d9c x20: ffffff801f65d000 [ 11.117594] x19: ffffff859933cb08 x18: 0000007c2b138000 [ 11.117597] x17: 0000007ebf1ef1e4 x16: ffffff8597c02274 [ 11.117599] x15: ffffffffffffffff x14: ffffffffffffffff [ 11.117602] x13: ffffffffffffffff x12: ffffffffffffffff [ 11.117604] x11: ffffffffffffffff x10: ffffffffffffffff [ 11.117607] x9 : ffffffffffffffff x8 : ffffffffffffffff [ 11.117609] x7 : ffffffffffffffff x6 : ffffffffffffffff [ 11.117612] x5 : ffffff801f7c6d9c x4 : ffffff801f65ed9c [ 11.117614] x3 : ffffff801f7c6d40 x2 : ffffffffffffffcc [ 11.117617] x1 : ffffff801f65ed80 x0 : ffffff801f7c5000 [ 11.117620] [ 11.117620] PC: 0xffffff8598c00e28: [ 11.117622] 0e28 a9022468 a9422428 a9032c6a a9432c2a a984346c a9c4342c f1010042 54fffee8 [ 11.117628] 0e48 a97c3c8e a9011c66 a97d1c86 a9022468 a97e2488 a9032c6a a97f2c8a a904346c [ 11.117634] 0e68 a93c3cae a93d1ca6 a93e24a8 a93f2caa d65f03c0 d503201f a97f348c 92400cae [ 11.117639] 0e88 cb0e0084 cb0e0042 a97f1c86 a93f34ac a97e2488 a97d2c8a a9fc348c cb0e00a5 [ 11.117645] [ 11.117645] LR: 0xffffff8597ef067c: [ 11.117647] 067c 97eeeccc f9400265 b4fffe65 52801803 910143e2 aa1503e1 910183e0 d2804004 [ 11.117652] 069c 72a02803 d63f00a0 aa0003f6 17ffffe9 aa1403e1 aa1503e2 aa1603e0 9434418a [ 11.117658] 06bc 97ff9590 72001c1f b9404ae1 f9402be0 54000241 910133e4 910163e2 d2800085 [ 11.117663] 06dc d2800103 290b03e1 52800021 52800040 97ff9657 2a0003f3 f9413bf4 f9402bf7 [ 11.117669] [ 11.117669] SP: 0xffffff8022993780: [ 11.117671] 3780 98c00e68 ffffff85 80000145 00000000 9a0aeaa0 ffffff85 00000040 00000000 [ 11.117677] 37a0 ffffffff 0000007f 97ef0644 ffffff85 229937c0 ffffff80 98c00e68 ffffff85 [ 11.117682] 37c0 22993ba0 ffffff80 97eeec4c ffffff85 f557d918 ffffffea 00000000 00000000 [ 11.117688] 37e0 f6264460 ffffffea f6264400 ffffffea f55b9480 ffffffea 97b704d8 ffffff85 [ 11.117693] [ 11.117695] Call trace: [ 11.117698] memcpy+0x188/0x2a0 [ 11.117701] pil_boot+0x358/0x730 [ 11.117704] subsys_powerup+0x28/0x30 [ 11.117709] subsys_start+0x38/0x134 [ 11.117711] __subsystem_get+0xb0/0x11c [ 11.117713] subsystem_get+0x10/0x18 [ 11.117716] cdsp_loader_do.isra.0+0xe4/0x1a8 [ 11.117718] cdsp_boot_store+0x8c/0x168 [ 11.117722] kobj_attr_store+0x14/0x24 [ 11.117726] sysfs_kf_write+0x34/0x44 [ 11.117729] kernfs_fop_write+0x118/0x184 [ 11.117733] __vfs_write+0x2c/0xd8 [ 11.117735] vfs_write+0x80/0xec [ 11.117738] SyS_write+0x54/0xac [ 11.117741] el0_svc_naked+0x34/0x38 [ 11.117744] Code: a97e2488 a9032c6a a97f2c8a a904346c (a93c3cae) [ 11.117747] ---[ end trace fc45fc8b1fa34513 ]--- Fixes booting with the new optimized memcpy routine. Change-Id: Ia95b740ec0ce2fd90e7eea5d5ee162d26f354179 Signed-off-by: Jebaitedneko <Jebaitedneko@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-18 23:08:26 -03:00
Robin Murphy	0e51f2e288	arm64: lib: Import latest memcpy()/memmove() implementation Import the latest implementation of memcpy(), based on the upstream code of string/aarch64/memcpy.S at commit afd6244 from https://github.com/ARM-software/optimized-routines, and subsuming memmove() in the process. Note that for simplicity Arm have chosen to contribute this code to Linux under GPLv2 rather than the original MIT license. Note also that the needs of the usercopy routines vs. regular memcpy() have now diverged so far that we abandon the shared template idea and the damage which that incurred to the tuning of LDP/STP loops. We'll be back to tackle those routines separately in future. Link: https://lore.kernel.org/r/3c953af43506581b2422f61952261e76949ba711.1622128527.git.robin.murphy@arm.com Change-Id: I78b4d7bf65b1a4eebf509b087d0120b0f99e51c4 Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-18 23:08:21 -03:00
EmanuelCN	6993932a60	schedutil: Implement tapered dvfs_headroom Inspired by: LineageOS/android_kernel_google_gs201@752c5f9 Change-Id: I2426f750416cbf9a7cb6876bcd386ae4c40825ca Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-17 04:17:29 -03:00
Samuel Pascua	d9f0298279	schedutil: Use map_util_freq() Change-Id: If9cf1b47dee3b9bd0663c88034da8edc98bd28f6 Signed-off-by: Samuel Pascua <pascua.samuel.14@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-17 04:17:21 -03:00
Samuel Pascua	a48f50b6bf	cpufreq: Add map_util_freq() Change-Id: Ibe3ac6ab685c97874fd381482a1e0cbd60fba806 Signed-off-by: Samuel Pascua <pascua.samuel.14@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-17 04:17:21 -03:00
Wei Wang	0942717376	defconfig: Enable CONFIG_FAIR_GROUP_SCHED Enable CONFIG_FAIR_GROUP_SCHED with proper tuning can help prioritize important work in Android. The feature was taken off due to improper setting on the background tasks' share. Now Tasks in root group has already moved into a newly created system subgroup, so the shares can be properly set. +----------------------------------------------------------------------------------------+ \| Cold App Launch Time (* w/ prio120 8 threads running in root cpuset) \| +----------------------------------------+--------+------+---------+-----------+---------+ \| \| chrome \| maps \| youtube \| playstore \| setting \| +----------------------------------------+--------+------+---------+-----------+---------+ \| No CONFIG_FAIR_GROUP_SCHED support() \| 591 \| 1314 \| 887 \| 1952 \| 551 \| +----------------------------------------+--------+------+---------+-----------+---------+ \| CONFIG_FAIR_GROUP_SCHED w/ 1% limit() \| 567 \| 637 \| 668 \| 1450 \| 529 \| +----------------------------------------+--------+------+---------+-----------+---------+ \| No stress runnning (best case) \| 416 \| 463 \| 484 \| 1075 \| 363 \| +----------------------------------------+--------+------+---------+-----------+---------+ xNombre: It is needed by UClamp Bug: 171740453 Test: Build and boot Change-Id: Ibb7e48c93136e3967da6381d7c0c94d0cdaee443 Signed-off-by: Wei Wang <wvw@google.com> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-17 04:17:21 -03:00
Samuel Pascua	da52819107	defconfig: Switch to UClamp Change-Id: I10a70d27d1f7a4baf2b697ee560bfe39ce29e774 Signed-off-by: Samuel Pascua <pascua.samuel.14@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-17 02:39:52 -03:00
Qais Yousef	de8ca9bcc7	sched/uclamp: Fix rq->uclamp_max not set on first enqueue [ Upstream commit 315c4f884800c45cb6bd8c90422fad554a8b9588 ] Commit d81ae8aac85c ("sched/uclamp: Fix initialization of struct uclamp_rq") introduced a bug where uclamp_max of the rq is not reset to match the woken up task's uclamp_max when the rq is idle. The code was relying on rq->uclamp_max initialized to zero, so on first enqueue static inline void uclamp_rq_inc_id(struct rq rq, struct task_struct p, enum uclamp_id clamp_id) { ... if (uc_se->value > READ_ONCE(uc_rq->value)) WRITE_ONCE(uc_rq->value, uc_se->value); } was actually resetting it. But since commit d81ae8aac85c changed the default to 1024, this no longer works. And since rq->uclamp_flags is also initialized to 0, neither above code path nor uclamp_idle_reset() update the rq->uclamp_max on first wake up from idle. This is only visible from first wake up(s) until the first dequeue to idle after enabling the static key. And it only matters if the uclamp_max of this task is < 1024 since only then its uclamp_max will be effectively ignored. Fix it by properly initializing rq->uclamp_flags = UCLAMP_FLAG_IDLE to ensure uclamp_idle_reset() is called which then will update the rq uclamp_max value as expected. Fixes: d81ae8aac85c ("sched/uclamp: Fix initialization of struct uclamp_rq") Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <Valentin.Schneider@arm.com> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Link: https://lkml.kernel.org/r/20211202112033.1705279-1-qais.yousef@arm.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>	2024-12-16 14:46:43 -03:00
Quentin Perret	56ed3a51c0	sched: Fix UCLAMP_FLAG_IDLE setting The UCLAMP_FLAG_IDLE flag is set on a runqueue when dequeueing the last uclamp active task (that is, when buckets.tasks reaches 0 for all buckets) to maintain the last uclamp.max and prevent blocked util from suddenly becoming visible. However, there is an asymmetry in how the flag is set and cleared which can lead to having the flag set whilst there are active tasks on the rq. Specifically, the flag is cleared in the uclamp_rq_inc() path, which is called at enqueue time, but set in uclamp_rq_dec_id() which is called both when dequeueing a task _and_ in the update_uclamp_active() path. As a result, when both uclamp_rq_{dec,ind}_id() are called from update_uclamp_active(), the flag ends up being set but not cleared, hence leaving the runqueue in a broken state. Fix this by clearing the flag in update_uclamp_active() as well. Fixes: e496187da710 ("sched/uclamp: Enforce last task's UCLAMP_MAX") Reported-by: Rick Yiu <rickyiu@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Qais Yousef <qais.yousef@arm.com> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Link: https://lore.kernel.org/r/20210805102154.590709-2-qperret@google.com Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>	2024-12-16 14:46:43 -03:00
Quentin Perret	89fd7c7fa5	ANDROID: sched: Make uclamp changes depend on CAP_SYS_NICE There is currently nothing preventing tasks from changing their per-task clamp values in anyway that they like. The rationale is probably that system administrators are still able to limit those clamps thanks to the cgroup interface. However, this causes pain in a system where both per-task and per-cgroup clamp values are expected to be under the control of core system components (as is the case for Android). To fix this, let's require CAP_SYS_NICE to change per-task clamp values. There are ongoing discussions upstream about more flexible approaches than this using the RLIMIT API -- see [1]. But the upstream discussion has not converged yet, and this is way too late for UAPI changes in android12-5.10 anyway, so let's apply this change which provides the behaviour we want without actually impacting UAPIs. [1] https://lore.kernel.org/lkml/20210623123441.592348-4-qperret@google.com/ Bug: 187186685 Signed-off-by: Quentin Perret <qperret@google.com> Change-Id: I749312a77306460318ac5374cf243d00b78120dd Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>	2024-12-16 14:46:43 -03:00
Xuewen Yan	1d6a30daff	sched/uclamp: Ignore max aggregation if rq is idle [ Upstream commit 3e1493f46390618ea78607cb30c58fc19e2a5035 ] When a task wakes up on an idle rq, uclamp_rq_util_with() would max aggregate with rq value. But since there is no task enqueued yet, the values are stale based on the last task that was running. When the new task actually wakes up and enqueued, then the rq uclamp values should reflect that of the newly woken up task effective uclamp values. This is a problem particularly for uclamp_max because it default to 1024. If a task p with uclamp_max = 512 wakes up, then max aggregation would ignore the capping that should apply when this task is enqueued, which is wrong. Fix that by ignoring max aggregation if the rq is idle since in that case the effective uclamp value of the rq will be the ones of the task that will wake up. Fixes: 9d20ad7dfc9a ("sched/uclamp: Add uclamp_util_with()") Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> [qias: Changelog] Reviewed-by: Qais Yousef <qais.yousef@arm.com> Link: https://lore.kernel.org/r/20210630141204.8197-1-xuewen.yan94@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>	2024-12-16 14:46:43 -03:00
Qais Yousef	4683c606d0	sched/uclamp: Fix uclamp_tg_restrict() [ Upstream commit 0213b7083e81f4acd69db32cb72eb4e5f220329a ] Now cpu.uclamp.min acts as a protection, we need to make sure that the uclamp request of the task is within the allowed range of the cgroup, that is it is clamp()'ed correctly by tg->uclamp[UCLAMP_MIN] and tg->uclamp[UCLAMP_MAX]. As reported by Xuewen [1] we can have some corner cases where there's inversion between uclamp requested by task (p) and the uclamp values of the taskgroup it's attached to (tg). Following table demonstrates 2 corner cases: \| p \| tg \| effective -----------+-----+------+----------- CASE 1 -----------+-----+------+----------- uclamp_min \| 60% \| 0% \| 60% -----------+-----+------+----------- uclamp_max \| 80% \| 50% \| 50% -----------+-----+------+----------- CASE 2 -----------+-----+------+----------- uclamp_min \| 0% \| 30% \| 30% -----------+-----+------+----------- uclamp_max \| 20% \| 50% \| 20% -----------+-----+------+----------- With this fix we get: \| p \| tg \| effective -----------+-----+------+----------- CASE 1 -----------+-----+------+----------- uclamp_min \| 60% \| 0% \| 50% -----------+-----+------+----------- uclamp_max \| 80% \| 50% \| 50% -----------+-----+------+----------- CASE 2 -----------+-----+------+----------- uclamp_min \| 0% \| 30% \| 30% -----------+-----+------+----------- uclamp_max \| 20% \| 50% \| 30% -----------+-----+------+----------- Additionally uclamp_update_active_tasks() must now unconditionally update both UCLAMP_MIN/MAX because changing the tg's UCLAMP_MAX for instance could have an impact on the effective UCLAMP_MIN of the tasks. \| p \| tg \| effective -----------+-----+------+----------- old -----------+-----+------+----------- uclamp_min \| 60% \| 0% \| 50% -----------+-----+------+----------- uclamp_max \| 80% \| 50% \| 50% -----------+-----+------+----------- new -----------+-----+------+----------- uclamp_min \| 60% \| 0% \| 60% -----------+-----+------+----------- uclamp_max \| 80% \|70% \| 70% -----------+-----+------+----------- [1] https://lore.kernel.org/lkml/CAB8ipk_a6VFNjiEnHRHkUMBKbA+qzPQvhtNjJ_YNzQhqV_o8Zw@mail.gmail.com/ Fixes: 0c18f2ecfcc2 ("sched/uclamp: Fix wrong implementation of cpu.uclamp.min") Reported-by: Xuewen Yan <xuewen.yan94@gmail.com> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210617165155.3774110-1-qais.yousef@arm.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>	2024-12-16 14:46:43 -03:00
Qais Yousef	6756e1b42a	sched/uclamp: Fix wrong implementation of cpu.uclamp.min [ Upstream commit 0c18f2ecfcc274a4bcc1d122f79ebd4001c3b445 ] cpu.uclamp.min is a protection as described in cgroup-v2 Resource Distribution Model Documentation/admin-guide/cgroup-v2.rst which means we try our best to preserve the minimum performance point of tasks in this group. See full description of cpu.uclamp.min in the cgroup-v2.rst. But the current implementation makes it a limit, which is not what was intended. For example: tg->cpu.uclamp.min = 20% p0->uclamp[UCLAMP_MIN] = 0 p1->uclamp[UCLAMP_MIN] = 50% Previous Behavior (limit): p0->effective_uclamp = 0 p1->effective_uclamp = 20% New Behavior (Protection): p0->effective_uclamp = 20% p1->effective_uclamp = 50% Which is inline with how protections should work. With this change the cgroup and per-task behaviors are the same, as expected. Additionally, we remove the confusing relationship between cgroup and !user_defined flag. We don't want for example RT tasks that are boosted by default to max to change their boost value when they attach to a cgroup. If a cgroup wants to limit the max performance point of tasks attached to it, then cpu.uclamp.max must be set accordingly. Or if they want to set different boost value based on cgroup, then sysctl_sched_util_clamp_min_rt_default must be used to NOT boost to max and set the right cpu.uclamp.min for each group to let the RT tasks obtain the desired boost value when attached to that group. As it stands the dependency on !user_defined flag adds an extra layer of complexity that is not required now cpu.uclamp.min behaves properly as a protection. The propagation model of effective cpu.uclamp.min in child cgroups as implemented by cpu_util_update_eff() is still correct. The parent protection sets an upper limit of what the child cgroups will effectively get. Fixes: 3eac870a3247 (sched/uclamp: Use TG's clamps to restrict TASK's clamps) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210510145032.1934078-2-qais.yousef@arm.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>	2024-12-16 14:46:43 -03:00
Quentin Perret	f871554be4	FROMLIST: sched: Fix out-of-bound access in uclamp Util-clamp places tasks in different buckets based on their clamp values for performance reasons. However, the size of buckets is currently computed using a rounding division, which can lead to an off-by-one error in some configurations. For instance, with 20 buckets, the bucket size will be 1024/20=51. A task with a clamp of 1024 will be mapped to bucket id 1024/51=20. Sadly, correct indexes are in range [0,19], hence leading to an out of bound memory access. Clamp the bucket id to fix the issue. Bug: 186415778 Fixes: 69842cba9ace ("sched/uclamp: Add CPU's clamp buckets refcounting") Suggested-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Link: https://lore.kernel.org/r/20210430151412.160913-1-qperret@google.com Change-Id: Ibc28662de5554f80f97533b60e747f8a6e871c56 Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>	2024-12-16 14:46:43 -03:00
Qais Yousef	860c8f0032	sched/uclamp: Fix locking around cpu_util_update_eff() cpu_cgroup_css_online() calls cpu_util_update_eff() without holding the uclamp_mutex or rcu_read_lock() like other call sites, which is a mistake. The uclamp_mutex is required to protect against concurrent reads and writes that could update the cgroup hierarchy. The rcu_read_lock() is required to traverse the cgroup data structures in cpu_util_update_eff(). Surround the caller with the required locks and add some asserts to better document the dependency in cpu_util_update_eff(). Fixes: 7226017ad37a ("sched/uclamp: Fix a bug in propagating uclamp value in new cgroups") Reported-by: Quentin Perret <qperret@google.com> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210510145032.1934078-3-qais.yousef@arm.com Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>	2024-12-16 14:46:43 -03:00
Qais Yousef	4b629cbf3a	FROMGIT: sched/uclamp: Fix a bug in propagating uclamp value in new cgroups When a new cgroup is created, the effective uclamp value wasn't updated with a call to cpu_util_update_eff() that looks at the hierarchy and update to the most restrictive values. Fix it by ensuring to call cpu_util_update_eff() when a new cgroup becomes online. Without this change, the newly created cgroup uses the default root_task_group uclamp values, which is 1024 for both uclamp_{min, max}, which will cause the rq to to be clamped to max, hence cause the system to run at max frequency. The problem was observed on Ubuntu server and was reproduced on Debian and Buildroot rootfs. By default, Ubuntu and Debian create a cpu controller cgroup hierarchy and add all tasks to it - which creates enough noise to keep the rq uclamp value at max most of the time. Imitating this behavior makes the problem visible in Buildroot too which otherwise looks fine since it's a minimal userspace. Bug: 120440300 Fixes: 0b60ba2dd342 ("sched/uclamp: Propagate parent clamps") Reported-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Doug Smythies <dsmythies@telus.net> Link: https://lore.kernel.org/lkml/000701d5b965$361b6c60$a2524520$@net/ (cherry picked from commit 7226017ad37a888915628e59a84a2d1e57b40707 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I9636c60e04d58bbfc5041df1305b34a12b5a3f46 Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>	2024-12-16 14:46:43 -03:00
Qais Yousef	babc23d9b6	UPSTREAM: sched/uclamp: Fix incorrect condition uclamp_update_active() should perform the update when p->uclamp[clamp_id].active is true. But when the logic was inverted in [1], the if condition wasn't inverted correctly too. [1] https://lore.kernel.org/lkml/20190902073836.GO2369@hirez.programming.kicks-ass.net/ Bug: 120440300 Reported-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Vincent Guittot <vincent.guittot@linaro.org> Cc: Ben Segall <bsegall@google.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Patrick Bellasi <patrick.bellasi@matbug.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: babbe170e053 ("sched/uclamp: Update CPU's refcount on TG's clamp changes") Link: https://lkml.kernel.org/r/20191114211052.15116-1-qais.yousef@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 6e1ff0773f49c7d38e8b4a9df598def6afb9f415) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I51b58a6089290277e08a0aaa72b86f852eec1512 Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>	2024-12-16 14:46:42 -03:00
Valentin Schneider	862c9ce4e6	BACKPORT: sched/fair: Make task_fits_capacity() consider uclamp restrictions task_fits_capacity() drives CPU selection at wakeup time, and is also used to detect misfit tasks. Right now it does so by comparing task_util_est() with a CPU's capacity, but doesn't take into account uclamp restrictions. There's a few interesting uses that can come out of doing this. For instance, a low uclamp.max value could prevent certain tasks from being flagged as misfit tasks, so they could merrily remain on low-capacity CPUs. Similarly, a high uclamp.min value would steer tasks towards high capacity CPUs at wakeup (and, should that fail, later steered via misfit balancing), so such "boosted" tasks would favor CPUs of higher capacity. Introduce uclamp_task_util() and make task_fits_capacity() use it. [QP: fixed missing dependency on fits_capacity() by using the open coded alternative] Bug: 120440300 Tested-By: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Quentin Perret <qperret@google.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191211113851.24241-5-valentin.schneider@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit a7008c07a568278ed2763436404752a98004c7ff) Signed-off-by: Quentin Perret <qperret@google.com> Change-Id: Iabde2eda7252c3bcc273e61260a7a12a7de991b1	2024-12-16 14:46:42 -03:00
Satya Durga Srinivasu Prabhala	1a15c7c1b3	sched/fair: honor uclamp restrictions in fbt() While calculating untilization of CPU during task placement in fbt(), current code doesn't take uclamp into account which would lead to selection of incorrect CPU for the task when uclamp restrictions are in place for the task. Change-Id: I8371affe3b37733d222e5c57953e53f91fc19a53 Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>	2024-12-16 14:46:42 -03:00
Dietmar Eggemann	6bc2a06eae	sched/uclamp: Allow to reset a task uclamp constraint value In case the user wants to stop controlling a uclamp constraint value for a task, use the magic value -1 in sched_util_{min,max} with the appropriate sched_flags (SCHED_FLAG_UTIL_CLAMP_{MIN,MAX}) to indicate the reset. The advantage over the 'additional flag' approach (i.e. introducing SCHED_FLAG_UTIL_CLAMP_RESET) is that no additional flag has to be exported via uapi. This avoids the need to document how this new flag has be used in conjunction with the existing uclamp related flags. The following subtle issue is fixed as well. When a uclamp constraint value is set on a !user_defined uclamp_se it is currently first reset and then set. Fix this by AND'ing !user_defined with !SCHED_FLAG_UTIL_CLAMP which stands for the 'sched class change' case. The related condition 'if (uc_se->user_defined)' moved from __setscheduler_uclamp() into uclamp_reset(). Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Yun Hsiang <hsiang023167@gmail.com> Link: https://lkml.kernel.org/r/20201113113454.25868-1-dietmar.eggemann@arm.com	2024-12-16 14:46:42 -03:00
Qinglang Miao	5e192458aa	sched/uclamp: Remove unnecessary mutex_init() The uclamp_mutex lock is initialized statically via DEFINE_MUTEX(), it is unnecessary to initialize it runtime via mutex_init(). Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Patrick Bellasi <patrick.bellasi@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Link: https://lore.kernel.org/r/20200725085629.98292-1-miaoqinglang@huawei.com Signed-off-by: RuRuTiaSaMa <1009087450@qq.com>	2024-12-16 14:46:42 -03:00
Hridaya Prajapati	c48b14d697	sched/cpupri: Checkout changes from redbull Branch: android-msm-redbull-4.19-u-beta5.3 Change-Id: I0283b176f3308459473973ca7df4eee2db1ca644 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-16 14:46:39 -03:00
Qais Yousef	cce1c19561	FROMGIT: sched/rt: Re-instate old behavior in select_task_rq_rt() When RT Capacity Aware support was added, the logic in select_task_rq_rt was modified to force a search for a fitting CPU if the task currently doesn't run on one. But if the search failed, and the search was only triggered to fulfill the fitness request; we could end up selecting a new CPU unnecessarily. Fix this and re-instate the original behavior by ensuring we bail out in that case. This behavior change only affected asymmetric systems that are using util_clamp to implement capacity aware. None asymmetric systems weren't affected. Bug: 120440300 LINK: https://lore.kernel.org/lkml/20200218041620.GD28029@codeaurora.org/ Reported-by: Pavan Kondeti <pkondeti@codeaurora.org> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Fixes: 804d402fb6f6 ("sched/rt: Make RT capacity-aware") Link: https://lkml.kernel.org/r/20200302132721.8353-3-qais.yousef@arm.com (cherry picked from commit b28bc1e002c23ff8a4999c4a2fb1d4d412bc6f5e https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I670ab7f95a3bd8b4790e1cafe89308ead524367e	2024-12-16 14:46:39 -03:00
Qais Yousef	8c6e539b2e	FROMGIT: sched/rt: Remove unnecessary push for unfit tasks In task_woken_rt() and switched_to_rto() we try trigger push-pull if the task is unfit. But the logic is found lacking because if the task was the only one running on the CPU, then rt_rq is not in overloaded state and won't trigger a push. The necessity of this logic was under a debate as well, a summary of the discussion can be found in the following thread: https://lore.kernel.org/lkml/20200226160247.iqvdakiqbakk2llz@e107158-lin.cambridge.arm.com/ Remove the logic for now until a better approach is agreed upon. Bug: 120440300 Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Fixes: 804d402fb6f6 ("sched/rt: Make RT capacity-aware") Link: https://lkml.kernel.org/r/20200302132721.8353-6-qais.yousef@arm.com (cherry picked from commit d94a9df49069ba8ff7c4aaeca1229e6471a01a15 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Id120ada4a89972b3feb8d8b022babb98db1a157f	2024-12-16 14:46:39 -03:00
Qais Yousef	9750a5f2d3	BACKPORT: FROMGIT: sched/rt: Allow pulling unfitting task When implemented RT Capacity Awareness; the logic was done such that if a task was running on a fitting CPU, then it was sticky and we would try our best to keep it there. But as Steve suggested, to adhere to the strict priority rules of RT class; allow pulling an RT task to unfitting CPU to ensure it gets a chance to run ASAP. Bug: 120440300 LINK: https://lore.kernel.org/lkml/20200203111451.0d1da58f@oasis.local.home/ Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Fixes: 804d402fb6f6 ("sched/rt: Make RT capacity-aware") Link: https://lkml.kernel.org/r/20200302132721.8353-5-qais.yousef@arm.com (cherry picked from commit 98ca645f824301bde72e0a51cdc8bdbbea6774a5 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) [Trivial merge conflict] Change-Id: Ie25fa5a4f3b0979ed06df8d156e5586b2928479e Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-16 14:46:33 -03:00
Qais Yousef	d26b0237e0	FROMGIT: sched/rt: cpupri_find: Trigger a full search as fallback If we failed to find a fitting CPU, in cpupri_find(), we only fallback to the level we found a hit at. But Steve suggested to fallback to a second full scan instead as this could be a better effort. https://lore.kernel.org/lkml/20200304135404.146c56eb@gandalf.local.home/ We trigger the 2nd search unconditionally since the argument about triggering a full search is that the recorded fall back level might have become empty by then. Which means storing any data about what happened would be meaningless and stale. I had a humble try at timing it and it seemed okay for the small 6 CPUs system I was running on https://lore.kernel.org/lkml/20200305124324.42x6ehjxbnjkklnh@e107158-lin.cambridge.arm.com/ On large system this second full scan could be expensive. But there are no users outside capacity awareness for this fitness function at the moment. Heterogeneous systems tend to be small with 8cores in total. Bug: 120440300 Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lkml.kernel.org/r/20200310142219.syxzn5ljpdxqtbgx@e107158-lin.cambridge.arm.com (cherry picked from commit e94f80f6c49020008e6fa0f3d4b806b8595d17d8 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Ib20d400be47cd913a43a5c71fafee6a7fffb78aa	2024-12-16 14:46:33 -03:00
Qais Yousef	f8ef6fb43a	FROMGIT: sched/rt: Optimize cpupri_find() on non-heterogenous systems By introducing a new cpupri_find_fitness() function that takes the fitness_fn as an argument and only called when asym_system static key is enabled. cpupri_find() is now a wrapper function that calls cpupri_find_fitness() passing NULL as a fitness_fn, hence disabling the logic that handles fitness by default. Bug: 120440300 LINK: https://lore.kernel.org/lkml/c0772fca-0a4b-c88d-fdf2-5715fcf8447b@arm.com/ Reported-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Fixes: 804d402fb6f6 ("sched/rt: Make RT capacity-aware") Link: https://lkml.kernel.org/r/20200302132721.8353-4-qais.yousef@arm.com (cherry picked from commit a1bd02e1f28b1939cac8c64072a0e578c3cbc345 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I8ad4d9e391030ae499f7a1805485147de64abcdf	2024-12-16 14:46:33 -03:00
Qais Yousef	d524c0114f	BACKPORT: FROMGIT: sched/rt: cpupri_find: Implement fallback mechanism for !fit case When searching for the best lowest_mask with a fitness_fn passed, make sure we record the lowest_level that returns a valid lowest_mask so that we can use that as a fallback in case we fail to find a fitting CPU at all levels. The intention in the original patch was not to allow a down migration to unfitting CPU. But this missed the case where we are already running on unfitting one. With this change now RT tasks can still move between unfitting CPUs when they're already running on such CPU. And as Steve suggested; to adhere to the strict priority rules of RT, if a task is already running on a fitting CPU but due to priority it can't run on it, allow it to downmigrate to unfitting CPU so it can run. Bug: 120440300 Reported-by: Pavan Kondeti <pkondeti@codeaurora.org> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Fixes: 804d402fb6f6 ("sched/rt: Make RT capacity-aware") Link: https://lkml.kernel.org/r/20200302132721.8353-2-qais.yousef@arm.com Link: https://lore.kernel.org/lkml/20200203142712.a7yvlyo2y3le5cpn@e107158-lin/ (cherry picked from commit d9cb236b9429044dc694ea70a50163ddd283cea6 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) [Trivial merge conflict] Change-Id: I3430e9624f8f7b11d3875c39c5765a51aec4a6f5 Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-16 14:46:24 -03:00
Qais Yousef	d719b75cc9	BACKPORT: sched/rt: Make RT capacity-aware Capacity Awareness refers to the fact that on heterogeneous systems (like Arm big.LITTLE), the capacity of the CPUs is not uniform, hence when placing tasks we need to be aware of this difference of CPU capacities. In such scenarios we want to ensure that the selected CPU has enough capacity to meet the requirement of the running task. Enough capacity means here that capacity_orig_of(cpu) >= task.requirement. The definition of task.requirement is dependent on the scheduling class. For CFS, utilization is used to select a CPU that has >= capacity value than the cfs_task.util. capacity_orig_of(cpu) >= cfs_task.util DL isn't capacity aware at the moment but can make use of the bandwidth reservation to implement that in a similar manner CFS uses utilization. The following patchset implements that: https://lore.kernel.org/lkml/20190506044836.2914-1-luca.abeni@santannapisa.it/ capacity_orig_of(cpu)/SCHED_CAPACITY >= dl_deadline/dl_runtime For RT we don't have a per task utilization signal and we lack any information in general about what performance requirement the RT task needs. But with the introduction of uclamp, RT tasks can now control that by setting uclamp_min to guarantee a minimum performance point. ATM the uclamp value are only used for frequency selection; but on heterogeneous systems this is not enough and we need to ensure that the capacity of the CPU is >= uclamp_min. Which is what implemented here. capacity_orig_of(cpu) >= rt_task.uclamp_min Note that by default uclamp.min is 1024, which means that RT tasks will always be biased towards the big CPUs, which make for a better more predictable behavior for the default case. Must stress that the bias acts as a hint rather than a definite placement strategy. For example, if all big cores are busy executing other RT tasks we can't guarantee that a new RT task will be placed there. On non-heterogeneous systems the original behavior of RT should be retained. Similarly if uclamp is not selected in the config. [ mingo: Minor edits to comments. ] Bug: 120440300 Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191009104611.15363-1-qais.yousef@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit 804d402fb6f6487b825aae8cf42fda6426c62867 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git) [Qais: resolved minor conflict in kernel/sched/cpupri.c] Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Ifc9da1c47de1aec9b4d87be2614e4c8968366900 Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-16 14:46:14 -03:00
Andrzej Perczak	36fccb0f06	sched: Set uclamp_util_min_rt_default to 0 Taken from oriole init script. This fixes a problem with stuck freqs on little core. Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>	2024-12-16 14:46:14 -03:00
darkhz	e7a3c0b7a8	sched/cpufreq_schedutil: Reflect uclamp changes This is a follow-up of: BACKPORT: sched/cpufreq, sched/uclamp: Add clamps for FAIR and RT tasks We excluded the schedutil-related change there since the differences between 4.9 and 4.19 schedutil sources were very big, and proceed to modify sugov_get_util() according to the above commit.	2024-12-16 14:46:14 -03:00
darkhz	02bd1ba153	sched/fair: Make boosted and prefer_idle tunables uclamp aware Since we now have uclamp_boosted() and uclamp_latency_sensitive(), which is similar to schedtune_task_boost() and schedtune_prefer_idle() respectively, use them. Change-Id: Ia88e06b7aff5ae6a6ff54cc99f944064b75fe9cb Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-16 14:46:11 -03:00
darkhz	ab91338a51	sched/fair: Modify boosted_task_util() to reflect uclamp changes This is a commit that reflects: ANDROID: sched/fair: EAS: Add uclamp support to find_energy_efficient_cpu() `3a5e1534e0` Change-Id: I4b4a6cd4fcc1b7d4db3c3d96c342a26781dba48d Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-12-16 14:45:56 -03:00
Quentin Perret	02b01aa2e8	ANDROID: sched: Introduce uclamp latency and boost wrapper Introduce a simple helper to read the latency_sensitive flag from a task. It is called uclamp_latency_sensitive() to match the API proposed by Patrick. While at it, introduce uclamp_boosted() which returns true only when a task has a non-null min-clamp. Change-Id: I5fc747da8b58625257a6604a3c88487b657fbe7a Suggested-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2024-12-16 13:44:57 -03:00
Quentin Perret	a9fd2f5012	ANDROID: sched/core: Add a latency-sensitive flag to uclamp Add a 'latency_sensitive' flag to uclamp in order to express the need for some tasks to find a CPU where they can wake-up quickly. This is not expected to be used without cgroup support, so add solely a cgroup interface for it. As this flag represents a boolean attribute and not an amount of resources to be shared, it is not clear what the delegation logic should be. As such, it is kept simple: every new cgroup starts with latency_sensitive set to false, regardless of the parent. In essence, this is similar to SchedTune's prefer-idle flag which was used in android-4.19 and prior. Change-Id: I722d8ecabb428bb7b95a5b54bc70a87f182dde2a Signed-off-by: Quentin Perret <quentin.perret@arm.com>	2024-12-16 13:44:56 -03:00

1 2 3 4 5 ...

812074 Commits