812110 Commits

Author SHA1 Message Date
Angelo G. Del Regno
69fb1f355d scripts: Lower kernel gzip compression to fastest
First of all, this is a downstream kernel - always keep that in mind!
Now, this kernel is targeting new *very powerful* Qualcomm platforms
like SM8250 and the Sony Edo platform - which has a very fast UFS card.

Keep in mind that the bootloader sets the CPU at a frequency that is
slightly faster than the "in the middle" ones, which is anyway not
veeeery fast - but that's good, really. I agree.

So.. check this out:   for Image.gz-dtb.....
COMP_LEVEL    SIZE
9             20116171
5	      20220479
2	      20940223
1	      21231290

Remember again that we're loading from a UFS card and that
we are loading ~1.1MB more out of a 20MB file.
If you're smart enough you surely know already about RAM and CPU
overhead of very high compression levels.

If you still disagree with what I just did, read this commit description
another 20 times, or more, until you understand it. :)))

Change-Id: Ic28bff2011b40631fc81b582a25029ac8d12d48e
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-12-19 03:24:15 -03:00
Jebaitedneko
5a0532fc09 subsys-pil-tz: Use memcpy_toio() for pil_init_image_trusted()
The new optimized memcpy doesn't work well on device memory, and when subsys tries to load any FW, we are met with:

[   11.111213] ueventd: firmware: loading 'cdsp.mdt' for '/devices/platform/soc/8300000.qcom,turing/firmware/cdsp.mdt'
[   11.113128] ueventd: loading /devices/platform/soc/8300000.qcom,turing/firmware/cdsp.mdt took 2ms
[   11.113170] subsys-pil-tz 8300000.qcom,turing: cdsp: loading from 0x0000000099100000 to 0x000000009a500000
[   11.117481] ueventd: firmware: loading 'adsp.mdt' for '/devices/platform/soc/17300000.qcom,lpass/firmware/adsp.mdt'
[   11.117518] Unable to handle kernel paging request at virtual address ffffff801f7c6d5c
[   11.117522] Mem abort info:
[   11.117525]   Exception class = DABT (current EL), IL = 32 bits
[   11.117527]   SET = 0, FnV = 0
[   11.117529]   EA = 0, S1PTW = 0
[   11.117530]   FSC = 33
[   11.117532] Data abort info:
[   11.117534]   ISV = 0, ISS = 0x00000061
[   11.117536]   CM = 0, WnR = 1
[   11.117539] swapper pgtable: 4k pages, 39-bit VAs, pgd = 000000003e4fd651
[   11.117541] [ffffff801f7c6d5c] *pgd=00000001f8883003, *pud=00000001f8883003, *pmd=00000001f30fe003, *pte=00680000fd475703
[   11.117547] Internal error: Oops: 96000061 [#1] PREEMPT SMP
[   11.117551] Modules linked in:
[   11.117554] Process init (pid: 568, stack limit = 0x000000005be89f40)
[   11.117558] CPU: 4 PID: 568 Comm: init Tainted: G S      W       4.14.239-MOCHI #1
[   11.117560] Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 VAYU (DT)
[   11.117562] task: 00000000e65c9d8d task.stack: 000000005be89f40
[   11.117570] pc : memcpy+0x188/0x2a0
[   11.117577] lr : pil_init_image_trusted+0x130/0x234
[   11.117579] sp : ffffff80229937c0 pstate : 80000145
[   11.117581] x29: ffffff80229937c0 x28: ffffffeaebf93a00
[   11.117584] x27: ffffff8599334000 x26: 0000000000000040
[   11.117586] x25: ffffff859a0aeaa0 x24: ffffff859a63a000
[   11.117589] x23: ffffffeaf557d880 x22: ffffff801f7c5000
[   11.117592] x21: 0000000000001d9c x20: ffffff801f65d000
[   11.117594] x19: ffffff859933cb08 x18: 0000007c2b138000
[   11.117597] x17: 0000007ebf1ef1e4 x16: ffffff8597c02274
[   11.117599] x15: ffffffffffffffff x14: ffffffffffffffff
[   11.117602] x13: ffffffffffffffff x12: ffffffffffffffff
[   11.117604] x11: ffffffffffffffff x10: ffffffffffffffff
[   11.117607] x9 : ffffffffffffffff x8 : ffffffffffffffff
[   11.117609] x7 : ffffffffffffffff x6 : ffffffffffffffff
[   11.117612] x5 : ffffff801f7c6d9c x4 : ffffff801f65ed9c
[   11.117614] x3 : ffffff801f7c6d40 x2 : ffffffffffffffcc
[   11.117617] x1 : ffffff801f65ed80 x0 : ffffff801f7c5000
[   11.117620]
[   11.117620] PC: 0xffffff8598c00e28:
[   11.117622] 0e28  a9022468 a9422428 a9032c6a a9432c2a a984346c a9c4342c f1010042 54fffee8
[   11.117628] 0e48  a97c3c8e a9011c66 a97d1c86 a9022468 a97e2488 a9032c6a a97f2c8a a904346c
[   11.117634] 0e68  a93c3cae a93d1ca6 a93e24a8 a93f2caa d65f03c0 d503201f a97f348c 92400cae
[   11.117639] 0e88  cb0e0084 cb0e0042 a97f1c86 a93f34ac a97e2488 a97d2c8a a9fc348c cb0e00a5
[   11.117645]
[   11.117645] LR: 0xffffff8597ef067c:
[   11.117647] 067c  97eeeccc f9400265 b4fffe65 52801803 910143e2 aa1503e1 910183e0 d2804004
[   11.117652] 069c  72a02803 d63f00a0 aa0003f6 17ffffe9 aa1403e1 aa1503e2 aa1603e0 9434418a
[   11.117658] 06bc  97ff9590 72001c1f b9404ae1 f9402be0 54000241 910133e4 910163e2 d2800085
[   11.117663] 06dc  d2800103 290b03e1 52800021 52800040 97ff9657 2a0003f3 f9413bf4 f9402bf7
[   11.117669]
[   11.117669] SP: 0xffffff8022993780:
[   11.117671] 3780  98c00e68 ffffff85 80000145 00000000 9a0aeaa0 ffffff85 00000040 00000000
[   11.117677] 37a0  ffffffff 0000007f 97ef0644 ffffff85 229937c0 ffffff80 98c00e68 ffffff85
[   11.117682] 37c0  22993ba0 ffffff80 97eeec4c ffffff85 f557d918 ffffffea 00000000 00000000
[   11.117688] 37e0  f6264460 ffffffea f6264400 ffffffea f55b9480 ffffffea 97b704d8 ffffff85
[   11.117693]
[   11.117695] Call trace:
[   11.117698]  memcpy+0x188/0x2a0
[   11.117701]  pil_boot+0x358/0x730
[   11.117704]  subsys_powerup+0x28/0x30
[   11.117709]  subsys_start+0x38/0x134
[   11.117711]  __subsystem_get+0xb0/0x11c
[   11.117713]  subsystem_get+0x10/0x18
[   11.117716]  cdsp_loader_do.isra.0+0xe4/0x1a8
[   11.117718]  cdsp_boot_store+0x8c/0x168
[   11.117722]  kobj_attr_store+0x14/0x24
[   11.117726]  sysfs_kf_write+0x34/0x44
[   11.117729]  kernfs_fop_write+0x118/0x184
[   11.117733]  __vfs_write+0x2c/0xd8
[   11.117735]  vfs_write+0x80/0xec
[   11.117738]  SyS_write+0x54/0xac
[   11.117741]  el0_svc_naked+0x34/0x38
[   11.117744] Code: a97e2488 a9032c6a a97f2c8a a904346c (a93c3cae)
[   11.117747] ---[ end trace fc45fc8b1fa34513 ]---

Fixes booting with the new optimized memcpy routine.

Change-Id: Ia95b740ec0ce2fd90e7eea5d5ee162d26f354179
Signed-off-by: Jebaitedneko <Jebaitedneko@gmail.com>
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-12-18 23:08:26 -03:00
Robin Murphy
0e51f2e288 arm64: lib: Import latest memcpy()/memmove() implementation
Import the latest implementation of memcpy(), based on the
upstream code of string/aarch64/memcpy.S at commit afd6244 from
https://github.com/ARM-software/optimized-routines, and subsuming
memmove() in the process.

Note that for simplicity Arm have chosen to contribute this code
to Linux under GPLv2 rather than the original MIT license.

Note also that the needs of the usercopy routines vs. regular memcpy()
have now diverged so far that we abandon the shared template idea
and the damage which that incurred to the tuning of LDP/STP loops.
We'll be back to tackle those routines separately in future.

Link: https://lore.kernel.org/r/3c953af43506581b2422f61952261e76949ba711.1622128527.git.robin.murphy@arm.com
Change-Id: I78b4d7bf65b1a4eebf509b087d0120b0f99e51c4
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-12-18 23:08:21 -03:00
EmanuelCN
6993932a60 schedutil: Implement tapered dvfs_headroom
Inspired by: LineageOS/android_kernel_google_gs201@752c5f9

Change-Id: I2426f750416cbf9a7cb6876bcd386ae4c40825ca
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-12-17 04:17:29 -03:00
Samuel Pascua
d9f0298279 schedutil: Use map_util_freq()
Change-Id: If9cf1b47dee3b9bd0663c88034da8edc98bd28f6
Signed-off-by: Samuel Pascua <pascua.samuel.14@gmail.com>
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-12-17 04:17:21 -03:00
Samuel Pascua
a48f50b6bf cpufreq: Add map_util_freq()
Change-Id: Ibe3ac6ab685c97874fd381482a1e0cbd60fba806
Signed-off-by: Samuel Pascua <pascua.samuel.14@gmail.com>
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-12-17 04:17:21 -03:00
Wei Wang
0942717376 defconfig: Enable CONFIG_FAIR_GROUP_SCHED
Enable CONFIG_FAIR_GROUP_SCHED with proper tuning can help prioritize
important work in Android. The feature was taken off due to improper
setting on the background tasks' share. Now Tasks in root group has
already moved into a newly created system subgroup, so the shares can be
properly set.

+----------------------------------------------------------------------------------------+
| Cold App Launch Time (* w/ prio120 8 threads running in root cpuset)                   |
+----------------------------------------+--------+------+---------+-----------+---------+
|                                        | chrome | maps | youtube | playstore | setting |
+----------------------------------------+--------+------+---------+-----------+---------+
| No CONFIG_FAIR_GROUP_SCHED support(*)  |    591 | 1314 |     887 |      1952 |     551 |
+----------------------------------------+--------+------+---------+-----------+---------+
| CONFIG_FAIR_GROUP_SCHED w/ 1% limit(*) |    567 |  637 |     668 |      1450 |     529 |
+----------------------------------------+--------+------+---------+-----------+---------+
| No stress runnning (best case)         |    416 |  463 |     484 |      1075 |     363 |
+----------------------------------------+--------+------+---------+-----------+---------+

xNombre: It is needed by UClamp

Bug: 171740453
Test: Build and boot
Change-Id: Ibb7e48c93136e3967da6381d7c0c94d0cdaee443
Signed-off-by: Wei Wang <wvw@google.com>
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-12-17 04:17:21 -03:00
Samuel Pascua
da52819107 defconfig: Switch to UClamp
Change-Id: I10a70d27d1f7a4baf2b697ee560bfe39ce29e774
Signed-off-by: Samuel Pascua <pascua.samuel.14@gmail.com>
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-12-17 02:39:52 -03:00
Qais Yousef
de8ca9bcc7 sched/uclamp: Fix rq->uclamp_max not set on first enqueue
[ Upstream commit 315c4f884800c45cb6bd8c90422fad554a8b9588 ]

Commit d81ae8aac85c ("sched/uclamp: Fix initialization of struct
uclamp_rq") introduced a bug where uclamp_max of the rq is not reset to
match the woken up task's uclamp_max when the rq is idle.

The code was relying on rq->uclamp_max initialized to zero, so on first
enqueue

	static inline void uclamp_rq_inc_id(struct rq *rq, struct task_struct *p,
					    enum uclamp_id clamp_id)
	{
		...

		if (uc_se->value > READ_ONCE(uc_rq->value))
			WRITE_ONCE(uc_rq->value, uc_se->value);
	}

was actually resetting it. But since commit d81ae8aac85c changed the
default to 1024, this no longer works. And since rq->uclamp_flags is
also initialized to 0, neither above code path nor uclamp_idle_reset()
update the rq->uclamp_max on first wake up from idle.

This is only visible from first wake up(s) until the first dequeue to
idle after enabling the static key. And it only matters if the
uclamp_max of this task is < 1024 since only then its uclamp_max will be
effectively ignored.

Fix it by properly initializing rq->uclamp_flags = UCLAMP_FLAG_IDLE to
ensure uclamp_idle_reset() is called which then will update the rq
uclamp_max value as expected.

Fixes: d81ae8aac85c ("sched/uclamp: Fix initialization of struct uclamp_rq")
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <Valentin.Schneider@arm.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lkml.kernel.org/r/20211202112033.1705279-1-qais.yousef@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
2024-12-16 14:46:43 -03:00
Quentin Perret
56ed3a51c0 sched: Fix UCLAMP_FLAG_IDLE setting
The UCLAMP_FLAG_IDLE flag is set on a runqueue when dequeueing the last
uclamp active task (that is, when buckets.tasks reaches 0 for all
buckets) to maintain the last uclamp.max and prevent blocked util from
suddenly becoming visible.

However, there is an asymmetry in how the flag is set and cleared which
can lead to having the flag set whilst there are active tasks on the rq.
Specifically, the flag is cleared in the uclamp_rq_inc() path, which is
called at enqueue time, but set in uclamp_rq_dec_id() which is called
both when dequeueing a task _and_ in the update_uclamp_active() path. As
a result, when both uclamp_rq_{dec,ind}_id() are called from
update_uclamp_active(), the flag ends up being set but not cleared,
hence leaving the runqueue in a broken state.

Fix this by clearing the flag in update_uclamp_active() as well.

Fixes: e496187da710 ("sched/uclamp: Enforce last task's UCLAMP_MAX")
Reported-by: Rick Yiu <rickyiu@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Qais Yousef <qais.yousef@arm.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lore.kernel.org/r/20210805102154.590709-2-qperret@google.com
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
2024-12-16 14:46:43 -03:00
Quentin Perret
89fd7c7fa5 ANDROID: sched: Make uclamp changes depend on CAP_SYS_NICE
There is currently nothing preventing tasks from changing their per-task
clamp values in anyway that they like. The rationale is probably that
system administrators are still able to limit those clamps thanks to the
cgroup interface. However, this causes pain in a system where both
per-task and per-cgroup clamp values are expected to be under the
control of core system components (as is the case for Android).

To fix this, let's require CAP_SYS_NICE to change per-task clamp values.
There are ongoing discussions upstream about more flexible approaches
than this using the RLIMIT API -- see [1]. But the upstream discussion
has not converged yet, and this is way too late for UAPI changes in
android12-5.10 anyway, so let's apply this change which provides the
behaviour we want without actually impacting UAPIs.

[1] https://lore.kernel.org/lkml/20210623123441.592348-4-qperret@google.com/

Bug: 187186685
Signed-off-by: Quentin Perret <qperret@google.com>
Change-Id: I749312a77306460318ac5374cf243d00b78120dd
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
2024-12-16 14:46:43 -03:00
Xuewen Yan
1d6a30daff sched/uclamp: Ignore max aggregation if rq is idle
[ Upstream commit 3e1493f46390618ea78607cb30c58fc19e2a5035 ]

When a task wakes up on an idle rq, uclamp_rq_util_with() would max
aggregate with rq value. But since there is no task enqueued yet, the
values are stale based on the last task that was running. When the new
task actually wakes up and enqueued, then the rq uclamp values should
reflect that of the newly woken up task effective uclamp values.

This is a problem particularly for uclamp_max because it default to
1024. If a task p with uclamp_max = 512 wakes up, then max aggregation
would ignore the capping that should apply when this task is enqueued,
which is wrong.

Fix that by ignoring max aggregation if the rq is idle since in that
case the effective uclamp value of the rq will be the ones of the task
that will wake up.

Fixes: 9d20ad7dfc9a ("sched/uclamp: Add uclamp_util_with()")
Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
[qias: Changelog]
Reviewed-by: Qais Yousef <qais.yousef@arm.com>
Link: https://lore.kernel.org/r/20210630141204.8197-1-xuewen.yan94@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
2024-12-16 14:46:43 -03:00
Qais Yousef
4683c606d0 sched/uclamp: Fix uclamp_tg_restrict()
[ Upstream commit 0213b7083e81f4acd69db32cb72eb4e5f220329a ]

Now cpu.uclamp.min acts as a protection, we need to make sure that the
uclamp request of the task is within the allowed range of the cgroup,
that is it is clamp()'ed correctly by tg->uclamp[UCLAMP_MIN] and
tg->uclamp[UCLAMP_MAX].

As reported by Xuewen [1] we can have some corner cases where there's
inversion between uclamp requested by task (p) and the uclamp values of
the taskgroup it's attached to (tg). Following table demonstrates
2 corner cases:

	           |  p  |  tg  |  effective
	-----------+-----+------+-----------
	CASE 1
	-----------+-----+------+-----------
	uclamp_min | 60% | 0%   |  60%
	-----------+-----+------+-----------
	uclamp_max | 80% | 50%  |  50%
	-----------+-----+------+-----------
	CASE 2
	-----------+-----+------+-----------
	uclamp_min | 0%  | 30%  |  30%
	-----------+-----+------+-----------
	uclamp_max | 20% | 50%  |  20%
	-----------+-----+------+-----------

With this fix we get:

	           |  p  |  tg  |  effective
	-----------+-----+------+-----------
	CASE 1
	-----------+-----+------+-----------
	uclamp_min | 60% | 0%   |  50%
	-----------+-----+------+-----------
	uclamp_max | 80% | 50%  |  50%
	-----------+-----+------+-----------
	CASE 2
	-----------+-----+------+-----------
	uclamp_min | 0%  | 30%  |  30%
	-----------+-----+------+-----------
	uclamp_max | 20% | 50%  |  30%
	-----------+-----+------+-----------

Additionally uclamp_update_active_tasks() must now unconditionally
update both UCLAMP_MIN/MAX because changing the tg's UCLAMP_MAX for
instance could have an impact on the effective UCLAMP_MIN of the tasks.

	           |  p  |  tg  |  effective
	-----------+-----+------+-----------
	old
	-----------+-----+------+-----------
	uclamp_min | 60% | 0%   |  50%
	-----------+-----+------+-----------
	uclamp_max | 80% | 50%  |  50%
	-----------+-----+------+-----------
	*new*
	-----------+-----+------+-----------
	uclamp_min | 60% | 0%   | *60%*
	-----------+-----+------+-----------
	uclamp_max | 80% |*70%* | *70%*
	-----------+-----+------+-----------

[1] https://lore.kernel.org/lkml/CAB8ipk_a6VFNjiEnHRHkUMBKbA+qzPQvhtNjJ_YNzQhqV_o8Zw@mail.gmail.com/

Fixes: 0c18f2ecfcc2 ("sched/uclamp: Fix wrong implementation of cpu.uclamp.min")
Reported-by: Xuewen Yan <xuewen.yan94@gmail.com>
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210617165155.3774110-1-qais.yousef@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
2024-12-16 14:46:43 -03:00
Qais Yousef
6756e1b42a sched/uclamp: Fix wrong implementation of cpu.uclamp.min
[ Upstream commit 0c18f2ecfcc274a4bcc1d122f79ebd4001c3b445 ]

cpu.uclamp.min is a protection as described in cgroup-v2 Resource
Distribution Model

	Documentation/admin-guide/cgroup-v2.rst

which means we try our best to preserve the minimum performance point of
tasks in this group. See full description of cpu.uclamp.min in the
cgroup-v2.rst.

But the current implementation makes it a limit, which is not what was
intended.

For example:

	tg->cpu.uclamp.min = 20%

	p0->uclamp[UCLAMP_MIN] = 0
	p1->uclamp[UCLAMP_MIN] = 50%

	Previous Behavior (limit):

		p0->effective_uclamp = 0
		p1->effective_uclamp = 20%

	New Behavior (Protection):

		p0->effective_uclamp = 20%
		p1->effective_uclamp = 50%

Which is inline with how protections should work.

With this change the cgroup and per-task behaviors are the same, as
expected.

Additionally, we remove the confusing relationship between cgroup and
!user_defined flag.

We don't want for example RT tasks that are boosted by default to max to
change their boost value when they attach to a cgroup. If a cgroup wants
to limit the max performance point of tasks attached to it, then
cpu.uclamp.max must be set accordingly.

Or if they want to set different boost value based on cgroup, then
sysctl_sched_util_clamp_min_rt_default must be used to NOT boost to max
and set the right cpu.uclamp.min for each group to let the RT tasks
obtain the desired boost value when attached to that group.

As it stands the dependency on !user_defined flag adds an extra layer of
complexity that is not required now cpu.uclamp.min behaves properly as
a protection.

The propagation model of effective cpu.uclamp.min in child cgroups as
implemented by cpu_util_update_eff() is still correct. The parent
protection sets an upper limit of what the child cgroups will
effectively get.

Fixes: 3eac870a3247 (sched/uclamp: Use TG's clamps to restrict TASK's clamps)
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210510145032.1934078-2-qais.yousef@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
2024-12-16 14:46:43 -03:00
Quentin Perret
f871554be4 FROMLIST: sched: Fix out-of-bound access in uclamp
Util-clamp places tasks in different buckets based on their clamp values
for performance reasons. However, the size of buckets is currently
computed using a rounding division, which can lead to an off-by-one
error in some configurations.

For instance, with 20 buckets, the bucket size will be 1024/20=51. A
task with a clamp of 1024 will be mapped to bucket id 1024/51=20. Sadly,
correct indexes are in range [0,19], hence leading to an out of bound
memory access.

Clamp the bucket id to fix the issue.

Bug: 186415778
Fixes: 69842cba9ace ("sched/uclamp: Add CPU's clamp buckets refcounting")
Suggested-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lore.kernel.org/r/20210430151412.160913-1-qperret@google.com
Change-Id: Ibc28662de5554f80f97533b60e747f8a6e871c56
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
2024-12-16 14:46:43 -03:00
Qais Yousef
860c8f0032 sched/uclamp: Fix locking around cpu_util_update_eff()
cpu_cgroup_css_online() calls cpu_util_update_eff() without holding the
uclamp_mutex or rcu_read_lock() like other call sites, which is
a mistake.

The uclamp_mutex is required to protect against concurrent reads and
writes that could update the cgroup hierarchy.

The rcu_read_lock() is required to traverse the cgroup data structures
in cpu_util_update_eff().

Surround the caller with the required locks and add some asserts to
better document the dependency in cpu_util_update_eff().

Fixes: 7226017ad37a ("sched/uclamp: Fix a bug in propagating uclamp value in new cgroups")
Reported-by: Quentin Perret <qperret@google.com>
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210510145032.1934078-3-qais.yousef@arm.com
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
2024-12-16 14:46:43 -03:00
Qais Yousef
4b629cbf3a FROMGIT: sched/uclamp: Fix a bug in propagating uclamp value in new cgroups
When a new cgroup is created, the effective uclamp value wasn't updated
with a call to cpu_util_update_eff() that looks at the hierarchy and
update to the most restrictive values.

Fix it by ensuring to call cpu_util_update_eff() when a new cgroup
becomes online.

Without this change, the newly created cgroup uses the default
root_task_group uclamp values, which is 1024 for both uclamp_{min, max},
which will cause the rq to to be clamped to max, hence cause the
system to run at max frequency.

The problem was observed on Ubuntu server and was reproduced on Debian
and Buildroot rootfs.

By default, Ubuntu and Debian create a cpu controller cgroup hierarchy
and add all tasks to it - which creates enough noise to keep the rq
uclamp value at max most of the time. Imitating this behavior makes the
problem visible in Buildroot too which otherwise looks fine since it's a
minimal userspace.

Bug: 120440300
Fixes: 0b60ba2dd342 ("sched/uclamp: Propagate parent clamps")
Reported-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Doug Smythies <dsmythies@telus.net>
Link: https://lore.kernel.org/lkml/000701d5b965$361b6c60$a2524520$@net/
(cherry picked from commit 7226017ad37a888915628e59a84a2d1e57b40707
 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core)
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: I9636c60e04d58bbfc5041df1305b34a12b5a3f46
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
2024-12-16 14:46:43 -03:00
Qais Yousef
babc23d9b6 UPSTREAM: sched/uclamp: Fix incorrect condition
uclamp_update_active() should perform the update when
p->uclamp[clamp_id].active is true. But when the logic was inverted in
[1], the if condition wasn't inverted correctly too.

[1] https://lore.kernel.org/lkml/20190902073836.GO2369@hirez.programming.kicks-ass.net/

Bug: 120440300
Reported-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Patrick Bellasi <patrick.bellasi@matbug.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: babbe170e053 ("sched/uclamp: Update CPU's refcount on TG's clamp changes")
Link: https://lkml.kernel.org/r/20191114211052.15116-1-qais.yousef@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 6e1ff0773f49c7d38e8b4a9df598def6afb9f415)
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: I51b58a6089290277e08a0aaa72b86f852eec1512
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
2024-12-16 14:46:42 -03:00
Valentin Schneider
862c9ce4e6 BACKPORT: sched/fair: Make task_fits_capacity() consider uclamp restrictions
task_fits_capacity() drives CPU selection at wakeup time, and is also used
to detect misfit tasks. Right now it does so by comparing task_util_est()
with a CPU's capacity, but doesn't take into account uclamp restrictions.

There's a few interesting uses that can come out of doing this. For
instance, a low uclamp.max value could prevent certain tasks from being
flagged as misfit tasks, so they could merrily remain on low-capacity CPUs.
Similarly, a high uclamp.min value would steer tasks towards high capacity
CPUs at wakeup (and, should that fail, later steered via misfit balancing),
so such "boosted" tasks would favor CPUs of higher capacity.

Introduce uclamp_task_util() and make task_fits_capacity() use it.

[QP: fixed missing dependency on fits_capacity() by using the open coded
alternative]

Bug: 120440300
Tested-By: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Quentin Perret <qperret@google.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191211113851.24241-5-valentin.schneider@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit a7008c07a568278ed2763436404752a98004c7ff)
Signed-off-by: Quentin Perret <qperret@google.com>
Change-Id: Iabde2eda7252c3bcc273e61260a7a12a7de991b1
2024-12-16 14:46:42 -03:00
Satya Durga Srinivasu Prabhala
1a15c7c1b3 sched/fair: honor uclamp restrictions in fbt()
While calculating untilization of CPU during task placement in fbt(),
current code doesn't take uclamp into account which would lead to
selection of incorrect CPU for the task when uclamp restrictions
are in place for the task.

Change-Id: I8371affe3b37733d222e5c57953e53f91fc19a53
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2024-12-16 14:46:42 -03:00
Dietmar Eggemann
6bc2a06eae sched/uclamp: Allow to reset a task uclamp constraint value
In case the user wants to stop controlling a uclamp constraint value
for a task, use the magic value -1 in sched_util_{min,max} with the
appropriate sched_flags (SCHED_FLAG_UTIL_CLAMP_{MIN,MAX}) to indicate
the reset.

The advantage over the 'additional flag' approach (i.e. introducing
SCHED_FLAG_UTIL_CLAMP_RESET) is that no additional flag has to be
exported via uapi. This avoids the need to document how this new flag
has be used in conjunction with the existing uclamp related flags.

The following subtle issue is fixed as well. When a uclamp constraint
value is set on a !user_defined uclamp_se it is currently first reset
and then set.
Fix this by AND'ing !user_defined with !SCHED_FLAG_UTIL_CLAMP which
stands for the 'sched class change' case.
The related condition 'if (uc_se->user_defined)' moved from
__setscheduler_uclamp() into uclamp_reset().

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Yun Hsiang <hsiang023167@gmail.com>
Link: https://lkml.kernel.org/r/20201113113454.25868-1-dietmar.eggemann@arm.com
2024-12-16 14:46:42 -03:00
Qinglang Miao
5e192458aa sched/uclamp: Remove unnecessary mutex_init()
The uclamp_mutex lock is initialized statically via DEFINE_MUTEX(),
it is unnecessary to initialize it runtime via mutex_init().

Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Patrick Bellasi <patrick.bellasi@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lore.kernel.org/r/20200725085629.98292-1-miaoqinglang@huawei.com
Signed-off-by: RuRuTiaSaMa <1009087450@qq.com>
2024-12-16 14:46:42 -03:00
Hridaya Prajapati
c48b14d697 sched/cpupri: Checkout changes from redbull
Branch: android-msm-redbull-4.19-u-beta5.3

Change-Id: I0283b176f3308459473973ca7df4eee2db1ca644
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-12-16 14:46:39 -03:00
Qais Yousef
cce1c19561 FROMGIT: sched/rt: Re-instate old behavior in select_task_rq_rt()
When RT Capacity Aware support was added, the logic in select_task_rq_rt
was modified to force a search for a fitting CPU if the task currently
doesn't run on one.

But if the search failed, and the search was only triggered to fulfill
the fitness request; we could end up selecting a new CPU unnecessarily.

Fix this and re-instate the original behavior by ensuring we bail out
in that case.

This behavior change only affected asymmetric systems that are using
util_clamp to implement capacity aware. None asymmetric systems weren't
affected.

Bug: 120440300
LINK: https://lore.kernel.org/lkml/20200218041620.GD28029@codeaurora.org/
Reported-by: Pavan Kondeti <pkondeti@codeaurora.org>
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fixes: 804d402fb6f6 ("sched/rt: Make RT capacity-aware")
Link: https://lkml.kernel.org/r/20200302132721.8353-3-qais.yousef@arm.com
(cherry picked from commit b28bc1e002c23ff8a4999c4a2fb1d4d412bc6f5e
 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core)
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: I670ab7f95a3bd8b4790e1cafe89308ead524367e
2024-12-16 14:46:39 -03:00
Qais Yousef
8c6e539b2e FROMGIT: sched/rt: Remove unnecessary push for unfit tasks
In task_woken_rt() and switched_to_rto() we try trigger push-pull if the
task is unfit.

But the logic is found lacking because if the task was the only one
running on the CPU, then rt_rq is not in overloaded state and won't
trigger a push.

The necessity of this logic was under a debate as well, a summary of
the discussion can be found in the following thread:

  https://lore.kernel.org/lkml/20200226160247.iqvdakiqbakk2llz@e107158-lin.cambridge.arm.com/

Remove the logic for now until a better approach is agreed upon.

Bug: 120440300
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fixes: 804d402fb6f6 ("sched/rt: Make RT capacity-aware")
Link: https://lkml.kernel.org/r/20200302132721.8353-6-qais.yousef@arm.com
(cherry picked from commit d94a9df49069ba8ff7c4aaeca1229e6471a01a15
 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core)
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: Id120ada4a89972b3feb8d8b022babb98db1a157f
2024-12-16 14:46:39 -03:00
Qais Yousef
9750a5f2d3 BACKPORT: FROMGIT: sched/rt: Allow pulling unfitting task
When implemented RT Capacity Awareness; the logic was done such that if
a task was running on a fitting CPU, then it was sticky and we would try
our best to keep it there.

But as Steve suggested, to adhere to the strict priority rules of RT
class; allow pulling an RT task to unfitting CPU to ensure it gets a
chance to run ASAP.

Bug: 120440300
LINK: https://lore.kernel.org/lkml/20200203111451.0d1da58f@oasis.local.home/
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fixes: 804d402fb6f6 ("sched/rt: Make RT capacity-aware")
Link: https://lkml.kernel.org/r/20200302132721.8353-5-qais.yousef@arm.com
(cherry picked from commit 98ca645f824301bde72e0a51cdc8bdbbea6774a5
 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core)
[Trivial merge conflict]
Change-Id: Ie25fa5a4f3b0979ed06df8d156e5586b2928479e
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-12-16 14:46:33 -03:00
Qais Yousef
d26b0237e0 FROMGIT: sched/rt: cpupri_find: Trigger a full search as fallback
If we failed to find a fitting CPU, in cpupri_find(), we only fallback
to the level we found a hit at.

But Steve suggested to fallback to a second full scan instead as this
could be a better effort.

	https://lore.kernel.org/lkml/20200304135404.146c56eb@gandalf.local.home/

We trigger the 2nd search unconditionally since the argument about
triggering a full search is that the recorded fall back level might have
become empty by then. Which means storing any data about what happened
would be meaningless and stale.

I had a humble try at timing it and it seemed okay for the small 6 CPUs
system I was running on

	https://lore.kernel.org/lkml/20200305124324.42x6ehjxbnjkklnh@e107158-lin.cambridge.arm.com/

On large system this second full scan could be expensive. But there are
no users outside capacity awareness for this fitness function at the
moment. Heterogeneous systems tend to be small with 8cores in total.

Bug: 120440300
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lkml.kernel.org/r/20200310142219.syxzn5ljpdxqtbgx@e107158-lin.cambridge.arm.com
(cherry picked from commit e94f80f6c49020008e6fa0f3d4b806b8595d17d8
 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core)
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: Ib20d400be47cd913a43a5c71fafee6a7fffb78aa
2024-12-16 14:46:33 -03:00
Qais Yousef
f8ef6fb43a FROMGIT: sched/rt: Optimize cpupri_find() on non-heterogenous systems
By introducing a new cpupri_find_fitness() function that takes the
fitness_fn as an argument and only called when asym_system static key is
enabled.

cpupri_find() is now a wrapper function that calls cpupri_find_fitness()
passing NULL as a fitness_fn, hence disabling the logic that handles
fitness by default.

Bug: 120440300
LINK: https://lore.kernel.org/lkml/c0772fca-0a4b-c88d-fdf2-5715fcf8447b@arm.com/
Reported-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fixes: 804d402fb6f6 ("sched/rt: Make RT capacity-aware")
Link: https://lkml.kernel.org/r/20200302132721.8353-4-qais.yousef@arm.com
(cherry picked from commit a1bd02e1f28b1939cac8c64072a0e578c3cbc345
 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core)
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: I8ad4d9e391030ae499f7a1805485147de64abcdf
2024-12-16 14:46:33 -03:00
Qais Yousef
d524c0114f BACKPORT: FROMGIT: sched/rt: cpupri_find: Implement fallback mechanism for !fit case
When searching for the best lowest_mask with a fitness_fn passed, make
sure we record the lowest_level that returns a valid lowest_mask so that
we can use that as a fallback in case we fail to find a fitting CPU at
all levels.

The intention in the original patch was not to allow a down migration to
unfitting CPU. But this missed the case where we are already running on
unfitting one.

With this change now RT tasks can still move between unfitting CPUs when
they're already running on such CPU.

And as Steve suggested; to adhere to the strict priority rules of RT, if
a task is already running on a fitting CPU but due to priority it can't
run on it, allow it to downmigrate to unfitting CPU so it can run.

Bug: 120440300
Reported-by: Pavan Kondeti <pkondeti@codeaurora.org>
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fixes: 804d402fb6f6 ("sched/rt: Make RT capacity-aware")
Link: https://lkml.kernel.org/r/20200302132721.8353-2-qais.yousef@arm.com
Link: https://lore.kernel.org/lkml/20200203142712.a7yvlyo2y3le5cpn@e107158-lin/
(cherry picked from commit d9cb236b9429044dc694ea70a50163ddd283cea6
 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core)
[Trivial merge conflict]
Change-Id: I3430e9624f8f7b11d3875c39c5765a51aec4a6f5
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-12-16 14:46:24 -03:00
Qais Yousef
d719b75cc9 BACKPORT: sched/rt: Make RT capacity-aware
Capacity Awareness refers to the fact that on heterogeneous systems
(like Arm big.LITTLE), the capacity of the CPUs is not uniform, hence
when placing tasks we need to be aware of this difference of CPU
capacities.

In such scenarios we want to ensure that the selected CPU has enough
capacity to meet the requirement of the running task. Enough capacity
means here that capacity_orig_of(cpu) >= task.requirement.

The definition of task.requirement is dependent on the scheduling class.

For CFS, utilization is used to select a CPU that has >= capacity value
than the cfs_task.util.

	capacity_orig_of(cpu) >= cfs_task.util

DL isn't capacity aware at the moment but can make use of the bandwidth
reservation to implement that in a similar manner CFS uses utilization.
The following patchset implements that:

https://lore.kernel.org/lkml/20190506044836.2914-1-luca.abeni@santannapisa.it/

	capacity_orig_of(cpu)/SCHED_CAPACITY >= dl_deadline/dl_runtime

For RT we don't have a per task utilization signal and we lack any
information in general about what performance requirement the RT task
needs. But with the introduction of uclamp, RT tasks can now control
that by setting uclamp_min to guarantee a minimum performance point.

ATM the uclamp value are only used for frequency selection; but on
heterogeneous systems this is not enough and we need to ensure that the
capacity of the CPU is >= uclamp_min. Which is what implemented here.

	capacity_orig_of(cpu) >= rt_task.uclamp_min

Note that by default uclamp.min is 1024, which means that RT tasks will
always be biased towards the big CPUs, which make for a better more
predictable behavior for the default case.

Must stress that the bias acts as a hint rather than a definite
placement strategy. For example, if all big cores are busy executing
other RT tasks we can't guarantee that a new RT task will be placed
there.

On non-heterogeneous systems the original behavior of RT should be
retained. Similarly if uclamp is not selected in the config.

[ mingo: Minor edits to comments. ]

Bug: 120440300
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191009104611.15363-1-qais.yousef@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 804d402fb6f6487b825aae8cf42fda6426c62867
 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git)
[Qais: resolved minor conflict in kernel/sched/cpupri.c]
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: Ifc9da1c47de1aec9b4d87be2614e4c8968366900
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-12-16 14:46:14 -03:00
Andrzej Perczak
36fccb0f06 sched: Set uclamp_util_min_rt_default to 0
Taken from oriole init script. This fixes a problem with stuck freqs on
little core.

Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
2024-12-16 14:46:14 -03:00
darkhz
e7a3c0b7a8 sched/cpufreq_schedutil: Reflect uclamp changes
This is a follow-up of:
BACKPORT: sched/cpufreq, sched/uclamp: Add clamps for FAIR and RT tasks

We excluded the schedutil-related change there since the differences
between 4.9 and 4.19 schedutil sources were very big, and proceed to
modify sugov_get_util() according to the above commit.
2024-12-16 14:46:14 -03:00
darkhz
02bd1ba153 sched/fair: Make boosted and prefer_idle tunables uclamp aware
Since we now have uclamp_boosted() and uclamp_latency_sensitive(),
which is similar to schedtune_task_boost() and schedtune_prefer_idle()
respectively, use them.

Change-Id: Ia88e06b7aff5ae6a6ff54cc99f944064b75fe9cb
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-12-16 14:46:11 -03:00
darkhz
ab91338a51 sched/fair: Modify boosted_task_util() to reflect uclamp changes
This is a commit that reflects:
ANDROID: sched/fair: EAS: Add uclamp support to find_energy_efficient_cpu()

3a5e1534e0

Change-Id: I4b4a6cd4fcc1b7d4db3c3d96c342a26781dba48d
Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>
2024-12-16 14:45:56 -03:00
Quentin Perret
02b01aa2e8 ANDROID: sched: Introduce uclamp latency and boost wrapper
Introduce a simple helper to read the latency_sensitive flag from a
task. It is called uclamp_latency_sensitive() to match the API
proposed by Patrick.

While at it, introduce uclamp_boosted() which returns true only when a
task has a non-null min-clamp.

Change-Id: I5fc747da8b58625257a6604a3c88487b657fbe7a
Suggested-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
2024-12-16 13:44:57 -03:00
Quentin Perret
a9fd2f5012 ANDROID: sched/core: Add a latency-sensitive flag to uclamp
Add a 'latency_sensitive' flag to uclamp in order to express the need
for some tasks to find a CPU where they can wake-up quickly. This is not
expected to be used without cgroup support, so add solely a cgroup
interface for it.

As this flag represents a boolean attribute and not an amount of
resources to be shared, it is not clear what the delegation logic should
be. As such, it is kept simple: every new cgroup starts with
latency_sensitive set to false, regardless of the parent.

In essence, this is similar to SchedTune's prefer-idle flag which was
used in android-4.19 and prior.

Change-Id: I722d8ecabb428bb7b95a5b54bc70a87f182dde2a
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
2024-12-16 13:44:56 -03:00
Qais Yousef
12fbc18b60 BACKPORT: sched/uclamp: Add a new sysctl to control RT default boost value
RT tasks by default run at the highest capacity/performance level. When
uclamp is selected this default behavior is retained by enforcing the
requested uclamp.min (p->uclamp_req[UCLAMP_MIN]) of the RT tasks to be
uclamp_none(UCLAMP_MAX), which is SCHED_CAPACITY_SCALE; the maximum
value.

This is also referred to as 'the default boost value of RT tasks'.

See commit 1a00d999971c ("sched/uclamp: Set default clamps for RT tasks").

On battery powered devices, it is desired to control this default
(currently hardcoded) behavior at runtime to reduce energy consumed by
RT tasks.

For example, a mobile device manufacturer where big.LITTLE architecture
is dominant, the performance of the little cores varies across SoCs, and
on high end ones the big cores could be too power hungry.

Given the diversity of SoCs, the new knob allows manufactures to tune
the best performance/power for RT tasks for the particular hardware they
run on.

They could opt to further tune the value when the user selects
a different power saving mode or when the device is actively charging.

The runtime aspect of it further helps in creating a single kernel image
that can be run on multiple devices that require different tuning.

Keep in mind that a lot of RT tasks in the system are created by the
kernel. On Android for instance I can see over 50 RT tasks, only
a handful of which created by the Android framework.

To control the default behavior globally by system admins and device
integrator, introduce the new sysctl_sched_uclamp_util_min_rt_default
to change the default boost value of the RT tasks.

I anticipate this to be mostly in the form of modifying the init script
of a particular device.

To avoid polluting the fast path with unnecessary code, the approach
taken is to synchronously do the update by traversing all the existing
tasks in the system. This could race with a concurrent fork(), which is
dealt with by introducing sched_post_fork() function which will ensure
the racy fork will get the right update applied.

Tested on Juno-r2 in combination with the RT capacity awareness [1].
By default an RT task will go to the highest capacity CPU and run at the
maximum frequency, which is particularly energy inefficient on high end
mobile devices because the biggest core[s] are 'huge' and power hungry.

With this patch the RT task can be controlled to run anywhere by
default, and doesn't cause the frequency to be maximum all the time.
Yet any task that really needs to be boosted can easily escape this
default behavior by modifying its requested uclamp.min value
(p->uclamp_req[UCLAMP_MIN]) via sched_setattr() syscall.

[1] 804d402fb6f6: ("sched/rt: Make RT capacity-aware")

Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200716110347.19553-2-qais.yousef@arm.com

(cherry picked from commit 13685c4a08fca9dd76bf53bfcbadc044ab2a08cb)

Conflicts:
   kernel/fork.c
   kernel/sysctl.c
Upstream has commit 5a5cf5cb30d7 ("cgroup: refactor fork helpers") and
further commit ef2c41cf38a7 ("clone3: allow spawning processes into
cgroups") which affect the calls after this.  Picking the first would
be easy but the 2nd would be much bigger.  Also, my cherry-pick put my
sysctl in the wrong place in the table in sysctl.c, so I manually
moved it.  Weird.

BUG=b:160171130
TEST=With series rt tasks don't get boosted

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Change-Id: I678d8ee899ecfbe0a1f0bb94da85d54fff924a57
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/2340433
Reviewed-by: Joel Fernandes <joelaf@google.com>
2024-12-16 13:44:56 -03:00
Qais Yousef
1c998840de UPSTREAM: sched/uclamp: Fix a deadlock when enabling uclamp static key
The following splat was caught when setting uclamp value of a task:

  BUG: sleeping function called from invalid context at ./include/linux/percpu-rwsem.h:49

   cpus_read_lock+0x68/0x130
   static_key_enable+0x1c/0x38
   __sched_setscheduler+0x900/0xad8

Fix by ensuring we enable the key outside of the critical section in
__sched_setscheduler()

Fixes: 46609ce22703 ("sched/uclamp: Protect uclamp fast path code with static key")
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200716110347.19553-4-qais.yousef@arm.com
(cherry picked from commit e65855a52b479f98674998cb23b21ef5a8144b04)

BUG=b:160171130
TEST=Future patch needs this one; doesn't break anything

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Change-Id: Idd1fdadedae2a7289d7c5eb7df5caebf0bf12f58
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/2340431
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Joel Fernandes <joelaf@google.com>
2024-12-16 13:44:56 -03:00
Qais Yousef
07a5c7d42a BACKPORT: sched/uclamp: Protect uclamp fast path code with static key
There is a report that when uclamp is enabled, a netperf UDP test
regresses compared to a kernel compiled without uclamp.

https://lore.kernel.org/lkml/20200529100806.GA3070@suse.de/

While investigating the root cause, there were no sign that the uclamp
code is doing anything particularly expensive but could suffer from bad
cache behavior under certain circumstances that are yet to be
understood.

https://lore.kernel.org/lkml/20200616110824.dgkkbyapn3io6wik@e107158-lin/

To reduce the pressure on the fast path anyway, add a static key that is
by default will skip executing uclamp logic in the
enqueue/dequeue_task() fast path until it's needed.

As soon as the user start using util clamp by:

	1. Changing uclamp value of a task with sched_setattr()
	2. Modifying the default sysctl_sched_util_clamp_{min, max}
	3. Modifying the default cpu.uclamp.{min, max} value in cgroup

We flip the static key now that the user has opted to use util clamp.
Effectively re-introducing uclamp logic in the enqueue/dequeue_task()
fast path. It stays on from that point forward until the next reboot.

This should help minimize the effect of util clamp on workloads that
don't need it but still allow distros to ship their kernels with uclamp
compiled in by default.

SCHED_WARN_ON() in uclamp_rq_dec_id() was removed since now we can end
up with unbalanced call to uclamp_rq_dec_id() if we flip the key while
a task is running in the rq. Since we know it is harmless we just
quietly return if we attempt a uclamp_rq_dec_id() when
rq->uclamp[].bucket[].tasks is 0.

In schedutil, we introduce a new uclamp_is_enabled() helper which takes
the static key into account to ensure RT boosting behavior is retained.

The following results demonstrates how this helps on 2 Sockets Xeon E5
2x10-Cores system.

                                   nouclamp                 uclamp      uclamp-static-key
Hmean     send-64         162.43 (   0.00%)      157.84 *  -2.82%*      163.39 *   0.59%*
Hmean     send-128        324.71 (   0.00%)      314.78 *  -3.06%*      326.18 *   0.45%*
Hmean     send-256        641.55 (   0.00%)      628.67 *  -2.01%*      648.12 *   1.02%*
Hmean     send-1024      2525.28 (   0.00%)     2448.26 *  -3.05%*     2543.73 *   0.73%*
Hmean     send-2048      4836.14 (   0.00%)     4712.08 *  -2.57%*     4867.69 *   0.65%*
Hmean     send-3312      7540.83 (   0.00%)     7425.45 *  -1.53%*     7621.06 *   1.06%*
Hmean     send-4096      9124.53 (   0.00%)     8948.82 *  -1.93%*     9276.25 *   1.66%*
Hmean     send-8192     15589.67 (   0.00%)    15486.35 *  -0.66%*    15819.98 *   1.48%*
Hmean     send-16384    26386.47 (   0.00%)    25752.25 *  -2.40%*    26773.74 *   1.47%*

The perf diff between nouclamp and uclamp-static-key when uclamp is
disabled in the fast path:

     8.73%     -1.55%  [kernel.kallsyms]        [k] try_to_wake_up
     0.07%     +0.04%  [kernel.kallsyms]        [k] deactivate_task
     0.13%     -0.02%  [kernel.kallsyms]        [k] activate_task

The diff between nouclamp and uclamp-static-key when uclamp is enabled
in the fast path:

     8.73%     -0.72%  [kernel.kallsyms]        [k] try_to_wake_up
     0.13%     +0.39%  [kernel.kallsyms]        [k] activate_task
     0.07%     +0.38%  [kernel.kallsyms]        [k] deactivate_task

Fixes: 69842cba9ace ("sched/uclamp: Add CPU's clamp buckets refcounting")
Reported-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lkml.kernel.org/r/20200630112123.12076-3-qais.yousef@arm.com

(cherry picked from commit 46609ce227039fd192e0ecc7d940bed587fd2c78)

Conflicts:
   kernel/sched/sched.h
We have commit f76b70375571 ("ANDROID: sched: Introduce uclamp latency
and boost wrapper").  Conflict is trivial context diff, though.

BUG=b:160171130
TEST=Future patch needs this one; doesn't break anything after fix taken

Cq-Depend: chromium:2340431
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Change-Id: Ib8a770fe948bf77082971cc6f78a20b0eec14519
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/2340430
Reviewed-by: Joel Fernandes <joelaf@google.com>
2024-12-16 13:44:56 -03:00
Qais Yousef
eac5454455 UPSTREAM: sched/uclamp: Fix initialization of struct uclamp_rq
struct uclamp_rq was zeroed out entirely in assumption that in the first
call to uclamp_rq_inc() they'd be initialized correctly in accordance to
default settings.

But when next patch introduces a static key to skip
uclamp_rq_{inc,dec}() until userspace opts in to use uclamp, schedutil
will fail to perform any frequency changes because the
rq->uclamp[UCLAMP_MAX].value is zeroed at init and stays as such. Which
means all rqs are capped to 0 by default.

Fix it by making sure we do proper initialization at init without
relying on uclamp_rq_inc() doing it later.

Fixes: 69842cba9ace ("sched/uclamp: Add CPU's clamp buckets refcounting")
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lkml.kernel.org/r/20200630112123.12076-2-qais.yousef@arm.com
(cherry picked from commit d81ae8aac85ca2e307d273f6dc7863a721bf054e)

BUG=b:160171130
TEST=Future patch needs this one; doesn't break anything

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Change-Id: Iad2e5f24d469f7803d08bbaeb73eeab3c6c26521
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/2340429
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Joel Fernandes <joelaf@google.com>
2024-12-16 13:44:56 -03:00
Valentin Schneider
70aec121c0 UPSTREAM: sched/uclamp: Make uclamp util helpers use and return UL values
Vincent pointed out recently that the canonical type for utilization
values is 'unsigned long'. Internally uclamp uses 'unsigned int' values for
cache optimization, but this doesn't have to be exported to its users.

Make the uclamp helpers that deal with utilization use and return unsigned
long values.

Tested-By: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Quentin Perret <qperret@google.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191211113851.24241-3-valentin.schneider@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 686516b55e98edf18c2a02d36aaaa6f4c0f6c39c)

BUG=b:160171130
TEST=Future patch picks cleaner

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Change-Id: I0cc7fdc6e6aeddd307e2d66456dbc1782f4d38be
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/2340428
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Joel Fernandes <joelaf@google.com>
2024-12-16 13:44:56 -03:00
Valentin Schneider
ddb81e9813 BACKPORT: sched/uclamp: Remove uclamp_util()
The sole user of uclamp_util(), schedutil_cpu_util(), was made to use
uclamp_util_with() instead in commit:

  af24bde8df20 ("sched/uclamp: Add uclamp support to energy_compute()")

From then on, uclamp_util() has remained unused. Being a simple wrapper
around uclamp_util_with(), we can get rid of it and win back a few lines.

Tested-By: Dietmar Eggemann <dietmar.eggemann@arm.com>
Suggested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191211113851.24241-2-valentin.schneider@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

(cherry picked from commit 59fe675248ffc37d4167e9ec6920a2f3d5ec67bb)

Conflicts:
   kernel/sched/sched.h
We have commit f76b70375571 ("ANDROID: sched: Introduce uclamp latency
and boost wrapper").  Conflict is trivial context diff, though.

BUG=b:160171130
TEST=Future patch picks cleaner

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Change-Id: I5ad7eb758c863327788c098ec16eaf829094898c
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/2340427
Reviewed-by: Joel Fernandes <joelaf@google.com>
2024-12-16 13:44:56 -03:00
Quentin Perret
65938ec4e9 sched/core: Fix reset-on-fork from RT with uclamp
commit eaf5a92ebde5bca3bb2565616115bd6d579486cd upstream.

uclamp_fork() resets the uclamp values to their default when the
reset-on-fork flag is set. It also checks whether the task has a RT
policy, and sets its uclamp.min to 1024 accordingly. However, during
reset-on-fork, the task's policy is lowered to SCHED_NORMAL right after,
hence leading to an erroneous uclamp.min setting for the new task if it
was forked from RT.

Fix this by removing the unnecessary check on rt_task() in
uclamp_fork() as this doesn't make sense if the reset-on-fork flag is
set.

Fixes: 1a00d999971c ("sched/uclamp: Set default clamps for RT tasks")
Reported-by: Chitti Babu Theegala <ctheegal@codeaurora.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Patrick Bellasi <patrick.bellasi@matbug.net>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lkml.kernel.org/r/20200416085956.217587-1-qperret@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-16 13:44:56 -03:00
Li Guanglei
57465b7cfe sched/core: Fix size of rq::uclamp initialization
[ Upstream commit dcd6dffb0a75741471297724640733fa4e958d72 ]

rq::uclamp is an array of struct uclamp_rq, make sure we clear the
whole thing.

Fixes: 69842cba9ace ("sched/uclamp: Add CPU's clamp buckets refcountinga")
Signed-off-by: Li Guanglei <guanglei.li@unisoc.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Qais Yousef <qais.yousef@arm.com>
Link: https://lkml.kernel.org/r/1577259844-12677-1-git-send-email-guangleix.li@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-16 13:44:56 -03:00
Qais Yousef
e2a638214e sched/uclamp: Reject negative values in cpu_uclamp_write()
commit b562d140649966d4daedd0483a8fe59ad3bb465a upstream.

The check to ensure that the new written value into cpu.uclamp.{min,max}
is within range, [0:100], wasn't working because of the signed
comparison

 7301                 if (req.percent > UCLAMP_PERCENT_SCALE) {
 7302                         req.ret = -ERANGE;
 7303                         return req;
 7304                 }

	# echo -1 > cpu.uclamp.min
	# cat cpu.uclamp.min
	42949671.96

Cast req.percent into u64 to force the comparison to be unsigned and
work as intended in capacity_from_percent().

	# echo -1 > cpu.uclamp.min
	sh: write error: Numerical result out of range

Fixes: 2480c093130f ("sched/uclamp: Extend CPU's cgroup controller")
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200114210947.14083-1-qais.yousef@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-16 13:44:56 -03:00
Qais Yousef
f11b12ec18 UPSTREAM: sched/core: Fix compilation error when cgroup not selected
When cgroup is disabled the following compilation error was hit

	kernel/sched/core.c: In function ‘uclamp_update_active_tasks’:
	kernel/sched/core.c:1081:23: error: storage size of ‘it’ isn’t known
	  struct css_task_iter it;
			       ^~
	kernel/sched/core.c:1084:2: error: implicit declaration of function ‘css_task_iter_start’; did you mean ‘__sg_page_iter_start’? [-Werror=implicit-function-declaration]
	  css_task_iter_start(css, 0, &it);
	  ^~~~~~~~~~~~~~~~~~~
	  __sg_page_iter_start
	kernel/sched/core.c:1085:14: error: implicit declaration of function ‘css_task_iter_next’; did you mean ‘__sg_page_iter_next’? [-Werror=implicit-function-declaration]
	  while ((p = css_task_iter_next(&it))) {
		      ^~~~~~~~~~~~~~~~~~
		      __sg_page_iter_next
	kernel/sched/core.c:1091:2: error: implicit declaration of function ‘css_task_iter_end’; did you mean ‘get_task_cred’? [-Werror=implicit-function-declaration]
	  css_task_iter_end(&it);
	  ^~~~~~~~~~~~~~~~~
	  get_task_cred
	kernel/sched/core.c:1081:23: warning: unused variable ‘it’ [-Wunused-variable]
	  struct css_task_iter it;
			       ^~
	cc1: some warnings being treated as errors
	make[2]: *** [kernel/sched/core.o] Error 1

Fix by protetion uclamp_update_active_tasks() with
CONFIG_UCLAMP_TASK_GROUP

Bug: 120440300
Fixes: babbe170e053 ("sched/uclamp: Update CPU's refcount on TG's clamp changes")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Patrick Bellasi <patrick.bellasi@matbug.net>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Ben Segall <bsegall@google.com>
Link: https://lkml.kernel.org/r/20191105112212.596-1-qais.yousef@arm.com
(cherry picked from commit e3b8b6a0d12cccf772113d6b5c1875192186fbd4)
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: Ia4c0f801d68050526f9f117ec9189e448b01345a
Signed-off-by: Quentin Perret <qperret@google.com>
2024-12-16 13:44:56 -03:00
Ingo Molnar
170aee05d6 UPSTREAM: sched/core: Fix uclamp ABI bug, clean up and robustify sched_read_attr() ABI logic and code
Thadeu Lima de Souza Cascardo reported that 'chrt' broke on recent kernels:

  $ chrt -p $$
  chrt: failed to get pid 26306's policy: Argument list too long

and he has root-caused the bug to the following commit increasing sched_attr
size and breaking sched_read_attr() into returning -EFBIG:

  a509a7cd7974 ("sched/uclamp: Extend sched_setattr() to support utilization clamping")

The other, bigger bug is that the whole sched_getattr() and sched_read_attr()
logic of checking non-zero bits in new ABI components is arguably broken,
and pretty much any extension of the ABI will spuriously break the ABI.
That's way too fragile.

Instead implement the perf syscall's extensible ABI instead, which we
already implement on the sched_setattr() side:

 - if user-attributes have the same size as kernel attributes then the
   logic is unchanged.

 - if user-attributes are larger than the kernel knows about then simply
   skip the extra bits, but set attr->size to the (smaller) kernel size
   so that tooling can (in principle) handle older kernel as well.

 - if user-attributes are smaller than the kernel knows about then just
   copy whatever user-space can accept.

Also clean up the whole logic:

 - Simplify the code flow - there's no need for 'ret' for example.

 - Standardize on 'kattr/uattr' and 'ksize/usize' naming to make sure we
   always know which side we are dealing with.

 - Why is it called 'read' when what it does is to copy to user? This
   code is so far away from VFS read() semantics that the naming is
   actively confusing. Name it sched_attr_copy_to_user() instead, which
   mirrors other copy_to_user() functionality.

 - Move the attr->size assignment from the head of sched_getattr() to the
   sched_attr_copy_to_user() function. Nothing else within the kernel
   should care about the size of the structure.

With these fixes the sched_getattr() syscall now nicely supports an
extensible ABI in both a forward and backward compatible fashion, and
will also fix the chrt bug.

As an added bonus the bogus -EFBIG return is removed as well, which as
Thadeu noted should have been -E2BIG to begin with.

Bug: 120440300
Reported-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Tested-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Patrick Bellasi <patrick.bellasi@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: a509a7cd7974 ("sched/uclamp: Extend sched_setattr() to support utilization clamping")
Link: https://lkml.kernel.org/r/20190904075532.GA26751@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 1251201c0d34fadf69d56efa675c2b7dd0a90eca)
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: I67e653c4f69db0140e9651c125b60e2b8cfd62f1
Signed-off-by: Quentin Perret <qperret@google.com>
2024-12-16 13:44:56 -03:00
Patrick Bellasi
0449deb1f3 UPSTREAM: sched/uclamp: Always use 'enum uclamp_id' for clamp_id values
The supported clamp indexes are defined in 'enum clamp_id', however, because
of the code logic in some of the first utilization clamping series version,
sometimes we needed to use 'unsigned int' to represent indices.

This is not more required since the final version of the uclamp_* APIs can
always use the proper enum uclamp_id type.

Fix it with a bulk rename now that we have all the bits merged.

Bug: 120440300
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Michal Koutny <mkoutny@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Alessio Balsini <balsini@android.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Perret <quentin.perret@arm.com>
Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
Cc: Steve Muckle <smuckle@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Todd Kjos <tkjos@google.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lkml.kernel.org/r/20190822132811.31294-7-patrick.bellasi@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 0413d7f33e60751570fd6c179546bde2f7d82dcb)
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: I0be680b2489fa07244bac63b5c6fe1a79a53bef7
Signed-off-by: Quentin Perret <qperret@google.com>
2024-12-16 13:44:56 -03:00
Patrick Bellasi
9fa48d715c UPSTREAM: sched/uclamp: Update CPU's refcount on TG's clamp changes
On updates of task group (TG) clamp values, ensure that these new values
are enforced on all RUNNABLE tasks of the task group, i.e. all RUNNABLE
tasks are immediately boosted and/or capped as requested.

Do that each time we update effective clamps from cpu_util_update_eff().
Use the *cgroup_subsys_state (css) to walk the list of tasks in each
affected TG and update their RUNNABLE tasks.
Update each task by using the same mechanism used for cpu affinity masks
updates, i.e. by taking the rq lock.

Bug: 120440300
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Michal Koutny <mkoutny@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Alessio Balsini <balsini@android.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Perret <quentin.perret@arm.com>
Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
Cc: Steve Muckle <smuckle@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Todd Kjos <tkjos@google.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lkml.kernel.org/r/20190822132811.31294-6-patrick.bellasi@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit babbe170e053c6ec2343751749995b7b9fd5fd2c)
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: I5e48891bd48c266dd282e1bab8f60533e4e29b48
Signed-off-by: Quentin Perret <qperret@google.com>
2024-12-16 13:44:56 -03:00
Patrick Bellasi
e60e436fce UPSTREAM: sched/uclamp: Use TG's clamps to restrict TASK's clamps
When a task specific clamp value is configured via sched_setattr(2), this
value is accounted in the corresponding clamp bucket every time the task is
{en,de}qeued. However, when cgroups are also in use, the task specific
clamp values could be restricted by the task_group (TG) clamp values.

Update uclamp_cpu_inc() to aggregate task and TG clamp values. Every time a
task is enqueued, it's accounted in the clamp bucket tracking the smaller
clamp between the task specific value and its TG effective value. This
allows to:

1. ensure cgroup clamps are always used to restrict task specific requests,
   i.e. boosted not more than its TG effective protection and capped at
   least as its TG effective limit.

2. implement a "nice-like" policy, where tasks are still allowed to request
   less than what enforced by their TG effective limits and protections

Do this by exploiting the concept of "effective" clamp, which is already
used by a TG to track parent enforced restrictions.

Apply task group clamp restrictions only to tasks belonging to a child
group. While, for tasks in the root group or in an autogroup, system
defaults are still enforced.

Bug: 120440300
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Michal Koutny <mkoutny@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Alessio Balsini <balsini@android.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Perret <quentin.perret@arm.com>
Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
Cc: Steve Muckle <smuckle@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Todd Kjos <tkjos@google.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lkml.kernel.org/r/20190822132811.31294-5-patrick.bellasi@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 3eac870a324728e5d17118888840dad70bcd37f3)
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Change-Id: I0215e0a68cc0fa7c441e33052757f8571b7c99b9
Signed-off-by: Quentin Perret <qperret@google.com>
2024-12-16 13:44:56 -03:00