msm-4.14

mirror of https://github.com/rd-stuffs/msm-4.14.git synced 2025-02-20 11:45:48 +08:00

Author	SHA1	Message	Date
Mel Gorman	cf13220288	sched/fair: Do not migrate if the prev_cpu is idle wake_affine_idle() prefers to move a task to the current CPU if the wakeup is due to an interrupt. The expectation is that the interrupt data is cache hot and relevant to the waking task as well as avoiding a search. However, there is no way to determine if there was cache hot data on the previous CPU that may exceed the interrupt data. Furthermore, round-robin delivery of interrupts can migrate tasks around a socket where each CPU is under-utilised. This can interact badly with cpufreq which makes decisions based on per-cpu data. It has been observed on machines with HWP that p-states are not boosted to their maximum levels even though the workload is latency and throughput sensitive. This patch uses the previous CPU for the task if it's idle and cache-affine with the current CPU even if the current CPU is idle due to the wakup being related to the interrupt. This reduces migrations at the cost of the interrupt data not being cache hot when the task wakes. A variety of workloads were tested on various machines and no adverse impact was noticed that was outside noise. dbench on ext4 on UMA showed roughly 10% reduction in the number of CPU migrations and it is a case where interrupts are frequent for IO competions. In most cases, the difference in performance is quite small but variability is often reduced. For example, this is the result for pgbench running on a UMA machine with different numbers of clients. 4.15.0-rc9 4.15.0-rc9 baseline waprev-v1 Hmean 1 22096.28 ( 0.00%) 22734.86 ( 2.89%) Hmean 4 74633.42 ( 0.00%) 75496.77 ( 1.16%) Hmean 7 115017.50 ( 0.00%) 113030.81 ( -1.73%) Hmean 12 126209.63 ( 0.00%) 126613.40 ( 0.32%) Hmean 16 131886.91 ( 0.00%) 130844.35 ( -0.79%) Stddev 1 636.38 ( 0.00%) 417.11 ( 34.46%) Stddev 4 614.64 ( 0.00%) 583.24 ( 5.11%) Stddev 7 542.46 ( 0.00%) 435.45 ( 19.73%) Stddev 12 173.93 ( 0.00%) 171.50 ( 1.40%) Stddev 16 671.42 ( 0.00%) 680.30 ( -1.32%) CoeffVar 1 2.88 ( 0.00%) 1.83 ( 36.26%) Note that the different in performance is marginal but for low utilisation, there is less variability. Link: http://lkml.kernel.org/r/20180130104555.4125-4-mgorman@techsingularity.net Change-Id: Iba4b2fee25a8c7b4f0218f656b0db5eab9e11691 Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com> Signed-off-by: Yaroslav Furman <yaro330@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:55 -03:00
Mel Gorman	b85f064f47	sched/fair: Only immediately migrate tasks due to interrupts if prev and target CPUs share cache If waking from an idle CPU due to an interrupt then it's possible that the waker task will be pulled to wake on the current CPU. Unfortunately, depending on the type of interrupt and IRQ configuration, there may not be a strong relationship between the CPU an interrupt was delivered on and the CPU a task was running on. For example, the interrupts could all be delivered to CPUs on one particular node due to the machine topology or IRQ affinity configuration. Another example is an interrupt for an IO completion which can be delivered to any CPU where there is no guarantee the data is either cache hot or even local. This patch was motivated by the observation that an IO workload was being pulled cross-node on a frequent basis when IO completed. From a wakeup latency perspective, it's still useful to know that an idle CPU is immediately available for use but lets only consider an automatic migration if the CPUs share cache to limit damage due to NUMA migrations. Migrations may still occur if wake_affine_weight determines it's appropriate. These are the throughput results for dbench running on ext4 comparing 4.15-rc3 and this patch on a 2-socket machine where interrupts due to IO completions can happen on any CPU. 4.15.0-rc3 4.15.0-rc3 vanilla lessmigrate Hmean 1 854.64 ( 0.00%) 865.01 ( 1.21%) Hmean 2 1229.60 ( 0.00%) 1274.44 ( 3.65%) Hmean 4 1591.81 ( 0.00%) 1628.08 ( 2.28%) Hmean 8 1845.04 ( 0.00%) 1831.80 ( -0.72%) Hmean 16 2038.61 ( 0.00%) 2091.44 ( 2.59%) Hmean 32 2327.19 ( 0.00%) 2430.29 ( 4.43%) Hmean 64 2570.61 ( 0.00%) 2568.54 ( -0.08%) Hmean 128 2481.89 ( 0.00%) 2499.28 ( 0.70%) Stddev 1 14.31 ( 0.00%) 5.35 ( 62.65%) Stddev 2 21.29 ( 0.00%) 11.09 ( 47.92%) Stddev 4 7.22 ( 0.00%) 6.80 ( 5.92%) Stddev 8 26.70 ( 0.00%) 9.41 ( 64.76%) Stddev 16 22.40 ( 0.00%) 20.01 ( 10.70%) Stddev 32 45.13 ( 0.00%) 44.74 ( 0.85%) Stddev 64 93.10 ( 0.00%) 93.18 ( -0.09%) Stddev 128 184.28 ( 0.00%) 177.85 ( 3.49%) Note the small increase in throughput for low thread counts but also note that the standard deviation for each sample during the test run is lower. The throughput figures for dbench can be misleading so the benchmark is actually modified to time the latency of the processing of one load file with many samples taken. The difference in latency is 4.15.0-rc3 4.15.0-rc3 vanilla lessmigrate Amean 1 21.71 ( 0.00%) 21.47 ( 1.08%) Amean 2 30.89 ( 0.00%) 29.58 ( 4.26%) Amean 4 47.54 ( 0.00%) 46.61 ( 1.97%) Amean 8 82.71 ( 0.00%) 82.81 ( -0.12%) Amean 16 149.45 ( 0.00%) 145.01 ( 2.97%) Amean 32 265.49 ( 0.00%) 248.43 ( 6.42%) Amean 64 463.23 ( 0.00%) 463.55 ( -0.07%) Amean 128 933.97 ( 0.00%) 935.50 ( -0.16%) Stddev 1 1.58 ( 0.00%) 1.54 ( 2.26%) Stddev 2 2.84 ( 0.00%) 2.95 ( -4.15%) Stddev 4 6.78 ( 0.00%) 6.85 ( -0.99%) Stddev 8 16.85 ( 0.00%) 16.37 ( 2.85%) Stddev 16 41.59 ( 0.00%) 41.04 ( 1.32%) Stddev 32 111.05 ( 0.00%) 105.11 ( 5.35%) Stddev 64 285.94 ( 0.00%) 288.01 ( -0.72%) Stddev 128 803.39 ( 0.00%) 809.73 ( -0.79%) It's a small improvement which is not surprising given that migrations that migrate to a different node as not that common. However, it is noticeable in the CPU migration statistics which are reduced by 24%. There was a query for v1 of this patch about NAS so here are the results for C-class using MPI for parallelisation on the same machine nas-mpi 4.15.0-rc3 4.15.0-rc3 vanilla noirq Time cg.C 24.25 ( 0.00%) 23.17 ( 4.45%) Time ep.C 8.22 ( 0.00%) 8.29 ( -0.85%) Time ft.C 22.67 ( 0.00%) 20.34 ( 10.28%) Time is.C 1.42 ( 0.00%) 1.47 ( -3.52%) Time lu.C 55.62 ( 0.00%) 54.81 ( 1.46%) Time mg.C 7.93 ( 0.00%) 7.91 ( 0.25%) 4.15.0-rc3 4.15.0-rc3 vanilla noirq-v1r1 User 3799.96 3748.34 System 672.10 626.15 Elapsed 91.91 79.49 lu.C sees a small gain, ft.C a large gain and ep.C and is.C see small regressions but in terms of absolute time, the difference is small and likely within run-to-run variance. System CPU usage is slightly reduced. schbench from Facebook was also requested. This is a bit of a mixed bag but it's important to note that this workload should not be heavily impacted by wakeups from interrupt context. 4.15.0-rc3 4.15.0-rc3 vanilla noirq-v1r1 Lat 50.00th-qrtle-1 41.00 ( 0.00%) 41.00 ( 0.00%) Lat 75.00th-qrtle-1 42.00 ( 0.00%) 42.00 ( 0.00%) Lat 90.00th-qrtle-1 43.00 ( 0.00%) 44.00 ( -2.33%) Lat 95.00th-qrtle-1 44.00 ( 0.00%) 46.00 ( -4.55%) Lat 99.00th-qrtle-1 57.00 ( 0.00%) 58.00 ( -1.75%) Lat 99.50th-qrtle-1 59.00 ( 0.00%) 59.00 ( 0.00%) Lat 99.90th-qrtle-1 67.00 ( 0.00%) 78.00 ( -16.42%) Lat 50.00th-qrtle-2 40.00 ( 0.00%) 51.00 ( -27.50%) Lat 75.00th-qrtle-2 45.00 ( 0.00%) 56.00 ( -24.44%) Lat 90.00th-qrtle-2 53.00 ( 0.00%) 59.00 ( -11.32%) Lat 95.00th-qrtle-2 57.00 ( 0.00%) 61.00 ( -7.02%) Lat 99.00th-qrtle-2 67.00 ( 0.00%) 71.00 ( -5.97%) Lat 99.50th-qrtle-2 69.00 ( 0.00%) 74.00 ( -7.25%) Lat 99.90th-qrtle-2 83.00 ( 0.00%) 77.00 ( 7.23%) Lat 50.00th-qrtle-4 51.00 ( 0.00%) 51.00 ( 0.00%) Lat 75.00th-qrtle-4 57.00 ( 0.00%) 56.00 ( 1.75%) Lat 90.00th-qrtle-4 60.00 ( 0.00%) 59.00 ( 1.67%) Lat 95.00th-qrtle-4 62.00 ( 0.00%) 62.00 ( 0.00%) Lat 99.00th-qrtle-4 73.00 ( 0.00%) 72.00 ( 1.37%) Lat 99.50th-qrtle-4 76.00 ( 0.00%) 74.00 ( 2.63%) Lat 99.90th-qrtle-4 85.00 ( 0.00%) 78.00 ( 8.24%) Lat 50.00th-qrtle-8 54.00 ( 0.00%) 58.00 ( -7.41%) Lat 75.00th-qrtle-8 59.00 ( 0.00%) 62.00 ( -5.08%) Lat 90.00th-qrtle-8 65.00 ( 0.00%) 66.00 ( -1.54%) Lat 95.00th-qrtle-8 67.00 ( 0.00%) 70.00 ( -4.48%) Lat 99.00th-qrtle-8 78.00 ( 0.00%) 79.00 ( -1.28%) Lat 99.50th-qrtle-8 81.00 ( 0.00%) 80.00 ( 1.23%) Lat 99.90th-qrtle-8 116.00 ( 0.00%) 83.00 ( 28.45%) Lat 50.00th-qrtle-16 65.00 ( 0.00%) 64.00 ( 1.54%) Lat 75.00th-qrtle-16 77.00 ( 0.00%) 71.00 ( 7.79%) Lat 90.00th-qrtle-16 83.00 ( 0.00%) 82.00 ( 1.20%) Lat 95.00th-qrtle-16 87.00 ( 0.00%) 87.00 ( 0.00%) Lat 99.00th-qrtle-16 95.00 ( 0.00%) 96.00 ( -1.05%) Lat 99.50th-qrtle-16 99.00 ( 0.00%) 103.00 ( -4.04%) Lat 99.90th-qrtle-16 104.00 ( 0.00%) 122.00 ( -17.31%) Lat 50.00th-qrtle-32 71.00 ( 0.00%) 73.00 ( -2.82%) Lat 75.00th-qrtle-32 91.00 ( 0.00%) 92.00 ( -1.10%) Lat 90.00th-qrtle-32 108.00 ( 0.00%) 107.00 ( 0.93%) Lat 95.00th-qrtle-32 118.00 ( 0.00%) 115.00 ( 2.54%) Lat 99.00th-qrtle-32 134.00 ( 0.00%) 129.00 ( 3.73%) Lat 99.50th-qrtle-32 138.00 ( 0.00%) 133.00 ( 3.62%) Lat 99.90th-qrtle-32 149.00 ( 0.00%) 146.00 ( 2.01%) Lat 50.00th-qrtle-39 83.00 ( 0.00%) 81.00 ( 2.41%) Lat 75.00th-qrtle-39 105.00 ( 0.00%) 102.00 ( 2.86%) Lat 90.00th-qrtle-39 120.00 ( 0.00%) 119.00 ( 0.83%) Lat 95.00th-qrtle-39 129.00 ( 0.00%) 128.00 ( 0.78%) Lat 99.00th-qrtle-39 153.00 ( 0.00%) 149.00 ( 2.61%) Lat 99.50th-qrtle-39 166.00 ( 0.00%) 156.00 ( 6.02%) Lat 99.90th-qrtle-39 12304.00 ( 0.00%) 12848.00 ( -4.42%) When heavily loaded (e.g. 99.50th-qrtle-39 indicates 39 threads), there are small gains in many cases. Otherwise it depends on the quartile used where it can be bad -- e.g. 75.00th-qrtle-2. However, even these results are probably a co-incidence. For this workload, much depends on what node the threads get placed on and their relative locality and not wakeups from interrupt context. A larger component on how it behaves would be automatic NUMA balancing where a fault incurred to measure locality would be a much larger contributer to latency than the wakeup path. This is the results from an almost identical machine that happened to run the same test. They only differ in terms of storage which is irrelevant for this test. 4.15.0-rc3 4.15.0-rc3 vanilla noirq-v1r1 Lat 50.00th-qrtle-1 41.00 ( 0.00%) 41.00 ( 0.00%) Lat 75.00th-qrtle-1 42.00 ( 0.00%) 42.00 ( 0.00%) Lat 90.00th-qrtle-1 44.00 ( 0.00%) 43.00 ( 2.27%) Lat 95.00th-qrtle-1 53.00 ( 0.00%) 45.00 ( 15.09%) Lat 99.00th-qrtle-1 59.00 ( 0.00%) 58.00 ( 1.69%) Lat 99.50th-qrtle-1 60.00 ( 0.00%) 59.00 ( 1.67%) Lat 99.90th-qrtle-1 86.00 ( 0.00%) 61.00 ( 29.07%) Lat 50.00th-qrtle-2 52.00 ( 0.00%) 41.00 ( 21.15%) Lat 75.00th-qrtle-2 57.00 ( 0.00%) 46.00 ( 19.30%) Lat 90.00th-qrtle-2 60.00 ( 0.00%) 53.00 ( 11.67%) Lat 95.00th-qrtle-2 62.00 ( 0.00%) 57.00 ( 8.06%) Lat 99.00th-qrtle-2 73.00 ( 0.00%) 68.00 ( 6.85%) Lat 99.50th-qrtle-2 74.00 ( 0.00%) 71.00 ( 4.05%) Lat 99.90th-qrtle-2 90.00 ( 0.00%) 75.00 ( 16.67%) Lat 50.00th-qrtle-4 57.00 ( 0.00%) 52.00 ( 8.77%) Lat 75.00th-qrtle-4 60.00 ( 0.00%) 58.00 ( 3.33%) Lat 90.00th-qrtle-4 62.00 ( 0.00%) 62.00 ( 0.00%) Lat 95.00th-qrtle-4 65.00 ( 0.00%) 65.00 ( 0.00%) Lat 99.00th-qrtle-4 76.00 ( 0.00%) 75.00 ( 1.32%) Lat 99.50th-qrtle-4 77.00 ( 0.00%) 77.00 ( 0.00%) Lat 99.90th-qrtle-4 87.00 ( 0.00%) 81.00 ( 6.90%) Lat 50.00th-qrtle-8 59.00 ( 0.00%) 57.00 ( 3.39%) Lat 75.00th-qrtle-8 63.00 ( 0.00%) 62.00 ( 1.59%) Lat 90.00th-qrtle-8 66.00 ( 0.00%) 67.00 ( -1.52%) Lat 95.00th-qrtle-8 68.00 ( 0.00%) 70.00 ( -2.94%) Lat 99.00th-qrtle-8 79.00 ( 0.00%) 80.00 ( -1.27%) Lat 99.50th-qrtle-8 80.00 ( 0.00%) 84.00 ( -5.00%) Lat 99.90th-qrtle-8 84.00 ( 0.00%) 90.00 ( -7.14%) Lat 50.00th-qrtle-16 65.00 ( 0.00%) 65.00 ( 0.00%) Lat 75.00th-qrtle-16 77.00 ( 0.00%) 75.00 ( 2.60%) Lat 90.00th-qrtle-16 84.00 ( 0.00%) 83.00 ( 1.19%) Lat 95.00th-qrtle-16 88.00 ( 0.00%) 87.00 ( 1.14%) Lat 99.00th-qrtle-16 97.00 ( 0.00%) 96.00 ( 1.03%) Lat 99.50th-qrtle-16 100.00 ( 0.00%) 104.00 ( -4.00%) Lat 99.90th-qrtle-16 110.00 ( 0.00%) 126.00 ( -14.55%) Lat 50.00th-qrtle-32 70.00 ( 0.00%) 71.00 ( -1.43%) Lat 75.00th-qrtle-32 92.00 ( 0.00%) 94.00 ( -2.17%) Lat 90.00th-qrtle-32 110.00 ( 0.00%) 110.00 ( 0.00%) Lat 95.00th-qrtle-32 121.00 ( 0.00%) 118.00 ( 2.48%) Lat 99.00th-qrtle-32 135.00 ( 0.00%) 137.00 ( -1.48%) Lat 99.50th-qrtle-32 140.00 ( 0.00%) 146.00 ( -4.29%) Lat 99.90th-qrtle-32 150.00 ( 0.00%) 160.00 ( -6.67%) Lat 50.00th-qrtle-39 80.00 ( 0.00%) 71.00 ( 11.25%) Lat 75.00th-qrtle-39 102.00 ( 0.00%) 91.00 ( 10.78%) Lat 90.00th-qrtle-39 118.00 ( 0.00%) 108.00 ( 8.47%) Lat 95.00th-qrtle-39 128.00 ( 0.00%) 117.00 ( 8.59%) Lat 99.00th-qrtle-39 149.00 ( 0.00%) 133.00 ( 10.74%) Lat 99.50th-qrtle-39 160.00 ( 0.00%) 139.00 ( 13.12%) Lat 99.90th-qrtle-39 13808.00 ( 0.00%) 4920.00 ( 64.37%) Despite being nearly identical, it showed a variety of major gains so I'm not convinced that heavy emphasis should be placed on this particular workload in terms of evaluating this particular patch. Further evidence of this is the fact that testing on a UMA machine showed small gains/losses even though the patch should be a no-op on UMA. Link: http://lkml.kernel.org/r/20171219085947.13136-2-mgorman@techsingularity.net Change-Id: I788a3c61c044b86a3c45e46010660ff0d40f3615 Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com> Signed-off-by: Yaroslav Furman <yaro330@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:55 -03:00
Mel Gorman	7c2959ecb1	sched/fair: Restructure wake_affine() to return a CPU id This is a preparation patch that has wake_affine() return a CPU ID instead of a boolean. The intent is to allow the wake_affine() helpers to be avoided if a decision is already made. This patch has no functional change. Link: http://lkml.kernel.org/r/20180130104555.4125-3-mgorman@techsingularity.net Change-Id: I74fb4723c07c1dcbc401e9f88ad83834770875f4 Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com> Signed-off-by: Yaroslav Furman <yaro330@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:55 -03:00
Mel Gorman	88e795c365	sched/fair: Remove unnecessary parameters from wake_affine_idle() wake_affine_idle() takes parameters it never uses so clean it up. Link: http://lkml.kernel.org/r/20180130104555.4125-2-mgorman@techsingularity.net Change-Id: I3093bc72dac98eac3e4a6afa131f2e806493c0ee Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com> Signed-off-by: Yaroslav Furman <yaro330@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:55 -03:00
Joel Fernandes	623c84365b	sched/fair: Skip frequency updates if CPU about to idle If CPU is about to idle, prevent a frequency update. With the number of schedutil governor wake ups are reduced by more than half on a test playing bluetooth audio. Test: sugov wake ups drop by more than half when playing music with screen off (476 / 1092) [yaro]: Ported to 4.14 Bug: 64689959 Change-Id: I400026557b4134c0ac77f51c79610a96eb985b4a Signed-off-by: Joel Fernandes <joelaf@google.com> Signed-off-by: Yaroslav Furman <yaro330@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:55 -03:00
Yaroslav Furman	50c8304c35	Revert "sched/fair: Drop always true parameter of update_cfs_rq_load_avg()" We need this to implement "sched/fair: Skip frequency updates if CPU about to idle" This reverts commit 3a123bbbb10d54dbdde6ccbbd519c74c91ba2f52. Change-Id: I8abe4b2e0ae0c610a1477c24fb7b356fec0de409 Signed-off-by: Yaroslav Furman <yaro330@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:55 -03:00
Richard Raya	c0cc305165	msm-4.14: Add CPU LLCC bus boost triggers Change-Id: I8ed4fcc32e643e871dda2e40fb86b3fc3f4326a4 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-07-02 11:42:54 -03:00
Richard Raya	ed703dfda8	Revert "sched/fair: Set migration cost to zero to leverage DynamIQ Shared Unit" This reverts commit c34f4b0a6f0619f0ead0dc55e4944b20d45ce37d.	2024-06-15 18:08:10 -03:00
celtare21	3e8a26fb1a	sched/tune: Stub prefer_high_cap Change-Id: I5cc328eb8a56ac12eb3a41d4984fd25988d40020 Signed-off-by: celtare21 <celtare21@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-06-15 15:31:22 -03:00
EmanuelCN	76aabaeca6	sysctl: Add sched_lib dummies * So userspace wont complain about it missing Change-Id: I79e0da1677c7987442ac6827f4b38b74a0a19cd9 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-06-15 15:31:22 -03:00
Alexander Winkowski	9953f6ead3	sysctl: Add dummy sched_sync_hint_enable node for QTI perf HAL Change-Id: I42206524415d198c3060063c8d4182d12551c6a3 Signed-off-by: Alexander Winkowski <dereference23@outlook.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-06-15 15:31:22 -03:00
Adam W. Willis	4495d7afcc	sysctl: Introduce compatibility layer to appease OOS when !SCHED_WALT Based on 441d1a2 ("sysctl: Expose a few additional sched features to appease OOS when !SCHED_WALT") This creates a set of no-op proc nodes to stand in for certain features specific to SCHED_WALT for the purpose of appeasing userspace and preventing runaway error spamming to logcat during interactions, e.g. ANDR-PERF-UTIL Failed to read /proc/sys/kernel/sched_busy_hyst_ns ANDR-PERF-OPTSHANDLER Failed to read /proc/sys/kernel/sched_busy_hyst_ns ANDR-PERF-UTIL Failed to read /proc/sys/kernel/sched_prefer_spread ANDR-PERF-OPTSHANDLER Failed to read /proc/sys/kernel/sched_prefer_spread ANDR-PERF-UTIL Failed to read /proc/sys/kernel/sched_busy_hysteresis_enable_cpus ANDR-PERF-OPTSHANDLER Failed to read /proc/sys/kernel/sched_busy_hysteresis_enable_cpus Please note that these nodes do not function properly, or at all, and exist solely to provide CAF's performance HAL with something to read and write meaningless numbers to. Change-Id: I17dcd6def7ce4dcbc1a4be2b51727e85a18c050e Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-06-15 15:31:22 -03:00
Richard Raya	7c54142e28	Revert "sched/tune: Restrict stune boost to top-app" This reverts commit 2ef5980a4daedca5be699e284e43d0de1cb87139. Change-Id: I7d597730c3022bb59e872ef030fc7efd403f2817 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-06-04 21:03:24 -03:00
Richard Raya	3143685e95	Merge branch 'linux-4.14.y' of https://github.com/openela/kernel-lts * 'linux-4.14.y' of https://github.com/openela/kernel-lts: (278 commits) LTS: Update to 4.14.348 docs: kernel_include.py: Cope with docutils 0.21 serial: kgdboc: Fix NMI-safety problems from keyboard reset code btrfs: add missing mutex_unlock in btrfs_relocate_sys_chunks() dm: limit the number of targets and parameter size area Revert "selftests: mm: fix map_hugetlb failure on 64K page size systems" LTS: Update to 4.14.347 rds: Fix build regression. RDS: IB: Use DEFINE_PER_CPU_SHARED_ALIGNED for rds_ib_stats af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc(). net: fix out-of-bounds access in ops_init drm/vmwgfx: Fix invalid reads in fence signaled events dyndbg: fix old BUG_ON in >control parser tipc: fix UAF in error path usb: gadget: f_fs: Fix a race condition when processing setup packets. usb: gadget: composite: fix OS descriptors w_value logic firewire: nosy: ensure user_length is taken into account when fetching packet contents af_unix: Fix garbage collector racing against connect() af_unix: Do not use atomic ops for unix_sk(sk)->inflight. ipv6: fib6_rules: avoid possible NULL dereference in fib6_rule_action() ... Change-Id: If329d39dd4e95e14045bb7c58494c197d1352d60 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-06-04 16:33:29 -03:00
Robin H. Johnson	e3b3f139e8	tracing: Show size of requested perf buffer commit a90afe8d020da9298c98fddb19b7a6372e2feb45 upstream. If the perf buffer isn't large enough, provide a hint about how large it needs to be for whatever is running. Link: https://lkml.kernel.org/r/20210831043723.13481-1-robbat2@gentoo.org Signed-off-by: Robin H. Johnson <robbat2@gentoo.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> (cherry picked from commit 78b92d50fe6ab79d536f4b12c5bde15f2751414d) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-05-31 12:57:27 +00:00
Siddh Raman Pant	638cc92651	Revert "tracing/trigger: Fix to return error if failed to alloc snapshot" This reverts commit bcf4a115a5068f3331fafb8c176c1af0da3d8b19 which is commit 0958b33ef5a04ed91f61cef4760ac412080c4e08 upstream. The change has an incorrect assumption about the return value because in the current stable trees for versions 5.15 and before, the following commit responsible for making 0 a success value is not present: b8cc44a4d3c1 ("tracing: Remove logic for registering multiple event triggers at a time") The return value should be 0 on failure in the current tree, because in the functions event_trigger_callback() and event_enable_trigger_func(), we have: ret = cmd_ops->reg(glob, trigger_ops, trigger_data, file); /* * The above returns on success the # of functions enabled, * but if it didn't find any functions it returns zero. * Consider no functions a failure too. */ if (!ret) { ret = -ENOENT; Cc: stable@kernel.org # 5.15, 5.10, 5.4, 4.19 Signed-off-by: Siddh Raman Pant <siddh.raman.pant@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 34925d01baf3ee62ab21c21efd9e2c44c24c004a) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-05-31 12:30:39 +00:00
Zheng Yejian	b13c030fd2	kprobes: Fix possible use-after-free issue on kprobe registration commit 325f3fb551f8cd672dbbfc4cf58b14f9ee3fc9e8 upstream. When unloading a module, its state is changing MODULE_STATE_LIVE -> MODULE_STATE_GOING -> MODULE_STATE_UNFORMED. Each change will take a time. `is_module_text_address()` and `__module_text_address()` works with MODULE_STATE_LIVE and MODULE_STATE_GOING. If we use `is_module_text_address()` and `__module_text_address()` separately, there is a chance that the first one is succeeded but the next one is failed because module->state becomes MODULE_STATE_UNFORMED between those operations. In `check_kprobe_address_safe()`, if the second `__module_text_address()` is failed, that is ignored because it expected a kernel_text address. But it may have failed simply because module->state has been changed to MODULE_STATE_UNFORMED. In this case, arm_kprobe() will try to modify non-exist module text address (use-after-free). To fix this problem, we should not use separated `is_module_text_address()` and `__module_text_address()`, but use only `__module_text_address()` once and do `try_module_get(module)` which is only available with MODULE_STATE_LIVE. Link: https://lore.kernel.org/all/20240410015802.265220-1-zhengyejian1@huawei.com/ Fixes: 28f6c37a2910 ("kprobes: Forbid probing on trampoline and BPF code areas") Cc: stable@vger.kernel.org Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> [Fix conflict due to lack dependency commit 223a76b268c9 ("kprobes: Fix coding style issues")] Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit b5808d40093403334d939e2c3c417144d12a6f33) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-05-31 12:30:39 +00:00
Arnd Bergmann	39f0520cea	tracing: hide unused ftrace_event_id_fops [ Upstream commit 5281ec83454d70d98b71f1836fb16512566c01cd ] When CONFIG_PERF_EVENTS, a 'make W=1' build produces a warning about the unused ftrace_event_id_fops variable: kernel/trace/trace_events.c:2155:37: error: 'ftrace_event_id_fops' defined but not used [-Werror=unused-const-variable=] 2155 \| static const struct file_operations ftrace_event_id_fops = { Hide this in the same #ifdef as the reference to it. Link: https://lore.kernel.org/linux-trace-kernel/20240403080702.3509288-7-arnd@kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Zheng Yejian <zhengyejian1@huawei.com> Cc: Kees Cook <keescook@chromium.org> Cc: Ajay Kaher <akaher@vmware.com> Cc: Jinjie Ruan <ruanjinjie@huawei.com> Cc: Clément Léger <cleger@rivosinc.com> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: "Tzvetomir Stoyanov (VMware)" <tz.stoyanov@gmail.com> Fixes: 620a30e97feb ("tracing: Don't pass file_operations array to event_create_dir()") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 8bfa576fe3c6df875a16f3eb27f7ec3fdd7f3168) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-05-31 12:30:39 +00:00
Yang Jihong	d1bcca1f15	perf/core: Fix reentry problem in perf_output_read_group() commit 6b959ba22d34ca793ffdb15b5715457c78e38b1a upstream. perf_output_read_group may respond to IPI request of other cores and invoke __perf_install_in_context function. As a result, hwc configuration is modified. causing inconsistency and unexpected consequences. Interrupts are not disabled when perf_output_read_group reads PMU counter. In this case, IPI request may be received from other cores. As a result, PMU configuration is modified and an error occurs when reading PMU counter: CPU0 CPU1 __se_sys_perf_event_open perf_install_in_context perf_output_read_group smp_call_function_single for_each_sibling_event(sub, leader) { generic_exec_single if ((sub != event) && remote_function (sub->state == PERF_EVENT_STATE_ACTIVE)) \| <enter IPI handler: __perf_install_in_context> <----RAISE IPI-----+ __perf_install_in_context ctx_resched event_sched_out armpmu_del ... hwc->idx = -1; // event->hwc.idx is set to -1 ... <exit IPI> sub->pmu->read(sub); armpmu_read armv8pmu_read_counter armv8pmu_read_hw_counter int idx = event->hw.idx; // idx = -1 u64 val = armv8pmu_read_evcntr(idx); u32 counter = ARMV8_IDX_TO_COUNTER(idx); // invalid counter = 30 read_pmevcntrn(counter) // undefined instruction Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220902082918.179248-1-yangjihong1@huawei.com Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit a2039c87d30177f0fd349ab000e6af25a0d48de8) [Vegard: fix conflict in context due to missing commit ece0857258cbaf20b9828157035999f46ca060c8 ("perf/core: Add a new read format to get a number of lost samples").] Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-05-30 09:00:41 +00:00
John Ogness	271b5f6285	printk: Update @console_may_schedule in console_trylock_spinning() [ Upstream commit 8076972468584d4a21dab9aa50e388b3ea9ad8c7 ] console_trylock_spinning() may takeover the console lock from a schedulable context. Update @console_may_schedule to make sure it reflects a trylock acquire. Reported-by: Mukesh Ojha <quic_mojha@quicinc.com> Closes: https://lore.kernel.org/lkml/20240222090538.23017-1-quic_mojha@quicinc.com Fixes: dbdda842fe96 ("printk: Add console owner and waiter logic to load balance console writes") Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/875xybmo2z.fsf@jogness.linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 45f99d441067035dbb3f2a0d9713abe61ea721c5) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-05-30 09:00:41 +00:00
Maulik Shah	56a2eacb2e	PM: suspend: Set mem_sleep_current during kernel command line setup [ Upstream commit 9bc4ffd32ef8943f5c5a42c9637cfd04771d021b ] psci_init_system_suspend() invokes suspend_set_ops() very early during bootup even before kernel command line for mem_sleep_default is setup. This leads to kernel command line mem_sleep_default=s2idle not working as mem_sleep_current gets changed to deep via suspend_set_ops() and never changes back to s2idle. Set mem_sleep_current along with mem_sleep_default during kernel command line setup as default suspend mode. Fixes: faf7ec4a92c0 ("drivers: firmware: psci: add system suspend support") CC: stable@vger.kernel.org # 5.4+ Signed-off-by: Maulik Shah <quic_mkshah@quicinc.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 312ead3c0e23315596560e9cc1d6ebbee1282e40) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>	2024-05-30 08:58:50 +00:00
Richard Raya	2d15a0306b	Revert "cpuidle: Hardcode `stop_tick`" This reverts commit 05f9f1909535fc539e2639683f3d70f6dfc0bbc9. Change-Id: I9d179c49a6333c7786985465edb8036636621870 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-29 22:12:32 -03:00
Alexander Winkowski	05f9f19095	cpuidle: Hardcode `stop_tick` Change-Id: I1229bdaa5afcba572c24616aedcd831c93316108 Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-26 16:21:08 -03:00
Richard Raya	bc6b1756c9	Revert "suspend: Switch to s2idle" This reverts commit 13cc43b12b3d4200d5bc8d900d05689b23c58aa5. Change-Id: I990aed89a678011004e123ada5ac07d7b1a8f0f4 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-24 21:24:39 -03:00
Richard Raya	92ac899fb5	Revert "completion: Use simple wait queues" This reverts commit fba50debe41b34bc281947cecd03c67b79f97403. Change-Id: I3fc44d88fc72afdc8c4e26dce5dca996ed415ff9 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-24 21:24:39 -03:00
Richard Raya	4bbc6ce7ad	Revert "PM/s2idle: Make s2idle_wait_head swait based" This reverts commit 099732c94bc367ad219a1d35ff5bcd66e9ea0e1e. Change-Id: I9e6f70c472574770352bf9185736b5c5547091e0 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-24 21:24:39 -03:00
Richard Raya	48e7200544	Revert "sched/tune: Restrict prefer_idle to top-app" This reverts commit bf2fdd35c1cab2dd1c704ad2a40b20ae4473a4fe. Change-Id: Iea1e43e248b9f3a6b2b34c2313aef4e0eaad9381 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-23 19:16:49 -03:00
Sultan Alsawaf	21f963a096	arm64: Disable GENERIC_IRQ_EFFECTIVE_AFF_MASK The effective affinity mask causes a lot of bugs by virtue of many set_irq_affinity handlers only setting an effective affinity mask for an IRQ's parent but not the IRQ itself. Since this is a widespread issue that would require manual fixing on every different SoC, just disable the effective affinity mask altogether and use the first CPU in an affinity mask configured. Change-Id: Ieec06c5c392250608d954dd144e1513a5bdad56f Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-22 20:43:53 -03:00
Sultan Alsawaf	eb9fec4be7	sched: Resolve sched_feat() at compile time to improve code optimization Keeping the compiler in the dark about which sched_feat() toggles are set reduces its ability to optimize scheduler code, which results in a measurably significant performance loss since the affected code is very hot. Turning sched_feat() definitions into macros that the compiler can resolve results in smaller and faster code. The sched_feat() definitions are converted to macros with this sed: sed -i -E \ -e 's/SCHED_FEAT$(.),(.)$/#define SCHED_FEAT_\1\2/' \ -e 's/true/1/' \ -e 's/false/0/' \ kernel/sched/features.h The sched_feat() macros are named SCHED_FEAT_*. With this done, we can now optimize try-to-wake-up code when SCHED_FEAT_TTWU_QUEUE is disabled, which results in a CPU usage reduction of a few percent inside try_to_wake_up(). Change-Id: Id3564300537308a2db5aedebc890be6f8932019b Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-22 20:43:52 -03:00
Vincent Guittot	27bf4daea1	sched/fair: Make sure to try to detach at least one movable task During load balance, we try at most env->loop_max time to move a task. But it can happen that the loop_max LRU tasks (ie tail of the cfs_tasks list) can't be moved to dst_cpu because of affinity. In this case, loop in the list until we found at least one. The maximum of detached tasks remained the same as before. Link: https://lkml.kernel.org/r/20220825122726.20819-2-vincent.guittot@linaro.org Change-Id: Icaab04df216fd58001741d6f234a3964d1569031 Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-22 20:43:50 -03:00
Vincent Guittot	6c5317fdb1	sched/fair: Remove sysctl_sched_migration_cost condition With a default value of 500us, sysctl_sched_migration_cost is significanlty higher than the cost of load_balance. Remove the condition and rely on the sd->max_newidle_lb_cost to abort newidle_balance. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Acked-by: Mel Gorman <mgorman@suse.de> Link: https://lore.kernel.org/r/20211019123537.17146-5-vincent.guittot@linaro.org Change-Id: I5731e7967b03c1c1a542f210d128d58da98fb734 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-22 20:43:49 -03:00
Vincent Guittot	6bdf75a7b2	sched/fair: Skip update_blocked_averages if we are defering load balance In newidle_balance(), the scheduler skips load balance to the new idle cpu when the 1st sd of this_rq is: this_rq->avg_idle < sd->max_newidle_lb_cost Doing a costly call to update_blocked_averages() will not be useful and simply adds overhead when this condition is true. Check the condition early in newidle_balance() to skip update_blocked_averages() when possible. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Acked-by: Mel Gorman <mgorman@suse.de> Link: https://lore.kernel.org/r/20211019123537.17146-3-vincent.guittot@linaro.org Change-Id: I615ac6f2147b4aa044b618957d31392927c1c283 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-22 20:43:48 -03:00
Vincent Guittot	f502c6d2aa	sched/fair: Account update_blocked_averages in newidle_balance cost The time spent to update the blocked load can be significant depending of the complexity fo the cgroup hierarchy. Take this time into account in the cost of the 1st load balance of a newly idle cpu. Also reduce the number of call to sched_clock_cpu() and track more actual work. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Acked-by: Mel Gorman <mgorman@suse.de> Link: https://lore.kernel.org/r/20211019123537.17146-2-vincent.guittot@linaro.org Change-Id: I0921c8b7a37ee1bb764d9de41ca4902ca2bd15c6 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-22 20:43:46 -03:00
Sultan Alsawaf	0fa3521bf8	cpumask: Add optimized helpers when NR_CPUS fits in a long When NR_CPUS fits in a long, it's possible to use compiler built-ins to produce much faster code when operating on cpumasks compared to just using the generic bitops APIs. Therefore, add optimized helpers using compiler built-ins when NR_CPUS fits in a long. This also turns nr_cpu_ids into a compile-time constant for further optimization potential. Note that compared to the upstream cpumask rewrite with this feature, these optimized helpers perfectly preserve the semantics of the helpers they replace. And this change is much smaller than the upstream version. Change-Id: I1ac6058a19bd3b22a491176eef9d661cca78e521 Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-21 20:44:45 -03:00
Richard Raya	13cc43b12b	suspend: Switch to s2idle Change-Id: I0ece3be209c637c59342333408597265e1121397 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-21 20:44:45 -03:00
Sebastian Andrzej Siewior	099732c94b	PM/s2idle: Make s2idle_wait_head swait based [ Upstream commit 93f141324d4860a1294e6899923c01ec5411d70b ] s2idle_wait_head is used during s2idle with interrupts disabled even on RT. There is no "custom" wake up function so swait could be used instead which is also lower weight compared to the wait_queue. Make s2idle_wait_head a swait_queue_head. Change-Id: Iac95f519c770c147ab9fa83de9616095408bc062 Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-21 20:44:45 -03:00
Sebastian Andrzej Siewior	34bc407b9f	PM/suspend: Prevent might sleep splats [ Upstream commit ec7ff06b919647a2fd7d2761a26f5a1d465e819c ] This is an updated version of this patch which was merged upstream as commit c1a957d17086d20d52d7f9c8dffaeac2ee09d6f9 Change-Id: I066f092b7bd3c5729af78c330f1049f84f0b7ad6 Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-21 20:44:44 -03:00
Thomas Gleixner	fba50debe4	completion: Use simple wait queues Completions have no long lasting callbacks and therefor do not need the complex waitqueue variant. Use simple waitqueues which reduces the contention on the waitqueue lock. Change-Id: I071e1408cb43f2b432179c42b01ced8df6efd127 Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-21 20:44:44 -03:00
Richard Raya	1e2c5c9755	Revert "PM / freezer: Abort suspend when there's a wakeup while freezing" This reverts commit 64a6b2c346641aa6d8b08368dad2e085dd82f3d1. Change-Id: Ifccdc2e16c44c5c67ed974ea486411b6f6bc42d7 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-21 20:44:44 -03:00
Sultan Alsawaf	f3d379116c	qos: Change cpus_affine to not be atomic There isn't a need for cpus_affine to be atomic, and reading/writing to it outside of the global pm_qos lock is racy anyway. As such, we can simply turn it into a primitive integer type. Change-Id: Icd8eceb3fcf1a07f345ce0ed104b930e405900d1 Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-21 19:08:32 -03:00
Sultan Alsawaf	c0a7549a75	qos: Speed up plist traversal in pm_qos_set_value_for_cpus() The plist is already sorted and traversed in ascending order of PM QoS value, so we can simply look at the lowest PM QoS values which affect the given request's CPUs until we've looked at all of them, at which point the traversal can be stopped early. This also lets us get rid of the pesky qos_val array. Change-Id: I5456b976b16f3544ac9ae3289446f5a4e115e6ef Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-21 19:08:31 -03:00
Sultan Alsawaf	d408ba9c24	qos: Fix PM QoS requests almost never shutting off Andrzej Perczak discovered that his CPUs would almost never enter an idle state deeper than C0, and pinpointed the cause of the issue to be commit "qos: Speed up pm_qos_set_value_for_cpus()". As it turns out, the optimizations introduced in that commit contain two issues that are responsible for this behavior: pm_qos_remove_request() fails to refresh the affected per-CPU targets, and IRQ migrations fail to refresh their old affinity's targets. Removing a request fails to refresh the per-CPU targets because `new_req->node.prio` isn't updated to the PM QoS class' default value upon removal, and so it contains its old value from when it was active. This causes the `changed` loop in pm_qos_set_value_for_cpus() to check against a stale PM QoS request value and erroneously determine that the request in question doesn't alter the current per-CPU targets. As for IRQ migrations, only the new CPU affinity mask gets updated, which causes the CPUs present in the old affinity mask but not the new one to retain their targets, specifically when a migration occurs while the associated PM QoS request is active. To fix these issues while retaining optimal speed, update PM QoS requests' CPU affinity inside pm_qos_set_value_for_cpus() so that the old affinity can be known, and skip the `changed` loop when the request in question is being removed. Change-Id: I8b4626a253a85a19d1fafc197ba2c18a6ed73bbf Reported-by: Andrzej Perczak <kartapolska@gmail.com> Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-21 19:08:28 -03:00
Sultan Alsawaf	e2fccb76ad	qos: Speed up pm_qos_set_value_for_cpus() A lot of unnecessary work is done in pm_qos_set_value_for_cpus(), especially when the request being updated isn't affined to all CPUs. We can reduce the work done here significantly by only inspecting the CPUs which are affected by the updated request, and bailing out if the updated request doesn't change anything. We can make some other micro-optimizations as well knowing that this code is only for the PM_QOS_CPU_DMA_LATENCY class. Change-Id: I9e8e7bb09d20a97b9c5853c32ececb17f17a891d Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-21 19:08:27 -03:00
Richard Raya	4f01075fa2	Revert "qos: Change cpus_affine to not be atomic" This reverts commit b785b90d42a353dde68edc7cd1ad2166772fdc1f. Change-Id: I9ac0f0998104f01243b0c35e3ef1ec530bedd03a Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-21 19:08:26 -03:00
Richard Raya	718e3d532f	Revert "workqueue: Schedule workers on CPU0 or 0-5 by default" This reverts commit fae4a4fed88fad9f23590d9890c5e8641565ce01. Change-Id: I4b95b19ac3fc0ab48e00136cd9cf7c67adbc0f1f Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-18 17:38:08 -03:00
Sultan Alsawaf	c34f4b0a6f	sched/fair: Set migration cost to zero to leverage DynamIQ Shared Unit Within a DynamIQ Shared Unit (DSU), task migration cost is optimized through L2 and L3 cache sharing. When a task is migrated between CPUs within the same DSU cluster, there is no loss of L2$ and L3$ locality. Since all CPU cores are tightly interconnected within a DSU, set the task migration cost to zero knowing that the DSU will facilitate L$ sharing via snooping. This leverages the DSU to improve power consumption and performance on systems which contain a single DSU cluster. Change-Id: Ifce37fc6c125097c1965d12c61b4ee351b9dbd94 Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-14 21:19:59 -03:00
Sultan Alsawaf	4ecbcfecba	sched/core: Forbid Unity-based games from changing their CPU affinity Unity-based games (such as Wild Rift) like to shoot themselves in the foot by setting a nonsense CPU affinity, restricting the game to a narrow set of CPU cores that it thinks are the "big" cores in a heterogeneous CPU. It assumes that CPUs only have two performance domains (clusters), and therefore royally mucks up games' CPU affinities on CPUs which have more than two performance domains. Check if a setaffinity target task is part of a Unity-based game and silently ignore the setaffinity request so that it can't sabotage itself. Change-Id: I47c292bd2b9f08b6299378308ba646168b341fb5 Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-14 21:19:59 -03:00
Richard Raya	6a9d9fb9aa	sched/core: Revert Qualcomm's sched_lib affinities Change-Id: Ic30e3bfae3aa47c7b9edea5500159f597aa95918 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-14 21:19:59 -03:00
Richard Raya	cb228f45ac	Revert "sched/core: Use SCHED_RR in place of SCHED_FIFO for all users" This reverts commit 316fc7da752bbbf792ee8240c2303200e36e91eb. Change-Id: I371bd4d88c5175ee1e4dabbf186aa44defeec711 Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-14 21:19:58 -03:00
Richard Raya	483c8826cf	sched/fair: Switch scaling to logarithmical Change-Id: I4b8ea6b8b79935b1d91912bb9fc5263cf96f832f Signed-off-by: Richard Raya <rdxzv.dev@gmail.com>	2024-05-10 15:49:22 -03:00

1 2 3 4 5 ...

29066 Commits