26917 Commits

Author SHA1 Message Date
qctecmdr Service
4402e6fdbd Merge "sched: walt: Optimize cpu_util() and cpu_util_cum()" 2018-08-22 16:27:43 -07:00
Pavankumar Kondeti
325ffb56ba sched: Improve the scheduler
This change is for general scheduler improvement

Change-Id: I5ad0b6e4e90e7d9cb3bd5d5bd5e909ebcb9739e2
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-08-21 12:29:46 -07:00
Pavankumar Kondeti
eb35eebc5c sched: walt: Optimize cpu_util() and cpu_util_cum()
The task demand in 1024 units is readily available in task_struct, so
use it directly for cumulative_runnable_avg and cum_window_demand
accounting. The cpu_util() and cpu_util_cum() functions which are
called multiple times during the task placement can return the
scaled values without doing any math.

Scaling the sum of unscaled demand of tasks is more accurate compared
to the sum of scaled demand of tasks, but it is good enough for
task placement decisions.

Change-Id: Iba4be93cd34f130bed1cb533ecaa52ab8bae5f3d
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-08-21 10:41:37 +05:30
Pavankumar Kondeti
289dd294f3 sched: walt: Optimize task_util()
task_util() for WALT is currently defined as

p->ravg.demand / (sched_ravg_window >> SCHED_CAPACITY_SHIFT);

This math is required to scale the task demand to 1024 scale.
task_util() is used many times in task placement. So the calls
to this can be optimized by caching the scaled value when task
demand is calculated.

Change-Id: I0c170a10704ae3e8fe4e9f271e8e65c3923075e5
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-08-21 10:41:36 +05:30
Pavankumar Kondeti
635bbd2366 sched: walt: Refactor WALT initialization routine
The global variables are initialized in walt_sched_init(). This
function is called for each rq initialization and the global
variables are initialized again unnecessarily. Refactor the code
to fix this.

Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
Change-Id: I085bc9ff576f655afa12fda015f054fbab3a0c91
2018-08-21 10:41:36 +05:30
Pavankumar Kondeti
0f07a9533e sched: walt: Remove unused last_switch_out_ts member from task_struct
last_switch_out_ts member in task_struct is not used. So remove it.

Change-Id: I7f088f1ee86fa61d3ce88646734866dd7ed2bfe2
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-08-20 19:10:08 +05:30
qctecmdr Service
1c65a0720e Merge "sched: Improve the scheduler" 2018-08-16 08:54:20 -07:00
qctecmdr Service
e475065714 Merge "timers: Clear timer_base::must_forward_clk with timer_base::lock held" 2018-08-15 14:03:34 -07:00
qctecmdr Service
3fd2e0f21e Merge "debugfs: defer debugfs_fsdata allocation to first usage" 2018-08-15 10:39:25 -07:00
Pavankumar Kondeti
4dfac456ce sched: Improve the scheduler
This change is for general scheduler improvement.

Change-Id: Ib93a77c1d12574354213a37b1171a462b7cab685
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-08-14 14:13:51 -07:00
Gaurav Kohli
0484d2d1a0 timers: Clear timer_base::must_forward_clk with timer_base::lock held
timer_base::must_forward_clock is indicating that the base clock might be
stale due to a long idle sleep.

The forwarding of the base clock takes place in the timer softirq or when a
timer is enqueued to a base which is idle. If the enqueue of timer to an
idle base happens from a remote CPU, then the following race can happen:

  CPU0					CPU1
  run_timer_softirq			mod_timer

					base = lock_timer_base(timer);
  base->must_forward_clk = false
					if (base->must_forward_clk)
				       	    forward(base); -> skipped

					enqueue_timer(base, timer, idx);
					-> idx is calculated high due to
					   stale base
					unlock_timer_base(timer);
  base = lock_timer_base(timer);
  forward(base);

The root cause is that timer_base::must_forward_clk is cleared outside the
timer_base::lock held region, so the remote queuing CPU observes it as
cleared, but the base clock is still stale. This can cause large
granularity values for timers, i.e. the accuracy of the expiry time
suffers.

Prevent this by clearing the flag with timer_base::lock held, so that the
forwarding takes place before the cleared flag is observable by a remote
CPU.

Change-Id: Ia1d8eb0807861a2a520ea2fee0df00daeeccb675
Signed-off-by: Gaurav Kohli <gkohli@codeaurora.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john.stultz@linaro.org
Cc: sboyd@kernel.org
Cc: linux-arm-msm@vger.kernel.org
Link: https://lkml.kernel.org/r/1533199863-22748-1-git-send-email-gkohli@codeaurora.org
Git-commit: 363e934d8811d799c88faffc5bfca782fd728334
Git-repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2018-08-14 13:49:02 -07:00
Pavankumar Kondeti
59e635fb51 sched: Improve the scheduler
This change is for general scheduler improvement.

Change-Id: Iea8e3a6f6b2d83a755b51674e08a1cadcc0247ba
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-08-14 12:48:43 -07:00
Raghavendra Rao Ananta
584987cdca trace: convert to debugfs_file_get() and -put()
Convert the depreciated calls, debugfs_use_file_start() and
debugfs_use_file_finish() to debugfs_file_get() and debugfs_file_put()
respectively.

Change-Id: Ifdf214d4274a4062d7a077fdc94a9528d960eed6
Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
2018-08-13 22:12:13 -07:00
qctecmdr Service
5003c47289 Merge "sched/fair: Do load balancing of misfit task only when dst_cpu is idle" 2018-08-13 20:49:18 -07:00
qctecmdr Service
73db437ff5 Merge "sched: walt: Add BUG_ON() when wallclock goes backwards" 2018-08-13 13:36:15 -07:00
Satya Durga Srinivasu Prabhala
8832b80a65 sched/fair: Do load balancing of misfit task only when dst_cpu is idle
'commit 53adc803f95a ("FROMLIST: sched/fair: Consider misfit tasks
when load-balancing")' introduces a change where a higher capacity
CPUs could pull a misfit task from lower capacity via active load
balancing. However it fails to take in to account if the dst cpu
is already busy, we want to avoid overcrowding the dst cpu and
further causing migrations away from it.

Add a check to ensure that misfit tasks are pulled via active
load balancing only if the dst cpu is idle.

Change-Id: Iceee1c36dba445a45aaee5e380a729dac68e56c8
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2018-08-13 10:24:01 -07:00
John Dias
c82c3d2372 sched: walt: fix out-of-bounds access
A computation in update_top_tasks() is indexing
off the end of a top_tasks array. There's code
to limit the index in the computation, but it's
insufficient.

Bug: 110529282
Change-Id: Idb5ff5e5800c014394bcb04638844bf1e057a40c
Signed-off-by: John Dias <joaodias@google.com>
Signed-off-by: Lingutla Chandrasekhar <clingutla@codeaurora.org>
2018-08-09 06:07:23 -07:00
qctecmdr Service
0aa2590930 Merge "nohz: Fix local_timer_softirq_pending()" 2018-08-09 01:36:46 -07:00
qctecmdr Service
3f60b75007 Merge "ANDROID: sched/events: Introduce util_est trace events" 2018-08-06 15:21:05 -07:00
qctecmdr Service
a445fba039 Merge "stop_machine: Atomically queue and wake stopper threads" 2018-08-06 10:40:41 -07:00
Anna-Maria Gleixner
218a1100c4 nohz: Fix local_timer_softirq_pending()
local_timer_softirq_pending() checks whether the timer softirq is
pending with: local_softirq_pending() & TIMER_SOFTIRQ.

This is wrong because TIMER_SOFTIRQ is the softirq number and not a
bitmask. So the test checks for the wrong bit.

Use BIT(TIMER_SOFTIRQ) instead.

Change-Id: I48cb96a585d1f60d5ad8aaceef077eac08ff37d0
Fixes: 5d62c183f9e9 ("nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()")
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Cc: bigeasy@linutronix.de
Cc: peterz@infradead.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180731161358.29472-1-anna-maria@linutronix.de
Git-commit: 80d20d35af1edd632a5e7a3b9c0ab7ceff92769e
Git-repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2018-08-06 09:09:18 -07:00
Patrick Bellasi
bb343c1564 ANDROID: sched/events: Introduce util_est trace events
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Change-Id: I359f7ffbd62e86a16a96d7f02da38e9ff260fd99
Git-commit: a9d8b29e5195a8ecc3b485f19b71533b07e34792
Git-repo: https://android.googlesource.com/kernel/common/
[satyap@codeaurora.org: trivial merge conflict resolution in
include/trace/events/sched.h]
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2018-08-03 14:24:41 -07:00
Patrick Bellasi
44d3dca274 ANDROID: sched/fair: schedtune: update before schedutil
When a task is enqueue its boosted value must be accounted on that CPU
to better support the selection of the required frequency.
However, schedutil is (implicitly) updated by update_load_avg() which
always happens before schedtune_{en,de}queue_task(), thus potentially
introducing a latency between boost value updates and frequency
selections.

Let's update schedtune at the beginning of enqueue_task_fair(),
which will ensure that all schedutil updates will see the most
updated boost value for a CPU.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Change-Id: I1038f00600dd43ca38b76b2c5681b4f438ae4036
Git-commit: 78605d7daf0b8a9ec559fc38431dd19f015eed7a
Git-repo: https://android.googlesource.com/kernel/common/
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2018-08-03 14:24:37 -07:00
Patrick Bellasi
456f33d5cb FROMLIST: sched/fair: add support to tune PELT ramp/decay timings
The PELT half-life is the time [ms] required by the PELT signal to build
up a 50% load/utilization, starting from zero. This time is currently
hardcoded to be 32ms, a value which seems to make sense for most of the
workloads.

However, 32ms has been verified to be too long for certain classes of
workloads. For example, in the mobile space many tasks affecting the
user-experience run with a 16ms or 8ms cadence, since they need to match
the common 60Hz or 120Hz refresh rate of the graphics pipeline.
This contributed so fare to the idea that "PELT is too slow" to properly
track the utilization of interactive mobile workloads, especially
compared to alternative load tracking solutions which provides a
better representation of tasks demand in the range of 10-20ms.

A faster PELT ramp-up time could give some advantages to speed-up the
time required for the signal to stabilize and thus to better represent
task demands in the mobile space. As a downside, it also reduces the
decay time, and thus we forget the load/utilization of sleeping tasks
(or idle CPUs) faster.

Fortunately, since the integration of the utilization estimation
support in mainline kernel:

   commit 7f65ea42eb00 ("sched/fair: Add util_est on top of PELT")

a fast decay time is no longer an issue for tasks utilization estimation.
Although estimated utilization does not slow down the decay of blocked
utilization on idle CPUs, for mobile workloads this seems not to be a
major concern compared to the benefits in interactivity responsiveness.

Let's add a compile time option to choose the PELT speed which better
fits for a specific system. By default the current 32ms half-life is
used, but we can also compile a kernel to use a faster ramp-up time of
either 16ms or 8ms. These two configurations have been verified to give
PELT a further improvement in performance, compared to other out-of-tree
load tracking solutions, when it comes to track interactive workloads
thus better supporting both tasks placements and frequencies selections.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Paul Turner <pjt@google.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

[
 backport from LKML:
 Message-ID: <20180409165134.707-1-patrick.bellasi@arm.com>
]
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Change-Id: I50569748918b799ac4bf4e7d2b387253080a0fd2
Git-commit: cb22d9159761cb32c35a5f9399b8011fcdae654b
Git-repo: https://android.googlesource.com/kernel/common/
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2018-08-03 14:24:32 -07:00
Patrick Bellasi
4869fc02aa BACKPORT: sched/fair: Update util_est before updating schedutil
When a task is enqueued the estimated utilization of a CPU is updated
to better support the selection of the required frequency.

However, schedutil is (implicitly) updated by update_load_avg() which
always happens before util_est_{en,de}queue(), thus potentially
introducing a latency between estimated utilization updates and
frequency selections.

Let's update util_est at the beginning of enqueue_task_fair(),
which will ensure that all schedutil updates will see the most
updated estimated utilization value for a CPU.

Reported-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
Cc: Steve Muckle <smuckle@google.com>
Fixes: 7f65ea42eb00 ("sched/fair: Add util_est on top of PELT")
Link: http://lkml.kernel.org/r/20180524141023.13765-3-patrick.bellasi@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

[
 backport from upstream:
 commit 2539fc82aa9b ("sched/fair: Update util_est before updating schedutil")
]
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Change-Id: I3bb0df07097d2abebe595198795c78a5cc8c2d43
Git-commit: 7fc13166181e8b70832b0491922c9c8c763d680f
Git-repo: https://android.googlesource.com/kernel/common/
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2018-08-03 14:24:23 -07:00
Patrick Bellasi
e4c8ea3e7f BACKPORT: sched/fair: Update util_est only on util_avg updates
The estimated utilization of a task is currently updated every time the
task is dequeued. However, to keep overheads under control, PELT signals
are effectively updated at maximum once every 1ms.

Thus, for really short running tasks, it can happen that their util_avg
value has not been updates since their last enqueue.  If such tasks are
also frequently running tasks (e.g. the kind of workload generated by
hackbench) it can also happen that their util_avg is updated only every
few activations.

This means that updating util_est at every dequeue potentially introduces
not necessary overheads and it's also conceptually wrong if the util_avg
signal has never been updated during a task activation.

Let's introduce a throttling mechanism on task's util_est updates
to sync them with util_avg updates. To make the solution memory
efficient, both in terms of space and load/store operations, we encode a
synchronization flag into the LSB of util_est.enqueued.
This makes util_est an even values only metric, which is still
considered good enough for its purpose.
The synchronization bit is (re)set by __update_load_avg_se() once the
PELT signal of a task has been updated during its last activation.

Such a throttling mechanism allows to keep under control util_est
overheads in the wakeup hot path, thus making it a suitable mechanism
which can be enabled also on high-intensity workload systems.
Thus, this now switches on by default the estimation utilization
scheduler feature.

Suggested-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
Cc: Steve Muckle <smuckle@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Todd Kjos <tkjos@android.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Link: http://lkml.kernel.org/r/20180309095245.11071-5-patrick.bellasi@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

[
 backport from upstream:
 commit d519329f72a6 ("sched/fair: Update util_est only on util_avg updates")
]
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Change-Id: I1309cffc11c1708c1030364facced7b25bcb49d7
Git-commit: 3356f39e9adca368bca214340c0bdc6234e35cad
Git-repo: https://android.googlesource.com/kernel/common/
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2018-08-03 14:24:09 -07:00
Patrick Bellasi
2fe1540146 BACKPORT: sched/fair: Use util_est in LB and WU paths
When the scheduler looks at the CPU utilization, the current PELT value
for a CPU is returned straight away. In certain scenarios this can have
undesired side effects on task placement.

For example, since the task utilization is decayed at wakeup time, when
a long sleeping big task is enqueued it does not add immediately a
significant contribution to the target CPU.
As a result we generate a race condition where other tasks can be placed
on the same CPU while it is still considered relatively empty.

In order to reduce this kind of race conditions, this patch introduces the
required support to integrate the usage of the CPU's estimated utilization
in the wakeup path, via cpu_util_wake(), as well as in the load-balance
path, via cpu_util() which is used by update_sg_lb_stats().

The estimated utilization of a CPU is defined to be the maximum between
its PELT's utilization and the sum of the estimated utilization (at
previous dequeue time) of all the tasks currently RUNNABLE on that CPU.
This allows to properly represent the spare capacity of a CPU which, for
example, has just got a big task running since a long sleep period.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
Cc: Steve Muckle <smuckle@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Todd Kjos <tkjos@android.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Link: http://lkml.kernel.org/r/20180309095245.11071-3-patrick.bellasi@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

[
 backport from upstream:
 commit f9be3e5 ("sched/fair: Use util_est in LB and WU paths")

 This provides also schedutil integration, since:
    sugov_get_util()
       boosted_cpu_util()
          cpu_util_freq()
             cpu_util()

 thus, not requiring to backport:
 commit a07630b8b2c1 ("sched/cpufreq/schedutil: Use util_est for OPP selection")

 Support for energy_diff is also provided, since:
    calc_sg_energy()
       find_new_capacity()
          group_max_util()
             cpu_util_wake()
 and:
       group_norm_util()
          cpu_util_wake()

 Where both cpu_util() and cpu_util_wake() already consider the estimated
 utlilization in case of PELT being in use.
]
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Change-Id: I2be201cf7bb0b1449b14b4da64844067dbdb5eb4
Git-commit: f9fb6fbeb4c2d34482cd605acfb6f406459a7246
Git-repo: https://android.googlesource.com/kernel/common/
[satyap@codeaurora.org:
1. Updated cpu_util() as needed and keep the function in
   kernel/sched/sched.h
2. Trivial merge conflicts resolution for task_util_est() call sites
]
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2018-08-03 14:23:49 -07:00
Patrick Bellasi
8f24e60011 BACKPORT: sched/fair: Add util_est on top of PELT
The util_avg signal computed by PELT is too variable for some use-cases.
For example, a big task waking up after a long sleep period will have its
utilization almost completely decayed. This introduces some latency before
schedutil will be able to pick the best frequency to run a task.

The same issue can affect task placement. Indeed, since the task
utilization is already decayed at wakeup, when the task is enqueued in a
CPU, this can result in a CPU running a big task as being temporarily
represented as being almost empty. This leads to a race condition where
other tasks can be potentially allocated on a CPU which just started to run
a big task which slept for a relatively long period.

Moreover, the PELT utilization of a task can be updated every [ms], thus
making it a continuously changing value for certain longer running
tasks. This means that the instantaneous PELT utilization of a RUNNING
task is not really meaningful to properly support scheduler decisions.

For all these reasons, a more stable signal can do a better job of
representing the expected/estimated utilization of a task/cfs_rq.
Such a signal can be easily created on top of PELT by still using it as
an estimator which produces values to be aggregated on meaningful
events.

This patch adds a simple implementation of util_est, a new signal built on
top of PELT's util_avg where:

    util_est(task) = max(task::util_avg, f(task::util_avg@dequeue))

This allows to remember how big a task has been reported by PELT in its
previous activations via f(task::util_avg@dequeue), which is the new
_task_util_est(struct task_struct*) function added by this patch.

If a task should change its behavior and it runs longer in a new
activation, after a certain time its util_est will just track the
original PELT signal (i.e. task::util_avg).

The estimated utilization of cfs_rq is defined only for root ones.
That's because the only sensible consumer of this signal are the
scheduler and schedutil when looking for the overall CPU utilization
due to FAIR tasks.

For this reason, the estimated utilization of a root cfs_rq is simply
defined as:

    util_est(cfs_rq) = max(cfs_rq::util_avg, cfs_rq::util_est::enqueued)

where:

    cfs_rq::util_est::enqueued = sum(_task_util_est(task))
                                 for each RUNNABLE task on that root cfs_rq

It's worth noting that the estimated utilization is tracked only for
objects of interests, specifically:

 - Tasks: to better support tasks placement decisions
 - root cfs_rqs: to better support both tasks placement decisions as
                 well as frequencies selection

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
Cc: Steve Muckle <smuckle@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Todd Kjos <tkjos@android.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Link: http://lkml.kernel.org/r/20180309095245.11071-2-patrick.bellasi@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

[
 backport from upstream:
 commit 7f65ea42eb00 ("sched/fair: Add util_est on top of PELT")

 No major changes.
]
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Change-Id: Icfa41dd73bd4da674b0044cacb11f320cf39eabf
Git-commit: 700f1172f7a7cb06a037670fa1145f8cf4bdedbb
Git-repo: https://android.googlesource.com/kernel/common/
[satyap@codeaurora.org:
1. Update task_util() as needed and keep the functionality in
   kernel/sched/sched.h
2. Resolve trivial merge conflicts in kernel/sched/fair.c
}
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2018-08-03 14:17:30 -07:00
Patrick Bellasi
1f31b82834 ANDROID: sched/fair: Cleanup cpu_util{_wake}()
The current implementation of cpu_util and cpu_util_wake makes more
difficult the backporting of mainline patches and it also has some
features not longer required by the current EAS code, e.g. delta
utilization in __cpu_util().

Let's clean up these functions definitions to:
1. get rid of the not longer required __cpu_util(cpu, delta)
   This function is now only called with delta=0 and thus we can refold
   its implementation into the original wrapper function cpu_util(cpu)
2. optimize for the WALT path on CONFIG_SCHED_WALT builds
   Currently indeed we execute some not necessary PELT related code even
   when WALT signals are required.
   Let's change this by assuming that on CONFIG_SCHED_WALT build we are
   likely using WALT signals. While on !CONFIG_SCHED_WALT we still have
   just the PELT signals with a code structure which matches mainline
3. move the definitions from sched/sched.h into sched/fair.c
   This is the only module using these functions and it also better
   align with the mainline location for these functions
4. get rid of the walt_util macro
   That macro has a function-like signature but it's modifying a
   parameter (passed by value) which makes it a bit confusing.
   Moreover, the usage of the min_t() macro to cap signals with
   capacity_orig_of() it makes not more required the explicit type cast
   used by that macro to support both 32 and 63 bits targets.
5. remove forward declarations
   Which are not required once the definition is moved at the top of
   fair.c, since they don't have other local dependencies.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Change-Id: I61c9b7b8a0a34b494527c5aa76218c64543c16d2
Git-commit: 2bd47d3fcebfd8a79606e8eaed318d8361e635c8
Git-repo: https://android.googlesource.com/kernel/common/
[satyap@codeaurora.org:
1. Update cpu_util() and cpu_util_freq() as needed and keep the
   functionality in kernel/sched/sched.h instead of moving to
   kernel/sched/fair.c
2. Replace cpu_util_freq_pelt() with cpu_util()
]
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2018-08-03 14:16:21 -07:00
Dietmar Eggemann
b4f436ae22 ANDROID: sched: Update max cpu capacity in case of max frequency constraints
Wakeup balancing uses cpu capacity awareness and needs to know the
system-wide maximum cpu capacity.

Patch "sched: Store system-wide maximum cpu capacity in root domain"
finds the system-wide maximum cpu capacity during scheduler domain
hierarchy setup. This is sufficient as long as maximum frequency
invariance is not enabled.

If it is enabled, the system-wide maximum cpu capacity can change
between scheduler domain hierarchy setups due to frequency capping.

The cpu capacity is changed in update_cpu_capacity() which is called in
load balance on the lowest scheduler domain hierarchy level. To be able
to know if a change in cpu capacity for a certain cpu also has an effect
on the system-wide maximum cpu capacity it is normally necessary to
iterate over all cpus. This would be way too costly. That's why this
patch follows a different approach.

The unsigned long max_cpu_capacity value in struct root_domain is
replaced with a struct max_cpu_capacity, containing value (the
max_cpu_capacity) and cpu (the cpu index of the cpu providing the
maximum cpu_capacity).

Changes to the system-wide maximum cpu capacity and the cpu index are
made if:

 1 System-wide maximum cpu capacity < cpu capacity
 2 System-wide maximum cpu capacity > cpu capacity and cpu index == cpu

There are no changes to the system-wide maximum cpu capacity in all
other cases.

Atomic read and write access to the pair (max_cpu_capacity.val,
max_cpu_capacity.cpu) is enforced by max_cpu_capacity.lock.

The access to max_cpu_capacity.val in task_fits_max() is still performed
without taking the max_cpu_capacity.lock.

The code to set max cpu capacity in build_sched_domains() has been
removed because the whole functionality is now provided by
update_cpu_capacity() instead.

This approach can introduce errors temporarily, e.g. in case the cpu
currently providing the max cpu capacity has its cpu capacity lowered
due to frequency capping and calls update_cpu_capacity() before any cpu
which might provide the max cpu now.

Change-Id: Idaa7a16723001e222e476de34df332558e48dd13
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Git-commit: 2cc3df5e1c7be786e3089c7882716a89bd8c0d1b
Git-repo: https://android.googlesource.com/kernel/common/
[satyap@codeaurora.org:
1. Replace max_cpu_capacity with max_cpu_capacity.val
2. Replace pr_info() with printk_deferred()
3. Fix below warning introduced as part of the commit to
   avoid compilation issue.
   kernel/sched/fair.c:9324:30: warning: '&&' within '||' \
				[-Wlogical-op-parentheses]
]
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2018-08-03 14:11:49 -07:00
Prasad Sodagudi
739725c32a stop_machine: Atomically queue and wake stopper threads
When cpu_stop_queue_work() releases the lock for the stopper
thread that was queued into its wake queue, preemption is
enabled, which leads to the following deadlock:

CPU0                              CPU1
sched_setaffinity(0, ...)
__set_cpus_allowed_ptr()
stop_one_cpu(0, ...)              stop_two_cpus(0, 1, ...)
cpu_stop_queue_work(0, ...)       cpu_stop_queue_two_works(0, ..., 1, ...)

-grabs lock for migration/0-
                                  -spins with preemption disabled,
                                   waiting for migration/0's lock to be
                                   released-

-adds work items for migration/0
and queues migration/0 to its
wake_q-

-releases lock for migration/0
 and preemption is enabled-

-current thread is preempted,
and __set_cpus_allowed_ptr
has changed the thread's
cpu allowed mask to CPU1 only-

                                  -acquires migration/0 and migration/1's
                                   locks-

                                  -adds work for migration/0 but does not
                                   add migration/0 to wake_q, since it is
                                   already in a wake_q-

                                  -adds work for migration/1 and adds
                                   migration/1 to its wake_q-

                                  -releases migration/0 and migration/1's
                                   locks, wakes migration/1, and enables
                                   preemption-

                                  -since migration/1 is requested to run,
                                   migration/1 begins to run and waits on
                                   migration/0, but migration/0 will never
                                   be able to run, since the thread that
                                   can wake it is affine to CPU1-

Disable preemption in cpu_stop_queue_work() before
queueing works for stopper threads, and queueing the stopper
thread in the wake queue, to ensure that the operation
of queueing the works and waking the stopper threads is atomic.

Change-Id: Iac8ae8d823db2c62191cf93629876f505cb09e77
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2018-08-03 14:00:24 -07:00
Dietmar Eggemann
a7e5318117 ANDROID: sched/fair: add arch scaling function for max frequency capping
To be able to scale the cpu capacity by this factor introduce a call to
the new arch scaling function arch_scale_max_freq_capacity() in
update_cpu_capacity() and provide a default implementation which returns
SCHED_CAPACITY_SCALE.

Another subsystem (e.g. cpufreq) or architectural or platform specific
code can overwrite this default implementation, exactly as for frequency
and cpu invariance. It has to be enabled by the arch by defining
arch_scale_max_freq_capacity to the actual implementation.

Change-Id: I770a8b1f4f7340e9e314f71c64a765bf880f4b4d
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Git-commit: 44791a47f588394e3d755dc01c528bc072409eef
Git-repo: https://android.googlesource.com/kernel/common/
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2018-08-03 13:51:43 -07:00
Dietmar Eggemann
19fd21e993 ANDROID: sched, trace: Remove trace event sched_load_avg_cpu
The functionality of sched_load_avg_cpu can be provided by
sched_load_cfs_rq.
The WALT extension will be included into sched_load_cfs_rq by patch
(ANDROID: trace: Add WALT util signal to trace event sched_load_cfs_rq).

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Change-Id: I2e3944a6e60a77e3237b7c8cb29cbd840118a6ad
Git-commit: df6f68dffb40654d144b33e00acf8ba9e08d0618
Git-repo: https://android.googlesource.com/kernel/common/
[satyap@codeaurora.org: trivial merge conflict resolution in
include/trace/events/sched.h due to WALT related changes to
sched_load_cfs_rq trace point]
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2018-08-03 13:51:15 -07:00
Dietmar Eggemann
4daa6e6949 ANDROID: Rename and move include/linux/sched_energy.h
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Change-Id: I51401b94947c64feb598287a991eadf9de064340
Git-commit: 75473ed034fd79433198c617437a3f84aa01c57a
Git-repo: https://android.googlesource.com/kernel/common/
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2018-08-03 13:51:04 -07:00
qctecmdr Service
f0ea07c405 Merge "ANDROID: update_group_capacity for single cpu in cluster" 2018-08-03 08:25:11 -07:00
qctecmdr Service
beb0ab5897 Merge "sched/walt: improve the scheduler" 2018-08-03 08:25:11 -07:00
qctecmdr Service
88532b424b Merge "ANDROID: sched/fair: Don't balance misfits if it would overload local group" 2018-08-01 06:23:24 -07:00
qctecmdr Service
9a8972bae6 Merge "sched/fair: fix prefer_idle behaviour" 2018-08-01 02:12:46 -07:00
Satya Durga Srinivasu Prabhala
ff3a481216 sched/walt: improve the scheduler
This change is for general scheduler improvement.

Change-Id: Ie162a57537bb9ada66a4254d606e17d54b7a3a49
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
[pkondeti@codeaurora.org: code refactoring and implemented freq to load
calculations.]
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2018-07-31 22:30:55 -07:00
Ionela Voinescu
ce00e4cfc5 ANDROID: update_group_capacity for single cpu in cluster
If we're only left with one big CPU in a cluster (either there was only
one to begin with or the others have been hotplugged out), the MC level
seen from the perspective of that CPU, will only contain one group (that
of the CPU) and the LOAD_BALANCE flag will be cleared at that level.
The MC level is kept nonetheless as it will still contain energy
information, as per commit 06654998ed81 ("ANDROID: sched: EAS & 'single
cpu per cluster'/cpu hotplug interoperability").

This will result in update_cpu_capacity never being called for that big
CPU and its capacity will never be updated. Also, its capacity will
never be considered when setting or updating the max_cpu_capacity
structure.

Therefore, call update_group_capacity for SD levels that do not have the
SD_LOAD_BALANCE flag set in order to update the CPU capacity information
of CPUs that are alone in a cluster and the groups that they belong to.
The call to update_cpu_capacity will also result in appropriately
setting the values of the max_cpu_capacity structure.

Fixes: 06654998ed81 ("ANDROID: sched: EAS & 'single cpu per cluster'/cpu
hotplug interoperability")
Change-Id: I6074972dde1fdf586378f8a3534e3d0aa42a809a
Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Git-commit: 8fa863274cb41ca169ff14de48620c1f33599e9c
Git-repo: https://android-review.googlesource.com/c/kernel/common/
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2018-07-31 15:09:51 -07:00
Chris Redpath
949ae185c2 ANDROID: sched/fair: Also do misfit in overloaded groups
If we can classify the group as overloaded, that overrides
any classification as misfit but we may still have misfit
tasks present. Check the rq we're looking at to see if
this is the case.

Change-Id: Ida8eb66aa625e34de3fe2ee1b0dd8a78926273d8
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
[Removed stray reference to rq_has_misfit]
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Git-commit: fcfb6e412d77e82816a93b8cd2776025f37f1396
Git-repo: https://android.googlesource.com/kernel/common/
Signed-off-by: Puja Gupta <pujag@codeaurora.org>
2018-07-31 15:09:24 -07:00
Chris Redpath
4ca7dd5b2d ANDROID: sched/fair: Don't balance misfits if it would overload local group
When load balancing in a system with misfit tasks present, if we always
pull a misfit task to the local group this can lead to pulling a running
task from a smaller capacity CPUs to a bigger CPU which is busy. In this
situation, the pulled task is likely not to get a chance to run before
an idle balance on another small CPU pulls it back. This penalises the
pulled task as it is stopped for a short amount of time and then likely
relocated to a different CPU (since the original CPU just did a NEWLY_IDLE
balance and reset the periodic interval).

If we only do this unconditionally for NEWLY_IDLE balance, we can be
sure that any tasks and load which are present on the local group are
related to short-running tasks which we are happy to displace for a
longer running task in a system with misfit tasks present.

However, other balance types should only pull a task if we think
that the local group is underutilized - checking the number of tasks
gives us a conservative estimate here since if they were short tasks
we would have been doing NEWLY_IDLE balances instead.

Change-Id: I710add1ab1139482620b6addc8370ad194791beb
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Git-commit: 37e94467884b9ab3552a4309c97985b9ec067d76
Git-repo: https://android.googlesource.com/kernel/common/
Signed-off-by: Puja Gupta <pujag@codeaurora.org>
2018-07-31 15:09:20 -07:00
Chris Redpath
44b893b6c7 FROMLIST: sched/fair: Don't move tasks to lower capacity cpus unless necessary
When lower capacity CPUs are load balancing and considering to pull
something from a higher capacity group, we should not pull tasks from a
cpu with only one task running as this is guaranteed to impede progress
for that task. If there is more than one task running, load balance in
the higher capacity group would have already made any possible moves to
resolve imbalance and we should make better use of system compute
capacity by moving a task if we still have more than one running.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>

Change-Id: Ib86570abdd453a51be885b086c8d80be2773a6f2
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
[from https://lore.kernel.org/lkml/1530699470-29808-11-git-send-email-morten.rasmussen@arm.com/]
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Git-commit: 07e7ce6c8459defc34e63ae0f0334e811d223990
Git-repo: https://android.googlesource.com/kernel/common/
Signed-off-by: Puja Gupta <pujag@codeaurora.org>
2018-07-31 15:09:15 -07:00
Morten Rasmussen
af40367ae9 FROMLIST: sched/core: Disable SD_PREFER_SIBLING on asymmetric cpu capacity domains
The 'prefer sibling' sched_domain flag is intended to encourage
spreading tasks to sibling sched_domain to take advantage of more caches
and core for SMT systems. It has recently been changed to be on all
non-NUMA topology level. However, spreading across domains with cpu
capacity asymmetry isn't desirable, e.g. spreading from high capacity to
low capacity cpus even if high capacity cpus aren't overutilized might
give access to more cache but the cpu will be slower and possibly lead
to worse overall throughput.

To prevent this, we need to remove SD_PREFER_SIBLING on the sched_domain
level immediately below SD_ASYM_CPUCAPACITY.

Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
[from https://lore.kernel.org/lkml/1530699470-29808-13-git-send-email-morten.rasmussen@arm.com/]
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Change-Id: I944a003c7b685132c57a90a8aaf85509196679e6
Git-commit: b45d82a276a0375cf9368dfc2bb0094130537b06
Git-repo: https://android.googlesource.com/kernel/common/
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2018-07-31 15:09:05 -07:00
Morten Rasmussen
2762432d6e FROMLIST: sched/core: Disable SD_ASYM_CPUCAPACITY for root_domains without asymmetry
When hotplugging cpus out or creating exclusive cpusets systems which
were asymmetric at boot might become symmetric. In this case leaving the
flag set might lead to suboptimal scheduling decisions.

The arch-code proving the flag doesn't have visibility of the cpuset
configuration so it must either be told by passing a cpumask or by
letting the generic topology code verify if the flag should still be set
when taking the actual sched_domain_span() into account. This patch
implements the latter approach.

Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
[from https://lore.kernel.org/lkml/1530699470-29808-12-git-send-email-morten.rasmussen@arm.com/]
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Change-Id: I6a369e474743c96fe0b866ebb084e0a250e8c5d7
Git-commit: 052f19c502a41e50a95d50c06db5ba0a58358620
Git-repo: https://android.googlesource.com/kernel/common/
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2018-07-31 15:09:00 -07:00
Valentin Schneider
4635e58725 FROMLIST: sched/fair: Set rq->rd->overload when misfit
Idle balance is a great opportunity to pull a misfit task. However,
there are scenarios where misfit tasks are present but idle balance is
prevented by the overload flag.

A good example of this is a workload of n identical tasks. Let's suppose
we have a 2+2 Arm big.LITTLE system. We then spawn 4 fairly
CPU-intensive tasks - for the sake of simplicity let's say they are just
CPU hogs, even when running on big CPUs.

They are identical tasks, so on an SMP system they should all end at
(roughly) the same time. However, in our case the LITTLE CPUs are less
performing than the big CPUs, so tasks running on the LITTLEs will have
a longer completion time.

This means that the big CPUs will complete their work earlier, at which
point they should pull the tasks from the LITTLEs. What we want to
happen is summarized as follows:

a,b,c,d are our CPU-hogging tasks
_ signifies idling

LITTLE_0 | a a a a _ _
LITTLE_1 | b b b b _ _
---------|-------------
  big_0  | c c c c a a
  big_1  | d d d d b b
		  ^
		  ^
    Tasks end on the big CPUs, idle balance happens
    and the misfit tasks are pulled straight away

This however won't happen, because currently the overload flag is only
set when there is any CPU that has more than one runnable task - which
may very well not be the case here if our CPU-hogging workload is all
there is to run.

As such, this commit sets the overload flag in update_sg_lb_stats when
a group is flagged as having a misfit task.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>

Change-Id: I131f9ad90f7fb53f4946f61f2fb65ab0798f23c5
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
[from https://lore.kernel.org/lkml/1530699470-29808-10-git-send-email-morten.rasmussen@arm.com/]
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Git-commit: 96a2daab1f375d9ca8c16fc5c721ea3fd60cb182
Git-repo: https://android.googlesource.com/kernel/common/
Signed-off-by: Puja Gupta <pujag@codeaurora.org>
2018-07-31 15:08:53 -07:00
Valentin Schneider
5182d80980 FROMLIST: sched: Wrap rq->rd->overload accesses with READ/WRITE_ONCE
This variable can be read and set locklessly within update_sd_lb_stats().
As such, READ/WRITE_ONCE are added to make sure nothing terribly wrong
can happen because of the compiler.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
[from https://lore.kernel.org/lkml/1530699470-29808-9-git-send-email-morten.rasmussen@arm.com/]
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Change-Id: I14ce007f916bffd6a7938b7d56faf2b384d27887
Git-commit: 907139485794dd7922a9632ab3507bb6a16e562e
Git-repo: https://android.googlesource.com/kernel/common/
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2018-07-31 15:08:48 -07:00
Valentin Schneider
168dc6ff8e FROMLIST: sched: Change root_domain->overload type to int
sizeof(_Bool) is implementation defined, so let's just go with 'int' as
is done for other structures e.g. sched_domain_shared->has_idle_cores.

The local 'overload' variable used in update_sd_lb_stats can remain
bool, as it won't impact any struct layout and can be assigned to the
root_domain field.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>

Change-Id: I24c3ee7fc9a0aa76c3a7a1369714248703e73af9
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
[from https://lore.kernel.org/lkml/1530699470-29808-8-git-send-email-morten.rasmussen@arm.com/]
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Git-commit: 14b1b1f5b38e9d81b26c3d22da2bfb8076cf780d
Git-repo: https://android.googlesource.com/kernel/common/
Signed-off-by: Puja Gupta <pujag@codeaurora.org>
2018-07-31 15:08:43 -07:00
Valentin Schneider
2c2ab0de80 FROMLIST: sched/fair: Change prefer_sibling type to bool
This variable is entirely local to update_sd_lb_stats, so we can
safely change its type and slightly clean up its initialisation.

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
[from https://lore.kernel.org/lkml/1530699470-29808-7-git-send-email-morten.rasmussen@arm.com/]
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Change-Id: I8d3cc862292290c952505ff78e41cfe6b04bf168
Git-commit: 16419700b5cd1338191a619d0affbdba55eb627c
Git-repo: https://android.googlesource.com/kernel/common/
Signed-off-by: Satya Durga Srinivasu Prabhala <satyap@codeaurora.org>
2018-07-31 15:08:38 -07:00
Chris Redpath
53adc803f9 FROMLIST: sched/fair: Consider misfit tasks when load-balancing
On asymmetric cpu capacity systems load intensive tasks can end up on
cpus that don't suit their compute demand.  In this scenarios 'misfit'
tasks should be migrated to cpus with higher compute capacity to ensure
better throughput. group_misfit_task indicates this scenario, but tweaks
to the load-balance code are needed to make the migrations happen.

Misfit balancing only makes sense between a source group of lower
per-cpu capacity and destination group of higher compute capacity.
Otherwise, misfit balancing is ignored. group_misfit_task has lowest
priority so any imbalance due to overload is dealt with first.

The modifications are:

1. Only pick a group containing misfit tasks as the busiest group if the
   destination group has higher capacity and has spare capacity.
2. When the busiest group is a 'misfit' group, skip the usual average
   load and group capacity checks.
3. Set the imbalance for 'misfit' balancing sufficiently high for a task
   to be pulled ignoring average load.
4. Pick the cpu with the highest misfit load as the source cpu.
5. If the misfit task is alone on the source cpu, go for active
   balancing.

cc: Ingo Molnar <mingo@redhat.com>
cc: Peter Zijlstra <peterz@infradead.org>

Change-Id: Ib9f9edd31b6c56cfbeb2a2f9d5daaa1b6824375b
Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
[from https://lore.kernel.org/lkml/1530699470-29808-5-git-send-email-morten.rasmussen@arm.com/]
[backported - some parts already in android-4.14]
Signed-off-by: Ioan Budea <ioan.budea@arm.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Git-commit: d756b4bea368d69a1212011615ae9ed400180f9a
Git-repo: https://android.googlesource.com/kernel/common/
[pujag@codeaurora.org: trivial merge conflict resolution in
kernel/sched/fair.c]
Signed-off-by: Puja Gupta <pujag@codeaurora.org>
2018-07-31 15:08:30 -07:00