rcu: Squash backport from v5.4

This is a shameless squash of Jebaitedeneko work:
https://github.com/Jebaitedneko/android_kernel_xiaomi_vayu/tree/rcu

Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com>
Signed-off-by: azrim <mirzaspc@gmail.com>
This commit is contained in:
Andrzej Perczak 2021-12-16 19:58:25 +01:00 committed by azrim
parent f547b70409
commit 586aa90dca
No known key found for this signature in database
GPG Key ID: 497F8FB059B45D1C
101 changed files with 7363 additions and 6285 deletions

View File

@ -328,13 +328,13 @@
inkscape:window-height="1148" inkscape:window-height="1148"
id="namedview90" id="namedview90"
showgrid="true" showgrid="true"
inkscape:zoom="0.80021373" inkscape:zoom="0.69092787"
inkscape:cx="462.49289" inkscape:cx="476.34085"
inkscape:cy="473.6718" inkscape:cy="712.80957"
inkscape:window-x="770" inkscape:window-x="770"
inkscape:window-y="24" inkscape:window-y="24"
inkscape:window-maximized="0" inkscape:window-maximized="0"
inkscape:current-layer="g4114-9-3-9" inkscape:current-layer="g4"
inkscape:snap-grids="false" inkscape:snap-grids="false"
fit-margin-top="5" fit-margin-top="5"
fit-margin-right="5" fit-margin-right="5"
@ -813,14 +813,18 @@
<text <text
sodipodi:linespacing="125%" sodipodi:linespacing="125%"
id="text4110-5-7-6-2-4-0" id="text4110-5-7-6-2-4-0"
y="841.88086" y="670.74316"
x="1460.1007" x="1460.1007"
style="font-size:267.24359131px;font-style:normal;font-weight:normal;text-align:center;line-height:125%;letter-spacing:0px;word-spacing:0px;text-anchor:middle;fill:#000000;fill-opacity:1;stroke:none;font-family:Sans" style="font-size:267.24359131px;font-style:normal;font-weight:normal;text-align:center;line-height:125%;letter-spacing:0px;word-spacing:0px;text-anchor:middle;fill:#000000;fill-opacity:1;stroke:none;font-family:Sans"
xml:space="preserve"><tspan xml:space="preserve"><tspan
y="841.88086" y="670.74316"
x="1460.1007" x="1460.1007"
sodipodi:role="line" sodipodi:role="line"
id="tspan4925-1-2-4-5">reched_cpu()</tspan></text> id="tspan4925-1-2-4-5">Request</tspan><tspan
y="1004.7976"
x="1460.1007"
sodipodi:role="line"
id="tspan3100">context switch</tspan></text>
</g> </g>
</g> </g>
</svg> </svg>

Before

Width:  |  Height:  |  Size: 32 KiB

After

Width:  |  Height:  |  Size: 32 KiB

View File

@ -73,10 +73,10 @@ will ignore it because idle and offline CPUs are already residing
in quiescent states. in quiescent states.
Otherwise, the expedited grace period will use Otherwise, the expedited grace period will use
<tt>smp_call_function_single()</tt> to send the CPU an IPI, which <tt>smp_call_function_single()</tt> to send the CPU an IPI, which
is handled by <tt>sync_rcu_exp_handler()</tt>. is handled by <tt>rcu_exp_handler()</tt>.
<p> <p>
However, because this is preemptible RCU, <tt>sync_rcu_exp_handler()</tt> However, because this is preemptible RCU, <tt>rcu_exp_handler()</tt>
can check to see if the CPU is currently running in an RCU read-side can check to see if the CPU is currently running in an RCU read-side
critical section. critical section.
If not, the handler can immediately report a quiescent state. If not, the handler can immediately report a quiescent state.
@ -146,19 +146,18 @@ expedited grace period is shown in the following diagram:
<p><img src="ExpSchedFlow.svg" alt="ExpSchedFlow.svg" width="55%"> <p><img src="ExpSchedFlow.svg" alt="ExpSchedFlow.svg" width="55%">
<p> <p>
As with RCU-preempt's <tt>synchronize_rcu_expedited()</tt>, As with RCU-preempt, RCU-sched's
<tt>synchronize_sched_expedited()</tt> ignores offline and <tt>synchronize_sched_expedited()</tt> ignores offline and
idle CPUs, again because they are in remotely detectable idle CPUs, again because they are in remotely detectable
quiescent states. quiescent states.
However, the <tt>synchronize_rcu_expedited()</tt> handler However, because the
is <tt>sync_sched_exp_handler()</tt>, and because the
<tt>rcu_read_lock_sched()</tt> and <tt>rcu_read_unlock_sched()</tt> <tt>rcu_read_lock_sched()</tt> and <tt>rcu_read_unlock_sched()</tt>
leave no trace of their invocation, in general it is not possible to tell leave no trace of their invocation, in general it is not possible to tell
whether or not the current CPU is in an RCU read-side critical section. whether or not the current CPU is in an RCU read-side critical section.
The best that <tt>sync_sched_exp_handler()</tt> can do is to check The best that RCU-sched's <tt>rcu_exp_handler()</tt> can do is to check
for idle, on the off-chance that the CPU went idle while the IPI for idle, on the off-chance that the CPU went idle while the IPI
was in flight. was in flight.
If the CPU is idle, then tt>sync_sched_exp_handler()</tt> reports If the CPU is idle, then <tt>rcu_exp_handler()</tt> reports
the quiescent state. the quiescent state.
<p> <p>
@ -299,19 +298,18 @@ Instead, the task pushing the grace period forward will include the
idle CPUs in the mask passed to <tt>rcu_report_exp_cpu_mult()</tt>. idle CPUs in the mask passed to <tt>rcu_report_exp_cpu_mult()</tt>.
<p> <p>
For RCU-sched, there is an additional check for idle in the IPI For RCU-sched, there is an additional check:
handler, <tt>sync_sched_exp_handler()</tt>.
If the IPI has interrupted the idle loop, then If the IPI has interrupted the idle loop, then
<tt>sync_sched_exp_handler()</tt> invokes <tt>rcu_report_exp_rdp()</tt> <tt>rcu_exp_handler()</tt> invokes <tt>rcu_report_exp_rdp()</tt>
to report the corresponding quiescent state. to report the corresponding quiescent state.
<p> <p>
For RCU-preempt, there is no specific check for idle in the For RCU-preempt, there is no specific check for idle in the
IPI handler (<tt>sync_rcu_exp_handler()</tt>), but because IPI handler (<tt>rcu_exp_handler()</tt>), but because
RCU read-side critical sections are not permitted within the RCU read-side critical sections are not permitted within the
idle loop, if <tt>sync_rcu_exp_handler()</tt> sees that the CPU is within idle loop, if <tt>rcu_exp_handler()</tt> sees that the CPU is within
RCU read-side critical section, the CPU cannot possibly be idle. RCU read-side critical section, the CPU cannot possibly be idle.
Otherwise, <tt>sync_rcu_exp_handler()</tt> invokes Otherwise, <tt>rcu_exp_handler()</tt> invokes
<tt>rcu_report_exp_rdp()</tt> to report the corresponding quiescent <tt>rcu_report_exp_rdp()</tt> to report the corresponding quiescent
state, regardless of whether or not that quiescent state was due to state, regardless of whether or not that quiescent state was due to
the CPU being idle. the CPU being idle.
@ -626,6 +624,8 @@ checks, but only during the mid-boot dead zone.
<p> <p>
With this refinement, synchronous grace periods can now be used from With this refinement, synchronous grace periods can now be used from
task context pretty much any time during the life of the kernel. task context pretty much any time during the life of the kernel.
That is, aside from some points in the suspend, hibernate, or shutdown
code path.
<h3><a name="Summary"> <h3><a name="Summary">
Summary</a></h3> Summary</a></h3>

View File

@ -2079,6 +2079,8 @@ Some of the relevant points of interest are as follows:
<li> <a href="#Hotplug CPU">Hotplug CPU</a>. <li> <a href="#Hotplug CPU">Hotplug CPU</a>.
<li> <a href="#Scheduler and RCU">Scheduler and RCU</a>. <li> <a href="#Scheduler and RCU">Scheduler and RCU</a>.
<li> <a href="#Tracing and RCU">Tracing and RCU</a>. <li> <a href="#Tracing and RCU">Tracing and RCU</a>.
<li> <a href="#Accesses to User Memory and RCU">
Accesses to User Memory and RCU</a>.
<li> <a href="#Energy Efficiency">Energy Efficiency</a>. <li> <a href="#Energy Efficiency">Energy Efficiency</a>.
<li> <a href="#Scheduling-Clock Interrupts and RCU"> <li> <a href="#Scheduling-Clock Interrupts and RCU">
Scheduling-Clock Interrupts and RCU</a>. Scheduling-Clock Interrupts and RCU</a>.
@ -2393,30 +2395,9 @@ when invoked from a CPU-hotplug notifier.
<p> <p>
RCU depends on the scheduler, and the scheduler uses RCU to RCU depends on the scheduler, and the scheduler uses RCU to
protect some of its data structures. protect some of its data structures.
This means the scheduler is forbidden from acquiring The preemptible-RCU <tt>rcu_read_unlock()</tt>
the runqueue locks and the priority-inheritance locks implementation must therefore be written carefully to avoid deadlocks
in the middle of an outermost RCU read-side critical section unless either involving the scheduler's runqueue and priority-inheritance locks.
(1)&nbsp;it releases them before exiting that same
RCU read-side critical section, or
(2)&nbsp;interrupts are disabled across
that entire RCU read-side critical section.
This same prohibition also applies (recursively!) to any lock that is acquired
while holding any lock to which this prohibition applies.
Adhering to this rule prevents preemptible RCU from invoking
<tt>rcu_read_unlock_special()</tt> while either runqueue or
priority-inheritance locks are held, thus avoiding deadlock.
<p>
Prior to v4.4, it was only necessary to disable preemption across
RCU read-side critical sections that acquired scheduler locks.
In v4.4, expedited grace periods started using IPIs, and these
IPIs could force a <tt>rcu_read_unlock()</tt> to take the slowpath.
Therefore, this expedited-grace-period change required disabling of
interrupts, not just preemption.
<p>
For RCU's part, the preemptible-RCU <tt>rcu_read_unlock()</tt>
implementation must be written carefully to avoid similar deadlocks.
In particular, <tt>rcu_read_unlock()</tt> must tolerate an In particular, <tt>rcu_read_unlock()</tt> must tolerate an
interrupt where the interrupt handler invokes both interrupt where the interrupt handler invokes both
<tt>rcu_read_lock()</tt> and <tt>rcu_read_unlock()</tt>. <tt>rcu_read_lock()</tt> and <tt>rcu_read_unlock()</tt>.
@ -2425,7 +2406,7 @@ negative nesting levels to avoid destructive recursion via
interrupt handler's use of RCU. interrupt handler's use of RCU.
<p> <p>
This pair of mutual scheduler-RCU requirements came as a This scheduler-RCU requirement came as a
<a href="https://lwn.net/Articles/453002/">complete surprise</a>. <a href="https://lwn.net/Articles/453002/">complete surprise</a>.
<p> <p>
@ -2436,9 +2417,28 @@ when running context-switch-heavy workloads when built with
<tt>CONFIG_NO_HZ_FULL=y</tt> <tt>CONFIG_NO_HZ_FULL=y</tt>
<a href="http://www.rdrop.com/users/paulmck/scalability/paper/BareMetal.2015.01.15b.pdf">did come as a surprise [PDF]</a>. <a href="http://www.rdrop.com/users/paulmck/scalability/paper/BareMetal.2015.01.15b.pdf">did come as a surprise [PDF]</a>.
RCU has made good progress towards meeting this requirement, even RCU has made good progress towards meeting this requirement, even
for context-switch-have <tt>CONFIG_NO_HZ_FULL=y</tt> workloads, for context-switch-heavy <tt>CONFIG_NO_HZ_FULL=y</tt> workloads,
but there is room for further improvement. but there is room for further improvement.
<p>
In the past, it was forbidden to disable interrupts across an
<tt>rcu_read_unlock()</tt> unless that interrupt-disabled region
of code also included the matching <tt>rcu_read_lock()</tt>.
Violating this restriction could result in deadlocks involving the
scheduler's runqueue and priority-inheritance spinlocks.
This restriction was lifted when interrupt-disabled calls to
<tt>rcu_read_unlock()</tt> started deferring the reporting of
the resulting RCU-preempt quiescent state until the end of that
interrupts-disabled region.
This deferred reporting means that the scheduler's runqueue and
priority-inheritance locks cannot be held while reporting an RCU-preempt
quiescent state, which lifts the earlier restriction, at least from
a deadlock perspective.
Unfortunately, real-time systems using RCU priority boosting may
need this restriction to remain in effect because deferred
quiescent-state reporting also defers deboosting, which in turn
degrades real-time latencies.
<h3><a name="Tracing and RCU">Tracing and RCU</a></h3> <h3><a name="Tracing and RCU">Tracing and RCU</a></h3>
<p> <p>
@ -2453,6 +2453,75 @@ cannot be used.
The tracing folks both located the requirement and provided the The tracing folks both located the requirement and provided the
needed fix, so this surprise requirement was relatively painless. needed fix, so this surprise requirement was relatively painless.
<h3><a name="Accesses to User Memory and RCU">
Accesses to User Memory and RCU</a></h3>
<p>
The kernel needs to access user-space memory, for example, to access
data referenced by system-call parameters.
The <tt>get_user()</tt> macro does this job.
<p>
However, user-space memory might well be paged out, which means
that <tt>get_user()</tt> might well page-fault and thus block while
waiting for the resulting I/O to complete.
It would be a very bad thing for the compiler to reorder
a <tt>get_user()</tt> invocation into an RCU read-side critical
section.
For example, suppose that the source code looked like this:
<blockquote>
<pre>
1 rcu_read_lock();
2 p = rcu_dereference(gp);
3 v = p-&gt;value;
4 rcu_read_unlock();
5 get_user(user_v, user_p);
6 do_something_with(v, user_v);
</pre>
</blockquote>
<p>
The compiler must not be permitted to transform this source code into
the following:
<blockquote>
<pre>
1 rcu_read_lock();
2 p = rcu_dereference(gp);
3 get_user(user_v, user_p); // BUG: POSSIBLE PAGE FAULT!!!
4 v = p-&gt;value;
5 rcu_read_unlock();
6 do_something_with(v, user_v);
</pre>
</blockquote>
<p>
If the compiler did make this transformation in a
<tt>CONFIG_PREEMPT=n</tt> kernel build, and if <tt>get_user()</tt> did
page fault, the result would be a quiescent state in the middle
of an RCU read-side critical section.
This misplaced quiescent state could result in line&nbsp;4 being
a use-after-free access, which could be bad for your kernel's
actuarial statistics.
Similar examples can be constructed with the call to <tt>get_user()</tt>
preceding the <tt>rcu_read_lock()</tt>.
<p>
Unfortunately, <tt>get_user()</tt> doesn't have any particular
ordering properties, and in some architectures the underlying <tt>asm</tt>
isn't even marked <tt>volatile</tt>.
And even if it was marked <tt>volatile</tt>, the above access to
<tt>p-&gt;value</tt> is not volatile, so the compiler would not have any
reason to keep those two accesses in order.
<p>
Therefore, the Linux-kernel definitions of <tt>rcu_read_lock()</tt>
and <tt>rcu_read_unlock()</tt> must act as compiler barriers,
at least for outermost instances of <tt>rcu_read_lock()</tt> and
<tt>rcu_read_unlock()</tt> within a nested set of RCU read-side critical
sections.
<h3><a name="Energy Efficiency">Energy Efficiency</a></h3> <h3><a name="Energy Efficiency">Energy Efficiency</a></h3>
<p> <p>

View File

@ -230,15 +230,58 @@ handlers are no longer able to execute on this CPU. This can happen if
the stalled CPU is spinning with interrupts are disabled, or, in -rt the stalled CPU is spinning with interrupts are disabled, or, in -rt
kernels, if a high-priority process is starving RCU's softirq handler. kernels, if a high-priority process is starving RCU's softirq handler.
For CONFIG_RCU_FAST_NO_HZ kernels, the "last_accelerate:" prints the The "fqs=" shows the number of force-quiescent-state idle/offline
low-order 16 bits (in hex) of the jiffies counter when this CPU last detection passes that the grace-period kthread has made across this
invoked rcu_try_advance_all_cbs() from rcu_needs_cpu() or last invoked CPU since the last time that this CPU noted the beginning of a grace
rcu_accelerate_cbs() from rcu_prepare_for_idle(). The "nonlazy_posted:" period.
prints the number of non-lazy callbacks posted since the last call to
rcu_needs_cpu(). Finally, an "L" indicates that there are currently The "detected by" line indicates which CPU detected the stall (in this
no non-lazy callbacks ("." is printed otherwise, as shown above) and case, CPU 32), how many jiffies have elapsed since the start of the grace
"D" indicates that dyntick-idle processing is enabled ("." is printed period (in this case 2603), the grace-period sequence number (7075), and
otherwise, for example, if disabled via the "nohz=" kernel boot parameter). an estimate of the total number of RCU callbacks queued across all CPUs
(625 in this case).
In kernels with CONFIG_RCU_FAST_NO_HZ, more information is printed
for each CPU:
0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 softirq=82/543 last_accelerate: a345/d342 Nonlazy posted: ..D
The "last_accelerate:" prints the low-order 16 bits (in hex) of the
jiffies counter when this CPU last invoked rcu_try_advance_all_cbs()
from rcu_needs_cpu() or last invoked rcu_accelerate_cbs() from
rcu_prepare_for_idle(). The "Nonlazy posted:" indicates lazy-callback
status, so that an "l" indicates that all callbacks were lazy at the start
of the last idle period and an "L" indicates that there are currently
no non-lazy callbacks (in both cases, "." is printed otherwise, as
shown above) and "D" indicates that dyntick-idle processing is enabled
("." is printed otherwise, for example, if disabled via the "nohz="
kernel boot parameter).
If the grace period ends just as the stall warning starts printing,
there will be a spurious stall-warning message, which will include
the following:
INFO: Stall ended before state dump start
This is rare, but does happen from time to time in real life. It is also
possible for a zero-jiffy stall to be flagged in this case, depending
on how the stall warning and the grace-period initialization happen to
interact. Please note that it is not possible to entirely eliminate this
sort of false positive without resorting to things like stop_machine(),
which is overkill for this sort of problem.
If all CPUs and tasks have passed through quiescent states, but the
grace period has nevertheless failed to end, the stall-warning splat
will include something like the following:
All QSes seen, last rcu_preempt kthread activity 23807 (4297905177-4297881370), jiffies_till_next_fqs=3, root ->qsmask 0x0
The "23807" indicates that it has been more than 23 thousand jiffies
since the grace-period kthread ran. The "jiffies_till_next_fqs"
indicates how frequently that kthread should run, giving the number
of jiffies between force-quiescent-state scans, in this case three,
which is way less than 23807. Finally, the root rcu_node structure's
->qsmask field is printed, which will normally be zero.
If the relevant grace-period kthread has been unable to run prior to If the relevant grace-period kthread has been unable to run prior to
the stall warning, the following additional line is printed: the stall warning, the following additional line is printed:

View File

@ -210,7 +210,7 @@ synchronize_rcu()
rcu_assign_pointer() rcu_assign_pointer()
typeof(p) rcu_assign_pointer(p, typeof(p) v); void rcu_assign_pointer(p, typeof(p) v);
Yes, rcu_assign_pointer() -is- implemented as a macro, though it Yes, rcu_assign_pointer() -is- implemented as a macro, though it
would be cool to be able to declare a function in this manner. would be cool to be able to declare a function in this manner.
@ -218,9 +218,9 @@ rcu_assign_pointer()
The updater uses this function to assign a new value to an The updater uses this function to assign a new value to an
RCU-protected pointer, in order to safely communicate the change RCU-protected pointer, in order to safely communicate the change
in value from the updater to the reader. This function returns in value from the updater to the reader. This macro does not
the new value, and also executes any memory-barrier instructions evaluate to an rvalue, but it does execute any memory-barrier
required for a given CPU architecture. instructions required for a given CPU architecture.
Perhaps just as important, it serves to document (1) which Perhaps just as important, it serves to document (1) which
pointers are protected by RCU and (2) the point at which a pointers are protected by RCU and (2) the point at which a
@ -815,11 +815,13 @@ RCU list traversal:
list_next_rcu list_next_rcu
list_for_each_entry_rcu list_for_each_entry_rcu
list_for_each_entry_continue_rcu list_for_each_entry_continue_rcu
list_for_each_entry_from_rcu
hlist_first_rcu hlist_first_rcu
hlist_next_rcu hlist_next_rcu
hlist_pprev_rcu hlist_pprev_rcu
hlist_for_each_entry_rcu hlist_for_each_entry_rcu
hlist_for_each_entry_rcu_bh hlist_for_each_entry_rcu_bh
hlist_for_each_entry_from_rcu
hlist_for_each_entry_continue_rcu hlist_for_each_entry_continue_rcu
hlist_for_each_entry_continue_rcu_bh hlist_for_each_entry_continue_rcu_bh
hlist_nulls_first_rcu hlist_nulls_first_rcu

View File

@ -2163,9 +2163,6 @@
This tests the locking primitive's ability to This tests the locking primitive's ability to
transition abruptly to and from idle. transition abruptly to and from idle.
locktorture.torture_runnable= [BOOT]
Start locktorture running at boot time.
locktorture.torture_type= [KNL] locktorture.torture_type= [KNL]
Specify the locking implementation to test. Specify the locking implementation to test.
@ -3528,7 +3525,9 @@
see CONFIG_RAS_CEC help text. see CONFIG_RAS_CEC help text.
rcu_nocbs= [KNL] rcu_nocbs= [KNL]
The argument is a cpu list, as described above. The argument is a cpu list, as described above,
except that the string "all" can be used to
specify every CPU on the system.
In kernels built with CONFIG_RCU_NOCB_CPU=y, set In kernels built with CONFIG_RCU_NOCB_CPU=y, set
the specified list of CPUs to be no-callback CPUs. the specified list of CPUs to be no-callback CPUs.
@ -3575,6 +3574,12 @@
the propagation of recent CPU-hotplug changes up the propagation of recent CPU-hotplug changes up
the rcu_node combining tree. the rcu_node combining tree.
rcutree.use_softirq= [KNL]
If set to zero, move all RCU_SOFTIRQ processing to
per-CPU rcuc kthreads. Defaults to a non-zero
value, meaning that RCU_SOFTIRQ is used by default.
Specify rcutree.use_softirq=0 to use rcuc kthreads.
rcutree.rcu_fanout_exact= [KNL] rcutree.rcu_fanout_exact= [KNL]
Disable autobalancing of the rcu_node combining Disable autobalancing of the rcu_node combining
tree. This is used by rcutorture, and might tree. This is used by rcutorture, and might
@ -3593,7 +3598,14 @@
Set required age in jiffies for a Set required age in jiffies for a
given grace period before RCU starts given grace period before RCU starts
soliciting quiescent-state help from soliciting quiescent-state help from
rcu_note_context_switch(). rcu_note_context_switch(). If not specified, the
kernel will calculate a value based on the most
recent settings of rcutree.jiffies_till_first_fqs
and rcutree.jiffies_till_next_fqs.
This calculated value may be viewed in
rcutree.jiffies_to_sched_qs. Any attempt to
set rcutree.jiffies_to_sched_qs will be
cheerfully overwritten.
rcutree.jiffies_till_first_fqs= [KNL] rcutree.jiffies_till_first_fqs= [KNL]
Set delay from grace-period initialization to Set delay from grace-period initialization to
@ -3617,12 +3629,13 @@
RCU_BOOST is not set, valid values are 0-99 and RCU_BOOST is not set, valid values are 0-99 and
the default is zero (non-realtime operation). the default is zero (non-realtime operation).
rcutree.rcu_nocb_leader_stride= [KNL] rcutree.rcu_nocb_gp_stride= [KNL]
Set the number of NOCB kthread groups, which Set the number of NOCB callback kthreads in
defaults to the square root of the number of each group, which defaults to the square root
CPUs. Larger numbers reduces the wakeup overhead of the number of CPUs. Larger numbers reduce
on the per-CPU grace-period kthreads, but increases the wakeup overhead on the global grace-period
that same overhead on each group's leader. kthread, but increases that same overhead on
each group's NOCB grace-period kthread.
rcutree.qhimark= [KNL] rcutree.qhimark= [KNL]
Set threshold of queued RCU callbacks beyond which Set threshold of queued RCU callbacks beyond which
@ -3649,6 +3662,11 @@
This wake_up() will be accompanied by a This wake_up() will be accompanied by a
WARN_ONCE() splat and an ftrace_dump(). WARN_ONCE() splat and an ftrace_dump().
rcutree.sysrq_rcu= [KNL]
Commandeer a sysrq key to dump out Tree RCU's
rcu_node tree with an eye towards determining
why a new grace period has not yet started.
rcuperf.gp_async= [KNL] rcuperf.gp_async= [KNL]
Measure performance of asynchronous Measure performance of asynchronous
grace-period primitives such as call_rcu(). grace-period primitives such as call_rcu().
@ -3684,9 +3702,6 @@
the same as for rcuperf.nreaders. the same as for rcuperf.nreaders.
N, where N is the number of CPUs N, where N is the number of CPUs
rcuperf.perf_runnable= [BOOT]
Start rcuperf running at boot time.
rcuperf.perf_type= [KNL] rcuperf.perf_type= [KNL]
Specify the RCU implementation to test. Specify the RCU implementation to test.
@ -3703,24 +3718,6 @@
in microseconds. The default of zero says in microseconds. The default of zero says
no holdoff. no holdoff.
rcutorture.cbflood_inter_holdoff= [KNL]
Set holdoff time (jiffies) between successive
callback-flood tests.
rcutorture.cbflood_intra_holdoff= [KNL]
Set holdoff time (jiffies) between successive
bursts of callbacks within a given callback-flood
test.
rcutorture.cbflood_n_burst= [KNL]
Set the number of bursts making up a given
callback-flood test. Set this to zero to
disable callback-flood testing.
rcutorture.cbflood_n_per_burst= [KNL]
Set the number of callbacks to be registered
in a given burst of a callback-flood test.
rcutorture.fqs_duration= [KNL] rcutorture.fqs_duration= [KNL]
Set duration of force_quiescent_state bursts Set duration of force_quiescent_state bursts
in microseconds. in microseconds.
@ -3774,8 +3771,8 @@
Set time (s) after boot for CPU-hotplug testing. Set time (s) after boot for CPU-hotplug testing.
rcutorture.onoff_interval= [KNL] rcutorture.onoff_interval= [KNL]
Set time (s) between CPU-hotplug operations, or Set time (jiffies) between CPU-hotplug operations,
zero to disable CPU-hotplug testing. or zero to disable CPU-hotplug testing.
rcutorture.shuffle_interval= [KNL] rcutorture.shuffle_interval= [KNL]
Set task-shuffle interval (s). Shuffling tasks Set task-shuffle interval (s). Shuffling tasks
@ -3793,6 +3790,9 @@
rcutorture.stall_cpu_holdoff= [KNL] rcutorture.stall_cpu_holdoff= [KNL]
Time to wait (s) after boot before inducing stall. Time to wait (s) after boot before inducing stall.
rcutorture.stall_cpu_irqsoff= [KNL]
Disable interrupts while stalling if set.
rcutorture.stat_interval= [KNL] rcutorture.stat_interval= [KNL]
Time (s) between statistics printk()s. Time (s) between statistics printk()s.
@ -3817,15 +3817,16 @@
Test RCU's dyntick-idle handling. See also the Test RCU's dyntick-idle handling. See also the
rcutorture.shuffle_interval parameter. rcutorture.shuffle_interval parameter.
rcutorture.torture_runnable= [BOOT]
Start rcutorture running at boot time.
rcutorture.torture_type= [KNL] rcutorture.torture_type= [KNL]
Specify the RCU implementation to test. Specify the RCU implementation to test.
rcutorture.verbose= [KNL] rcutorture.verbose= [KNL]
Enable additional printk() statements. Enable additional printk() statements.
rcupdate.rcu_cpu_stall_ftrace_dump= [KNL]
Dump ftrace buffer after reporting RCU CPU
stall warning.
rcupdate.rcu_cpu_stall_suppress= [KNL] rcupdate.rcu_cpu_stall_suppress= [KNL]
Suppress RCU CPU stall warning messages. Suppress RCU CPU stall warning messages.
@ -3864,12 +3865,6 @@
rcupdate.rcu_self_test= [KNL] rcupdate.rcu_self_test= [KNL]
Run the RCU early boot self tests Run the RCU early boot self tests
rcupdate.rcu_self_test_bh= [KNL]
Run the RCU bh early boot self tests
rcupdate.rcu_self_test_sched= [KNL]
Run the RCU sched early boot self tests
rdinit= [KNL] rdinit= [KNL]
Format: <full_path> Format: <full_path>
Run specified binary instead of /init from the ramdisk, Run specified binary instead of /init from the ramdisk,

View File

@ -57,11 +57,6 @@ torture_type Type of lock to torture. By default, only spinlocks will
o "rwsem_lock": read/write down() and up() semaphore pairs. o "rwsem_lock": read/write down() and up() semaphore pairs.
torture_runnable Start locktorture at boot time in the case where the
module is built into the kernel, otherwise wait for
torture_runnable to be set via sysfs before starting.
By default it will begin once the module is loaded.
** Torture-framework (RCU + locking) ** ** Torture-framework (RCU + locking) **

View File

@ -46,6 +46,7 @@
#include <linux/pci.h> #include <linux/pci.h>
#include <linux/smp.h> #include <linux/smp.h>
#include <linux/syscore_ops.h> #include <linux/syscore_ops.h>
#include <linux/rcupdate.h>
#include <asm/cpufeature.h> #include <asm/cpufeature.h>
#include <asm/e820/api.h> #include <asm/e820/api.h>
@ -793,6 +794,9 @@ void mtrr_ap_init(void)
if (!use_intel() || mtrr_aps_delayed_init) if (!use_intel() || mtrr_aps_delayed_init)
return; return;
rcu_cpu_starting(smp_processor_id());
/* /*
* Ideally we should hold mtrr_mutex here to avoid mtrr entries * Ideally we should hold mtrr_mutex here to avoid mtrr entries
* changed, but this routine will be called in cpu boot time, * changed, but this routine will be called in cpu boot time,

View File

@ -346,7 +346,7 @@ static inline void gov_clear_update_util(struct cpufreq_policy *policy)
for_each_cpu(i, policy->cpus) for_each_cpu(i, policy->cpus)
cpufreq_remove_update_util_hook(i); cpufreq_remove_update_util_hook(i);
synchronize_sched(); synchronize_rcu();
} }
static struct policy_dbs_info *alloc_policy_dbs_info(struct cpufreq_policy *policy, static struct policy_dbs_info *alloc_policy_dbs_info(struct cpufreq_policy *policy,

View File

@ -163,7 +163,7 @@ static int expand_fdtable(struct files_struct *files, unsigned int nr)
* or have finished their rcu_read_lock_sched() section. * or have finished their rcu_read_lock_sched() section.
*/ */
if (atomic_read(&files->count) > 1) if (atomic_read(&files->count) > 1)
synchronize_sched(); synchronize_rcu();
spin_lock(&files->file_lock); spin_lock(&files->file_lock);
if (!new_fdt) if (!new_fdt)
@ -391,7 +391,7 @@ static struct fdtable *close_files(struct files_struct * files)
struct file * file = xchg(&fdt->fd[i], NULL); struct file * file = xchg(&fdt->fd[i], NULL);
if (file) { if (file) {
filp_close(file, files); filp_close(file, files);
cond_resched_rcu_qs(); cond_resched_tasks_rcu_qs();
} }
} }
i++; i++;

View File

@ -297,6 +297,10 @@
KEEP(*(__tracepoints_ptrs)) /* Tracepoints: pointer array */ \ KEEP(*(__tracepoints_ptrs)) /* Tracepoints: pointer array */ \
VMLINUX_SYMBOL(__stop___tracepoints_ptrs) = .; \ VMLINUX_SYMBOL(__stop___tracepoints_ptrs) = .; \
*(__tracepoints_strings)/* Tracepoints: strings */ \ *(__tracepoints_strings)/* Tracepoints: strings */ \
. = ALIGN(8); \
__start___srcu_struct = .; \
*(___srcu_struct_ptrs) \
__end___srcu_struct = .; \
} \ } \
\ \
.rodata1 : AT(ADDR(.rodata1) - LOAD_OFFSET) { \ .rodata1 : AT(ADDR(.rodata1) - LOAD_OFFSET) { \

View File

@ -337,7 +337,7 @@ static inline bool inode_to_wb_is_valid(struct inode *inode)
* holding either @inode->i_lock, @inode->i_mapping->tree_lock, or the * holding either @inode->i_lock, @inode->i_mapping->tree_lock, or the
* associated wb's list_lock. * associated wb's list_lock.
*/ */
static inline struct bdi_writeback *inode_to_wb(struct inode *inode) static inline struct bdi_writeback *inode_to_wb(const struct inode *inode)
{ {
#ifdef CONFIG_LOCKDEP #ifdef CONFIG_LOCKDEP
WARN_ON_ONCE(debug_locks && WARN_ON_ONCE(debug_locks &&

View File

@ -720,9 +720,31 @@ do { \
lock_acquire(&(lock)->dep_map, 0, 0, 1, 1, NULL, _THIS_IP_); \ lock_acquire(&(lock)->dep_map, 0, 0, 1, 1, NULL, _THIS_IP_); \
lock_release(&(lock)->dep_map, 0, _THIS_IP_); \ lock_release(&(lock)->dep_map, 0, _THIS_IP_); \
} while (0) } while (0)
#define lockdep_assert_irqs_enabled() do { \
WARN_ONCE(debug_locks && !current->lockdep_recursion && \
!current->hardirqs_enabled, \
"IRQs not enabled as expected\n"); \
} while (0)
#define lockdep_assert_irqs_disabled() do { \
WARN_ONCE(debug_locks && !current->lockdep_recursion && \
current->hardirqs_enabled, \
"IRQs not disabled as expected\n"); \
} while (0)
#define lockdep_assert_in_irq() do { \
WARN_ONCE(debug_locks && !current->lockdep_recursion && \
!current->hardirq_context, \
"Not in hardirq as expected\n"); \
} while (0)
#else #else
# define might_lock(lock) do { } while (0) # define might_lock(lock) do { } while (0)
# define might_lock_read(lock) do { } while (0) # define might_lock_read(lock) do { } while (0)
# define lockdep_assert_irqs_enabled() do { } while (0)
# define lockdep_assert_irqs_disabled() do { } while (0)
# define lockdep_assert_in_irq() do { } while (0)
#endif #endif
#ifdef CONFIG_LOCKDEP #ifdef CONFIG_LOCKDEP

View File

@ -20,6 +20,7 @@
#include <linux/export.h> #include <linux/export.h>
#include <linux/rbtree_latch.h> #include <linux/rbtree_latch.h>
#include <linux/cfi.h> #include <linux/cfi.h>
#include <linux/srcu.h>
#include <linux/percpu.h> #include <linux/percpu.h>
#include <asm/module.h> #include <asm/module.h>
@ -437,6 +438,10 @@ struct module {
unsigned int num_tracepoints; unsigned int num_tracepoints;
struct tracepoint * const *tracepoints_ptrs; struct tracepoint * const *tracepoints_ptrs;
#endif #endif
#ifdef CONFIG_TREE_SRCU
unsigned int num_srcu_structs;
struct srcu_struct **srcu_struct_ptrs;
#endif
#ifdef HAVE_JUMP_LABEL #ifdef HAVE_JUMP_LABEL
struct jump_entry *jump_entries; struct jump_entry *jump_entries;
unsigned int num_jump_entries; unsigned int num_jump_entries;

View File

@ -43,9 +43,7 @@
* in srcu_notifier_call_chain(): no cache bounces and no memory barriers. * in srcu_notifier_call_chain(): no cache bounces and no memory barriers.
* As compensation, srcu_notifier_chain_unregister() is rather expensive. * As compensation, srcu_notifier_chain_unregister() is rather expensive.
* SRCU notifier chains should be used when the chain will be called very * SRCU notifier chains should be used when the chain will be called very
* often but notifier_blocks will seldom be removed. Also, SRCU notifier * often but notifier_blocks will seldom be removed.
* chains are slightly more difficult to use because they require special
* runtime initialization.
*/ */
struct notifier_block; struct notifier_block;
@ -91,7 +89,7 @@ struct srcu_notifier_head {
(name)->head = NULL; \ (name)->head = NULL; \
} while (0) } while (0)
/* srcu_notifier_heads must be initialized and cleaned up dynamically */ /* srcu_notifier_heads must be cleaned up dynamically */
extern void srcu_init_notifier_head(struct srcu_notifier_head *nh); extern void srcu_init_notifier_head(struct srcu_notifier_head *nh);
#define srcu_cleanup_notifier_head(name) \ #define srcu_cleanup_notifier_head(name) \
cleanup_srcu_struct(&(name)->srcu); cleanup_srcu_struct(&(name)->srcu);
@ -104,7 +102,13 @@ extern void srcu_init_notifier_head(struct srcu_notifier_head *nh);
.head = NULL } .head = NULL }
#define RAW_NOTIFIER_INIT(name) { \ #define RAW_NOTIFIER_INIT(name) { \
.head = NULL } .head = NULL }
/* srcu_notifier_heads cannot be initialized statically */
#define SRCU_NOTIFIER_INIT(name, pcpu) \
{ \
.mutex = __MUTEX_INITIALIZER(name.mutex), \
.head = NULL, \
.srcu = __SRCU_STRUCT_INIT(name.srcu, pcpu), \
}
#define ATOMIC_NOTIFIER_HEAD(name) \ #define ATOMIC_NOTIFIER_HEAD(name) \
struct atomic_notifier_head name = \ struct atomic_notifier_head name = \
@ -116,6 +120,26 @@ extern void srcu_init_notifier_head(struct srcu_notifier_head *nh);
struct raw_notifier_head name = \ struct raw_notifier_head name = \
RAW_NOTIFIER_INIT(name) RAW_NOTIFIER_INIT(name)
#ifdef CONFIG_TREE_SRCU
#define _SRCU_NOTIFIER_HEAD(name, mod) \
static DEFINE_PER_CPU(struct srcu_data, \
name##_head_srcu_data); \
mod struct srcu_notifier_head name = \
SRCU_NOTIFIER_INIT(name, name##_head_srcu_data)
#else
#define _SRCU_NOTIFIER_HEAD(name, mod) \
mod struct srcu_notifier_head name = \
SRCU_NOTIFIER_INIT(name, name)
#endif
#define SRCU_NOTIFIER_HEAD(name) \
_SRCU_NOTIFIER_HEAD(name, /* not static */)
#define SRCU_NOTIFIER_HEAD_STATIC(name) \
_SRCU_NOTIFIER_HEAD(name, static)
#ifdef __KERNEL__ #ifdef __KERNEL__
extern int atomic_notifier_chain_register(struct atomic_notifier_head *nh, extern int atomic_notifier_chain_register(struct atomic_notifier_head *nh,

View File

@ -20,7 +20,7 @@ struct percpu_rw_semaphore {
#define DEFINE_STATIC_PERCPU_RWSEM(name) \ #define DEFINE_STATIC_PERCPU_RWSEM(name) \
static DEFINE_PER_CPU(unsigned int, __percpu_rwsem_rc_##name); \ static DEFINE_PER_CPU(unsigned int, __percpu_rwsem_rc_##name); \
static struct percpu_rw_semaphore name = { \ static struct percpu_rw_semaphore name = { \
.rss = __RCU_SYNC_INITIALIZER(name.rss, RCU_SCHED_SYNC), \ .rss = __RCU_SYNC_INITIALIZER(name.rss), \
.read_count = &__percpu_rwsem_rc_##name, \ .read_count = &__percpu_rwsem_rc_##name, \
.rw_sem = __RWSEM_INITIALIZER(name.rw_sem), \ .rw_sem = __RWSEM_INITIALIZER(name.rw_sem), \
.writer = __RCUWAIT_INITIALIZER(name.writer), \ .writer = __RCUWAIT_INITIALIZER(name.writer), \
@ -41,7 +41,7 @@ static inline void percpu_down_read_preempt_disable(struct percpu_rw_semaphore *
* cannot both change sem->state from readers_fast and start checking * cannot both change sem->state from readers_fast and start checking
* counters while we are here. So if we see !sem->state, we know that * counters while we are here. So if we see !sem->state, we know that
* the writer won't be checking until we're past the preempt_enable() * the writer won't be checking until we're past the preempt_enable()
* and that one the synchronize_sched() is done, the writer will see * and that once the synchronize_rcu() is done, the writer will see
* anything we did within this RCU-sched read-size critical section. * anything we did within this RCU-sched read-size critical section.
*/ */
__this_cpu_inc(*sem->read_count); __this_cpu_inc(*sem->read_count);

View File

@ -27,6 +27,9 @@
#ifndef __INCLUDE_LINUX_RCU_SEGCBLIST_H #ifndef __INCLUDE_LINUX_RCU_SEGCBLIST_H
#define __INCLUDE_LINUX_RCU_SEGCBLIST_H #define __INCLUDE_LINUX_RCU_SEGCBLIST_H
#include <linux/types.h>
#include <linux/atomic.h>
/* Simple unsegmented callback lists. */ /* Simple unsegmented callback lists. */
struct rcu_cblist { struct rcu_cblist {
struct rcu_head *head; struct rcu_head *head;
@ -78,8 +81,14 @@ struct rcu_segcblist {
struct rcu_head *head; struct rcu_head *head;
struct rcu_head **tails[RCU_CBLIST_NSEGS]; struct rcu_head **tails[RCU_CBLIST_NSEGS];
unsigned long gp_seq[RCU_CBLIST_NSEGS]; unsigned long gp_seq[RCU_CBLIST_NSEGS];
#ifdef CONFIG_RCU_NOCB_CPU
atomic_long_t len;
#else
long len; long len;
#endif
long len_lazy; long len_lazy;
u8 enabled;
u8 offloaded;
}; };
#define RCU_SEGCBLIST_INITIALIZER(n) \ #define RCU_SEGCBLIST_INITIALIZER(n) \

View File

@ -26,62 +26,42 @@
#include <linux/wait.h> #include <linux/wait.h>
#include <linux/rcupdate.h> #include <linux/rcupdate.h>
enum rcu_sync_type { RCU_SYNC, RCU_SCHED_SYNC, RCU_BH_SYNC };
/* Structure to mediate between updaters and fastpath-using readers. */ /* Structure to mediate between updaters and fastpath-using readers. */
struct rcu_sync { struct rcu_sync {
int gp_state; int gp_state;
int gp_count; int gp_count;
wait_queue_head_t gp_wait; wait_queue_head_t gp_wait;
int cb_state;
struct rcu_head cb_head; struct rcu_head cb_head;
enum rcu_sync_type gp_type;
}; };
extern void rcu_sync_lockdep_assert(struct rcu_sync *);
/** /**
* rcu_sync_is_idle() - Are readers permitted to use their fastpaths? * rcu_sync_is_idle() - Are readers permitted to use their fastpaths?
* @rsp: Pointer to rcu_sync structure to use for synchronization * @rsp: Pointer to rcu_sync structure to use for synchronization
* *
* Returns true if readers are permitted to use their fastpaths. * Returns true if readers are permitted to use their fastpaths. Must be
* Must be invoked within an RCU read-side critical section whose * invoked within some flavor of RCU read-side critical section.
* flavor matches that of the rcu_sync struture.
*/ */
static inline bool rcu_sync_is_idle(struct rcu_sync *rsp) static inline bool rcu_sync_is_idle(struct rcu_sync *rsp)
{ {
#ifdef CONFIG_PROVE_RCU RCU_LOCKDEP_WARN(!rcu_read_lock_any_held(),
rcu_sync_lockdep_assert(rsp); "suspicious rcu_sync_is_idle() usage");
#endif return !READ_ONCE(rsp->gp_state); /* GP_IDLE */
return !rsp->gp_state; /* GP_IDLE */
} }
extern void rcu_sync_init(struct rcu_sync *, enum rcu_sync_type); extern void rcu_sync_init(struct rcu_sync *);
extern void rcu_sync_enter_start(struct rcu_sync *); extern void rcu_sync_enter_start(struct rcu_sync *);
extern void rcu_sync_enter(struct rcu_sync *); extern void rcu_sync_enter(struct rcu_sync *);
extern void rcu_sync_exit(struct rcu_sync *); extern void rcu_sync_exit(struct rcu_sync *);
extern void rcu_sync_dtor(struct rcu_sync *); extern void rcu_sync_dtor(struct rcu_sync *);
#define __RCU_SYNC_INITIALIZER(name, type) { \ #define __RCU_SYNC_INITIALIZER(name) { \
.gp_state = 0, \ .gp_state = 0, \
.gp_count = 0, \ .gp_count = 0, \
.gp_wait = __WAIT_QUEUE_HEAD_INITIALIZER(name.gp_wait), \ .gp_wait = __WAIT_QUEUE_HEAD_INITIALIZER(name.gp_wait), \
.cb_state = 0, \
.gp_type = type, \
} }
#define __DEFINE_RCU_SYNC(name, type) \ #define DEFINE_RCU_SYNC(name) \
struct rcu_sync_struct name = __RCU_SYNC_INITIALIZER(name, type) struct rcu_sync name = __RCU_SYNC_INITIALIZER(name)
#define DEFINE_RCU_SYNC(name) \
__DEFINE_RCU_SYNC(name, RCU_SYNC)
#define DEFINE_RCU_SCHED_SYNC(name) \
__DEFINE_RCU_SYNC(name, RCU_SCHED_SYNC)
#define DEFINE_RCU_BH_SYNC(name) \
__DEFINE_RCU_SYNC(name, RCU_BH_SYNC)
#endif /* _LINUX_RCU_SYNC_H_ */ #endif /* _LINUX_RCU_SYNC_H_ */

View File

@ -40,6 +40,24 @@ static inline void INIT_LIST_HEAD_RCU(struct list_head *list)
*/ */
#define list_next_rcu(list) (*((struct list_head __rcu **)(&(list)->next))) #define list_next_rcu(list) (*((struct list_head __rcu **)(&(list)->next)))
/*
* Check during list traversal that we are within an RCU reader
*/
#define check_arg_count_one(dummy)
#ifdef CONFIG_PROVE_RCU_LIST
#define __list_check_rcu(dummy, cond, extra...) \
({ \
check_arg_count_one(extra); \
RCU_LOCKDEP_WARN(!cond && !rcu_read_lock_any_held(), \
"RCU-list traversed in non-reader section!"); \
})
#else
#define __list_check_rcu(dummy, cond, extra...) \
({ check_arg_count_one(extra); })
#endif
/* /*
* Insert a new entry between two known consecutive entries. * Insert a new entry between two known consecutive entries.
* *
@ -182,7 +200,7 @@ static inline void list_replace_rcu(struct list_head *old,
* @list: the RCU-protected list to splice * @list: the RCU-protected list to splice
* @prev: points to the last element of the existing list * @prev: points to the last element of the existing list
* @next: points to the first element of the existing list * @next: points to the first element of the existing list
* @sync: function to sync: synchronize_rcu(), synchronize_sched(), ... * @sync: synchronize_rcu, synchronize_rcu_expedited, ...
* *
* The list pointed to by @prev and @next can be RCU-read traversed * The list pointed to by @prev and @next can be RCU-read traversed
* concurrently with this function. * concurrently with this function.
@ -240,7 +258,7 @@ static inline void __list_splice_init_rcu(struct list_head *list,
* designed for stacks. * designed for stacks.
* @list: the RCU-protected list to splice * @list: the RCU-protected list to splice
* @head: the place in the existing list to splice the first list into * @head: the place in the existing list to splice the first list into
* @sync: function to sync: synchronize_rcu(), synchronize_sched(), ... * @sync: synchronize_rcu, synchronize_rcu_expedited, ...
*/ */
static inline void list_splice_init_rcu(struct list_head *list, static inline void list_splice_init_rcu(struct list_head *list,
struct list_head *head, struct list_head *head,
@ -255,7 +273,7 @@ static inline void list_splice_init_rcu(struct list_head *list,
* list, designed for queues. * list, designed for queues.
* @list: the RCU-protected list to splice * @list: the RCU-protected list to splice
* @head: the place in the existing list to splice the first list into * @head: the place in the existing list to splice the first list into
* @sync: function to sync: synchronize_rcu(), synchronize_sched(), ... * @sync: synchronize_rcu, synchronize_rcu_expedited, ...
*/ */
static inline void list_splice_tail_init_rcu(struct list_head *list, static inline void list_splice_tail_init_rcu(struct list_head *list,
struct list_head *head, struct list_head *head,
@ -343,14 +361,16 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,
* @pos: the type * to use as a loop cursor. * @pos: the type * to use as a loop cursor.
* @head: the head for your list. * @head: the head for your list.
* @member: the name of the list_head within the struct. * @member: the name of the list_head within the struct.
* @cond: optional lockdep expression if called from non-RCU protection.
* *
* This list-traversal primitive may safely run concurrently with * This list-traversal primitive may safely run concurrently with
* the _rcu list-mutation primitives such as list_add_rcu() * the _rcu list-mutation primitives such as list_add_rcu()
* as long as the traversal is guarded by rcu_read_lock(). * as long as the traversal is guarded by rcu_read_lock().
*/ */
#define list_for_each_entry_rcu(pos, head, member) \ #define list_for_each_entry_rcu(pos, head, member, cond...) \
for (pos = list_entry_rcu((head)->next, typeof(*pos), member); \ for (__list_check_rcu(dummy, ## cond, 0), \
&pos->member != (head); \ pos = list_entry_rcu((head)->next, typeof(*pos), member); \
&pos->member != (head); \
pos = list_entry_rcu(pos->member.next, typeof(*pos), member)) pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
/** /**
@ -359,13 +379,12 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,
* @type: the type of the struct this is embedded in. * @type: the type of the struct this is embedded in.
* @member: the name of the list_head within the struct. * @member: the name of the list_head within the struct.
* *
* This primitive may safely run concurrently with the _rcu list-mutation * This primitive may safely run concurrently with the _rcu
* primitives such as list_add_rcu(), but requires some implicit RCU * list-mutation primitives such as list_add_rcu(), but requires some
* read-side guarding. One example is running within a special * implicit RCU read-side guarding. One example is running within a special
* exception-time environment where preemption is disabled and where * exception-time environment where preemption is disabled and where lockdep
* lockdep cannot be invoked (in which case updaters must use RCU-sched, * cannot be invoked. Another example is when items are added to the list,
* as in synchronize_sched(), call_rcu_sched(), and friends). Another * but never deleted.
* example is when items are added to the list, but never deleted.
*/ */
#define list_entry_lockless(ptr, type, member) \ #define list_entry_lockless(ptr, type, member) \
container_of((typeof(ptr))READ_ONCE(ptr), type, member) container_of((typeof(ptr))READ_ONCE(ptr), type, member)
@ -376,13 +395,12 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,
* @head: the head for your list. * @head: the head for your list.
* @member: the name of the list_struct within the struct. * @member: the name of the list_struct within the struct.
* *
* This primitive may safely run concurrently with the _rcu list-mutation * This primitive may safely run concurrently with the _rcu
* primitives such as list_add_rcu(), but requires some implicit RCU * list-mutation primitives such as list_add_rcu(), but requires some
* read-side guarding. One example is running within a special * implicit RCU read-side guarding. One example is running within a special
* exception-time environment where preemption is disabled and where * exception-time environment where preemption is disabled and where lockdep
* lockdep cannot be invoked (in which case updaters must use RCU-sched, * cannot be invoked. Another example is when items are added to the list,
* as in synchronize_sched(), call_rcu_sched(), and friends). Another * but never deleted.
* example is when items are added to the list, but never deleted.
*/ */
#define list_for_each_entry_lockless(pos, head, member) \ #define list_for_each_entry_lockless(pos, head, member) \
for (pos = list_entry_lockless((head)->next, typeof(*pos), member); \ for (pos = list_entry_lockless((head)->next, typeof(*pos), member); \
@ -396,13 +414,43 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,
* @member: the name of the list_head within the struct. * @member: the name of the list_head within the struct.
* *
* Continue to iterate over list of given type, continuing after * Continue to iterate over list of given type, continuing after
* the current position. * the current position which must have been in the list when the RCU read
* lock was taken.
* This would typically require either that you obtained the node from a
* previous walk of the list in the same RCU read-side critical section, or
* that you held some sort of non-RCU reference (such as a reference count)
* to keep the node alive *and* in the list.
*
* This iterator is similar to list_for_each_entry_from_rcu() except
* this starts after the given position and that one starts at the given
* position.
*/ */
#define list_for_each_entry_continue_rcu(pos, head, member) \ #define list_for_each_entry_continue_rcu(pos, head, member) \
for (pos = list_entry_rcu(pos->member.next, typeof(*pos), member); \ for (pos = list_entry_rcu(pos->member.next, typeof(*pos), member); \
&pos->member != (head); \ &pos->member != (head); \
pos = list_entry_rcu(pos->member.next, typeof(*pos), member)) pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
/**
* list_for_each_entry_from_rcu - iterate over a list from current point
* @pos: the type * to use as a loop cursor.
* @head: the head for your list.
* @member: the name of the list_node within the struct.
*
* Iterate over the tail of a list starting from a given position,
* which must have been in the list when the RCU read lock was taken.
* This would typically require either that you obtained the node from a
* previous walk of the list in the same RCU read-side critical section, or
* that you held some sort of non-RCU reference (such as a reference count)
* to keep the node alive *and* in the list.
*
* This iterator is similar to list_for_each_entry_continue_rcu() except
* this starts from the given position and that one starts from the position
* after the given position.
*/
#define list_for_each_entry_from_rcu(pos, head, member) \
for (; &(pos)->member != (head); \
pos = list_entry_rcu(pos->member.next, typeof(*(pos)), member))
/** /**
* hlist_del_rcu - deletes entry from hash list without re-initialization * hlist_del_rcu - deletes entry from hash list without re-initialization
* @n: the element to delete from the hash list. * @n: the element to delete from the hash list.
@ -588,13 +636,15 @@ static inline void hlist_add_behind_rcu(struct hlist_node *n,
* @pos: the type * to use as a loop cursor. * @pos: the type * to use as a loop cursor.
* @head: the head for your list. * @head: the head for your list.
* @member: the name of the hlist_node within the struct. * @member: the name of the hlist_node within the struct.
* @cond: optional lockdep expression if called from non-RCU protection.
* *
* This list-traversal primitive may safely run concurrently with * This list-traversal primitive may safely run concurrently with
* the _rcu list-mutation primitives such as hlist_add_head_rcu() * the _rcu list-mutation primitives such as hlist_add_head_rcu()
* as long as the traversal is guarded by rcu_read_lock(). * as long as the traversal is guarded by rcu_read_lock().
*/ */
#define hlist_for_each_entry_rcu(pos, head, member) \ #define hlist_for_each_entry_rcu(pos, head, member, cond...) \
for (pos = hlist_entry_safe (rcu_dereference_raw(hlist_first_rcu(head)),\ for (__list_check_rcu(dummy, ## cond, 0), \
pos = hlist_entry_safe(rcu_dereference_raw(hlist_first_rcu(head)),\
typeof(*(pos)), member); \ typeof(*(pos)), member); \
pos; \ pos; \
pos = hlist_entry_safe(rcu_dereference_raw(hlist_next_rcu(\ pos = hlist_entry_safe(rcu_dereference_raw(hlist_next_rcu(\

View File

@ -48,24 +48,14 @@
#define ulong2long(a) (*(long *)(&(a))) #define ulong2long(a) (*(long *)(&(a)))
/* Exported common interfaces */ /* Exported common interfaces */
#ifdef CONFIG_PREEMPT_RCU
void call_rcu(struct rcu_head *head, rcu_callback_t func); void call_rcu(struct rcu_head *head, rcu_callback_t func);
#else /* #ifdef CONFIG_PREEMPT_RCU */
#define call_rcu call_rcu_sched
#endif /* #else #ifdef CONFIG_PREEMPT_RCU */
void call_rcu_bh(struct rcu_head *head, rcu_callback_t func);
void call_rcu_sched(struct rcu_head *head, rcu_callback_t func);
void synchronize_sched(void);
void rcu_barrier_tasks(void); void rcu_barrier_tasks(void);
void synchronize_rcu(void);
#ifdef CONFIG_PREEMPT_RCU #ifdef CONFIG_PREEMPT_RCU
void __rcu_read_lock(void); void __rcu_read_lock(void);
void __rcu_read_unlock(void); void __rcu_read_unlock(void);
void rcu_read_unlock_special(struct task_struct *t);
void synchronize_rcu(void);
/* /*
* Defined as a macro as it is a very low level header included from * Defined as a macro as it is a very low level header included from
@ -87,11 +77,6 @@ static inline void __rcu_read_unlock(void)
preempt_enable(); preempt_enable();
} }
static inline void synchronize_rcu(void)
{
synchronize_sched();
}
static inline int rcu_preempt_depth(void) static inline int rcu_preempt_depth(void)
{ {
return 0; return 0;
@ -102,11 +87,8 @@ static inline int rcu_preempt_depth(void)
/* Internal to kernel */ /* Internal to kernel */
void rcu_init(void); void rcu_init(void);
extern int rcu_scheduler_active __read_mostly; extern int rcu_scheduler_active __read_mostly;
void rcu_sched_qs(void); void rcu_sched_clock_irq(int user);
void rcu_bh_qs(void);
void rcu_check_callbacks(int user);
void rcu_report_dead(unsigned int cpu); void rcu_report_dead(unsigned int cpu);
void rcu_cpu_starting(unsigned int cpu);
void rcutree_migrate_callbacks(int cpu); void rcutree_migrate_callbacks(int cpu);
#ifdef CONFIG_RCU_STALL_COMMON #ifdef CONFIG_RCU_STALL_COMMON
@ -135,11 +117,10 @@ static inline void rcu_init_nohz(void) { }
* RCU_NONIDLE - Indicate idle-loop code that needs RCU readers * RCU_NONIDLE - Indicate idle-loop code that needs RCU readers
* @a: Code that RCU needs to pay attention to. * @a: Code that RCU needs to pay attention to.
* *
* RCU, RCU-bh, and RCU-sched read-side critical sections are forbidden * RCU read-side critical sections are forbidden in the inner idle loop,
* in the inner idle loop, that is, between the rcu_idle_enter() and * that is, between the rcu_idle_enter() and the rcu_idle_exit() -- RCU
* the rcu_idle_exit() -- RCU will happily ignore any such read-side * will happily ignore any such read-side critical sections. However,
* critical sections. However, things like powertop need tracepoints * things like powertop need tracepoints in the inner idle loop.
* in the inner idle loop.
* *
* This macro provides the way out: RCU_NONIDLE(do_something_with_RCU()) * This macro provides the way out: RCU_NONIDLE(do_something_with_RCU())
* will tell RCU that it needs to pay attention, invoke its argument * will tell RCU that it needs to pay attention, invoke its argument
@ -158,44 +139,40 @@ static inline void rcu_init_nohz(void) { }
} while (0) } while (0)
/* /*
* Note a voluntary context switch for RCU-tasks benefit. This is a * Note a quasi-voluntary context switch for RCU-tasks's benefit.
* macro rather than an inline function to avoid #include hell. * This is a macro rather than an inline function to avoid #include hell.
*/ */
#ifdef CONFIG_TASKS_RCU #ifdef CONFIG_TASKS_RCU
#define rcu_note_voluntary_context_switch_lite(t) \ #define rcu_tasks_qs(t) \
do { \ do { \
if (READ_ONCE((t)->rcu_tasks_holdout)) \ if (READ_ONCE((t)->rcu_tasks_holdout)) \
WRITE_ONCE((t)->rcu_tasks_holdout, false); \ WRITE_ONCE((t)->rcu_tasks_holdout, false); \
} while (0) } while (0)
#define rcu_note_voluntary_context_switch(t) \ #define rcu_note_voluntary_context_switch(t) rcu_tasks_qs(t)
do { \
rcu_all_qs(); \
rcu_note_voluntary_context_switch_lite(t); \
} while (0)
void call_rcu_tasks(struct rcu_head *head, rcu_callback_t func); void call_rcu_tasks(struct rcu_head *head, rcu_callback_t func);
void synchronize_rcu_tasks(void); void synchronize_rcu_tasks(void);
void exit_tasks_rcu_start(void); void exit_tasks_rcu_start(void);
void exit_tasks_rcu_finish(void); void exit_tasks_rcu_finish(void);
#else /* #ifdef CONFIG_TASKS_RCU */ #else /* #ifdef CONFIG_TASKS_RCU */
#define rcu_note_voluntary_context_switch_lite(t) do { } while (0) #define rcu_tasks_qs(t) do { } while (0)
#define rcu_note_voluntary_context_switch(t) rcu_all_qs() #define rcu_note_voluntary_context_switch(t) do { } while (0)
#define call_rcu_tasks call_rcu_sched #define call_rcu_tasks call_rcu
#define synchronize_rcu_tasks synchronize_sched #define synchronize_rcu_tasks synchronize_rcu
static inline void exit_tasks_rcu_start(void) { } static inline void exit_tasks_rcu_start(void) { }
static inline void exit_tasks_rcu_finish(void) { } static inline void exit_tasks_rcu_finish(void) { }
#endif /* #else #ifdef CONFIG_TASKS_RCU */ #endif /* #else #ifdef CONFIG_TASKS_RCU */
/** /**
* cond_resched_rcu_qs - Report potential quiescent states to RCU * cond_resched_tasks_rcu_qs - Report potential quiescent states to RCU
* *
* This macro resembles cond_resched(), except that it is defined to * This macro resembles cond_resched(), except that it is defined to
* report potential quiescent states to RCU-tasks even if the cond_resched() * report potential quiescent states to RCU-tasks even if the cond_resched()
* machinery were to be shut off, as some advocate for PREEMPT kernels. * machinery were to be shut off, as some advocate for PREEMPT kernels.
*/ */
#define cond_resched_rcu_qs() \ #define cond_resched_tasks_rcu_qs() \
do { \ do { \
if (!cond_resched()) \ rcu_tasks_qs(current); \
rcu_note_voluntary_context_switch(current); \ cond_resched(); \
} while (0) } while (0)
/* /*
@ -212,10 +189,12 @@ do { \
#endif #endif
/* /*
* init_rcu_head_on_stack()/destroy_rcu_head_on_stack() are needed for dynamic * The init_rcu_head_on_stack() and destroy_rcu_head_on_stack() calls
* initialization and destruction of rcu_head on the stack. rcu_head structures * are needed for dynamic initialization and destruction of rcu_head
* allocated dynamically in the heap or defined statically don't need any * on the stack, and init_rcu_head()/destroy_rcu_head() are needed for
* initialization. * dynamic initialization and destruction of statically allocated rcu_head
* structures. However, rcu_head structures allocated dynamically in the
* heap don't need any initialization.
*/ */
#ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD
void init_rcu_head(struct rcu_head *head); void init_rcu_head(struct rcu_head *head);
@ -255,6 +234,7 @@ int debug_lockdep_rcu_enabled(void);
int rcu_read_lock_held(void); int rcu_read_lock_held(void);
int rcu_read_lock_bh_held(void); int rcu_read_lock_bh_held(void);
int rcu_read_lock_sched_held(void); int rcu_read_lock_sched_held(void);
int rcu_read_lock_any_held(void);
#else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */ #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
@ -275,6 +255,12 @@ static inline int rcu_read_lock_sched_held(void)
{ {
return !preemptible(); return !preemptible();
} }
static inline int rcu_read_lock_any_held(void)
{
return !preemptible();
}
#endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */ #endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
#ifdef CONFIG_PROVE_RCU #ifdef CONFIG_PROVE_RCU
@ -323,22 +309,21 @@ static inline void rcu_preempt_sleep_check(void) { }
* Helper functions for rcu_dereference_check(), rcu_dereference_protected() * Helper functions for rcu_dereference_check(), rcu_dereference_protected()
* and rcu_assign_pointer(). Some of these could be folded into their * and rcu_assign_pointer(). Some of these could be folded into their
* callers, but they are left separate in order to ease introduction of * callers, but they are left separate in order to ease introduction of
* multiple flavors of pointers to match the multiple flavors of RCU * multiple pointers markings to match different RCU implementations
* (e.g., __rcu_bh, * __rcu_sched, and __srcu), should this make sense in * (e.g., __srcu), should this make sense in the future.
* the future.
*/ */
#ifdef __CHECKER__ #ifdef __CHECKER__
#define rcu_dereference_sparse(p, space) \ #define rcu_check_sparse(p, space) \
((void)(((typeof(*p) space *)p) == p)) ((void)(((typeof(*p) space *)p) == p))
#else /* #ifdef __CHECKER__ */ #else /* #ifdef __CHECKER__ */
#define rcu_dereference_sparse(p, space) #define rcu_check_sparse(p, space)
#endif /* #else #ifdef __CHECKER__ */ #endif /* #else #ifdef __CHECKER__ */
#define __rcu_access_pointer(p, space) \ #define __rcu_access_pointer(p, space) \
({ \ ({ \
typeof(*p) *_________p1 = (typeof(*p) *__force)READ_ONCE(p); \ typeof(*p) *_________p1 = (typeof(*p) *__force)READ_ONCE(p); \
rcu_dereference_sparse(p, space); \ rcu_check_sparse(p, space); \
((typeof(*p) __force __kernel *)(_________p1)); \ ((typeof(*p) __force __kernel *)(_________p1)); \
}) })
#define __rcu_dereference_check(p, c, space) \ #define __rcu_dereference_check(p, c, space) \
@ -346,13 +331,13 @@ static inline void rcu_preempt_sleep_check(void) { }
/* Dependency order vs. p above. */ \ /* Dependency order vs. p above. */ \
typeof(*p) *________p1 = (typeof(*p) *__force)READ_ONCE(p); \ typeof(*p) *________p1 = (typeof(*p) *__force)READ_ONCE(p); \
RCU_LOCKDEP_WARN(!(c), "suspicious rcu_dereference_check() usage"); \ RCU_LOCKDEP_WARN(!(c), "suspicious rcu_dereference_check() usage"); \
rcu_dereference_sparse(p, space); \ rcu_check_sparse(p, space); \
((typeof(*p) __force __kernel *)(________p1)); \ ((typeof(*p) __force __kernel *)(________p1)); \
}) })
#define __rcu_dereference_protected(p, c, space) \ #define __rcu_dereference_protected(p, c, space) \
({ \ ({ \
RCU_LOCKDEP_WARN(!(c), "suspicious rcu_dereference_protected() usage"); \ RCU_LOCKDEP_WARN(!(c), "suspicious rcu_dereference_protected() usage"); \
rcu_dereference_sparse(p, space); \ rcu_check_sparse(p, space); \
((typeof(*p) __force __kernel *)(p)); \ ((typeof(*p) __force __kernel *)(p)); \
}) })
#define rcu_dereference_raw(p) \ #define rcu_dereference_raw(p) \
@ -400,15 +385,15 @@ static inline void rcu_preempt_sleep_check(void) { }
* other macros that it invokes. * other macros that it invokes.
*/ */
#define rcu_assign_pointer(p, v) \ #define rcu_assign_pointer(p, v) \
({ \ do { \
uintptr_t _r_a_p__v = (uintptr_t)(v); \ uintptr_t _r_a_p__v = (uintptr_t)(v); \
rcu_check_sparse(p, __rcu); \
\ \
if (__builtin_constant_p(v) && (_r_a_p__v) == (uintptr_t)NULL) \ if (__builtin_constant_p(v) && (_r_a_p__v) == (uintptr_t)NULL) \
WRITE_ONCE((p), (typeof(p))(_r_a_p__v)); \ WRITE_ONCE((p), (typeof(p))(_r_a_p__v)); \
else \ else \
smp_store_release(&p, RCU_INITIALIZER((typeof(p))_r_a_p__v)); \ smp_store_release(&p, RCU_INITIALIZER((typeof(p))_r_a_p__v)); \
_r_a_p__v; \ } while (0)
})
/** /**
* rcu_swap_protected() - swap an RCU and a regular pointer * rcu_swap_protected() - swap an RCU and a regular pointer
@ -431,12 +416,12 @@ static inline void rcu_preempt_sleep_check(void) { }
* @p: The pointer to read * @p: The pointer to read
* *
* Return the value of the specified RCU-protected pointer, but omit the * Return the value of the specified RCU-protected pointer, but omit the
* smp_read_barrier_depends() and keep the READ_ONCE(). This is useful * lockdep checks for being in an RCU read-side critical section. This is
* when the value of this pointer is accessed, but the pointer is not * useful when the value of this pointer is accessed, but the pointer is
* dereferenced, for example, when testing an RCU-protected pointer against * not dereferenced, for example, when testing an RCU-protected pointer
* NULL. Although rcu_access_pointer() may also be used in cases where * against NULL. Although rcu_access_pointer() may also be used in cases
* update-side locks prevent the value of the pointer from changing, you * where update-side locks prevent the value of the pointer from changing,
* should instead use rcu_dereference_protected() for this use case. * you should instead use rcu_dereference_protected() for this use case.
* *
* It is also permissible to use rcu_access_pointer() when read-side * It is also permissible to use rcu_access_pointer() when read-side
* access to the pointer was removed at least one grace period ago, as * access to the pointer was removed at least one grace period ago, as
@ -519,12 +504,11 @@ static inline void rcu_preempt_sleep_check(void) { }
* @c: The conditions under which the dereference will take place * @c: The conditions under which the dereference will take place
* *
* Return the value of the specified RCU-protected pointer, but omit * Return the value of the specified RCU-protected pointer, but omit
* both the smp_read_barrier_depends() and the READ_ONCE(). This * the READ_ONCE(). This is useful in cases where update-side locks
* is useful in cases where update-side locks prevent the value of the * prevent the value of the pointer from changing. Please note that this
* pointer from changing. Please note that this primitive does *not* * primitive does *not* prevent the compiler from repeating this reference
* prevent the compiler from repeating this reference or combining it * or combining it with other references, so it should not be used without
* with other references, so it should not be used without protection * protection of appropriate locks.
* of appropriate locks.
* *
* This function is only for update-side use. Using this function * This function is only for update-side use. Using this function
* when protected only by rcu_read_lock() will result in infrequent * when protected only by rcu_read_lock() will result in infrequent
@ -688,14 +672,9 @@ static inline void rcu_read_unlock(void)
/** /**
* rcu_read_lock_bh() - mark the beginning of an RCU-bh critical section * rcu_read_lock_bh() - mark the beginning of an RCU-bh critical section
* *
* This is equivalent of rcu_read_lock(), but to be used when updates * This is equivalent of rcu_read_lock(), but also disables softirqs.
* are being done using call_rcu_bh() or synchronize_rcu_bh(). Since * Note that anything else that disables softirqs can also serve as
* both call_rcu_bh() and synchronize_rcu_bh() consider completion of a * an RCU read-side critical section.
* softirq handler to be a quiescent state, a process in RCU read-side
* critical section must be protected by disabling softirqs. Read-side
* critical sections in interrupt context can use just rcu_read_lock(),
* though this should at least be commented to avoid confusing people
* reading the code.
* *
* Note that rcu_read_lock_bh() and the matching rcu_read_unlock_bh() * Note that rcu_read_lock_bh() and the matching rcu_read_unlock_bh()
* must occur in the same context, for example, it is illegal to invoke * must occur in the same context, for example, it is illegal to invoke
@ -728,10 +707,9 @@ static inline void rcu_read_unlock_bh(void)
/** /**
* rcu_read_lock_sched() - mark the beginning of a RCU-sched critical section * rcu_read_lock_sched() - mark the beginning of a RCU-sched critical section
* *
* This is equivalent of rcu_read_lock(), but to be used when updates * This is equivalent of rcu_read_lock(), but disables preemption.
* are being done using call_rcu_sched() or synchronize_rcu_sched(). * Read-side critical sections can also be introduced by anything else
* Read-side critical sections can also be introduced by anything that * that disables preemption, including local_irq_disable() and friends.
* disables preemption, including local_irq_disable() and friends.
* *
* Note that rcu_read_lock_sched() and the matching rcu_read_unlock_sched() * Note that rcu_read_lock_sched() and the matching rcu_read_unlock_sched()
* must occur in the same context, for example, it is illegal to invoke * must occur in the same context, for example, it is illegal to invoke
@ -815,7 +793,7 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
*/ */
#define RCU_INIT_POINTER(p, v) \ #define RCU_INIT_POINTER(p, v) \
do { \ do { \
rcu_dereference_sparse(p, __rcu); \ rcu_check_sparse(p, __rcu); \
WRITE_ONCE(p, RCU_INITIALIZER(v)); \ WRITE_ONCE(p, RCU_INITIALIZER(v)); \
} while (0) } while (0)
@ -847,7 +825,7 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
/** /**
* kfree_rcu() - kfree an object after a grace period. * kfree_rcu() - kfree an object after a grace period.
* @ptr: pointer to kfree * @ptr: pointer to kfree
* @rcu_head: the name of the struct rcu_head within the type of @ptr. * @rhf: the name of the struct rcu_head within the type of @ptr.
* *
* Many rcu callbacks functions just call kfree() on the base structure. * Many rcu callbacks functions just call kfree() on the base structure.
* These functions are trivial, but their size adds up, and furthermore * These functions are trivial, but their size adds up, and furthermore
@ -870,9 +848,13 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
* The BUILD_BUG_ON check must not involve any function calls, hence the * The BUILD_BUG_ON check must not involve any function calls, hence the
* checks are done in macros here. * checks are done in macros here.
*/ */
#define kfree_rcu(ptr, rcu_head) \ #define kfree_rcu(ptr, rhf) \
__kfree_rcu(&((ptr)->rcu_head), offsetof(typeof(*(ptr)), rcu_head)) do { \
typeof (ptr) ___p = (ptr); \
\
if (___p) \
__kfree_rcu(&((___p)->rhf), offsetof(typeof(*(ptr)), rhf)); \
} while (0)
/* /*
* Place this after a lock-acquisition primitive to guarantee that * Place this after a lock-acquisition primitive to guarantee that
@ -887,4 +869,98 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
#endif /* #else #ifdef CONFIG_ARCH_WEAK_RELEASE_ACQUIRE */ #endif /* #else #ifdef CONFIG_ARCH_WEAK_RELEASE_ACQUIRE */
/* Has the specified rcu_head structure been handed to call_rcu()? */
/*
* rcu_head_init - Initialize rcu_head for rcu_head_after_call_rcu()
* @rhp: The rcu_head structure to initialize.
*
* If you intend to invoke rcu_head_after_call_rcu() to test whether a
* given rcu_head structure has already been passed to call_rcu(), then
* you must also invoke this rcu_head_init() function on it just after
* allocating that structure. Calls to this function must not race with
* calls to call_rcu(), rcu_head_after_call_rcu(), or callback invocation.
*/
static inline void rcu_head_init(struct rcu_head *rhp)
{
rhp->func = (rcu_callback_t)~0L;
}
/*
* rcu_head_after_call_rcu - Has this rcu_head been passed to call_rcu()?
* @rhp: The rcu_head structure to test.
* @func: The function passed to call_rcu() along with @rhp.
*
* Returns @true if the @rhp has been passed to call_rcu() with @func,
* and @false otherwise. Emits a warning in any other case, including
* the case where @rhp has already been invoked after a grace period.
* Calls to this function must not race with callback invocation. One way
* to avoid such races is to enclose the call to rcu_head_after_call_rcu()
* in an RCU read-side critical section that includes a read-side fetch
* of the pointer to the structure containing @rhp.
*/
static inline bool
rcu_head_after_call_rcu(struct rcu_head *rhp, rcu_callback_t f)
{
rcu_callback_t func = READ_ONCE(rhp->func);
if (func == f)
return true;
WARN_ON_ONCE(func != (rcu_callback_t)~0L);
return false;
}
/* Transitional pre-consolidation compatibility definitions. */
static inline void synchronize_rcu_bh(void)
{
synchronize_rcu();
}
static inline void synchronize_rcu_bh_expedited(void)
{
synchronize_rcu_expedited();
}
static inline void call_rcu_bh(struct rcu_head *head, rcu_callback_t func)
{
call_rcu(head, func);
}
static inline void rcu_barrier_bh(void)
{
rcu_barrier();
}
static inline void synchronize_sched(void)
{
synchronize_rcu();
}
static inline void synchronize_sched_expedited(void)
{
synchronize_rcu_expedited();
}
static inline void call_rcu_sched(struct rcu_head *head, rcu_callback_t func)
{
call_rcu(head, func);
}
static inline void rcu_barrier_sched(void)
{
rcu_barrier();
}
static inline unsigned long get_state_synchronize_sched(void)
{
return get_state_synchronize_rcu();
}
static inline void cond_synchronize_sched(unsigned long oldstate)
{
cond_synchronize_rcu(oldstate);
}
#endif /* __LINUX_RCUPDATE_H */ #endif /* __LINUX_RCUPDATE_H */

View File

@ -31,21 +31,4 @@ do { \
#define wait_rcu_gp(...) _wait_rcu_gp(false, __VA_ARGS__) #define wait_rcu_gp(...) _wait_rcu_gp(false, __VA_ARGS__)
/**
* synchronize_rcu_mult - Wait concurrently for multiple grace periods
* @...: List of call_rcu() functions for the flavors to wait on.
*
* This macro waits concurrently for multiple flavors of RCU grace periods.
* For example, synchronize_rcu_mult(call_rcu, call_rcu_bh) would wait
* on concurrent RCU and RCU-bh grace periods. Waiting on a give SRCU
* domain requires you to write a wrapper function for that SRCU domain's
* call_srcu() function, supplying the corresponding srcu_struct.
*
* If Tiny RCU, tell _wait_rcu_gp() not to bother waiting for RCU
* or RCU-bh, given that anywhere synchronize_rcu_mult() can be called
* is automatically a grace period.
*/
#define synchronize_rcu_mult(...) \
_wait_rcu_gp(IS_ENABLED(CONFIG_TINY_RCU), __VA_ARGS__)
#endif /* _LINUX_SCHED_RCUPDATE_WAIT_H */ #endif /* _LINUX_SCHED_RCUPDATE_WAIT_H */

View File

@ -46,54 +46,29 @@ static inline void cond_synchronize_rcu(unsigned long oldstate)
might_sleep(); might_sleep();
} }
static inline unsigned long get_state_synchronize_sched(void) extern void rcu_barrier(void);
{
return 0;
}
static inline void cond_synchronize_sched(unsigned long oldstate)
{
might_sleep();
}
extern void rcu_barrier_bh(void);
extern void rcu_barrier_sched(void);
static inline void synchronize_rcu_expedited(void) static inline void synchronize_rcu_expedited(void)
{ {
synchronize_sched(); /* Only one CPU, so pretty fast anyway!!! */ synchronize_rcu();
} }
static inline void rcu_barrier(void) static inline void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
{
rcu_barrier_sched(); /* Only one CPU, so only one list of callbacks! */
}
static inline void synchronize_rcu_bh(void)
{
synchronize_sched();
}
static inline void synchronize_rcu_bh_expedited(void)
{
synchronize_sched();
}
static inline void synchronize_sched_expedited(void)
{
synchronize_sched();
}
static inline void kfree_call_rcu(struct rcu_head *head,
rcu_callback_t func)
{ {
call_rcu(head, func); call_rcu(head, func);
} }
void rcu_qs(void);
static inline void rcu_softirq_qs(void)
{
rcu_qs();
}
#define rcu_note_context_switch(preempt) \ #define rcu_note_context_switch(preempt) \
do { \ do { \
rcu_sched_qs(); \ rcu_qs(); \
rcu_note_voluntary_context_switch_lite(current); \ rcu_tasks_qs(current); \
} while (0) } while (0)
static inline int rcu_needs_cpu(u64 basemono, u64 *nextevt) static inline int rcu_needs_cpu(u64 basemono, u64 *nextevt)
@ -108,14 +83,19 @@ static inline int rcu_needs_cpu(u64 basemono, u64 *nextevt)
*/ */
static inline void rcu_virt_note_context_switch(int cpu) { } static inline void rcu_virt_note_context_switch(int cpu) { }
static inline void rcu_cpu_stall_reset(void) { } static inline void rcu_cpu_stall_reset(void) { }
static inline int rcu_jiffies_till_stall_check(void) { return 21 * HZ; }
static inline void rcu_idle_enter(void) { } static inline void rcu_idle_enter(void) { }
static inline void rcu_idle_exit(void) { } static inline void rcu_idle_exit(void) { }
static inline void rcu_irq_enter(void) { } static inline void rcu_irq_enter(void) { }
static inline bool rcu_irq_enter_disabled(void) { return false; }
static inline void rcu_irq_exit_irqson(void) { } static inline void rcu_irq_exit_irqson(void) { }
static inline void rcu_irq_enter_irqson(void) { } static inline void rcu_irq_enter_irqson(void) { }
static inline void rcu_irq_exit(void) { } static inline void rcu_irq_exit(void) { }
static inline void exit_rcu(void) { } static inline void exit_rcu(void) { }
static inline bool rcu_preempt_need_deferred_qs(struct task_struct *t)
{
return false;
}
static inline void rcu_preempt_deferred_qs(struct task_struct *t) { }
#ifdef CONFIG_SRCU #ifdef CONFIG_SRCU
void rcu_scheduler_starting(void); void rcu_scheduler_starting(void);
#else /* #ifndef CONFIG_SRCU */ #else /* #ifndef CONFIG_SRCU */
@ -133,5 +113,6 @@ static inline void rcu_all_qs(void) { barrier(); }
#define rcutree_offline_cpu NULL #define rcutree_offline_cpu NULL
#define rcutree_dead_cpu NULL #define rcutree_dead_cpu NULL
#define rcutree_dying_cpu NULL #define rcutree_dying_cpu NULL
static inline void rcu_cpu_starting(unsigned int cpu) { }
#endif /* __LINUX_RCUTINY_H */ #endif /* __LINUX_RCUTINY_H */

View File

@ -30,6 +30,7 @@
#ifndef __LINUX_RCUTREE_H #ifndef __LINUX_RCUTREE_H
#define __LINUX_RCUTREE_H #define __LINUX_RCUTREE_H
void rcu_softirq_qs(void);
void rcu_note_context_switch(bool preempt); void rcu_note_context_switch(bool preempt);
int rcu_needs_cpu(u64 basem, u64 *nextevt); int rcu_needs_cpu(u64 basem, u64 *nextevt);
void rcu_cpu_stall_reset(void); void rcu_cpu_stall_reset(void);
@ -44,40 +45,13 @@ static inline void rcu_virt_note_context_switch(int cpu)
rcu_note_context_switch(false); rcu_note_context_switch(false);
} }
void synchronize_rcu_bh(void);
void synchronize_sched_expedited(void);
void synchronize_rcu_expedited(void); void synchronize_rcu_expedited(void);
void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func); void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func);
/**
* synchronize_rcu_bh_expedited - Brute-force RCU-bh grace period
*
* Wait for an RCU-bh grace period to elapse, but use a "big hammer"
* approach to force the grace period to end quickly. This consumes
* significant time on all CPUs and is unfriendly to real-time workloads,
* so is thus not recommended for any sort of common-case code. In fact,
* if you are using synchronize_rcu_bh_expedited() in a loop, please
* restructure your code to batch your updates, and then use a single
* synchronize_rcu_bh() instead.
*
* Note that it is illegal to call this function while holding any lock
* that is acquired by a CPU-hotplug notifier. And yes, it is also illegal
* to call this function from a CPU-hotplug notifier. Failing to observe
* these restriction will result in deadlock.
*/
static inline void synchronize_rcu_bh_expedited(void)
{
synchronize_sched_expedited();
}
void rcu_barrier(void); void rcu_barrier(void);
void rcu_barrier_bh(void); bool rcu_eqs_special_set(int cpu);
void rcu_barrier_sched(void);
unsigned long get_state_synchronize_rcu(void); unsigned long get_state_synchronize_rcu(void);
void cond_synchronize_rcu(unsigned long oldstate); void cond_synchronize_rcu(unsigned long oldstate);
unsigned long get_state_synchronize_sched(void);
void cond_synchronize_sched(unsigned long oldstate);
void rcu_idle_enter(void); void rcu_idle_enter(void);
void rcu_idle_exit(void); void rcu_idle_exit(void);
@ -85,7 +59,6 @@ void rcu_irq_enter(void);
void rcu_irq_exit(void); void rcu_irq_exit(void);
void rcu_irq_enter_irqson(void); void rcu_irq_enter_irqson(void);
void rcu_irq_exit_irqson(void); void rcu_irq_exit_irqson(void);
bool rcu_irq_enter_disabled(void);
void exit_rcu(void); void exit_rcu(void);
@ -93,7 +66,9 @@ void rcu_scheduler_starting(void);
extern int rcu_scheduler_active __read_mostly; extern int rcu_scheduler_active __read_mostly;
void rcu_end_inkernel_boot(void); void rcu_end_inkernel_boot(void);
bool rcu_is_watching(void); bool rcu_is_watching(void);
#ifndef CONFIG_PREEMPT
void rcu_all_qs(void); void rcu_all_qs(void);
#endif
/* RCUtree hotplug events */ /* RCUtree hotplug events */
int rcutree_prepare_cpu(unsigned int cpu); int rcutree_prepare_cpu(unsigned int cpu);
@ -101,5 +76,6 @@ int rcutree_online_cpu(unsigned int cpu);
int rcutree_offline_cpu(unsigned int cpu); int rcutree_offline_cpu(unsigned int cpu);
int rcutree_dead_cpu(unsigned int cpu); int rcutree_dead_cpu(unsigned int cpu);
int rcutree_dying_cpu(unsigned int cpu); int rcutree_dying_cpu(unsigned int cpu);
void rcu_cpu_starting(unsigned int cpu);
#endif /* __LINUX_RCUTREE_H */ #endif /* __LINUX_RCUTREE_H */

View File

@ -6,19 +6,14 @@
/* /*
* rcuwait provides a way of blocking and waking up a single * rcuwait provides a way of blocking and waking up a single
* task in an rcu-safe manner; where it is forbidden to use * task in an rcu-safe manner.
* after exit_notify(). task_struct is not properly rcu protected,
* unless dealing with rcu-aware lists, ie: find_task_by_*().
* *
* Alternatively we have task_rcu_dereference(), but the return * The only time @task is non-nil is when a user is blocked (or
* semantics have different implications which would break the * checking if it needs to) on a condition, and reset as soon as we
* wakeup side. The only time @task is non-nil is when a user is * know that the condition has succeeded and are awoken.
* blocked (or checking if it needs to) on a condition, and reset
* as soon as we know that the condition has succeeded and are
* awoken.
*/ */
struct rcuwait { struct rcuwait {
struct task_struct *task; struct task_struct __rcu *task;
}; };
#define __RCUWAIT_INITIALIZER(name) \ #define __RCUWAIT_INITIALIZER(name) \
@ -37,13 +32,6 @@ extern void rcuwait_wake_up(struct rcuwait *w);
*/ */
#define rcuwait_wait_event(w, condition) \ #define rcuwait_wait_event(w, condition) \
({ \ ({ \
/* \
* Complain if we are called after do_exit()/exit_notify(), \
* as we cannot rely on the rcu critical region for the \
* wakeup side. \
*/ \
WARN_ON(current->exit_state); \
\
rcu_assign_pointer((w)->task, current); \ rcu_assign_pointer((w)->task, current); \
for (;;) { \ for (;;) { \
/* \ /* \

View File

@ -713,10 +713,8 @@ union rcu_special {
struct { struct {
u8 blocked; u8 blocked;
u8 need_qs; u8 need_qs;
u8 exp_need_qs; u8 exp_hint; /* Hint for performance. */
u8 deferred_qs;
/* Otherwise the compiler can store garbage here: */
u8 pad;
} b; /* Bits. */ } b; /* Bits. */
u32 s; /* Set of bits. */ u32 s; /* Set of bits. */
}; };

View File

@ -98,8 +98,6 @@ static inline void put_task_struct(struct task_struct *t)
__put_task_struct(t); __put_task_struct(t);
} }
struct task_struct *task_rcu_dereference(struct task_struct **ptask);
#ifdef CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT #ifdef CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT
extern int arch_task_struct_size __read_mostly; extern int arch_task_struct_size __read_mostly;
#else #else

View File

@ -38,20 +38,20 @@ struct srcu_struct;
#ifdef CONFIG_DEBUG_LOCK_ALLOC #ifdef CONFIG_DEBUG_LOCK_ALLOC
int __init_srcu_struct(struct srcu_struct *sp, const char *name, int __init_srcu_struct(struct srcu_struct *ssp, const char *name,
struct lock_class_key *key); struct lock_class_key *key);
#define init_srcu_struct(sp) \ #define init_srcu_struct(ssp) \
({ \ ({ \
static struct lock_class_key __srcu_key; \ static struct lock_class_key __srcu_key; \
\ \
__init_srcu_struct((sp), #sp, &__srcu_key); \ __init_srcu_struct((ssp), #ssp, &__srcu_key); \
}) })
#define __SRCU_DEP_MAP_INIT(srcu_name) .dep_map = { .name = #srcu_name }, #define __SRCU_DEP_MAP_INIT(srcu_name) .dep_map = { .name = #srcu_name },
#else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */ #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
int init_srcu_struct(struct srcu_struct *sp); int init_srcu_struct(struct srcu_struct *ssp);
#define __SRCU_DEP_MAP_INIT(srcu_name) #define __SRCU_DEP_MAP_INIT(srcu_name)
#endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */ #endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
@ -67,18 +67,18 @@ int init_srcu_struct(struct srcu_struct *sp);
struct srcu_struct { }; struct srcu_struct { };
#endif #endif
void call_srcu(struct srcu_struct *sp, struct rcu_head *head, void call_srcu(struct srcu_struct *ssp, struct rcu_head *head,
void (*func)(struct rcu_head *head)); void (*func)(struct rcu_head *head));
void cleanup_srcu_struct(struct srcu_struct *sp); void cleanup_srcu_struct(struct srcu_struct *ssp);
int __srcu_read_lock(struct srcu_struct *sp) __acquires(sp); int __srcu_read_lock(struct srcu_struct *ssp) __acquires(ssp);
void __srcu_read_unlock(struct srcu_struct *sp, int idx) __releases(sp); void __srcu_read_unlock(struct srcu_struct *ssp, int idx) __releases(ssp);
void synchronize_srcu(struct srcu_struct *sp); void synchronize_srcu(struct srcu_struct *ssp);
#ifdef CONFIG_DEBUG_LOCK_ALLOC #ifdef CONFIG_DEBUG_LOCK_ALLOC
/** /**
* srcu_read_lock_held - might we be in SRCU read-side critical section? * srcu_read_lock_held - might we be in SRCU read-side critical section?
* @sp: The srcu_struct structure to check * @ssp: The srcu_struct structure to check
* *
* If CONFIG_DEBUG_LOCK_ALLOC is selected, returns nonzero iff in an SRCU * If CONFIG_DEBUG_LOCK_ALLOC is selected, returns nonzero iff in an SRCU
* read-side critical section. In absence of CONFIG_DEBUG_LOCK_ALLOC, * read-side critical section. In absence of CONFIG_DEBUG_LOCK_ALLOC,
@ -92,16 +92,16 @@ void synchronize_srcu(struct srcu_struct *sp);
* relies on normal RCU, it can be called from the CPU which * relies on normal RCU, it can be called from the CPU which
* is in the idle loop from an RCU point of view or offline. * is in the idle loop from an RCU point of view or offline.
*/ */
static inline int srcu_read_lock_held(struct srcu_struct *sp) static inline int srcu_read_lock_held(const struct srcu_struct *ssp)
{ {
if (!debug_lockdep_rcu_enabled()) if (!debug_lockdep_rcu_enabled())
return 1; return 1;
return lock_is_held(&sp->dep_map); return lock_is_held(&ssp->dep_map);
} }
#else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */ #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
static inline int srcu_read_lock_held(struct srcu_struct *sp) static inline int srcu_read_lock_held(const struct srcu_struct *ssp)
{ {
return 1; return 1;
} }
@ -111,7 +111,7 @@ static inline int srcu_read_lock_held(struct srcu_struct *sp)
/** /**
* srcu_dereference_check - fetch SRCU-protected pointer for later dereferencing * srcu_dereference_check - fetch SRCU-protected pointer for later dereferencing
* @p: the pointer to fetch and protect for later dereferencing * @p: the pointer to fetch and protect for later dereferencing
* @sp: pointer to the srcu_struct, which is used to check that we * @ssp: pointer to the srcu_struct, which is used to check that we
* really are in an SRCU read-side critical section. * really are in an SRCU read-side critical section.
* @c: condition to check for update-side use * @c: condition to check for update-side use
* *
@ -120,24 +120,32 @@ static inline int srcu_read_lock_held(struct srcu_struct *sp)
* to 1. The @c argument will normally be a logical expression containing * to 1. The @c argument will normally be a logical expression containing
* lockdep_is_held() calls. * lockdep_is_held() calls.
*/ */
#define srcu_dereference_check(p, sp, c) \ #define srcu_dereference_check(p, ssp, c) \
__rcu_dereference_check((p), (c) || srcu_read_lock_held(sp), __rcu) __rcu_dereference_check((p), (c) || srcu_read_lock_held(ssp), __rcu)
/** /**
* srcu_dereference - fetch SRCU-protected pointer for later dereferencing * srcu_dereference - fetch SRCU-protected pointer for later dereferencing
* @p: the pointer to fetch and protect for later dereferencing * @p: the pointer to fetch and protect for later dereferencing
* @sp: pointer to the srcu_struct, which is used to check that we * @ssp: pointer to the srcu_struct, which is used to check that we
* really are in an SRCU read-side critical section. * really are in an SRCU read-side critical section.
* *
* Makes rcu_dereference_check() do the dirty work. If PROVE_RCU * Makes rcu_dereference_check() do the dirty work. If PROVE_RCU
* is enabled, invoking this outside of an RCU read-side critical * is enabled, invoking this outside of an RCU read-side critical
* section will result in an RCU-lockdep splat. * section will result in an RCU-lockdep splat.
*/ */
#define srcu_dereference(p, sp) srcu_dereference_check((p), (sp), 0) #define srcu_dereference(p, ssp) srcu_dereference_check((p), (ssp), 0)
/**
* srcu_dereference_notrace - no tracing and no lockdep calls from here
* @p: the pointer to fetch and protect for later dereferencing
* @ssp: pointer to the srcu_struct, which is used to check that we
* really are in an SRCU read-side critical section.
*/
#define srcu_dereference_notrace(p, ssp) srcu_dereference_check((p), (ssp), 1)
/** /**
* srcu_read_lock - register a new reader for an SRCU-protected structure. * srcu_read_lock - register a new reader for an SRCU-protected structure.
* @sp: srcu_struct in which to register the new reader. * @ssp: srcu_struct in which to register the new reader.
* *
* Enter an SRCU read-side critical section. Note that SRCU read-side * Enter an SRCU read-side critical section. Note that SRCU read-side
* critical sections may be nested. However, it is illegal to * critical sections may be nested. However, it is illegal to
@ -152,27 +160,44 @@ static inline int srcu_read_lock_held(struct srcu_struct *sp)
* srcu_read_unlock() in an irq handler if the matching srcu_read_lock() * srcu_read_unlock() in an irq handler if the matching srcu_read_lock()
* was invoked in process context. * was invoked in process context.
*/ */
static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp) static inline int srcu_read_lock(struct srcu_struct *ssp) __acquires(ssp)
{ {
int retval; int retval;
retval = __srcu_read_lock(sp); retval = __srcu_read_lock(ssp);
rcu_lock_acquire(&(sp)->dep_map); rcu_lock_acquire(&(ssp)->dep_map);
return retval;
}
/* Used by tracing, cannot be traced and cannot invoke lockdep. */
static inline notrace int
srcu_read_lock_notrace(struct srcu_struct *ssp) __acquires(ssp)
{
int retval;
retval = __srcu_read_lock(ssp);
return retval; return retval;
} }
/** /**
* srcu_read_unlock - unregister a old reader from an SRCU-protected structure. * srcu_read_unlock - unregister a old reader from an SRCU-protected structure.
* @sp: srcu_struct in which to unregister the old reader. * @ssp: srcu_struct in which to unregister the old reader.
* @idx: return value from corresponding srcu_read_lock(). * @idx: return value from corresponding srcu_read_lock().
* *
* Exit an SRCU read-side critical section. * Exit an SRCU read-side critical section.
*/ */
static inline void srcu_read_unlock(struct srcu_struct *sp, int idx) static inline void srcu_read_unlock(struct srcu_struct *ssp, int idx)
__releases(sp) __releases(ssp)
{ {
rcu_lock_release(&(sp)->dep_map); rcu_lock_release(&(ssp)->dep_map);
__srcu_read_unlock(sp, idx); __srcu_read_unlock(ssp, idx);
}
/* Used by tracing, cannot be traced and cannot call lockdep. */
static inline notrace void
srcu_read_unlock_notrace(struct srcu_struct *ssp, int idx) __releases(ssp)
{
__srcu_read_unlock(ssp, idx);
} }
/** /**

View File

@ -43,7 +43,7 @@ struct srcu_struct {
void srcu_drive_gp(struct work_struct *wp); void srcu_drive_gp(struct work_struct *wp);
#define __SRCU_STRUCT_INIT(name) \ #define __SRCU_STRUCT_INIT(name, __ignored) \
{ \ { \
.srcu_wq = __SWAIT_QUEUE_HEAD_INITIALIZER(name.srcu_wq), \ .srcu_wq = __SWAIT_QUEUE_HEAD_INITIALIZER(name.srcu_wq), \
.srcu_cb_tail = &name.srcu_cb_head, \ .srcu_cb_tail = &name.srcu_cb_head, \
@ -56,11 +56,11 @@ void srcu_drive_gp(struct work_struct *wp);
* Tree SRCU, which needs some per-CPU data. * Tree SRCU, which needs some per-CPU data.
*/ */
#define DEFINE_SRCU(name) \ #define DEFINE_SRCU(name) \
struct srcu_struct name = __SRCU_STRUCT_INIT(name) struct srcu_struct name = __SRCU_STRUCT_INIT(name, name)
#define DEFINE_STATIC_SRCU(name) \ #define DEFINE_STATIC_SRCU(name) \
static struct srcu_struct name = __SRCU_STRUCT_INIT(name) static struct srcu_struct name = __SRCU_STRUCT_INIT(name, name)
void synchronize_srcu(struct srcu_struct *sp); void synchronize_srcu(struct srcu_struct *ssp);
/* /*
* Counts the new reader in the appropriate per-CPU element of the * Counts the new reader in the appropriate per-CPU element of the
@ -68,36 +68,36 @@ void synchronize_srcu(struct srcu_struct *sp);
* __srcu_read_unlock() must be in the same handler instance. Returns an * __srcu_read_unlock() must be in the same handler instance. Returns an
* index that must be passed to the matching srcu_read_unlock(). * index that must be passed to the matching srcu_read_unlock().
*/ */
static inline int __srcu_read_lock(struct srcu_struct *sp) static inline int __srcu_read_lock(struct srcu_struct *ssp)
{ {
int idx; int idx;
idx = READ_ONCE(sp->srcu_idx); idx = READ_ONCE(ssp->srcu_idx);
WRITE_ONCE(sp->srcu_lock_nesting[idx], sp->srcu_lock_nesting[idx] + 1); WRITE_ONCE(ssp->srcu_lock_nesting[idx], ssp->srcu_lock_nesting[idx] + 1);
return idx; return idx;
} }
static inline void synchronize_srcu_expedited(struct srcu_struct *sp) static inline void synchronize_srcu_expedited(struct srcu_struct *ssp)
{ {
synchronize_srcu(sp); synchronize_srcu(ssp);
} }
static inline void srcu_barrier(struct srcu_struct *sp) static inline void srcu_barrier(struct srcu_struct *ssp)
{ {
synchronize_srcu(sp); synchronize_srcu(ssp);
} }
/* Defined here to avoid size increase for non-torture kernels. */ /* Defined here to avoid size increase for non-torture kernels. */
static inline void srcu_torture_stats_print(struct srcu_struct *sp, static inline void srcu_torture_stats_print(struct srcu_struct *ssp,
char *tt, char *tf) char *tt, char *tf)
{ {
int idx; int idx;
idx = READ_ONCE(sp->srcu_idx) & 0x1; idx = READ_ONCE(ssp->srcu_idx) & 0x1;
pr_alert("%s%s Tiny SRCU per-CPU(idx=%d): (%hd,%hd)\n", pr_alert("%s%s Tiny SRCU per-CPU(idx=%d): (%hd,%hd)\n",
tt, tf, idx, tt, tf, idx,
READ_ONCE(sp->srcu_lock_nesting[!idx]), READ_ONCE(ssp->srcu_lock_nesting[!idx]),
READ_ONCE(sp->srcu_lock_nesting[idx])); READ_ONCE(ssp->srcu_lock_nesting[idx]));
} }
#endif #endif

View File

@ -40,25 +40,26 @@ struct srcu_data {
unsigned long srcu_unlock_count[2]; /* Unlocks per CPU. */ unsigned long srcu_unlock_count[2]; /* Unlocks per CPU. */
/* Update-side state. */ /* Update-side state. */
raw_spinlock_t __private lock ____cacheline_internodealigned_in_smp; spinlock_t __private lock ____cacheline_internodealigned_in_smp;
struct rcu_segcblist srcu_cblist; /* List of callbacks.*/ struct rcu_segcblist srcu_cblist; /* List of callbacks.*/
unsigned long srcu_gp_seq_needed; /* Furthest future GP needed. */ unsigned long srcu_gp_seq_needed; /* Furthest future GP needed. */
unsigned long srcu_gp_seq_needed_exp; /* Furthest future exp GP. */ unsigned long srcu_gp_seq_needed_exp; /* Furthest future exp GP. */
bool srcu_cblist_invoking; /* Invoking these CBs? */ bool srcu_cblist_invoking; /* Invoking these CBs? */
struct delayed_work work; /* Context for CB invoking. */ struct timer_list delay_work; /* Delay for CB invoking */
struct work_struct work; /* Context for CB invoking. */
struct rcu_head srcu_barrier_head; /* For srcu_barrier() use. */ struct rcu_head srcu_barrier_head; /* For srcu_barrier() use. */
struct srcu_node *mynode; /* Leaf srcu_node. */ struct srcu_node *mynode; /* Leaf srcu_node. */
unsigned long grpmask; /* Mask for leaf srcu_node */ unsigned long grpmask; /* Mask for leaf srcu_node */
/* ->srcu_data_have_cbs[]. */ /* ->srcu_data_have_cbs[]. */
int cpu; int cpu;
struct srcu_struct *sp; struct srcu_struct *ssp;
}; };
/* /*
* Node in SRCU combining tree, similar in function to rcu_data. * Node in SRCU combining tree, similar in function to rcu_data.
*/ */
struct srcu_node { struct srcu_node {
raw_spinlock_t __private lock; spinlock_t __private lock;
unsigned long srcu_have_cbs[4]; /* GP seq for children */ unsigned long srcu_have_cbs[4]; /* GP seq for children */
/* having CBs, but only */ /* having CBs, but only */
/* is > ->srcu_gq_seq. */ /* is > ->srcu_gq_seq. */
@ -78,7 +79,7 @@ struct srcu_struct {
struct srcu_node *level[RCU_NUM_LVLS + 1]; struct srcu_node *level[RCU_NUM_LVLS + 1];
/* First node at each level. */ /* First node at each level. */
struct mutex srcu_cb_mutex; /* Serialize CB preparation. */ struct mutex srcu_cb_mutex; /* Serialize CB preparation. */
raw_spinlock_t __private lock; /* Protect counters */ spinlock_t __private lock; /* Protect counters */
struct mutex srcu_gp_mutex; /* Serialize GP work. */ struct mutex srcu_gp_mutex; /* Serialize GP work. */
unsigned int srcu_idx; /* Current rdr array element. */ unsigned int srcu_idx; /* Current rdr array element. */
unsigned long srcu_gp_seq; /* Grace-period seq #. */ unsigned long srcu_gp_seq; /* Grace-period seq #. */
@ -104,13 +105,14 @@ struct srcu_struct {
#define SRCU_STATE_SCAN1 1 #define SRCU_STATE_SCAN1 1
#define SRCU_STATE_SCAN2 2 #define SRCU_STATE_SCAN2 2
#define __SRCU_STRUCT_INIT(name) \ #define __SRCU_STRUCT_INIT(name, pcpu_name) \
{ \ { \
.sda = &name##_srcu_data, \ .sda = &pcpu_name, \
.lock = __RAW_SPIN_LOCK_UNLOCKED(name.lock), \ .lock = __SPIN_LOCK_UNLOCKED(name.lock), \
.srcu_gp_seq_needed = 0 - 1, \ .srcu_gp_seq_needed = -1UL, \
__SRCU_DEP_MAP_INIT(name) \ .work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0), \
} __SRCU_DEP_MAP_INIT(name) \
}
/* /*
* Define and initialize a srcu struct at build time. * Define and initialize a srcu struct at build time.
@ -131,14 +133,22 @@ struct srcu_struct {
* *
* See include/linux/percpu-defs.h for the rules on per-CPU variables. * See include/linux/percpu-defs.h for the rules on per-CPU variables.
*/ */
#define __DEFINE_SRCU(name, is_static) \ #ifdef MODULE
static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);\ # define __DEFINE_SRCU(name, is_static) \
is_static struct srcu_struct name = __SRCU_STRUCT_INIT(name) is_static struct srcu_struct name; \
struct srcu_struct *__srcu_struct_##name \
__section("___srcu_struct_ptrs") = &name
#else
# define __DEFINE_SRCU(name, is_static) \
static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data); \
is_static struct srcu_struct name = \
__SRCU_STRUCT_INIT(name, name##_srcu_data)
#endif
#define DEFINE_SRCU(name) __DEFINE_SRCU(name, /* not static */) #define DEFINE_SRCU(name) __DEFINE_SRCU(name, /* not static */)
#define DEFINE_STATIC_SRCU(name) __DEFINE_SRCU(name, static) #define DEFINE_STATIC_SRCU(name) __DEFINE_SRCU(name, static)
void synchronize_srcu_expedited(struct srcu_struct *sp); void synchronize_srcu_expedited(struct srcu_struct *ssp);
void srcu_barrier(struct srcu_struct *sp); void srcu_barrier(struct srcu_struct *ssp);
void srcu_torture_stats_print(struct srcu_struct *sp, char *tt, char *tf); void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf);
#endif #endif

View File

@ -208,6 +208,14 @@ static inline void timer_setup(struct timer_list *timer,
} }
#endif #endif
static inline void timer_setup_on_stack(struct timer_list *timer,
void (*callback)(struct timer_list *),
unsigned int flags)
{
__setup_timer_on_stack(timer, (TIMER_FUNC_TYPE)callback,
(TIMER_DATA_TYPE)timer, flags);
}
#define from_timer(var, callback_timer, timer_fieldname) \ #define from_timer(var, callback_timer, timer_fieldname) \
container_of(callback_timer, typeof(*var), timer_fieldname) container_of(callback_timer, typeof(*var), timer_fieldname)
@ -230,6 +238,7 @@ extern void add_timer_on(struct timer_list *timer, int cpu);
extern int del_timer(struct timer_list * timer); extern int del_timer(struct timer_list * timer);
extern int mod_timer(struct timer_list *timer, unsigned long expires); extern int mod_timer(struct timer_list *timer, unsigned long expires);
extern int mod_timer_pending(struct timer_list *timer, unsigned long expires); extern int mod_timer_pending(struct timer_list *timer, unsigned long expires);
extern int timer_reduce(struct timer_list *timer, unsigned long expires);
#ifdef CONFIG_SMP #ifdef CONFIG_SMP
extern bool check_pending_deferrable_timers(int cpu); extern bool check_pending_deferrable_timers(int cpu);
#endif #endif

View File

@ -50,11 +50,12 @@
do { if (verbose) pr_alert("%s" TORTURE_FLAG "!!! %s\n", torture_type, s); } while (0) do { if (verbose) pr_alert("%s" TORTURE_FLAG "!!! %s\n", torture_type, s); } while (0)
/* Definitions for online/offline exerciser. */ /* Definitions for online/offline exerciser. */
typedef void torture_ofl_func(void);
bool torture_offline(int cpu, long *n_onl_attempts, long *n_onl_successes, bool torture_offline(int cpu, long *n_onl_attempts, long *n_onl_successes,
unsigned long *sum_offl, int *min_onl, int *max_onl); unsigned long *sum_offl, int *min_onl, int *max_onl);
bool torture_online(int cpu, long *n_onl_attempts, long *n_onl_successes, bool torture_online(int cpu, long *n_onl_attempts, long *n_onl_successes,
unsigned long *sum_onl, int *min_onl, int *max_onl); unsigned long *sum_onl, int *min_onl, int *max_onl);
int torture_onoff_init(long ooholdoff, long oointerval); int torture_onoff_init(long ooholdoff, long oointerval, torture_ofl_func *f);
void torture_onoff_stats(void); void torture_onoff_stats(void);
bool torture_onoff_failures(void); bool torture_onoff_failures(void);
@ -64,6 +65,8 @@ struct torture_random_state {
long trs_count; long trs_count;
}; };
#define DEFINE_TORTURE_RANDOM(name) struct torture_random_state name = { 0, 0 } #define DEFINE_TORTURE_RANDOM(name) struct torture_random_state name = { 0, 0 }
#define DEFINE_TORTURE_RANDOM_PERCPU(name) \
DEFINE_PER_CPU(struct torture_random_state, name)
unsigned long torture_random(struct torture_random_state *trsp); unsigned long torture_random(struct torture_random_state *trsp);
/* Task shuffler, which causes CPUs to occasionally go idle. */ /* Task shuffler, which causes CPUs to occasionally go idle. */
@ -75,11 +78,11 @@ void torture_shutdown_absorb(const char *title);
int torture_shutdown_init(int ssecs, void (*cleanup)(void)); int torture_shutdown_init(int ssecs, void (*cleanup)(void));
/* Task stuttering, which forces load/no-load transitions. */ /* Task stuttering, which forces load/no-load transitions. */
void stutter_wait(const char *title); bool stutter_wait(const char *title);
int torture_stutter_init(int s); int torture_stutter_init(int s, int sgap);
/* Initialization and cleanup. */ /* Initialization and cleanup. */
bool torture_init_begin(char *ttype, bool v, int *runnable); bool torture_init_begin(char *ttype, int v);
void torture_init_end(void); void torture_init_end(void);
bool torture_cleanup_begin(void); bool torture_cleanup_begin(void);
void torture_cleanup_end(void); void torture_cleanup_end(void);
@ -96,4 +99,10 @@ void _torture_stop_kthread(char *m, struct task_struct **tp);
#define torture_stop_kthread(n, tp) \ #define torture_stop_kthread(n, tp) \
_torture_stop_kthread("Stopping " #n " task", &(tp)) _torture_stop_kthread("Stopping " #n " task", &(tp))
#ifdef CONFIG_PREEMPT
#define torture_preempt_schedule() preempt_schedule()
#else
#define torture_preempt_schedule()
#endif
#endif /* __LINUX_TORTURE_H */ #endif /* __LINUX_TORTURE_H */

View File

@ -77,7 +77,7 @@ int unregister_tracepoint_module_notifier(struct notifier_block *nb)
*/ */
static inline void tracepoint_synchronize_unregister(void) static inline void tracepoint_synchronize_unregister(void)
{ {
synchronize_sched(); synchronize_rcu();
} }
#ifdef CONFIG_HAVE_SYSCALL_TRACEPOINTS #ifdef CONFIG_HAVE_SYSCALL_TRACEPOINTS
@ -137,11 +137,8 @@ extern void syscall_unregfunc(void);
\ \
if (!(cond)) \ if (!(cond)) \
return; \ return; \
if (rcucheck) { \ if (rcucheck) \
if (WARN_ON_ONCE(rcu_irq_enter_disabled())) \
return; \
rcu_irq_enter_irqson(); \ rcu_irq_enter_irqson(); \
} \
rcu_read_lock_sched_notrace(); \ rcu_read_lock_sched_notrace(); \
it_func_ptr = rcu_dereference_sched((tp)->funcs); \ it_func_ptr = rcu_dereference_sched((tp)->funcs); \
if (it_func_ptr) { \ if (it_func_ptr) { \

View File

@ -1458,10 +1458,8 @@ do { \
} while (0) } while (0)
#ifdef CONFIG_LOCKDEP #ifdef CONFIG_LOCKDEP
static inline bool lockdep_sock_is_held(const struct sock *csk) static inline bool lockdep_sock_is_held(const struct sock *sk)
{ {
struct sock *sk = (struct sock *)csk;
return lockdep_is_held(&sk->sk_lock) || return lockdep_is_held(&sk->sk_lock) ||
lockdep_is_held(&sk->sk_lock.slock); lockdep_is_held(&sk->sk_lock.slock);
} }

View File

@ -52,6 +52,7 @@ TRACE_EVENT(rcu_utilization,
* "cpuqs": CPU passes through a quiescent state. * "cpuqs": CPU passes through a quiescent state.
* "cpuonl": CPU comes online. * "cpuonl": CPU comes online.
* "cpuofl": CPU goes offline. * "cpuofl": CPU goes offline.
* "cpuofl-bgp": CPU goes offline while blocking a grace period.
* "reqwait": GP kthread sleeps waiting for grace-period request. * "reqwait": GP kthread sleeps waiting for grace-period request.
* "reqwaitsig": GP kthread awakened by signal from reqwait state. * "reqwaitsig": GP kthread awakened by signal from reqwait state.
* "fqswait": GP kthread waiting until time to force quiescent states. * "fqswait": GP kthread waiting until time to force quiescent states.
@ -63,55 +64,54 @@ TRACE_EVENT(rcu_utilization,
*/ */
TRACE_EVENT(rcu_grace_period, TRACE_EVENT(rcu_grace_period,
TP_PROTO(const char *rcuname, unsigned long gpnum, const char *gpevent), TP_PROTO(const char *rcuname, unsigned long gp_seq, const char *gpevent),
TP_ARGS(rcuname, gpnum, gpevent), TP_ARGS(rcuname, gp_seq, gpevent),
TP_STRUCT__entry( TP_STRUCT__entry(
__field(const char *, rcuname) __field(const char *, rcuname)
__field(unsigned long, gpnum) __field(unsigned long, gp_seq)
__field(const char *, gpevent) __field(const char *, gpevent)
), ),
TP_fast_assign( TP_fast_assign(
__entry->rcuname = rcuname; __entry->rcuname = rcuname;
__entry->gpnum = gpnum; __entry->gp_seq = gp_seq;
__entry->gpevent = gpevent; __entry->gpevent = gpevent;
), ),
TP_printk("%s %lu %s", TP_printk("%s %lu %s",
__entry->rcuname, __entry->gpnum, __entry->gpevent) __entry->rcuname, __entry->gp_seq, __entry->gpevent)
); );
/* /*
* Tracepoint for future grace-period events, including those for no-callbacks * Tracepoint for future grace-period events. The caller should pull
* CPUs. The caller should pull the data from the rcu_node structure, * the data from the rcu_node structure, other than rcuname, which comes
* other than rcuname, which comes from the rcu_state structure, and event, * from the rcu_state structure, and event, which is one of the following:
* which is one of the following:
* *
* "Startleaf": Request a nocb grace period based on leaf-node data. * "Startleaf": Request a grace period based on leaf-node data.
* "Startedleaf": Leaf-node start proved sufficient. * "Prestarted": Someone beat us to the request
* "Startedleafroot": Leaf-node start proved sufficient after checking root. * "Startedleaf": Leaf node marked for future GP.
* "Startedleafroot": All nodes from leaf to root marked for future GP.
* "Startedroot": Requested a nocb grace period based on root-node data. * "Startedroot": Requested a nocb grace period based on root-node data.
* "NoGPkthread": The RCU grace-period kthread has not yet started.
* "StartWait": Start waiting for the requested grace period. * "StartWait": Start waiting for the requested grace period.
* "ResumeWait": Resume waiting after signal.
* "EndWait": Complete wait. * "EndWait": Complete wait.
* "Cleanup": Clean up rcu_node structure after previous GP. * "Cleanup": Clean up rcu_node structure after previous GP.
* "CleanupMore": Clean up, and another no-CB GP is needed. * "CleanupMore": Clean up, and another GP is needed.
*/ */
TRACE_EVENT(rcu_future_grace_period, TRACE_EVENT(rcu_future_grace_period,
TP_PROTO(const char *rcuname, unsigned long gpnum, unsigned long completed, TP_PROTO(const char *rcuname, unsigned long gp_seq,
unsigned long c, u8 level, int grplo, int grphi, unsigned long gp_seq_req, u8 level, int grplo, int grphi,
const char *gpevent), const char *gpevent),
TP_ARGS(rcuname, gpnum, completed, c, level, grplo, grphi, gpevent), TP_ARGS(rcuname, gp_seq, gp_seq_req, level, grplo, grphi, gpevent),
TP_STRUCT__entry( TP_STRUCT__entry(
__field(const char *, rcuname) __field(const char *, rcuname)
__field(unsigned long, gpnum) __field(unsigned long, gp_seq)
__field(unsigned long, completed) __field(unsigned long, gp_seq_req)
__field(unsigned long, c)
__field(u8, level) __field(u8, level)
__field(int, grplo) __field(int, grplo)
__field(int, grphi) __field(int, grphi)
@ -120,19 +120,17 @@ TRACE_EVENT(rcu_future_grace_period,
TP_fast_assign( TP_fast_assign(
__entry->rcuname = rcuname; __entry->rcuname = rcuname;
__entry->gpnum = gpnum; __entry->gp_seq = gp_seq;
__entry->completed = completed; __entry->gp_seq_req = gp_seq_req;
__entry->c = c;
__entry->level = level; __entry->level = level;
__entry->grplo = grplo; __entry->grplo = grplo;
__entry->grphi = grphi; __entry->grphi = grphi;
__entry->gpevent = gpevent; __entry->gpevent = gpevent;
), ),
TP_printk("%s %lu %lu %lu %u %d %d %s", TP_printk("%s %lu %lu %u %d %d %s",
__entry->rcuname, __entry->gpnum, __entry->completed, __entry->rcuname, __entry->gp_seq, __entry->gp_seq_req, __entry->level,
__entry->c, __entry->level, __entry->grplo, __entry->grphi, __entry->grplo, __entry->grphi, __entry->gpevent)
__entry->gpevent)
); );
/* /*
@ -179,6 +177,10 @@ TRACE_EVENT(rcu_grace_period_init,
* *
* "snap": Captured snapshot of expedited grace period sequence number. * "snap": Captured snapshot of expedited grace period sequence number.
* "start": Started a real expedited grace period. * "start": Started a real expedited grace period.
* "reset": Started resetting the tree
* "select": Started selecting the CPUs to wait on.
* "selectofl": Selected CPU partially offline.
* "startwait": Started waiting on selected CPUs.
* "end": Ended a real expedited grace period. * "end": Ended a real expedited grace period.
* "endwake": Woke piggybackers up. * "endwake": Woke piggybackers up.
* "done": Someone else did the expedited grace period for us. * "done": Someone else did the expedited grace period for us.
@ -259,7 +261,8 @@ TRACE_EVENT(rcu_exp_funnel_lock,
* "WakeNotPoll": Don't wake rcuo kthread because it is polling. * "WakeNotPoll": Don't wake rcuo kthread because it is polling.
* "DeferredWake": Carried out the "IsDeferred" wakeup. * "DeferredWake": Carried out the "IsDeferred" wakeup.
* "Poll": Start of new polling cycle for rcu_nocb_poll. * "Poll": Start of new polling cycle for rcu_nocb_poll.
* "Sleep": Sleep waiting for CBs for !rcu_nocb_poll. * "Sleep": Sleep waiting for GP for !rcu_nocb_poll.
* "CBSleep": Sleep waiting for CBs for !rcu_nocb_poll.
* "WokeEmpty": rcuo kthread woke to find empty list. * "WokeEmpty": rcuo kthread woke to find empty list.
* "WokeNonEmpty": rcuo kthread woke to find non-empty list. * "WokeNonEmpty": rcuo kthread woke to find non-empty list.
* "WaitQueue": Enqueue partially done, timed wait for it to complete. * "WaitQueue": Enqueue partially done, timed wait for it to complete.
@ -294,24 +297,24 @@ TRACE_EVENT(rcu_nocb_wake,
*/ */
TRACE_EVENT(rcu_preempt_task, TRACE_EVENT(rcu_preempt_task,
TP_PROTO(const char *rcuname, int pid, unsigned long gpnum), TP_PROTO(const char *rcuname, int pid, unsigned long gp_seq),
TP_ARGS(rcuname, pid, gpnum), TP_ARGS(rcuname, pid, gp_seq),
TP_STRUCT__entry( TP_STRUCT__entry(
__field(const char *, rcuname) __field(const char *, rcuname)
__field(unsigned long, gpnum) __field(unsigned long, gp_seq)
__field(int, pid) __field(int, pid)
), ),
TP_fast_assign( TP_fast_assign(
__entry->rcuname = rcuname; __entry->rcuname = rcuname;
__entry->gpnum = gpnum; __entry->gp_seq = gp_seq;
__entry->pid = pid; __entry->pid = pid;
), ),
TP_printk("%s %lu %d", TP_printk("%s %lu %d",
__entry->rcuname, __entry->gpnum, __entry->pid) __entry->rcuname, __entry->gp_seq, __entry->pid)
); );
/* /*
@ -321,23 +324,23 @@ TRACE_EVENT(rcu_preempt_task,
*/ */
TRACE_EVENT(rcu_unlock_preempted_task, TRACE_EVENT(rcu_unlock_preempted_task,
TP_PROTO(const char *rcuname, unsigned long gpnum, int pid), TP_PROTO(const char *rcuname, unsigned long gp_seq, int pid),
TP_ARGS(rcuname, gpnum, pid), TP_ARGS(rcuname, gp_seq, pid),
TP_STRUCT__entry( TP_STRUCT__entry(
__field(const char *, rcuname) __field(const char *, rcuname)
__field(unsigned long, gpnum) __field(unsigned long, gp_seq)
__field(int, pid) __field(int, pid)
), ),
TP_fast_assign( TP_fast_assign(
__entry->rcuname = rcuname; __entry->rcuname = rcuname;
__entry->gpnum = gpnum; __entry->gp_seq = gp_seq;
__entry->pid = pid; __entry->pid = pid;
), ),
TP_printk("%s %lu %d", __entry->rcuname, __entry->gpnum, __entry->pid) TP_printk("%s %lu %d", __entry->rcuname, __entry->gp_seq, __entry->pid)
); );
/* /*
@ -350,15 +353,15 @@ TRACE_EVENT(rcu_unlock_preempted_task,
*/ */
TRACE_EVENT(rcu_quiescent_state_report, TRACE_EVENT(rcu_quiescent_state_report,
TP_PROTO(const char *rcuname, unsigned long gpnum, TP_PROTO(const char *rcuname, unsigned long gp_seq,
unsigned long mask, unsigned long qsmask, unsigned long mask, unsigned long qsmask,
u8 level, int grplo, int grphi, int gp_tasks), u8 level, int grplo, int grphi, int gp_tasks),
TP_ARGS(rcuname, gpnum, mask, qsmask, level, grplo, grphi, gp_tasks), TP_ARGS(rcuname, gp_seq, mask, qsmask, level, grplo, grphi, gp_tasks),
TP_STRUCT__entry( TP_STRUCT__entry(
__field(const char *, rcuname) __field(const char *, rcuname)
__field(unsigned long, gpnum) __field(unsigned long, gp_seq)
__field(unsigned long, mask) __field(unsigned long, mask)
__field(unsigned long, qsmask) __field(unsigned long, qsmask)
__field(u8, level) __field(u8, level)
@ -369,7 +372,7 @@ TRACE_EVENT(rcu_quiescent_state_report,
TP_fast_assign( TP_fast_assign(
__entry->rcuname = rcuname; __entry->rcuname = rcuname;
__entry->gpnum = gpnum; __entry->gp_seq = gp_seq;
__entry->mask = mask; __entry->mask = mask;
__entry->qsmask = qsmask; __entry->qsmask = qsmask;
__entry->level = level; __entry->level = level;
@ -379,41 +382,40 @@ TRACE_EVENT(rcu_quiescent_state_report,
), ),
TP_printk("%s %lu %lx>%lx %u %d %d %u", TP_printk("%s %lu %lx>%lx %u %d %d %u",
__entry->rcuname, __entry->gpnum, __entry->rcuname, __entry->gp_seq,
__entry->mask, __entry->qsmask, __entry->level, __entry->mask, __entry->qsmask, __entry->level,
__entry->grplo, __entry->grphi, __entry->gp_tasks) __entry->grplo, __entry->grphi, __entry->gp_tasks)
); );
/* /*
* Tracepoint for quiescent states detected by force_quiescent_state(). * Tracepoint for quiescent states detected by force_quiescent_state().
* These trace events include the type of RCU, the grace-period number that * These trace events include the type of RCU, the grace-period number
* was blocked by the CPU, the CPU itself, and the type of quiescent state, * that was blocked by the CPU, the CPU itself, and the type of quiescent
* which can be "dti" for dyntick-idle mode, "ofl" for CPU offline, "kick" * state, which can be "dti" for dyntick-idle mode or "kick" when kicking
* when kicking a CPU that has been in dyntick-idle mode for too long, or * a CPU that has been in dyntick-idle mode for too long.
* "rqc" if the CPU got a quiescent state via its rcu_qs_ctr.
*/ */
TRACE_EVENT(rcu_fqs, TRACE_EVENT(rcu_fqs,
TP_PROTO(const char *rcuname, unsigned long gpnum, int cpu, const char *qsevent), TP_PROTO(const char *rcuname, unsigned long gp_seq, int cpu, const char *qsevent),
TP_ARGS(rcuname, gpnum, cpu, qsevent), TP_ARGS(rcuname, gp_seq, cpu, qsevent),
TP_STRUCT__entry( TP_STRUCT__entry(
__field(const char *, rcuname) __field(const char *, rcuname)
__field(unsigned long, gpnum) __field(unsigned long, gp_seq)
__field(int, cpu) __field(int, cpu)
__field(const char *, qsevent) __field(const char *, qsevent)
), ),
TP_fast_assign( TP_fast_assign(
__entry->rcuname = rcuname; __entry->rcuname = rcuname;
__entry->gpnum = gpnum; __entry->gp_seq = gp_seq;
__entry->cpu = cpu; __entry->cpu = cpu;
__entry->qsevent = qsevent; __entry->qsevent = qsevent;
), ),
TP_printk("%s %lu %d %s", TP_printk("%s %lu %d %s",
__entry->rcuname, __entry->gpnum, __entry->rcuname, __entry->gp_seq,
__entry->cpu, __entry->qsevent) __entry->cpu, __entry->qsevent)
); );
@ -421,37 +423,40 @@ TRACE_EVENT(rcu_fqs,
/* /*
* Tracepoint for dyntick-idle entry/exit events. These take a string * Tracepoint for dyntick-idle entry/exit events. These take a string
* as argument: "Start" for entering dyntick-idle mode, "End" for * as argument: "Start" for entering dyntick-idle mode, "Startirq" for
* leaving it, "--=" for events moving towards idle, and "++=" for events * entering it from irq/NMI, "End" for leaving it, "Endirq" for leaving it
* moving away from idle. "Error on entry: not idle task" and "Error on * to irq/NMI, "--=" for events moving towards idle, and "++=" for events
* exit: not idle task" indicate that a non-idle task is erroneously * moving away from idle.
* toying with the idle loop.
* *
* These events also take a pair of numbers, which indicate the nesting * These events also take a pair of numbers, which indicate the nesting
* depth before and after the event of interest. Note that task-related * depth before and after the event of interest, and a third number that is
* events use the upper bits of each number, while interrupt-related * the ->dynticks counter. Note that task-related and interrupt-related
* events use the lower bits. * events use two separate counters, and that the "++=" and "--=" events
* for irq/NMI will change the counter by two, otherwise by one.
*/ */
TRACE_EVENT(rcu_dyntick, TRACE_EVENT(rcu_dyntick,
TP_PROTO(const char *polarity, long long oldnesting, long long newnesting), TP_PROTO(const char *polarity, long oldnesting, long newnesting, int dynticks),
TP_ARGS(polarity, oldnesting, newnesting), TP_ARGS(polarity, oldnesting, newnesting, dynticks),
TP_STRUCT__entry( TP_STRUCT__entry(
__field(const char *, polarity) __field(const char *, polarity)
__field(long long, oldnesting) __field(long, oldnesting)
__field(long long, newnesting) __field(long, newnesting)
__field(int, dynticks)
), ),
TP_fast_assign( TP_fast_assign(
__entry->polarity = polarity; __entry->polarity = polarity;
__entry->oldnesting = oldnesting; __entry->oldnesting = oldnesting;
__entry->newnesting = newnesting; __entry->newnesting = newnesting;
__entry->dynticks = dynticks;
), ),
TP_printk("%s %llx %llx", __entry->polarity, TP_printk("%s %lx %lx %#3x", __entry->polarity,
__entry->oldnesting, __entry->newnesting) __entry->oldnesting, __entry->newnesting,
__entry->dynticks & 0xfff)
); );
/* /*
@ -736,20 +741,20 @@ TRACE_EVENT(rcu_torture_read,
); );
/* /*
* Tracepoint for _rcu_barrier() execution. The string "s" describes * Tracepoint for rcu_barrier() execution. The string "s" describes
* the _rcu_barrier phase: * the rcu_barrier phase:
* "Begin": _rcu_barrier() started. * "Begin": rcu_barrier() started.
* "EarlyExit": _rcu_barrier() piggybacked, thus early exit. * "EarlyExit": rcu_barrier() piggybacked, thus early exit.
* "Inc1": _rcu_barrier() piggyback check counter incremented. * "Inc1": rcu_barrier() piggyback check counter incremented.
* "OfflineNoCB": _rcu_barrier() found callback on never-online CPU * "OfflineNoCB": rcu_barrier() found callback on never-online CPU
* "OnlineNoCB": _rcu_barrier() found online no-CBs CPU. * "OnlineNoCB": rcu_barrier() found online no-CBs CPU.
* "OnlineQ": _rcu_barrier() found online CPU with callbacks. * "OnlineQ": rcu_barrier() found online CPU with callbacks.
* "OnlineNQ": _rcu_barrier() found online CPU, no callbacks. * "OnlineNQ": rcu_barrier() found online CPU, no callbacks.
* "IRQ": An rcu_barrier_callback() callback posted on remote CPU. * "IRQ": An rcu_barrier_callback() callback posted on remote CPU.
* "IRQNQ": An rcu_barrier_callback() callback found no callbacks. * "IRQNQ": An rcu_barrier_callback() callback found no callbacks.
* "CB": An rcu_barrier_callback() invoked a callback, not the last. * "CB": An rcu_barrier_callback() invoked a callback, not the last.
* "LastCB": An rcu_barrier_callback() invoked the last callback. * "LastCB": An rcu_barrier_callback() invoked the last callback.
* "Inc2": _rcu_barrier() piggyback check counter incremented. * "Inc2": rcu_barrier() piggyback check counter incremented.
* The "cpu" argument is the CPU or -1 if meaningless, the "cnt" argument * The "cpu" argument is the CPU or -1 if meaningless, the "cnt" argument
* is the count of remaining callbacks, and "done" is the piggybacking count. * is the count of remaining callbacks, and "done" is the piggybacking count.
*/ */
@ -782,8 +787,8 @@ TRACE_EVENT(rcu_barrier,
#else /* #ifdef CONFIG_RCU_TRACE */ #else /* #ifdef CONFIG_RCU_TRACE */
#define trace_rcu_grace_period(rcuname, gpnum, gpevent) do { } while (0) #define trace_rcu_grace_period(rcuname, gp_seq, gpevent) do { } while (0)
#define trace_rcu_future_grace_period(rcuname, gpnum, completed, c, \ #define trace_rcu_future_grace_period(rcuname, gp_seq, gp_seq_req, \
level, grplo, grphi, event) \ level, grplo, grphi, event) \
do { } while (0) do { } while (0)
#define trace_rcu_grace_period_init(rcuname, gpnum, level, grplo, grphi, \ #define trace_rcu_grace_period_init(rcuname, gpnum, level, grplo, grphi, \
@ -793,13 +798,13 @@ TRACE_EVENT(rcu_barrier,
#define trace_rcu_exp_funnel_lock(rcuname, level, grplo, grphi, gpevent) \ #define trace_rcu_exp_funnel_lock(rcuname, level, grplo, grphi, gpevent) \
do { } while (0) do { } while (0)
#define trace_rcu_nocb_wake(rcuname, cpu, reason) do { } while (0) #define trace_rcu_nocb_wake(rcuname, cpu, reason) do { } while (0)
#define trace_rcu_preempt_task(rcuname, pid, gpnum) do { } while (0) #define trace_rcu_preempt_task(rcuname, pid, gp_seq) do { } while (0)
#define trace_rcu_unlock_preempted_task(rcuname, gpnum, pid) do { } while (0) #define trace_rcu_unlock_preempted_task(rcuname, gp_seq, pid) do { } while (0)
#define trace_rcu_quiescent_state_report(rcuname, gpnum, mask, qsmask, level, \ #define trace_rcu_quiescent_state_report(rcuname, gp_seq, mask, qsmask, level, \
grplo, grphi, gp_tasks) do { } \ grplo, grphi, gp_tasks) do { } \
while (0) while (0)
#define trace_rcu_fqs(rcuname, gpnum, cpu, qsevent) do { } while (0) #define trace_rcu_fqs(rcuname, gp_seq, cpu, qsevent) do { } while (0)
#define trace_rcu_dyntick(polarity, oldnesting, newnesting) do { } while (0) #define trace_rcu_dyntick(polarity, oldnesting, newnesting, dyntick) do { } while (0)
#define trace_rcu_prep_idle(reason) do { } while (0) #define trace_rcu_prep_idle(reason) do { } while (0)
#define trace_rcu_callback(rcuname, rhp, qlen_lazy, qlen) do { } while (0) #define trace_rcu_callback(rcuname, rhp, qlen_lazy, qlen) do { } while (0)
#define trace_rcu_kfree_callback(rcuname, rhp, offset, qlen_lazy, qlen) \ #define trace_rcu_kfree_callback(rcuname, rhp, offset, qlen_lazy, qlen) \

View File

@ -5418,7 +5418,7 @@ int __init cgroup_init(void)
BUG_ON(cgroup_init_cftypes(NULL, cgroup1_base_files)); BUG_ON(cgroup_init_cftypes(NULL, cgroup1_base_files));
/* /*
* The latency of the synchronize_sched() is too high for cgroups, * The latency of the synchronize_rcu() is too high for cgroups,
* avoid it at the cost of forcing all readers into the slow path. * avoid it at the cost of forcing all readers into the slow path.
*/ */
rcu_sync_enter_start(&cgroup_threadgroup_rwsem.rss); rcu_sync_enter_start(&cgroup_threadgroup_rwsem.rss);

View File

@ -9741,7 +9741,7 @@ static void account_event(struct perf_event *event)
* call the perf scheduling hooks before proceeding to * call the perf scheduling hooks before proceeding to
* install events that need them. * install events that need them.
*/ */
synchronize_sched(); synchronize_rcu();
} }
/* /*
* Now that we have waited for the sync_sched(), allow further * Now that we have waited for the sync_sched(), allow further

View File

@ -227,69 +227,6 @@ repeat:
goto repeat; goto repeat;
} }
/*
* Note that if this function returns a valid task_struct pointer (!NULL)
* task->usage must remain >0 for the duration of the RCU critical section.
*/
struct task_struct *task_rcu_dereference(struct task_struct **ptask)
{
struct sighand_struct *sighand;
struct task_struct *task;
/*
* We need to verify that release_task() was not called and thus
* delayed_put_task_struct() can't run and drop the last reference
* before rcu_read_unlock(). We check task->sighand != NULL,
* but we can read the already freed and reused memory.
*/
retry:
task = rcu_dereference(*ptask);
if (!task)
return NULL;
probe_kernel_address(&task->sighand, sighand);
/*
* Pairs with atomic_dec_and_test() in put_task_struct(). If this task
* was already freed we can not miss the preceding update of this
* pointer.
*/
smp_rmb();
if (unlikely(task != READ_ONCE(*ptask)))
goto retry;
/*
* We've re-checked that "task == *ptask", now we have two different
* cases:
*
* 1. This is actually the same task/task_struct. In this case
* sighand != NULL tells us it is still alive.
*
* 2. This is another task which got the same memory for task_struct.
* We can't know this of course, and we can not trust
* sighand != NULL.
*
* In this case we actually return a random value, but this is
* correct.
*
* If we return NULL - we can pretend that we actually noticed that
* *ptask was updated when the previous task has exited. Or pretend
* that probe_slab_address(&sighand) reads NULL.
*
* If we return the new task (because sighand is not NULL for any
* reason) - this is fine too. This (new) task can't go away before
* another gp pass.
*
* And note: We could even eliminate the false positive if re-read
* task->sighand once again to avoid the falsely NULL. But this case
* is very unlikely so we don't care.
*/
if (!sighand)
return NULL;
return task;
}
void rcuwait_wake_up(struct rcuwait *w) void rcuwait_wake_up(struct rcuwait *w)
{ {
struct task_struct *task; struct task_struct *task;
@ -309,10 +246,6 @@ void rcuwait_wake_up(struct rcuwait *w)
*/ */
smp_mb(); /* (B) */ smp_mb(); /* (B) */
/*
* Avoid using task_rcu_dereference() magic as long as we are careful,
* see comment in rcuwait_wait_event() regarding ->exit_state.
*/
task = rcu_dereference(w->task); task = rcu_dereference(w->task);
if (task) if (task)
wake_up_process(task); wake_up_process(task);

View File

@ -229,7 +229,7 @@ static int collect_garbage_slots(struct kprobe_insn_cache *c)
struct kprobe_insn_page *kip, *next; struct kprobe_insn_page *kip, *next;
/* Ensure no-one is interrupted on the garbages */ /* Ensure no-one is interrupted on the garbages */
synchronize_sched(); synchronize_rcu();
list_for_each_entry_safe(kip, next, &c->pages, list) { list_for_each_entry_safe(kip, next, &c->pages, list) {
int i; int i;
@ -1811,7 +1811,7 @@ void unregister_kprobes(struct kprobe **kps, int num)
kps[i]->addr = NULL; kps[i]->addr = NULL;
mutex_unlock(&kprobe_mutex); mutex_unlock(&kprobe_mutex);
synchronize_sched(); synchronize_rcu();
for (i = 0; i < num; i++) for (i = 0; i < num; i++)
if (kps[i]->addr) if (kps[i]->addr)
__unregister_kprobe_bottom(kps[i]); __unregister_kprobe_bottom(kps[i]);
@ -2078,7 +2078,7 @@ void unregister_kretprobes(struct kretprobe **rps, int num)
rps[i]->kp.addr = NULL; rps[i]->kp.addr = NULL;
mutex_unlock(&kprobe_mutex); mutex_unlock(&kprobe_mutex);
synchronize_sched(); synchronize_rcu();
for (i = 0; i < num; i++) { for (i = 0; i < num; i++) {
if (rps[i]->kp.addr) { if (rps[i]->kp.addr) {
__unregister_kprobe_bottom(&rps[i]->kp); __unregister_kprobe_bottom(&rps[i]->kp);

View File

@ -60,7 +60,7 @@ static void notrace klp_ftrace_handler(unsigned long ip,
ops = container_of(fops, struct klp_ops, fops); ops = container_of(fops, struct klp_ops, fops);
/* /*
* A variant of synchronize_sched() is used to allow patching functions * A variant of synchronize_rcu() is used to allow patching functions
* where RCU is not watching, see klp_synchronize_transition(). * where RCU is not watching, see klp_synchronize_transition().
*/ */
preempt_disable_notrace(); preempt_disable_notrace();
@ -71,7 +71,7 @@ static void notrace klp_ftrace_handler(unsigned long ip,
/* /*
* func should never be NULL because preemption should be disabled here * func should never be NULL because preemption should be disabled here
* and unregister_ftrace_function() does the equivalent of a * and unregister_ftrace_function() does the equivalent of a
* synchronize_sched() before the func_stack removal. * synchronize_rcu() before the func_stack removal.
*/ */
if (WARN_ON_ONCE(!func)) if (WARN_ON_ONCE(!func))
goto unlock; goto unlock;

View File

@ -50,7 +50,7 @@ static DECLARE_DELAYED_WORK(klp_transition_work, klp_transition_work_fn);
/* /*
* This function is just a stub to implement a hard force * This function is just a stub to implement a hard force
* of synchronize_sched(). This requires synchronizing * of synchronize_rcu(). This requires synchronizing
* tasks even in userspace and idle. * tasks even in userspace and idle.
*/ */
static void klp_sync(struct work_struct *work) static void klp_sync(struct work_struct *work)
@ -159,7 +159,7 @@ void klp_cancel_transition(void)
void klp_update_patch_state(struct task_struct *task) void klp_update_patch_state(struct task_struct *task)
{ {
/* /*
* A variant of synchronize_sched() is used to allow patching functions * A variant of synchronize_rcu() is used to allow patching functions
* where RCU is not watching, see klp_synchronize_transition(). * where RCU is not watching, see klp_synchronize_transition().
*/ */
preempt_disable_notrace(); preempt_disable_notrace();

View File

@ -4351,7 +4351,7 @@ void lockdep_free_key_range(void *start, unsigned long size)
* *
* sync_sched() is sufficient because the read-side is IRQ disable. * sync_sched() is sufficient because the read-side is IRQ disable.
*/ */
synchronize_sched(); synchronize_rcu();
/* /*
* XXX at this point we could return the resources to the pool; * XXX at this point we could return the resources to the pool;

View File

@ -21,6 +21,9 @@
* Davidlohr Bueso <dave@stgolabs.net> * Davidlohr Bueso <dave@stgolabs.net>
* Based on kernel/rcu/torture.c. * Based on kernel/rcu/torture.c.
*/ */
#define pr_fmt(fmt) fmt
#include <linux/kernel.h> #include <linux/kernel.h>
#include <linux/module.h> #include <linux/module.h>
#include <linux/kthread.h> #include <linux/kthread.h>
@ -57,7 +60,7 @@ torture_param(int, shutdown_secs, 0, "Shutdown time (j), <= zero to disable.");
torture_param(int, stat_interval, 60, torture_param(int, stat_interval, 60,
"Number of seconds between stats printk()s"); "Number of seconds between stats printk()s");
torture_param(int, stutter, 5, "Number of jiffies to run/halt test, 0=disable"); torture_param(int, stutter, 5, "Number of jiffies to run/halt test, 0=disable");
torture_param(bool, verbose, true, torture_param(int, verbose, 1,
"Enable verbose debugging printk()s"); "Enable verbose debugging printk()s");
static char *torture_type = "spin_lock"; static char *torture_type = "spin_lock";
@ -77,10 +80,6 @@ struct lock_stress_stats {
long n_lock_acquired; long n_lock_acquired;
}; };
int torture_runnable = IS_ENABLED(MODULE);
module_param(torture_runnable, int, 0444);
MODULE_PARM_DESC(torture_runnable, "Start locktorture at module init");
/* Forward reference. */ /* Forward reference. */
static void lock_torture_cleanup(void); static void lock_torture_cleanup(void);
@ -130,10 +129,8 @@ static void torture_lock_busted_write_delay(struct torture_random_state *trsp)
if (!(torture_random(trsp) % if (!(torture_random(trsp) %
(cxt.nrealwriters_stress * 2000 * longdelay_ms))) (cxt.nrealwriters_stress * 2000 * longdelay_ms)))
mdelay(longdelay_ms); mdelay(longdelay_ms);
#ifdef CONFIG_PREEMPT
if (!(torture_random(trsp) % (cxt.nrealwriters_stress * 20000))) if (!(torture_random(trsp) % (cxt.nrealwriters_stress * 20000)))
preempt_schedule(); /* Allow test to be preempted. */ torture_preempt_schedule(); /* Allow test to be preempted. */
#endif
} }
static void torture_lock_busted_write_unlock(void) static void torture_lock_busted_write_unlock(void)
@ -179,10 +176,8 @@ static void torture_spin_lock_write_delay(struct torture_random_state *trsp)
if (!(torture_random(trsp) % if (!(torture_random(trsp) %
(cxt.nrealwriters_stress * 2 * shortdelay_us))) (cxt.nrealwriters_stress * 2 * shortdelay_us)))
udelay(shortdelay_us); udelay(shortdelay_us);
#ifdef CONFIG_PREEMPT
if (!(torture_random(trsp) % (cxt.nrealwriters_stress * 20000))) if (!(torture_random(trsp) % (cxt.nrealwriters_stress * 20000)))
preempt_schedule(); /* Allow test to be preempted. */ torture_preempt_schedule(); /* Allow test to be preempted. */
#endif
} }
static void torture_spin_lock_write_unlock(void) __releases(torture_spinlock) static void torture_spin_lock_write_unlock(void) __releases(torture_spinlock)
@ -352,10 +347,8 @@ static void torture_mutex_delay(struct torture_random_state *trsp)
mdelay(longdelay_ms * 5); mdelay(longdelay_ms * 5);
else else
mdelay(longdelay_ms / 5); mdelay(longdelay_ms / 5);
#ifdef CONFIG_PREEMPT
if (!(torture_random(trsp) % (cxt.nrealwriters_stress * 20000))) if (!(torture_random(trsp) % (cxt.nrealwriters_stress * 20000)))
preempt_schedule(); /* Allow test to be preempted. */ torture_preempt_schedule(); /* Allow test to be preempted. */
#endif
} }
static void torture_mutex_unlock(void) __releases(torture_mutex) static void torture_mutex_unlock(void) __releases(torture_mutex)
@ -507,10 +500,8 @@ static void torture_rtmutex_delay(struct torture_random_state *trsp)
if (!(torture_random(trsp) % if (!(torture_random(trsp) %
(cxt.nrealwriters_stress * 2 * shortdelay_us))) (cxt.nrealwriters_stress * 2 * shortdelay_us)))
udelay(shortdelay_us); udelay(shortdelay_us);
#ifdef CONFIG_PREEMPT
if (!(torture_random(trsp) % (cxt.nrealwriters_stress * 20000))) if (!(torture_random(trsp) % (cxt.nrealwriters_stress * 20000)))
preempt_schedule(); /* Allow test to be preempted. */ torture_preempt_schedule(); /* Allow test to be preempted. */
#endif
} }
static void torture_rtmutex_unlock(void) __releases(torture_rtmutex) static void torture_rtmutex_unlock(void) __releases(torture_rtmutex)
@ -547,10 +538,8 @@ static void torture_rwsem_write_delay(struct torture_random_state *trsp)
mdelay(longdelay_ms * 10); mdelay(longdelay_ms * 10);
else else
mdelay(longdelay_ms / 10); mdelay(longdelay_ms / 10);
#ifdef CONFIG_PREEMPT
if (!(torture_random(trsp) % (cxt.nrealwriters_stress * 20000))) if (!(torture_random(trsp) % (cxt.nrealwriters_stress * 20000)))
preempt_schedule(); /* Allow test to be preempted. */ torture_preempt_schedule(); /* Allow test to be preempted. */
#endif
} }
static void torture_rwsem_up_write(void) __releases(torture_rwsem) static void torture_rwsem_up_write(void) __releases(torture_rwsem)
@ -574,10 +563,8 @@ static void torture_rwsem_read_delay(struct torture_random_state *trsp)
mdelay(longdelay_ms * 2); mdelay(longdelay_ms * 2);
else else
mdelay(longdelay_ms / 2); mdelay(longdelay_ms / 2);
#ifdef CONFIG_PREEMPT
if (!(torture_random(trsp) % (cxt.nrealreaders_stress * 20000))) if (!(torture_random(trsp) % (cxt.nrealreaders_stress * 20000)))
preempt_schedule(); /* Allow test to be preempted. */ torture_preempt_schedule(); /* Allow test to be preempted. */
#endif
} }
static void torture_rwsem_up_read(void) __releases(torture_rwsem) static void torture_rwsem_up_read(void) __releases(torture_rwsem)
@ -878,7 +865,7 @@ static int __init lock_torture_init(void)
&percpu_rwsem_lock_ops, &percpu_rwsem_lock_ops,
}; };
if (!torture_init_begin(torture_type, verbose, &torture_runnable)) if (!torture_init_begin(torture_type, verbose))
return -EBUSY; return -EBUSY;
/* Process args and tell the world that the torturer is on the job. */ /* Process args and tell the world that the torturer is on the job. */
@ -979,7 +966,7 @@ static int __init lock_torture_init(void)
/* Prepare torture context. */ /* Prepare torture context. */
if (onoff_interval > 0) { if (onoff_interval > 0) {
firsterr = torture_onoff_init(onoff_holdoff * HZ, firsterr = torture_onoff_init(onoff_holdoff * HZ,
onoff_interval * HZ); onoff_interval * HZ, NULL);
if (firsterr) if (firsterr)
goto unwind; goto unwind;
} }
@ -995,7 +982,7 @@ static int __init lock_torture_init(void)
goto unwind; goto unwind;
} }
if (stutter > 0) { if (stutter > 0) {
firsterr = torture_stutter_init(stutter); firsterr = torture_stutter_init(stutter, stutter);
if (firsterr) if (firsterr)
goto unwind; goto unwind;
} }

View File

@ -15,7 +15,7 @@ int __percpu_init_rwsem(struct percpu_rw_semaphore *sem,
return -ENOMEM; return -ENOMEM;
/* ->rw_sem represents the whole percpu_rw_semaphore for lockdep */ /* ->rw_sem represents the whole percpu_rw_semaphore for lockdep */
rcu_sync_init(&sem->rss, RCU_SCHED_SYNC); rcu_sync_init(&sem->rss);
__init_rwsem(&sem->rw_sem, name, rwsem_key); __init_rwsem(&sem->rw_sem, name, rwsem_key);
rcuwait_init(&sem->writer); rcuwait_init(&sem->writer);
sem->readers_block = 0; sem->readers_block = 0;

View File

@ -2204,7 +2204,7 @@ static void free_module(struct module *mod)
/* Remove this module from bug list, this uses list_del_rcu */ /* Remove this module from bug list, this uses list_del_rcu */
module_bug_cleanup(mod); module_bug_cleanup(mod);
/* Wait for RCU-sched synchronizing before releasing mod->list and buglist. */ /* Wait for RCU-sched synchronizing before releasing mod->list and buglist. */
synchronize_sched(); synchronize_rcu();
mutex_unlock(&module_mutex); mutex_unlock(&module_mutex);
/* This may be empty, but that's OK */ /* This may be empty, but that's OK */
@ -3170,6 +3170,11 @@ static int find_module_sections(struct module *mod, struct load_info *info)
sizeof(*mod->tracepoints_ptrs), sizeof(*mod->tracepoints_ptrs),
&mod->num_tracepoints); &mod->num_tracepoints);
#endif #endif
#ifdef CONFIG_TREE_SRCU
mod->srcu_struct_ptrs = section_objs(info, "___srcu_struct_ptrs",
sizeof(*mod->srcu_struct_ptrs),
&mod->num_srcu_structs);
#endif
#ifdef HAVE_JUMP_LABEL #ifdef HAVE_JUMP_LABEL
mod->jump_entries = section_objs(info, "__jump_table", mod->jump_entries = section_objs(info, "__jump_table",
sizeof(*mod->jump_entries), sizeof(*mod->jump_entries),
@ -3577,15 +3582,15 @@ static noinline int do_init_module(struct module *mod)
/* /*
* We want to free module_init, but be aware that kallsyms may be * We want to free module_init, but be aware that kallsyms may be
* walking this with preempt disabled. In all the failure paths, we * walking this with preempt disabled. In all the failure paths, we
* call synchronize_sched(), but we don't want to slow down the success * call synchronize_rcu(), but we don't want to slow down the success
* path, so use actual RCU here. * path, so use actual RCU here.
* Note that module_alloc() on most architectures creates W+X page * Note that module_alloc() on most architectures creates W+X page
* mappings which won't be cleaned up until do_free_init() runs. Any * mappings which won't be cleaned up until do_free_init() runs. Any
* code such as mark_rodata_ro() which depends on those mappings to * code such as mark_rodata_ro() which depends on those mappings to
* be cleaned up needs to sync with the queued work - ie * be cleaned up needs to sync with the queued work - ie
* rcu_barrier_sched() * rcu_barrier()
*/ */
call_rcu_sched(&freeinit->rcu, do_free_init); call_rcu(&freeinit->rcu, do_free_init);
mutex_unlock(&module_mutex); mutex_unlock(&module_mutex);
wake_up_all(&module_wq); wake_up_all(&module_wq);
@ -3596,7 +3601,7 @@ fail_free_freeinit:
fail: fail:
/* Try to protect us from buggy refcounters. */ /* Try to protect us from buggy refcounters. */
mod->state = MODULE_STATE_GOING; mod->state = MODULE_STATE_GOING;
synchronize_sched(); synchronize_rcu();
module_put(mod); module_put(mod);
blocking_notifier_call_chain(&module_notify_list, blocking_notifier_call_chain(&module_notify_list,
MODULE_STATE_GOING, mod); MODULE_STATE_GOING, mod);
@ -3869,7 +3874,7 @@ static int load_module(struct load_info *info, const char __user *uargs,
ddebug_cleanup: ddebug_cleanup:
dynamic_debug_remove(mod, info->debug); dynamic_debug_remove(mod, info->debug);
synchronize_sched(); synchronize_rcu();
kfree(mod->args); kfree(mod->args);
free_arch_cleanup: free_arch_cleanup:
module_arch_cleanup(mod); module_arch_cleanup(mod);
@ -3884,7 +3889,7 @@ static int load_module(struct load_info *info, const char __user *uargs,
mod_tree_remove(mod); mod_tree_remove(mod);
wake_up_all(&module_wq); wake_up_all(&module_wq);
/* Wait for RCU-sched synchronizing before releasing mod->list. */ /* Wait for RCU-sched synchronizing before releasing mod->list. */
synchronize_sched(); synchronize_rcu();
mutex_unlock(&module_mutex); mutex_unlock(&module_mutex);
free_module: free_module:
/* /*

View File

@ -196,7 +196,7 @@ config RCU_BOOST
This option boosts the priority of preempted RCU readers that This option boosts the priority of preempted RCU readers that
block the current preemptible RCU grace period for too long. block the current preemptible RCU grace period for too long.
This option also prevents heavy loads from blocking RCU This option also prevents heavy loads from blocking RCU
callback invocation for all flavors of RCU. callback invocation.
Say Y here if you are working with real-time apps or heavy loads Say Y here if you are working with real-time apps or heavy loads
Say N here if you are unsure. Say N here if you are unsure.
@ -225,12 +225,12 @@ config RCU_NOCB_CPU
callback invocation to energy-efficient CPUs in battery-powered callback invocation to energy-efficient CPUs in battery-powered
asymmetric multiprocessors. asymmetric multiprocessors.
This option offloads callback invocation from the set of This option offloads callback invocation from the set of CPUs
CPUs specified at boot time by the rcu_nocbs parameter. specified at boot time by the rcu_nocbs parameter. For each
For each such CPU, a kthread ("rcuox/N") will be created to such CPU, a kthread ("rcuox/N") will be created to invoke
invoke callbacks, where the "N" is the CPU being offloaded, callbacks, where the "N" is the CPU being offloaded, and where
and where the "x" is "b" for RCU-bh, "p" for RCU-preempt, and the "p" for RCU-preempt (PREEMPT kernels) and "s" for RCU-sched
"s" for RCU-sched. Nothing prevents this kthread from running (!PREEMPT kernels). Nothing prevents this kthread from running
on the specified CPUs, but (1) the kthreads may be preempted on the specified CPUs, but (1) the kthreads may be preempted
between each callback, and (2) affinity or cgroups can be used between each callback, and (2) affinity or cgroups can be used
to force the kthreads to run on whatever set of CPUs is desired. to force the kthreads to run on whatever set of CPUs is desired.

View File

@ -7,6 +7,17 @@ menu "RCU Debugging"
config PROVE_RCU config PROVE_RCU
def_bool PROVE_LOCKING def_bool PROVE_LOCKING
config PROVE_RCU_LIST
bool "RCU list lockdep debugging"
depends on PROVE_RCU && RCU_EXPERT
default n
help
Enable RCU lockdep checking for list usages. By default it is
turned off since there are several list RCU users that still
need to be converted to pass a lockdep expression. To prevent
false-positive splats, we keep it default disabled but once all
users are converted, we can remove this config option.
config TORTURE_TEST config TORTURE_TEST
tristate tristate
default n default n
@ -30,7 +41,6 @@ config RCU_PERF_TEST
config RCU_TORTURE_TEST config RCU_TORTURE_TEST
tristate "torture tests for RCU" tristate "torture tests for RCU"
depends on DEBUG_KERNEL
select TORTURE_TEST select TORTURE_TEST
select SRCU select SRCU
select TASKS_RCU select TASKS_RCU

View File

@ -30,31 +30,8 @@
#define RCU_TRACE(stmt) #define RCU_TRACE(stmt)
#endif /* #else #ifdef CONFIG_RCU_TRACE */ #endif /* #else #ifdef CONFIG_RCU_TRACE */
/* /* Offset to allow distinguishing irq vs. task-based idle entry/exit. */
* Process-level increment to ->dynticks_nesting field. This allows for #define DYNTICK_IRQ_NONIDLE ((LONG_MAX / 2) + 1)
* architectures that use half-interrupts and half-exceptions from
* process context.
*
* DYNTICK_TASK_NEST_MASK defines a field of width DYNTICK_TASK_NEST_WIDTH
* that counts the number of process-based reasons why RCU cannot
* consider the corresponding CPU to be idle, and DYNTICK_TASK_NEST_VALUE
* is the value used to increment or decrement this field.
*
* The rest of the bits could in principle be used to count interrupts,
* but this would mean that a negative-one value in the interrupt
* field could incorrectly zero out the DYNTICK_TASK_NEST_MASK field.
* We therefore provide a two-bit guard field defined by DYNTICK_TASK_MASK
* that is set to DYNTICK_TASK_FLAG upon initial exit from idle.
* The DYNTICK_TASK_EXIT_IDLE value is thus the combined value used upon
* initial exit from idle.
*/
#define DYNTICK_TASK_NEST_WIDTH 7
#define DYNTICK_TASK_NEST_VALUE ((LLONG_MAX >> DYNTICK_TASK_NEST_WIDTH) + 1)
#define DYNTICK_TASK_NEST_MASK (LLONG_MAX - DYNTICK_TASK_NEST_VALUE + 1)
#define DYNTICK_TASK_FLAG ((DYNTICK_TASK_NEST_VALUE / 8) * 2)
#define DYNTICK_TASK_MASK ((DYNTICK_TASK_NEST_VALUE / 8) * 3)
#define DYNTICK_TASK_EXIT_IDLE (DYNTICK_TASK_NEST_VALUE + \
DYNTICK_TASK_FLAG)
/* /*
@ -100,15 +77,31 @@ static inline void rcu_seq_start(unsigned long *sp)
WARN_ON_ONCE(rcu_seq_state(*sp) != 1); WARN_ON_ONCE(rcu_seq_state(*sp) != 1);
} }
/* Compute the end-of-grace-period value for the specified sequence number. */
static inline unsigned long rcu_seq_endval(unsigned long *sp)
{
return (*sp | RCU_SEQ_STATE_MASK) + 1;
}
/* Adjust sequence number for end of update-side operation. */ /* Adjust sequence number for end of update-side operation. */
static inline void rcu_seq_end(unsigned long *sp) static inline void rcu_seq_end(unsigned long *sp)
{ {
smp_mb(); /* Ensure update-side operation before counter increment. */ smp_mb(); /* Ensure update-side operation before counter increment. */
WARN_ON_ONCE(!rcu_seq_state(*sp)); WARN_ON_ONCE(!rcu_seq_state(*sp));
WRITE_ONCE(*sp, (*sp | RCU_SEQ_STATE_MASK) + 1); WRITE_ONCE(*sp, rcu_seq_endval(sp));
} }
/* Take a snapshot of the update side's sequence number. */ /*
* rcu_seq_snap - Take a snapshot of the update side's sequence number.
*
* This function returns the earliest value of the grace-period sequence number
* that will indicate that a full grace period has elapsed since the current
* time. Once the grace-period sequence number has reached this value, it will
* be safe to invoke all callbacks that have been registered prior to the
* current time. This value is the current grace-period number plus two to the
* power of the number of low-order bits reserved for state, then rounded up to
* the next value in which the state bits are all zero.
*/
static inline unsigned long rcu_seq_snap(unsigned long *sp) static inline unsigned long rcu_seq_snap(unsigned long *sp)
{ {
unsigned long s; unsigned long s;
@ -124,6 +117,15 @@ static inline unsigned long rcu_seq_current(unsigned long *sp)
return READ_ONCE(*sp); return READ_ONCE(*sp);
} }
/*
* Given a snapshot from rcu_seq_snap(), determine whether or not the
* corresponding update-side operation has started.
*/
static inline bool rcu_seq_started(unsigned long *sp, unsigned long s)
{
return ULONG_CMP_LT((s - 1) & ~RCU_SEQ_STATE_MASK, READ_ONCE(*sp));
}
/* /*
* Given a snapshot from rcu_seq_snap(), determine whether or not a * Given a snapshot from rcu_seq_snap(), determine whether or not a
* full update-side operation has occurred. * full update-side operation has occurred.
@ -133,10 +135,50 @@ static inline bool rcu_seq_done(unsigned long *sp, unsigned long s)
return ULONG_CMP_GE(READ_ONCE(*sp), s); return ULONG_CMP_GE(READ_ONCE(*sp), s);
} }
/*
* Has a grace period completed since the time the old gp_seq was collected?
*/
static inline bool rcu_seq_completed_gp(unsigned long old, unsigned long new)
{
return ULONG_CMP_LT(old, new & ~RCU_SEQ_STATE_MASK);
}
/*
* Has a grace period started since the time the old gp_seq was collected?
*/
static inline bool rcu_seq_new_gp(unsigned long old, unsigned long new)
{
return ULONG_CMP_LT((old + RCU_SEQ_STATE_MASK) & ~RCU_SEQ_STATE_MASK,
new);
}
/*
* Roughly how many full grace periods have elapsed between the collection
* of the two specified grace periods?
*/
static inline unsigned long rcu_seq_diff(unsigned long new, unsigned long old)
{
unsigned long rnd_diff;
if (old == new)
return 0;
/*
* Compute the number of grace periods (still shifted up), plus
* one if either of new and old is not an exact grace period.
*/
rnd_diff = (new & ~RCU_SEQ_STATE_MASK) -
((old + RCU_SEQ_STATE_MASK) & ~RCU_SEQ_STATE_MASK) +
((new & RCU_SEQ_STATE_MASK) || (old & RCU_SEQ_STATE_MASK));
if (ULONG_CMP_GE(RCU_SEQ_STATE_MASK, rnd_diff))
return 1; /* Definitely no grace period has elapsed. */
return ((rnd_diff - RCU_SEQ_STATE_MASK - 1) >> RCU_SEQ_CTR_SHIFT) + 2;
}
/* /*
* debug_rcu_head_queue()/debug_rcu_head_unqueue() are used internally * debug_rcu_head_queue()/debug_rcu_head_unqueue() are used internally
* by call_rcu() and rcu callback execution, and are therefore not part of the * by call_rcu() and rcu callback execution, and are therefore not part
* RCU API. Leaving in rcupdate.h because they are used by all RCU flavors. * of the RCU API. These are in rcupdate.h because they are used by all
* RCU implementations.
*/ */
#ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD
@ -182,6 +224,7 @@ void kfree(const void *);
*/ */
static inline bool __rcu_reclaim(const char *rn, struct rcu_head *head) static inline bool __rcu_reclaim(const char *rn, struct rcu_head *head)
{ {
rcu_callback_t f;
unsigned long offset = (unsigned long)head->func; unsigned long offset = (unsigned long)head->func;
rcu_lock_acquire(&rcu_callback_map); rcu_lock_acquire(&rcu_callback_map);
@ -192,7 +235,9 @@ static inline bool __rcu_reclaim(const char *rn, struct rcu_head *head)
return true; return true;
} else { } else {
RCU_TRACE(trace_rcu_invoke_callback(rn, head);) RCU_TRACE(trace_rcu_invoke_callback(rn, head);)
head->func(head); f = head->func;
WRITE_ONCE(head->func, (rcu_callback_t)0L);
f(head);
rcu_lock_release(&rcu_callback_map); rcu_lock_release(&rcu_callback_map);
return false; return false;
} }
@ -200,9 +245,26 @@ static inline bool __rcu_reclaim(const char *rn, struct rcu_head *head)
#ifdef CONFIG_RCU_STALL_COMMON #ifdef CONFIG_RCU_STALL_COMMON
extern int rcu_cpu_stall_ftrace_dump;
extern int rcu_cpu_stall_suppress; extern int rcu_cpu_stall_suppress;
extern int rcu_cpu_stall_timeout;
int rcu_jiffies_till_stall_check(void); int rcu_jiffies_till_stall_check(void);
#define rcu_ftrace_dump_stall_suppress() \
do { \
if (!rcu_cpu_stall_suppress) \
rcu_cpu_stall_suppress = 3; \
} while (0)
#define rcu_ftrace_dump_stall_unsuppress() \
do { \
if (rcu_cpu_stall_suppress == 3) \
rcu_cpu_stall_suppress = 0; \
} while (0)
#else /* #endif #ifdef CONFIG_RCU_STALL_COMMON */
#define rcu_ftrace_dump_stall_suppress()
#define rcu_ftrace_dump_stall_unsuppress()
#endif /* #ifdef CONFIG_RCU_STALL_COMMON */ #endif /* #ifdef CONFIG_RCU_STALL_COMMON */
/* /*
@ -220,8 +282,12 @@ do { \
static atomic_t ___rfd_beenhere = ATOMIC_INIT(0); \ static atomic_t ___rfd_beenhere = ATOMIC_INIT(0); \
\ \
if (!atomic_read(&___rfd_beenhere) && \ if (!atomic_read(&___rfd_beenhere) && \
!atomic_xchg(&___rfd_beenhere, 1)) \ !atomic_xchg(&___rfd_beenhere, 1)) { \
tracing_off(); \
rcu_ftrace_dump_stall_suppress(); \
ftrace_dump(oops_dump_mode); \ ftrace_dump(oops_dump_mode); \
rcu_ftrace_dump_stall_unsuppress(); \
} \
} while (0) } while (0)
void rcu_early_boot_tests(void); void rcu_early_boot_tests(void);
@ -268,40 +334,53 @@ static inline void rcu_init_levelspread(int *levelspread, const int *levelcnt)
} }
} }
/* /* Returns a pointer to the first leaf rcu_node structure. */
* Do a full breadth-first scan of the rcu_node structures for the #define rcu_first_leaf_node() (rcu_state.level[rcu_num_lvls - 1])
* specified rcu_state structure.
*/ /* Is this rcu_node a leaf? */
#define rcu_for_each_node_breadth_first(rsp, rnp) \ #define rcu_is_leaf_node(rnp) ((rnp)->level == rcu_num_lvls - 1)
for ((rnp) = &(rsp)->node[0]; \
(rnp) < &(rsp)->node[rcu_num_nodes]; (rnp)++) /* Is this rcu_node the last leaf? */
#define rcu_is_last_leaf_node(rnp) ((rnp) == &rcu_state.node[rcu_num_nodes - 1])
/* /*
* Do a breadth-first scan of the non-leaf rcu_node structures for the * Do a full breadth-first scan of the {s,}rcu_node structures for the
* specified rcu_state structure. Note that if there is a singleton * specified state structure (for SRCU) or the only rcu_state structure
* rcu_node tree with but one rcu_node structure, this loop is a no-op. * (for RCU).
*/ */
#define rcu_for_each_nonleaf_node_breadth_first(rsp, rnp) \ #define srcu_for_each_node_breadth_first(sp, rnp) \
for ((rnp) = &(rsp)->node[0]; \ for ((rnp) = &(sp)->node[0]; \
(rnp) < (rsp)->level[rcu_num_lvls - 1]; (rnp)++) (rnp) < &(sp)->node[rcu_num_nodes]; (rnp)++)
#define rcu_for_each_node_breadth_first(rnp) \
srcu_for_each_node_breadth_first(&rcu_state, rnp)
/* /*
* Scan the leaves of the rcu_node hierarchy for the specified rcu_state * Scan the leaves of the rcu_node hierarchy for the rcu_state structure.
* structure. Note that if there is a singleton rcu_node tree with but * Note that if there is a singleton rcu_node tree with but one rcu_node
* one rcu_node structure, this loop -will- visit the rcu_node structure. * structure, this loop -will- visit the rcu_node structure. It is still
* It is still a leaf node, even if it is also the root node. * a leaf node, even if it is also the root node.
*/ */
#define rcu_for_each_leaf_node(rsp, rnp) \ #define rcu_for_each_leaf_node(rnp) \
for ((rnp) = (rsp)->level[rcu_num_lvls - 1]; \ for ((rnp) = rcu_first_leaf_node(); \
(rnp) < &(rsp)->node[rcu_num_nodes]; (rnp)++) (rnp) < &rcu_state.node[rcu_num_nodes]; (rnp)++)
/* /*
* Iterate over all possible CPUs in a leaf RCU node. * Iterate over all possible CPUs in a leaf RCU node.
*/ */
#define for_each_leaf_node_possible_cpu(rnp, cpu) \ #define for_each_leaf_node_possible_cpu(rnp, cpu) \
for ((cpu) = cpumask_next(rnp->grplo - 1, cpu_possible_mask); \ for ((cpu) = cpumask_next((rnp)->grplo - 1, cpu_possible_mask); \
cpu <= rnp->grphi; \ (cpu) <= rnp->grphi; \
cpu = cpumask_next((cpu), cpu_possible_mask)) (cpu) = cpumask_next((cpu), cpu_possible_mask))
/*
* Iterate over all CPUs in a leaf RCU node's specified mask.
*/
#define rcu_find_next_bit(rnp, cpu, mask) \
((rnp)->grplo + find_next_bit(&(mask), BITS_PER_LONG, (cpu)))
#define for_each_leaf_node_cpu_mask(rnp, cpu, mask) \
for ((cpu) = rcu_find_next_bit((rnp), 0, (mask)); \
(cpu) <= rnp->grphi; \
(cpu) = rcu_find_next_bit((rnp), (cpu) + 1 - (rnp->grplo), (mask)))
/* /*
* Wrappers for the rcu_node::lock acquire and release. * Wrappers for the rcu_node::lock acquire and release.
@ -341,7 +420,7 @@ do { \
} while (0) } while (0)
#define raw_spin_unlock_irqrestore_rcu_node(p, flags) \ #define raw_spin_unlock_irqrestore_rcu_node(p, flags) \
raw_spin_unlock_irqrestore(&ACCESS_PRIVATE(p, lock), flags) \ raw_spin_unlock_irqrestore(&ACCESS_PRIVATE(p, lock), flags)
#define raw_spin_trylock_rcu_node(p) \ #define raw_spin_trylock_rcu_node(p) \
({ \ ({ \
@ -352,45 +431,48 @@ do { \
___locked; \ ___locked; \
}) })
#define raw_lockdep_assert_held_rcu_node(p) \
lockdep_assert_held(&ACCESS_PRIVATE(p, lock))
#endif /* #if defined(SRCU) || !defined(TINY_RCU) */ #endif /* #if defined(SRCU) || !defined(TINY_RCU) */
#ifdef CONFIG_SRCU
void srcu_init(void);
#else /* #ifdef CONFIG_SRCU */
static inline void srcu_init(void) { }
#endif /* #else #ifdef CONFIG_SRCU */
#ifdef CONFIG_TINY_RCU #ifdef CONFIG_TINY_RCU
/* Tiny RCU doesn't expedite, as its purpose in life is instead to be tiny. */ /* Tiny RCU doesn't expedite, as its purpose in life is instead to be tiny. */
static inline bool rcu_gp_is_normal(void) { return true; } static inline bool rcu_gp_is_normal(void) { return true; }
static inline bool rcu_gp_is_expedited(void) { return false; } static inline bool rcu_gp_is_expedited(void) { return false; }
static inline void rcu_expedite_gp(void) { } static inline void rcu_expedite_gp(void) { }
static inline void rcu_unexpedite_gp(void) { } static inline void rcu_unexpedite_gp(void) { }
static inline void rcu_request_urgent_qs_task(struct task_struct *t) { }
#else /* #ifdef CONFIG_TINY_RCU */ #else /* #ifdef CONFIG_TINY_RCU */
bool rcu_gp_is_normal(void); /* Internal RCU use. */ bool rcu_gp_is_normal(void); /* Internal RCU use. */
bool rcu_gp_is_expedited(void); /* Internal RCU use. */ bool rcu_gp_is_expedited(void); /* Internal RCU use. */
void rcu_expedite_gp(void); void rcu_expedite_gp(void);
void rcu_unexpedite_gp(void); void rcu_unexpedite_gp(void);
void rcupdate_announce_bootup_oddness(void); void rcupdate_announce_bootup_oddness(void);
void rcu_request_urgent_qs_task(struct task_struct *t);
#endif /* #else #ifdef CONFIG_TINY_RCU */ #endif /* #else #ifdef CONFIG_TINY_RCU */
#define RCU_SCHEDULER_INACTIVE 0 #define RCU_SCHEDULER_INACTIVE 0
#define RCU_SCHEDULER_INIT 1 #define RCU_SCHEDULER_INIT 1
#define RCU_SCHEDULER_RUNNING 2 #define RCU_SCHEDULER_RUNNING 2
#ifdef CONFIG_TINY_RCU
static inline void rcu_request_urgent_qs_task(struct task_struct *t) { }
#else /* #ifdef CONFIG_TINY_RCU */
void rcu_request_urgent_qs_task(struct task_struct *t);
#endif /* #else #ifdef CONFIG_TINY_RCU */
enum rcutorture_type { enum rcutorture_type {
RCU_FLAVOR, RCU_FLAVOR,
RCU_BH_FLAVOR,
RCU_SCHED_FLAVOR,
RCU_TASKS_FLAVOR, RCU_TASKS_FLAVOR,
RCU_TRIVIAL_FLAVOR,
SRCU_FLAVOR, SRCU_FLAVOR,
INVALID_RCU_FLAVOR INVALID_RCU_FLAVOR
}; };
#if defined(CONFIG_TREE_RCU) || defined(CONFIG_PREEMPT_RCU) #if defined(CONFIG_TREE_RCU) || defined(CONFIG_PREEMPT_RCU)
void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags, void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags,
unsigned long *gpnum, unsigned long *completed); unsigned long *gp_seq);
void rcutorture_record_test_transition(void);
void rcutorture_record_progress(unsigned long vernum); void rcutorture_record_progress(unsigned long vernum);
void do_trace_rcu_torture_read(const char *rcutorturename, void do_trace_rcu_torture_read(const char *rcutorturename,
struct rcu_head *rhp, struct rcu_head *rhp,
@ -399,15 +481,11 @@ void do_trace_rcu_torture_read(const char *rcutorturename,
unsigned long c); unsigned long c);
#else #else
static inline void rcutorture_get_gp_data(enum rcutorture_type test_type, static inline void rcutorture_get_gp_data(enum rcutorture_type test_type,
int *flags, int *flags, unsigned long *gp_seq)
unsigned long *gpnum,
unsigned long *completed)
{ {
*flags = 0; *flags = 0;
*gpnum = 0; *gp_seq = 0;
*completed = 0;
} }
static inline void rcutorture_record_test_transition(void) { }
static inline void rcutorture_record_progress(unsigned long vernum) { } static inline void rcutorture_record_progress(unsigned long vernum) { }
#ifdef CONFIG_RCU_TRACE #ifdef CONFIG_RCU_TRACE
void do_trace_rcu_torture_read(const char *rcutorturename, void do_trace_rcu_torture_read(const char *rcutorturename,
@ -421,66 +499,57 @@ void do_trace_rcu_torture_read(const char *rcutorturename,
#endif #endif
#endif #endif
#if IS_ENABLED(CONFIG_RCU_TORTURE_TEST) || IS_MODULE(CONFIG_RCU_TORTURE_TEST)
long rcutorture_sched_setaffinity(pid_t pid, const struct cpumask *in_mask);
#endif
#ifdef CONFIG_TINY_SRCU #ifdef CONFIG_TINY_SRCU
static inline void srcutorture_get_gp_data(enum rcutorture_type test_type, static inline void srcutorture_get_gp_data(enum rcutorture_type test_type,
struct srcu_struct *sp, int *flags, struct srcu_struct *sp, int *flags,
unsigned long *gpnum, unsigned long *gp_seq)
unsigned long *completed)
{ {
if (test_type != SRCU_FLAVOR) if (test_type != SRCU_FLAVOR)
return; return;
*flags = 0; *flags = 0;
*completed = sp->srcu_idx; *gp_seq = sp->srcu_idx;
*gpnum = *completed;
} }
#elif defined(CONFIG_TREE_SRCU) #elif defined(CONFIG_TREE_SRCU)
void srcutorture_get_gp_data(enum rcutorture_type test_type, void srcutorture_get_gp_data(enum rcutorture_type test_type,
struct srcu_struct *sp, int *flags, struct srcu_struct *sp, int *flags,
unsigned long *gpnum, unsigned long *completed); unsigned long *gp_seq);
#endif #endif
#ifdef CONFIG_TINY_RCU #ifdef CONFIG_TINY_RCU
static inline unsigned long rcu_batches_started(void) { return 0; } static inline unsigned long rcu_get_gp_seq(void) { return 0; }
static inline unsigned long rcu_batches_started_bh(void) { return 0; }
static inline unsigned long rcu_batches_started_sched(void) { return 0; }
static inline unsigned long rcu_batches_completed(void) { return 0; }
static inline unsigned long rcu_batches_completed_bh(void) { return 0; }
static inline unsigned long rcu_batches_completed_sched(void) { return 0; }
static inline unsigned long rcu_exp_batches_completed(void) { return 0; } static inline unsigned long rcu_exp_batches_completed(void) { return 0; }
static inline unsigned long rcu_exp_batches_completed_sched(void) { return 0; }
static inline unsigned long static inline unsigned long
srcu_batches_completed(struct srcu_struct *sp) { return 0; } srcu_batches_completed(struct srcu_struct *sp) { return 0; }
static inline void rcu_force_quiescent_state(void) { } static inline void rcu_force_quiescent_state(void) { }
static inline void rcu_bh_force_quiescent_state(void) { }
static inline void rcu_sched_force_quiescent_state(void) { }
static inline void show_rcu_gp_kthreads(void) { } static inline void show_rcu_gp_kthreads(void) { }
static inline int rcu_get_gp_kthreads_prio(void) { return 0; }
static inline void rcu_fwd_progress_check(unsigned long j) { }
#else /* #ifdef CONFIG_TINY_RCU */ #else /* #ifdef CONFIG_TINY_RCU */
extern unsigned long rcutorture_testseq; unsigned long rcu_get_gp_seq(void);
extern unsigned long rcutorture_vernum;
unsigned long rcu_batches_started(void);
unsigned long rcu_batches_started_bh(void);
unsigned long rcu_batches_started_sched(void);
unsigned long rcu_batches_completed(void);
unsigned long rcu_batches_completed_bh(void);
unsigned long rcu_batches_completed_sched(void);
unsigned long rcu_exp_batches_completed(void); unsigned long rcu_exp_batches_completed(void);
unsigned long rcu_exp_batches_completed_sched(void);
unsigned long srcu_batches_completed(struct srcu_struct *sp); unsigned long srcu_batches_completed(struct srcu_struct *sp);
void show_rcu_gp_kthreads(void); void show_rcu_gp_kthreads(void);
int rcu_get_gp_kthreads_prio(void);
void rcu_fwd_progress_check(unsigned long j);
void rcu_force_quiescent_state(void); void rcu_force_quiescent_state(void);
void rcu_bh_force_quiescent_state(void);
void rcu_sched_force_quiescent_state(void);
extern struct workqueue_struct *rcu_gp_wq; extern struct workqueue_struct *rcu_gp_wq;
extern struct workqueue_struct *rcu_par_gp_wq;
#endif /* #else #ifdef CONFIG_TINY_RCU */ #endif /* #else #ifdef CONFIG_TINY_RCU */
#ifdef CONFIG_RCU_NOCB_CPU #ifdef CONFIG_RCU_NOCB_CPU
bool rcu_is_nocb_cpu(int cpu); bool rcu_is_nocb_cpu(int cpu);
void rcu_bind_current_to_nocb(void);
#else #else
static inline bool rcu_is_nocb_cpu(int cpu) { return false; } static inline bool rcu_is_nocb_cpu(int cpu) { return false; }
static inline void rcu_bind_current_to_nocb(void) { }
#endif #endif
#endif /* __LINUX_RCU_H */ #endif /* __LINUX_RCU_H */

View File

@ -23,6 +23,7 @@
#include <linux/types.h> #include <linux/types.h>
#include <linux/kernel.h> #include <linux/kernel.h>
#include <linux/interrupt.h> #include <linux/interrupt.h>
#include <linux/rcupdate.h>
#include "rcu_segcblist.h" #include "rcu_segcblist.h"
@ -35,6 +36,49 @@ void rcu_cblist_init(struct rcu_cblist *rclp)
rclp->len_lazy = 0; rclp->len_lazy = 0;
} }
/*
* Enqueue an rcu_head structure onto the specified callback list.
* This function assumes that the callback is non-lazy because it
* is intended for use by no-CBs CPUs, which do not distinguish
* between lazy and non-lazy RCU callbacks.
*/
void rcu_cblist_enqueue(struct rcu_cblist *rclp, struct rcu_head *rhp)
{
*rclp->tail = rhp;
rclp->tail = &rhp->next;
WRITE_ONCE(rclp->len, rclp->len + 1);
}
/*
* Flush the second rcu_cblist structure onto the first one, obliterating
* any contents of the first. If rhp is non-NULL, enqueue it as the sole
* element of the second rcu_cblist structure, but ensuring that the second
* rcu_cblist structure, if initially non-empty, always appears non-empty
* throughout the process. If rdp is NULL, the second rcu_cblist structure
* is instead initialized to empty.
*/
void rcu_cblist_flush_enqueue(struct rcu_cblist *drclp,
struct rcu_cblist *srclp,
struct rcu_head *rhp)
{
drclp->head = srclp->head;
if (drclp->head)
drclp->tail = srclp->tail;
else
drclp->tail = &drclp->head;
drclp->len = srclp->len;
drclp->len_lazy = srclp->len_lazy;
if (!rhp) {
rcu_cblist_init(srclp);
} else {
rhp->next = NULL;
srclp->head = rhp;
srclp->tail = &rhp->next;
WRITE_ONCE(srclp->len, 1);
srclp->len_lazy = 0;
}
}
/* /*
* Dequeue the oldest rcu_head structure from the specified callback * Dequeue the oldest rcu_head structure from the specified callback
* list. This function assumes that the callback is non-lazy, but * list. This function assumes that the callback is non-lazy, but
@ -56,6 +100,67 @@ struct rcu_head *rcu_cblist_dequeue(struct rcu_cblist *rclp)
return rhp; return rhp;
} }
/* Set the length of an rcu_segcblist structure. */
void rcu_segcblist_set_len(struct rcu_segcblist *rsclp, long v)
{
#ifdef CONFIG_RCU_NOCB_CPU
atomic_long_set(&rsclp->len, v);
#else
WRITE_ONCE(rsclp->len, v);
#endif
}
/*
* Increase the numeric length of an rcu_segcblist structure by the
* specified amount, which can be negative. This can cause the ->len
* field to disagree with the actual number of callbacks on the structure.
* This increase is fully ordered with respect to the callers accesses
* both before and after.
*/
void rcu_segcblist_add_len(struct rcu_segcblist *rsclp, long v)
{
#ifdef CONFIG_RCU_NOCB_CPU
smp_mb__before_atomic(); /* Up to the caller! */
atomic_long_add(v, &rsclp->len);
smp_mb__after_atomic(); /* Up to the caller! */
#else
smp_mb(); /* Up to the caller! */
WRITE_ONCE(rsclp->len, rsclp->len + v);
smp_mb(); /* Up to the caller! */
#endif
}
/*
* Increase the numeric length of an rcu_segcblist structure by one.
* This can cause the ->len field to disagree with the actual number of
* callbacks on the structure. This increase is fully ordered with respect
* to the callers accesses both before and after.
*/
void rcu_segcblist_inc_len(struct rcu_segcblist *rsclp)
{
rcu_segcblist_add_len(rsclp, 1);
}
/*
* Exchange the numeric length of the specified rcu_segcblist structure
* with the specified value. This can cause the ->len field to disagree
* with the actual number of callbacks on the structure. This exchange is
* fully ordered with respect to the callers accesses both before and after.
*/
long rcu_segcblist_xchg_len(struct rcu_segcblist *rsclp, long v)
{
#ifdef CONFIG_RCU_NOCB_CPU
return atomic_long_xchg(&rsclp->len, v);
#else
long ret = rsclp->len;
smp_mb(); /* Up to the caller! */
WRITE_ONCE(rsclp->len, v);
smp_mb(); /* Up to the caller! */
return ret;
#endif
}
/* /*
* Initialize an rcu_segcblist structure. * Initialize an rcu_segcblist structure.
*/ */
@ -68,8 +173,9 @@ void rcu_segcblist_init(struct rcu_segcblist *rsclp)
rsclp->head = NULL; rsclp->head = NULL;
for (i = 0; i < RCU_CBLIST_NSEGS; i++) for (i = 0; i < RCU_CBLIST_NSEGS; i++)
rsclp->tails[i] = &rsclp->head; rsclp->tails[i] = &rsclp->head;
rsclp->len = 0; rcu_segcblist_set_len(rsclp, 0);
rsclp->len_lazy = 0; rsclp->len_lazy = 0;
rsclp->enabled = 1;
} }
/* /*
@ -81,7 +187,16 @@ void rcu_segcblist_disable(struct rcu_segcblist *rsclp)
WARN_ON_ONCE(!rcu_segcblist_empty(rsclp)); WARN_ON_ONCE(!rcu_segcblist_empty(rsclp));
WARN_ON_ONCE(rcu_segcblist_n_cbs(rsclp)); WARN_ON_ONCE(rcu_segcblist_n_cbs(rsclp));
WARN_ON_ONCE(rcu_segcblist_n_lazy_cbs(rsclp)); WARN_ON_ONCE(rcu_segcblist_n_lazy_cbs(rsclp));
rsclp->tails[RCU_NEXT_TAIL] = NULL; rsclp->enabled = 0;
}
/*
* Mark the specified rcu_segcblist structure as offloaded. This
* structure must be empty.
*/
void rcu_segcblist_offload(struct rcu_segcblist *rsclp)
{
rsclp->offloaded = 1;
} }
/* /*
@ -129,6 +244,18 @@ struct rcu_head *rcu_segcblist_first_pend_cb(struct rcu_segcblist *rsclp)
return NULL; return NULL;
} }
/*
* Return false if there are no CBs awaiting grace periods, otherwise,
* return true and store the nearest waited-upon grace period into *lp.
*/
bool rcu_segcblist_nextgp(struct rcu_segcblist *rsclp, unsigned long *lp)
{
if (!rcu_segcblist_pend_cbs(rsclp))
return false;
*lp = rsclp->gp_seq[RCU_WAIT_TAIL];
return true;
}
/* /*
* Enqueue the specified callback onto the specified rcu_segcblist * Enqueue the specified callback onto the specified rcu_segcblist
* structure, updating accounting as needed. Note that the ->len * structure, updating accounting as needed. Note that the ->len
@ -141,13 +268,13 @@ struct rcu_head *rcu_segcblist_first_pend_cb(struct rcu_segcblist *rsclp)
void rcu_segcblist_enqueue(struct rcu_segcblist *rsclp, void rcu_segcblist_enqueue(struct rcu_segcblist *rsclp,
struct rcu_head *rhp, bool lazy) struct rcu_head *rhp, bool lazy)
{ {
WRITE_ONCE(rsclp->len, rsclp->len + 1); /* ->len sampled locklessly. */ rcu_segcblist_inc_len(rsclp);
if (lazy) if (lazy)
rsclp->len_lazy++; rsclp->len_lazy++;
smp_mb(); /* Ensure counts are updated before callback is enqueued. */ smp_mb(); /* Ensure counts are updated before callback is enqueued. */
rhp->next = NULL; rhp->next = NULL;
*rsclp->tails[RCU_NEXT_TAIL] = rhp; WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rhp);
rsclp->tails[RCU_NEXT_TAIL] = &rhp->next; WRITE_ONCE(rsclp->tails[RCU_NEXT_TAIL], &rhp->next);
} }
/* /*
@ -167,7 +294,7 @@ bool rcu_segcblist_entrain(struct rcu_segcblist *rsclp,
if (rcu_segcblist_n_cbs(rsclp) == 0) if (rcu_segcblist_n_cbs(rsclp) == 0)
return false; return false;
WRITE_ONCE(rsclp->len, rsclp->len + 1); rcu_segcblist_inc_len(rsclp);
if (lazy) if (lazy)
rsclp->len_lazy++; rsclp->len_lazy++;
smp_mb(); /* Ensure counts are updated before callback is entrained. */ smp_mb(); /* Ensure counts are updated before callback is entrained. */
@ -175,9 +302,9 @@ bool rcu_segcblist_entrain(struct rcu_segcblist *rsclp,
for (i = RCU_NEXT_TAIL; i > RCU_DONE_TAIL; i--) for (i = RCU_NEXT_TAIL; i > RCU_DONE_TAIL; i--)
if (rsclp->tails[i] != rsclp->tails[i - 1]) if (rsclp->tails[i] != rsclp->tails[i - 1])
break; break;
*rsclp->tails[i] = rhp; WRITE_ONCE(*rsclp->tails[i], rhp);
for (; i <= RCU_NEXT_TAIL; i++) for (; i <= RCU_NEXT_TAIL; i++)
rsclp->tails[i] = &rhp->next; WRITE_ONCE(rsclp->tails[i], &rhp->next);
return true; return true;
} }
@ -194,9 +321,8 @@ void rcu_segcblist_extract_count(struct rcu_segcblist *rsclp,
struct rcu_cblist *rclp) struct rcu_cblist *rclp)
{ {
rclp->len_lazy += rsclp->len_lazy; rclp->len_lazy += rsclp->len_lazy;
rclp->len += rsclp->len;
rsclp->len_lazy = 0; rsclp->len_lazy = 0;
WRITE_ONCE(rsclp->len, 0); /* ->len sampled locklessly. */ rclp->len = rcu_segcblist_xchg_len(rsclp, 0);
} }
/* /*
@ -212,12 +338,12 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
if (!rcu_segcblist_ready_cbs(rsclp)) if (!rcu_segcblist_ready_cbs(rsclp))
return; /* Nothing to do. */ return; /* Nothing to do. */
*rclp->tail = rsclp->head; *rclp->tail = rsclp->head;
rsclp->head = *rsclp->tails[RCU_DONE_TAIL]; WRITE_ONCE(rsclp->head, *rsclp->tails[RCU_DONE_TAIL]);
*rsclp->tails[RCU_DONE_TAIL] = NULL; WRITE_ONCE(*rsclp->tails[RCU_DONE_TAIL], NULL);
rclp->tail = rsclp->tails[RCU_DONE_TAIL]; rclp->tail = rsclp->tails[RCU_DONE_TAIL];
for (i = RCU_CBLIST_NSEGS - 1; i >= RCU_DONE_TAIL; i--) for (i = RCU_CBLIST_NSEGS - 1; i >= RCU_DONE_TAIL; i--)
if (rsclp->tails[i] == rsclp->tails[RCU_DONE_TAIL]) if (rsclp->tails[i] == rsclp->tails[RCU_DONE_TAIL])
rsclp->tails[i] = &rsclp->head; WRITE_ONCE(rsclp->tails[i], &rsclp->head);
} }
/* /*
@ -236,9 +362,9 @@ void rcu_segcblist_extract_pend_cbs(struct rcu_segcblist *rsclp,
return; /* Nothing to do. */ return; /* Nothing to do. */
*rclp->tail = *rsclp->tails[RCU_DONE_TAIL]; *rclp->tail = *rsclp->tails[RCU_DONE_TAIL];
rclp->tail = rsclp->tails[RCU_NEXT_TAIL]; rclp->tail = rsclp->tails[RCU_NEXT_TAIL];
*rsclp->tails[RCU_DONE_TAIL] = NULL; WRITE_ONCE(*rsclp->tails[RCU_DONE_TAIL], NULL);
for (i = RCU_DONE_TAIL + 1; i < RCU_CBLIST_NSEGS; i++) for (i = RCU_DONE_TAIL + 1; i < RCU_CBLIST_NSEGS; i++)
rsclp->tails[i] = rsclp->tails[RCU_DONE_TAIL]; WRITE_ONCE(rsclp->tails[i], rsclp->tails[RCU_DONE_TAIL]);
} }
/* /*
@ -249,8 +375,7 @@ void rcu_segcblist_insert_count(struct rcu_segcblist *rsclp,
struct rcu_cblist *rclp) struct rcu_cblist *rclp)
{ {
rsclp->len_lazy += rclp->len_lazy; rsclp->len_lazy += rclp->len_lazy;
/* ->len sampled locklessly. */ rcu_segcblist_add_len(rsclp, rclp->len);
WRITE_ONCE(rsclp->len, rsclp->len + rclp->len);
rclp->len_lazy = 0; rclp->len_lazy = 0;
rclp->len = 0; rclp->len = 0;
} }
@ -267,10 +392,10 @@ void rcu_segcblist_insert_done_cbs(struct rcu_segcblist *rsclp,
if (!rclp->head) if (!rclp->head)
return; /* No callbacks to move. */ return; /* No callbacks to move. */
*rclp->tail = rsclp->head; *rclp->tail = rsclp->head;
rsclp->head = rclp->head; WRITE_ONCE(rsclp->head, rclp->head);
for (i = RCU_DONE_TAIL; i < RCU_CBLIST_NSEGS; i++) for (i = RCU_DONE_TAIL; i < RCU_CBLIST_NSEGS; i++)
if (&rsclp->head == rsclp->tails[i]) if (&rsclp->head == rsclp->tails[i])
rsclp->tails[i] = rclp->tail; WRITE_ONCE(rsclp->tails[i], rclp->tail);
else else
break; break;
rclp->head = NULL; rclp->head = NULL;
@ -286,8 +411,8 @@ void rcu_segcblist_insert_pend_cbs(struct rcu_segcblist *rsclp,
{ {
if (!rclp->head) if (!rclp->head)
return; /* Nothing to do. */ return; /* Nothing to do. */
*rsclp->tails[RCU_NEXT_TAIL] = rclp->head; WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rclp->head);
rsclp->tails[RCU_NEXT_TAIL] = rclp->tail; WRITE_ONCE(rsclp->tails[RCU_NEXT_TAIL], rclp->tail);
rclp->head = NULL; rclp->head = NULL;
rclp->tail = &rclp->head; rclp->tail = &rclp->head;
} }
@ -311,7 +436,7 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq)
for (i = RCU_WAIT_TAIL; i < RCU_NEXT_TAIL; i++) { for (i = RCU_WAIT_TAIL; i < RCU_NEXT_TAIL; i++) {
if (ULONG_CMP_LT(seq, rsclp->gp_seq[i])) if (ULONG_CMP_LT(seq, rsclp->gp_seq[i]))
break; break;
rsclp->tails[RCU_DONE_TAIL] = rsclp->tails[i]; WRITE_ONCE(rsclp->tails[RCU_DONE_TAIL], rsclp->tails[i]);
} }
/* If no callbacks moved, nothing more need be done. */ /* If no callbacks moved, nothing more need be done. */
@ -320,7 +445,7 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq)
/* Clean up tail pointers that might have been misordered above. */ /* Clean up tail pointers that might have been misordered above. */
for (j = RCU_WAIT_TAIL; j < i; j++) for (j = RCU_WAIT_TAIL; j < i; j++)
rsclp->tails[j] = rsclp->tails[RCU_DONE_TAIL]; WRITE_ONCE(rsclp->tails[j], rsclp->tails[RCU_DONE_TAIL]);
/* /*
* Callbacks moved, so clean up the misordered ->tails[] pointers * Callbacks moved, so clean up the misordered ->tails[] pointers
@ -331,7 +456,7 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq)
for (j = RCU_WAIT_TAIL; i < RCU_NEXT_TAIL; i++, j++) { for (j = RCU_WAIT_TAIL; i < RCU_NEXT_TAIL; i++, j++) {
if (rsclp->tails[j] == rsclp->tails[RCU_NEXT_TAIL]) if (rsclp->tails[j] == rsclp->tails[RCU_NEXT_TAIL])
break; /* No more callbacks. */ break; /* No more callbacks. */
rsclp->tails[j] = rsclp->tails[i]; WRITE_ONCE(rsclp->tails[j], rsclp->tails[i]);
rsclp->gp_seq[j] = rsclp->gp_seq[i]; rsclp->gp_seq[j] = rsclp->gp_seq[i];
} }
} }
@ -396,30 +521,12 @@ bool rcu_segcblist_accelerate(struct rcu_segcblist *rsclp, unsigned long seq)
* structure other than in the RCU_NEXT_TAIL segment. * structure other than in the RCU_NEXT_TAIL segment.
*/ */
for (; i < RCU_NEXT_TAIL; i++) { for (; i < RCU_NEXT_TAIL; i++) {
rsclp->tails[i] = rsclp->tails[RCU_NEXT_TAIL]; WRITE_ONCE(rsclp->tails[i], rsclp->tails[RCU_NEXT_TAIL]);
rsclp->gp_seq[i] = seq; rsclp->gp_seq[i] = seq;
} }
return true; return true;
} }
/*
* Scan the specified rcu_segcblist structure for callbacks that need
* a grace period later than the one specified by "seq". We don't look
* at the RCU_DONE_TAIL or RCU_NEXT_TAIL segments because they don't
* have a grace-period sequence number.
*/
bool rcu_segcblist_future_gp_needed(struct rcu_segcblist *rsclp,
unsigned long seq)
{
int i;
for (i = RCU_WAIT_TAIL; i < RCU_NEXT_TAIL; i++)
if (rsclp->tails[i - 1] != rsclp->tails[i] &&
ULONG_CMP_LT(seq, rsclp->gp_seq[i]))
return true;
return false;
}
/* /*
* Merge the source rcu_segcblist structure into the destination * Merge the source rcu_segcblist structure into the destination
* rcu_segcblist structure, then initialize the source. Any pending * rcu_segcblist structure, then initialize the source. Any pending

View File

@ -22,6 +22,12 @@
#include <linux/rcu_segcblist.h> #include <linux/rcu_segcblist.h>
/* Return number of callbacks in the specified callback list. */
static inline long rcu_cblist_n_cbs(struct rcu_cblist *rclp)
{
return READ_ONCE(rclp->len);
}
/* /*
* Account for the fact that a previously dequeued callback turned out * Account for the fact that a previously dequeued callback turned out
* to be marked as lazy. * to be marked as lazy.
@ -32,6 +38,10 @@ static inline void rcu_cblist_dequeued_lazy(struct rcu_cblist *rclp)
} }
void rcu_cblist_init(struct rcu_cblist *rclp); void rcu_cblist_init(struct rcu_cblist *rclp);
void rcu_cblist_enqueue(struct rcu_cblist *rclp, struct rcu_head *rhp);
void rcu_cblist_flush_enqueue(struct rcu_cblist *drclp,
struct rcu_cblist *srclp,
struct rcu_head *rhp);
struct rcu_head *rcu_cblist_dequeue(struct rcu_cblist *rclp); struct rcu_head *rcu_cblist_dequeue(struct rcu_cblist *rclp);
/* /*
@ -49,13 +59,17 @@ struct rcu_head *rcu_cblist_dequeue(struct rcu_cblist *rclp);
*/ */
static inline bool rcu_segcblist_empty(struct rcu_segcblist *rsclp) static inline bool rcu_segcblist_empty(struct rcu_segcblist *rsclp)
{ {
return !rsclp->head; return !READ_ONCE(rsclp->head);
} }
/* Return number of callbacks in segmented callback list. */ /* Return number of callbacks in segmented callback list. */
static inline long rcu_segcblist_n_cbs(struct rcu_segcblist *rsclp) static inline long rcu_segcblist_n_cbs(struct rcu_segcblist *rsclp)
{ {
#ifdef CONFIG_RCU_NOCB_CPU
return atomic_long_read(&rsclp->len);
#else
return READ_ONCE(rsclp->len); return READ_ONCE(rsclp->len);
#endif
} }
/* Return number of lazy callbacks in segmented callback list. */ /* Return number of lazy callbacks in segmented callback list. */
@ -67,16 +81,22 @@ static inline long rcu_segcblist_n_lazy_cbs(struct rcu_segcblist *rsclp)
/* Return number of lazy callbacks in segmented callback list. */ /* Return number of lazy callbacks in segmented callback list. */
static inline long rcu_segcblist_n_nonlazy_cbs(struct rcu_segcblist *rsclp) static inline long rcu_segcblist_n_nonlazy_cbs(struct rcu_segcblist *rsclp)
{ {
return rsclp->len - rsclp->len_lazy; return rcu_segcblist_n_cbs(rsclp) - rsclp->len_lazy;
} }
/* /*
* Is the specified rcu_segcblist enabled, for example, not corresponding * Is the specified rcu_segcblist enabled, for example, not corresponding
* to an offline or callback-offloaded CPU? * to an offline CPU?
*/ */
static inline bool rcu_segcblist_is_enabled(struct rcu_segcblist *rsclp) static inline bool rcu_segcblist_is_enabled(struct rcu_segcblist *rsclp)
{ {
return !!rsclp->tails[RCU_NEXT_TAIL]; return rsclp->enabled;
}
/* Is the specified rcu_segcblist offloaded? */
static inline bool rcu_segcblist_is_offloaded(struct rcu_segcblist *rsclp)
{
return rsclp->offloaded;
} }
/* /*
@ -86,36 +106,18 @@ static inline bool rcu_segcblist_is_enabled(struct rcu_segcblist *rsclp)
*/ */
static inline bool rcu_segcblist_restempty(struct rcu_segcblist *rsclp, int seg) static inline bool rcu_segcblist_restempty(struct rcu_segcblist *rsclp, int seg)
{ {
return !*rsclp->tails[seg]; return !READ_ONCE(*READ_ONCE(rsclp->tails[seg]));
}
/*
* Interim function to return rcu_segcblist head pointer. Longer term, the
* rcu_segcblist will be used more pervasively, removing the need for this
* function.
*/
static inline struct rcu_head *rcu_segcblist_head(struct rcu_segcblist *rsclp)
{
return rsclp->head;
}
/*
* Interim function to return rcu_segcblist head pointer. Longer term, the
* rcu_segcblist will be used more pervasively, removing the need for this
* function.
*/
static inline struct rcu_head **rcu_segcblist_tail(struct rcu_segcblist *rsclp)
{
WARN_ON_ONCE(rcu_segcblist_empty(rsclp));
return rsclp->tails[RCU_NEXT_TAIL];
} }
void rcu_segcblist_inc_len(struct rcu_segcblist *rsclp);
void rcu_segcblist_init(struct rcu_segcblist *rsclp); void rcu_segcblist_init(struct rcu_segcblist *rsclp);
void rcu_segcblist_disable(struct rcu_segcblist *rsclp); void rcu_segcblist_disable(struct rcu_segcblist *rsclp);
void rcu_segcblist_offload(struct rcu_segcblist *rsclp);
bool rcu_segcblist_ready_cbs(struct rcu_segcblist *rsclp); bool rcu_segcblist_ready_cbs(struct rcu_segcblist *rsclp);
bool rcu_segcblist_pend_cbs(struct rcu_segcblist *rsclp); bool rcu_segcblist_pend_cbs(struct rcu_segcblist *rsclp);
struct rcu_head *rcu_segcblist_first_cb(struct rcu_segcblist *rsclp); struct rcu_head *rcu_segcblist_first_cb(struct rcu_segcblist *rsclp);
struct rcu_head *rcu_segcblist_first_pend_cb(struct rcu_segcblist *rsclp); struct rcu_head *rcu_segcblist_first_pend_cb(struct rcu_segcblist *rsclp);
bool rcu_segcblist_nextgp(struct rcu_segcblist *rsclp, unsigned long *lp);
void rcu_segcblist_enqueue(struct rcu_segcblist *rsclp, void rcu_segcblist_enqueue(struct rcu_segcblist *rsclp,
struct rcu_head *rhp, bool lazy); struct rcu_head *rhp, bool lazy);
bool rcu_segcblist_entrain(struct rcu_segcblist *rsclp, bool rcu_segcblist_entrain(struct rcu_segcblist *rsclp,
@ -134,7 +136,5 @@ void rcu_segcblist_insert_pend_cbs(struct rcu_segcblist *rsclp,
struct rcu_cblist *rclp); struct rcu_cblist *rclp);
void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq); void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq);
bool rcu_segcblist_accelerate(struct rcu_segcblist *rsclp, unsigned long seq); bool rcu_segcblist_accelerate(struct rcu_segcblist *rsclp, unsigned long seq);
bool rcu_segcblist_future_gp_needed(struct rcu_segcblist *rsclp,
unsigned long seq);
void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp, void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
struct rcu_segcblist *src_rsclp); struct rcu_segcblist *src_rsclp);

View File

@ -19,6 +19,9 @@
* *
* Authors: Paul E. McKenney <paulmck@us.ibm.com> * Authors: Paul E. McKenney <paulmck@us.ibm.com>
*/ */
#define pr_fmt(fmt) fmt
#include <linux/types.h> #include <linux/types.h>
#include <linux/kernel.h> #include <linux/kernel.h>
#include <linux/init.h> #include <linux/init.h>
@ -61,20 +64,45 @@ MODULE_AUTHOR("Paul E. McKenney <paulmck@linux.vnet.ibm.com>");
#define VERBOSE_PERFOUT_ERRSTRING(s) \ #define VERBOSE_PERFOUT_ERRSTRING(s) \
do { if (verbose) pr_alert("%s" PERF_FLAG "!!! %s\n", perf_type, s); } while (0) do { if (verbose) pr_alert("%s" PERF_FLAG "!!! %s\n", perf_type, s); } while (0)
/*
* The intended use cases for the nreaders and nwriters module parameters
* are as follows:
*
* 1. Specify only the nr_cpus kernel boot parameter. This will
* set both nreaders and nwriters to the value specified by
* nr_cpus for a mixed reader/writer test.
*
* 2. Specify the nr_cpus kernel boot parameter, but set
* rcuperf.nreaders to zero. This will set nwriters to the
* value specified by nr_cpus for an update-only test.
*
* 3. Specify the nr_cpus kernel boot parameter, but set
* rcuperf.nwriters to zero. This will set nreaders to the
* value specified by nr_cpus for a read-only test.
*
* Various other use cases may of course be specified.
*/
#ifdef MODULE
# define RCUPERF_SHUTDOWN 0
#else
# define RCUPERF_SHUTDOWN 1
#endif
torture_param(bool, gp_async, false, "Use asynchronous GP wait primitives"); torture_param(bool, gp_async, false, "Use asynchronous GP wait primitives");
torture_param(int, gp_async_max, 1000, "Max # outstanding waits per reader"); torture_param(int, gp_async_max, 1000, "Max # outstanding waits per reader");
torture_param(bool, gp_exp, false, "Use expedited GP wait primitives"); torture_param(bool, gp_exp, false, "Use expedited GP wait primitives");
torture_param(int, holdoff, 10, "Holdoff time before test start (s)"); torture_param(int, holdoff, 10, "Holdoff time before test start (s)");
torture_param(int, nreaders, 0, "Number of RCU reader threads"); torture_param(int, nreaders, -1, "Number of RCU reader threads");
torture_param(int, nwriters, -1, "Number of RCU updater threads"); torture_param(int, nwriters, -1, "Number of RCU updater threads");
torture_param(bool, shutdown, !IS_ENABLED(MODULE), torture_param(bool, shutdown, RCUPERF_SHUTDOWN,
"Shutdown at end of performance tests."); "Shutdown at end of performance tests.");
torture_param(bool, verbose, true, "Enable verbose debugging printk()s"); torture_param(int, verbose, 1, "Enable verbose debugging printk()s");
torture_param(int, writer_holdoff, 0, "Holdoff (us) between GPs, zero to disable"); torture_param(int, writer_holdoff, 0, "Holdoff (us) between GPs, zero to disable");
static char *perf_type = "rcu"; static char *perf_type = "rcu";
module_param(perf_type, charp, 0444); module_param(perf_type, charp, 0444);
MODULE_PARM_DESC(perf_type, "Type of RCU to performance-test (rcu, rcu_bh, ...)"); MODULE_PARM_DESC(perf_type, "Type of RCU to performance-test (rcu, srcu, ...)");
static int nrealreaders; static int nrealreaders;
static int nrealwriters; static int nrealwriters;
@ -106,10 +134,6 @@ static int rcu_perf_writer_state;
#define MAX_MEAS 10000 #define MAX_MEAS 10000
#define MIN_MEAS 100 #define MIN_MEAS 100
static int perf_runnable = IS_ENABLED(MODULE);
module_param(perf_runnable, int, 0444);
MODULE_PARM_DESC(perf_runnable, "Start rcuperf at boot");
/* /*
* Operations vector for selecting different types of tests. * Operations vector for selecting different types of tests.
*/ */
@ -120,8 +144,8 @@ struct rcu_perf_ops {
void (*cleanup)(void); void (*cleanup)(void);
int (*readlock)(void); int (*readlock)(void);
void (*readunlock)(int idx); void (*readunlock)(int idx);
unsigned long (*started)(void); unsigned long (*get_gp_seq)(void);
unsigned long (*completed)(void); unsigned long (*gp_diff)(unsigned long new, unsigned long old);
unsigned long (*exp_completed)(void); unsigned long (*exp_completed)(void);
void (*async)(struct rcu_head *head, rcu_callback_t func); void (*async)(struct rcu_head *head, rcu_callback_t func);
void (*gp_barrier)(void); void (*gp_barrier)(void);
@ -161,8 +185,8 @@ static struct rcu_perf_ops rcu_ops = {
.init = rcu_sync_perf_init, .init = rcu_sync_perf_init,
.readlock = rcu_perf_read_lock, .readlock = rcu_perf_read_lock,
.readunlock = rcu_perf_read_unlock, .readunlock = rcu_perf_read_unlock,
.started = rcu_batches_started, .get_gp_seq = rcu_get_gp_seq,
.completed = rcu_batches_completed, .gp_diff = rcu_seq_diff,
.exp_completed = rcu_exp_batches_completed, .exp_completed = rcu_exp_batches_completed,
.async = call_rcu, .async = call_rcu,
.gp_barrier = rcu_barrier, .gp_barrier = rcu_barrier,
@ -171,36 +195,6 @@ static struct rcu_perf_ops rcu_ops = {
.name = "rcu" .name = "rcu"
}; };
/*
* Definitions for rcu_bh perf testing.
*/
static int rcu_bh_perf_read_lock(void) __acquires(RCU_BH)
{
rcu_read_lock_bh();
return 0;
}
static void rcu_bh_perf_read_unlock(int idx) __releases(RCU_BH)
{
rcu_read_unlock_bh();
}
static struct rcu_perf_ops rcu_bh_ops = {
.ptype = RCU_BH_FLAVOR,
.init = rcu_sync_perf_init,
.readlock = rcu_bh_perf_read_lock,
.readunlock = rcu_bh_perf_read_unlock,
.started = rcu_batches_started_bh,
.completed = rcu_batches_completed_bh,
.exp_completed = rcu_exp_batches_completed_sched,
.async = call_rcu_bh,
.gp_barrier = rcu_barrier_bh,
.sync = synchronize_rcu_bh,
.exp_sync = synchronize_rcu_bh_expedited,
.name = "rcu_bh"
};
/* /*
* Definitions for srcu perf testing. * Definitions for srcu perf testing.
*/ */
@ -248,8 +242,8 @@ static struct rcu_perf_ops srcu_ops = {
.init = rcu_sync_perf_init, .init = rcu_sync_perf_init,
.readlock = srcu_perf_read_lock, .readlock = srcu_perf_read_lock,
.readunlock = srcu_perf_read_unlock, .readunlock = srcu_perf_read_unlock,
.started = NULL, .get_gp_seq = srcu_perf_completed,
.completed = srcu_perf_completed, .gp_diff = rcu_seq_diff,
.exp_completed = srcu_perf_completed, .exp_completed = srcu_perf_completed,
.async = srcu_call_rcu, .async = srcu_call_rcu,
.gp_barrier = srcu_rcu_barrier, .gp_barrier = srcu_rcu_barrier,
@ -277,8 +271,8 @@ static struct rcu_perf_ops srcud_ops = {
.cleanup = srcu_sync_perf_cleanup, .cleanup = srcu_sync_perf_cleanup,
.readlock = srcu_perf_read_lock, .readlock = srcu_perf_read_lock,
.readunlock = srcu_perf_read_unlock, .readunlock = srcu_perf_read_unlock,
.started = NULL, .get_gp_seq = srcu_perf_completed,
.completed = srcu_perf_completed, .gp_diff = rcu_seq_diff,
.exp_completed = srcu_perf_completed, .exp_completed = srcu_perf_completed,
.async = srcu_call_rcu, .async = srcu_call_rcu,
.gp_barrier = srcu_rcu_barrier, .gp_barrier = srcu_rcu_barrier,
@ -287,36 +281,6 @@ static struct rcu_perf_ops srcud_ops = {
.name = "srcud" .name = "srcud"
}; };
/*
* Definitions for sched perf testing.
*/
static int sched_perf_read_lock(void)
{
preempt_disable();
return 0;
}
static void sched_perf_read_unlock(int idx)
{
preempt_enable();
}
static struct rcu_perf_ops sched_ops = {
.ptype = RCU_SCHED_FLAVOR,
.init = rcu_sync_perf_init,
.readlock = sched_perf_read_lock,
.readunlock = sched_perf_read_unlock,
.started = rcu_batches_started_sched,
.completed = rcu_batches_completed_sched,
.exp_completed = rcu_exp_batches_completed_sched,
.async = call_rcu_sched,
.gp_barrier = rcu_barrier_sched,
.sync = synchronize_sched,
.exp_sync = synchronize_sched_expedited,
.name = "sched"
};
/* /*
* Definitions for RCU-tasks perf testing. * Definitions for RCU-tasks perf testing.
*/ */
@ -335,8 +299,8 @@ static struct rcu_perf_ops tasks_ops = {
.init = rcu_sync_perf_init, .init = rcu_sync_perf_init,
.readlock = tasks_perf_read_lock, .readlock = tasks_perf_read_lock,
.readunlock = tasks_perf_read_unlock, .readunlock = tasks_perf_read_unlock,
.started = rcu_no_completed, .get_gp_seq = rcu_no_completed,
.completed = rcu_no_completed, .gp_diff = rcu_seq_diff,
.async = call_rcu_tasks, .async = call_rcu_tasks,
.gp_barrier = rcu_barrier_tasks, .gp_barrier = rcu_barrier_tasks,
.sync = synchronize_rcu_tasks, .sync = synchronize_rcu_tasks,
@ -344,9 +308,11 @@ static struct rcu_perf_ops tasks_ops = {
.name = "tasks" .name = "tasks"
}; };
static bool __maybe_unused torturing_tasks(void) static unsigned long rcuperf_seq_diff(unsigned long new, unsigned long old)
{ {
return cur_ops == &tasks_ops; if (!cur_ops->gp_diff)
return new - old;
return cur_ops->gp_diff(new, old);
} }
/* /*
@ -354,7 +320,7 @@ static bool __maybe_unused torturing_tasks(void)
*/ */
static void rcu_perf_wait_shutdown(void) static void rcu_perf_wait_shutdown(void)
{ {
cond_resched_rcu_qs(); cond_resched_tasks_rcu_qs();
if (atomic_read(&n_rcu_perf_writer_finished) < nrealwriters) if (atomic_read(&n_rcu_perf_writer_finished) < nrealwriters)
return; return;
while (!torture_must_stop()) while (!torture_must_stop())
@ -422,6 +388,14 @@ rcu_perf_writer(void *arg)
if (holdoff) if (holdoff)
schedule_timeout_uninterruptible(holdoff * HZ); schedule_timeout_uninterruptible(holdoff * HZ);
/*
* Wait until rcu_end_inkernel_boot() is called for normal GP tests
* so that RCU is not always expedited for normal GP tests.
* The system_state test is approximate, but works well in practice.
*/
while (!gp_exp && system_state != SYSTEM_RUNNING)
schedule_timeout_uninterruptible(1);
t = ktime_get_mono_fast_ns(); t = ktime_get_mono_fast_ns();
if (atomic_inc_return(&n_rcu_perf_writer_started) >= nrealwriters) { if (atomic_inc_return(&n_rcu_perf_writer_started) >= nrealwriters) {
t_rcu_perf_writer_started = t; t_rcu_perf_writer_started = t;
@ -429,8 +403,7 @@ rcu_perf_writer(void *arg)
b_rcu_perf_writer_started = b_rcu_perf_writer_started =
cur_ops->exp_completed() / 2; cur_ops->exp_completed() / 2;
} else { } else {
b_rcu_perf_writer_started = b_rcu_perf_writer_started = cur_ops->get_gp_seq();
cur_ops->completed();
} }
} }
@ -487,7 +460,7 @@ retry:
cur_ops->exp_completed() / 2; cur_ops->exp_completed() / 2;
} else { } else {
b_rcu_perf_writer_finished = b_rcu_perf_writer_finished =
cur_ops->completed(); cur_ops->get_gp_seq();
} }
if (shutdown) { if (shutdown) {
smp_mb(); /* Assign before wake. */ smp_mb(); /* Assign before wake. */
@ -512,7 +485,7 @@ retry:
return 0; return 0;
} }
static inline void static void
rcu_perf_print_module_parms(struct rcu_perf_ops *cur_ops, const char *tag) rcu_perf_print_module_parms(struct rcu_perf_ops *cur_ops, const char *tag)
{ {
pr_alert("%s" PERF_FLAG pr_alert("%s" PERF_FLAG
@ -571,8 +544,8 @@ rcu_perf_cleanup(void)
t_rcu_perf_writer_finished - t_rcu_perf_writer_finished -
t_rcu_perf_writer_started, t_rcu_perf_writer_started,
ngps, ngps,
b_rcu_perf_writer_finished - rcuperf_seq_diff(b_rcu_perf_writer_finished,
b_rcu_perf_writer_started); b_rcu_perf_writer_started));
for (i = 0; i < nrealwriters; i++) { for (i = 0; i < nrealwriters; i++) {
if (!writer_durations) if (!writer_durations)
break; break;
@ -596,7 +569,7 @@ rcu_perf_cleanup(void)
kfree(writer_n_durations); kfree(writer_n_durations);
} }
/* Do flavor-specific cleanup operations. */ /* Do torture-type-specific cleanup operations. */
if (cur_ops->cleanup != NULL) if (cur_ops->cleanup != NULL)
cur_ops->cleanup(); cur_ops->cleanup();
@ -646,11 +619,10 @@ rcu_perf_init(void)
long i; long i;
int firsterr = 0; int firsterr = 0;
static struct rcu_perf_ops *perf_ops[] = { static struct rcu_perf_ops *perf_ops[] = {
&rcu_ops, &rcu_bh_ops, &srcu_ops, &srcud_ops, &sched_ops, &rcu_ops, &srcu_ops, &srcud_ops, &tasks_ops,
&tasks_ops,
}; };
if (!torture_init_begin(perf_type, verbose, &perf_runnable)) if (!torture_init_begin(perf_type, verbose))
return -EBUSY; return -EBUSY;
/* Process args and tell the world that the perf'er is on the job. */ /* Process args and tell the world that the perf'er is on the job. */
@ -660,12 +632,12 @@ rcu_perf_init(void)
break; break;
} }
if (i == ARRAY_SIZE(perf_ops)) { if (i == ARRAY_SIZE(perf_ops)) {
pr_alert("rcu-perf: invalid perf type: \"%s\"\n", pr_alert("rcu-perf: invalid perf type: \"%s\"\n", perf_type);
perf_type);
pr_alert("rcu-perf types:"); pr_alert("rcu-perf types:");
for (i = 0; i < ARRAY_SIZE(perf_ops); i++) for (i = 0; i < ARRAY_SIZE(perf_ops); i++)
pr_alert(" %s", perf_ops[i]->name); pr_cont(" %s", perf_ops[i]->name);
pr_alert("\n"); pr_cont("\n");
WARN_ON(!IS_MODULE(CONFIG_RCU_PERF_TEST));
firsterr = -EINVAL; firsterr = -EINVAL;
cur_ops = NULL; cur_ops = NULL;
goto unwind; goto unwind;

File diff suppressed because it is too large Load Diff

View File

@ -34,30 +34,33 @@
#include "rcu.h" #include "rcu.h"
int rcu_scheduler_active __read_mostly; int rcu_scheduler_active __read_mostly;
static LIST_HEAD(srcu_boot_list);
static bool srcu_init_done;
static int init_srcu_struct_fields(struct srcu_struct *sp) static int init_srcu_struct_fields(struct srcu_struct *ssp)
{ {
sp->srcu_lock_nesting[0] = 0; ssp->srcu_lock_nesting[0] = 0;
sp->srcu_lock_nesting[1] = 0; ssp->srcu_lock_nesting[1] = 0;
init_swait_queue_head(&sp->srcu_wq); init_swait_queue_head(&ssp->srcu_wq);
sp->srcu_cb_head = NULL; ssp->srcu_cb_head = NULL;
sp->srcu_cb_tail = &sp->srcu_cb_head; ssp->srcu_cb_tail = &ssp->srcu_cb_head;
sp->srcu_gp_running = false; ssp->srcu_gp_running = false;
sp->srcu_gp_waiting = false; ssp->srcu_gp_waiting = false;
sp->srcu_idx = 0; ssp->srcu_idx = 0;
INIT_WORK(&sp->srcu_work, srcu_drive_gp); INIT_WORK(&ssp->srcu_work, srcu_drive_gp);
INIT_LIST_HEAD(&ssp->srcu_work.entry);
return 0; return 0;
} }
#ifdef CONFIG_DEBUG_LOCK_ALLOC #ifdef CONFIG_DEBUG_LOCK_ALLOC
int __init_srcu_struct(struct srcu_struct *sp, const char *name, int __init_srcu_struct(struct srcu_struct *ssp, const char *name,
struct lock_class_key *key) struct lock_class_key *key)
{ {
/* Don't re-initialize a lock while it is held. */ /* Don't re-initialize a lock while it is held. */
debug_check_no_locks_freed((void *)sp, sizeof(*sp)); debug_check_no_locks_freed((void *)ssp, sizeof(*ssp));
lockdep_init_map(&sp->dep_map, name, key, 0); lockdep_init_map(&ssp->dep_map, name, key, 0);
return init_srcu_struct_fields(sp); return init_srcu_struct_fields(ssp);
} }
EXPORT_SYMBOL_GPL(__init_srcu_struct); EXPORT_SYMBOL_GPL(__init_srcu_struct);
@ -65,15 +68,15 @@ EXPORT_SYMBOL_GPL(__init_srcu_struct);
/* /*
* init_srcu_struct - initialize a sleep-RCU structure * init_srcu_struct - initialize a sleep-RCU structure
* @sp: structure to initialize. * @ssp: structure to initialize.
* *
* Must invoke this on a given srcu_struct before passing that srcu_struct * Must invoke this on a given srcu_struct before passing that srcu_struct
* to any other function. Each srcu_struct represents a separate domain * to any other function. Each srcu_struct represents a separate domain
* of SRCU protection. * of SRCU protection.
*/ */
int init_srcu_struct(struct srcu_struct *sp) int init_srcu_struct(struct srcu_struct *ssp)
{ {
return init_srcu_struct_fields(sp); return init_srcu_struct_fields(ssp);
} }
EXPORT_SYMBOL_GPL(init_srcu_struct); EXPORT_SYMBOL_GPL(init_srcu_struct);
@ -81,19 +84,19 @@ EXPORT_SYMBOL_GPL(init_srcu_struct);
/* /*
* cleanup_srcu_struct - deconstruct a sleep-RCU structure * cleanup_srcu_struct - deconstruct a sleep-RCU structure
* @sp: structure to clean up. * @ssp: structure to clean up.
* *
* Must invoke this after you are finished using a given srcu_struct that * Must invoke this after you are finished using a given srcu_struct that
* was initialized via init_srcu_struct(), else you leak memory. * was initialized via init_srcu_struct(), else you leak memory.
*/ */
void cleanup_srcu_struct(struct srcu_struct *sp) void cleanup_srcu_struct(struct srcu_struct *ssp)
{ {
WARN_ON(sp->srcu_lock_nesting[0] || sp->srcu_lock_nesting[1]); WARN_ON(ssp->srcu_lock_nesting[0] || ssp->srcu_lock_nesting[1]);
flush_work(&sp->srcu_work); flush_work(&ssp->srcu_work);
WARN_ON(sp->srcu_gp_running); WARN_ON(ssp->srcu_gp_running);
WARN_ON(sp->srcu_gp_waiting); WARN_ON(ssp->srcu_gp_waiting);
WARN_ON(sp->srcu_cb_head); WARN_ON(ssp->srcu_cb_head);
WARN_ON(&sp->srcu_cb_head != sp->srcu_cb_tail); WARN_ON(&ssp->srcu_cb_head != ssp->srcu_cb_tail);
} }
EXPORT_SYMBOL_GPL(cleanup_srcu_struct); EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
@ -101,13 +104,13 @@ EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
* Removes the count for the old reader from the appropriate element of * Removes the count for the old reader from the appropriate element of
* the srcu_struct. * the srcu_struct.
*/ */
void __srcu_read_unlock(struct srcu_struct *sp, int idx) void __srcu_read_unlock(struct srcu_struct *ssp, int idx)
{ {
int newval = sp->srcu_lock_nesting[idx] - 1; int newval = ssp->srcu_lock_nesting[idx] - 1;
WRITE_ONCE(sp->srcu_lock_nesting[idx], newval); WRITE_ONCE(ssp->srcu_lock_nesting[idx], newval);
if (!newval && READ_ONCE(sp->srcu_gp_waiting)) if (!newval && READ_ONCE(ssp->srcu_gp_waiting))
swake_up(&sp->srcu_wq); swake_up(&ssp->srcu_wq);
} }
EXPORT_SYMBOL_GPL(__srcu_read_unlock); EXPORT_SYMBOL_GPL(__srcu_read_unlock);
@ -121,24 +124,24 @@ void srcu_drive_gp(struct work_struct *wp)
int idx; int idx;
struct rcu_head *lh; struct rcu_head *lh;
struct rcu_head *rhp; struct rcu_head *rhp;
struct srcu_struct *sp; struct srcu_struct *ssp;
sp = container_of(wp, struct srcu_struct, srcu_work); ssp = container_of(wp, struct srcu_struct, srcu_work);
if (sp->srcu_gp_running || !READ_ONCE(sp->srcu_cb_head)) if (ssp->srcu_gp_running || !READ_ONCE(ssp->srcu_cb_head))
return; /* Already running or nothing to do. */ return; /* Already running or nothing to do. */
/* Remove recently arrived callbacks and wait for readers. */ /* Remove recently arrived callbacks and wait for readers. */
WRITE_ONCE(sp->srcu_gp_running, true); WRITE_ONCE(ssp->srcu_gp_running, true);
local_irq_disable(); local_irq_disable();
lh = sp->srcu_cb_head; lh = ssp->srcu_cb_head;
sp->srcu_cb_head = NULL; ssp->srcu_cb_head = NULL;
sp->srcu_cb_tail = &sp->srcu_cb_head; ssp->srcu_cb_tail = &ssp->srcu_cb_head;
local_irq_enable(); local_irq_enable();
idx = sp->srcu_idx; idx = ssp->srcu_idx;
WRITE_ONCE(sp->srcu_idx, !sp->srcu_idx); WRITE_ONCE(ssp->srcu_idx, !ssp->srcu_idx);
WRITE_ONCE(sp->srcu_gp_waiting, true); /* srcu_read_unlock() wakes! */ WRITE_ONCE(ssp->srcu_gp_waiting, true); /* srcu_read_unlock() wakes! */
swait_event(sp->srcu_wq, !READ_ONCE(sp->srcu_lock_nesting[idx])); swait_event(ssp->srcu_wq, !READ_ONCE(ssp->srcu_lock_nesting[idx]));
WRITE_ONCE(sp->srcu_gp_waiting, false); /* srcu_read_unlock() cheap. */ WRITE_ONCE(ssp->srcu_gp_waiting, false); /* srcu_read_unlock() cheap. */
/* Invoke the callbacks we removed above. */ /* Invoke the callbacks we removed above. */
while (lh) { while (lh) {
@ -155,9 +158,9 @@ void srcu_drive_gp(struct work_struct *wp)
* at interrupt level, but the ->srcu_gp_running checks will * at interrupt level, but the ->srcu_gp_running checks will
* straighten that out. * straighten that out.
*/ */
WRITE_ONCE(sp->srcu_gp_running, false); WRITE_ONCE(ssp->srcu_gp_running, false);
if (READ_ONCE(sp->srcu_cb_head)) if (READ_ONCE(ssp->srcu_cb_head))
schedule_work(&sp->srcu_work); schedule_work(&ssp->srcu_work);
} }
EXPORT_SYMBOL_GPL(srcu_drive_gp); EXPORT_SYMBOL_GPL(srcu_drive_gp);
@ -165,7 +168,7 @@ EXPORT_SYMBOL_GPL(srcu_drive_gp);
* Enqueue an SRCU callback on the specified srcu_struct structure, * Enqueue an SRCU callback on the specified srcu_struct structure,
* initiating grace-period processing if it is not already running. * initiating grace-period processing if it is not already running.
*/ */
void call_srcu(struct srcu_struct *sp, struct rcu_head *rhp, void call_srcu(struct srcu_struct *ssp, struct rcu_head *rhp,
rcu_callback_t func) rcu_callback_t func)
{ {
unsigned long flags; unsigned long flags;
@ -173,24 +176,28 @@ void call_srcu(struct srcu_struct *sp, struct rcu_head *rhp,
rhp->func = func; rhp->func = func;
rhp->next = NULL; rhp->next = NULL;
local_irq_save(flags); local_irq_save(flags);
*sp->srcu_cb_tail = rhp; *ssp->srcu_cb_tail = rhp;
sp->srcu_cb_tail = &rhp->next; ssp->srcu_cb_tail = &rhp->next;
local_irq_restore(flags); local_irq_restore(flags);
if (!READ_ONCE(sp->srcu_gp_running)) if (!READ_ONCE(ssp->srcu_gp_running)) {
schedule_work(&sp->srcu_work); if (likely(srcu_init_done))
schedule_work(&ssp->srcu_work);
else if (list_empty(&ssp->srcu_work.entry))
list_add(&ssp->srcu_work.entry, &srcu_boot_list);
}
} }
EXPORT_SYMBOL_GPL(call_srcu); EXPORT_SYMBOL_GPL(call_srcu);
/* /*
* synchronize_srcu - wait for prior SRCU read-side critical-section completion * synchronize_srcu - wait for prior SRCU read-side critical-section completion
*/ */
void synchronize_srcu(struct srcu_struct *sp) void synchronize_srcu(struct srcu_struct *ssp)
{ {
struct rcu_synchronize rs; struct rcu_synchronize rs;
init_rcu_head_on_stack(&rs.head); init_rcu_head_on_stack(&rs.head);
init_completion(&rs.completion); init_completion(&rs.completion);
call_srcu(sp, &rs.head, wakeme_after_rcu); call_srcu(ssp, &rs.head, wakeme_after_rcu);
wait_for_completion(&rs.completion); wait_for_completion(&rs.completion);
destroy_rcu_head_on_stack(&rs.head); destroy_rcu_head_on_stack(&rs.head);
} }
@ -201,3 +208,21 @@ void __init rcu_scheduler_starting(void)
{ {
rcu_scheduler_active = RCU_SCHEDULER_RUNNING; rcu_scheduler_active = RCU_SCHEDULER_RUNNING;
} }
/*
* Queue work for srcu_struct structures with early boot callbacks.
* The work won't actually execute until the workqueue initialization
* phase that takes place after the scheduler starts.
*/
void __init srcu_init(void)
{
struct srcu_struct *ssp;
srcu_init_done = true;
while (!list_empty(&srcu_boot_list)) {
ssp = list_first_entry(&srcu_boot_list,
struct srcu_struct, srcu_work.entry);
list_del_init(&ssp->srcu_work.entry);
schedule_work(&ssp->srcu_work);
}
}

File diff suppressed because it is too large Load Diff

View File

@ -23,65 +23,18 @@
#include <linux/rcu_sync.h> #include <linux/rcu_sync.h>
#include <linux/sched.h> #include <linux/sched.h>
#ifdef CONFIG_PROVE_RCU enum { GP_IDLE = 0, GP_ENTER, GP_PASSED, GP_EXIT, GP_REPLAY };
#define __INIT_HELD(func) .held = func,
#else
#define __INIT_HELD(func)
#endif
static const struct {
void (*sync)(void);
void (*call)(struct rcu_head *, void (*)(struct rcu_head *));
void (*wait)(void);
#ifdef CONFIG_PROVE_RCU
int (*held)(void);
#endif
} gp_ops[] = {
[RCU_SYNC] = {
.sync = synchronize_rcu,
.call = call_rcu,
.wait = rcu_barrier,
__INIT_HELD(rcu_read_lock_held)
},
[RCU_SCHED_SYNC] = {
.sync = synchronize_sched,
.call = call_rcu_sched,
.wait = rcu_barrier_sched,
__INIT_HELD(rcu_read_lock_sched_held)
},
[RCU_BH_SYNC] = {
.sync = synchronize_rcu_bh,
.call = call_rcu_bh,
.wait = rcu_barrier_bh,
__INIT_HELD(rcu_read_lock_bh_held)
},
};
enum { GP_IDLE = 0, GP_PENDING, GP_PASSED };
enum { CB_IDLE = 0, CB_PENDING, CB_REPLAY };
#define rss_lock gp_wait.lock #define rss_lock gp_wait.lock
#ifdef CONFIG_PROVE_RCU
void rcu_sync_lockdep_assert(struct rcu_sync *rsp)
{
RCU_LOCKDEP_WARN(!gp_ops[rsp->gp_type].held(),
"suspicious rcu_sync_is_idle() usage");
}
EXPORT_SYMBOL_GPL(rcu_sync_lockdep_assert);
#endif
/** /**
* rcu_sync_init() - Initialize an rcu_sync structure * rcu_sync_init() - Initialize an rcu_sync structure
* @rsp: Pointer to rcu_sync structure to be initialized * @rsp: Pointer to rcu_sync structure to be initialized
* @type: Flavor of RCU with which to synchronize rcu_sync structure
*/ */
void rcu_sync_init(struct rcu_sync *rsp, enum rcu_sync_type type) void rcu_sync_init(struct rcu_sync *rsp)
{ {
memset(rsp, 0, sizeof(*rsp)); memset(rsp, 0, sizeof(*rsp));
init_waitqueue_head(&rsp->gp_wait); init_waitqueue_head(&rsp->gp_wait);
rsp->gp_type = type;
} }
/** /**
@ -99,6 +52,70 @@ void rcu_sync_enter_start(struct rcu_sync *rsp)
rsp->gp_state = GP_PASSED; rsp->gp_state = GP_PASSED;
} }
static void rcu_sync_func(struct rcu_head *rhp);
static void rcu_sync_call(struct rcu_sync *rsp)
{
call_rcu(&rsp->cb_head, rcu_sync_func);
}
/**
* rcu_sync_func() - Callback function managing reader access to fastpath
* @rhp: Pointer to rcu_head in rcu_sync structure to use for synchronization
*
* This function is passed to call_rcu() function by rcu_sync_enter() and
* rcu_sync_exit(), so that it is invoked after a grace period following the
* that invocation of enter/exit.
*
* If it is called by rcu_sync_enter() it signals that all the readers were
* switched onto slow path.
*
* If it is called by rcu_sync_exit() it takes action based on events that
* have taken place in the meantime, so that closely spaced rcu_sync_enter()
* and rcu_sync_exit() pairs need not wait for a grace period.
*
* If another rcu_sync_enter() is invoked before the grace period
* ended, reset state to allow the next rcu_sync_exit() to let the
* readers back onto their fastpaths (after a grace period). If both
* another rcu_sync_enter() and its matching rcu_sync_exit() are invoked
* before the grace period ended, re-invoke call_rcu() on behalf of that
* rcu_sync_exit(). Otherwise, set all state back to idle so that readers
* can again use their fastpaths.
*/
static void rcu_sync_func(struct rcu_head *rhp)
{
struct rcu_sync *rsp = container_of(rhp, struct rcu_sync, cb_head);
unsigned long flags;
WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_IDLE);
WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_PASSED);
spin_lock_irqsave(&rsp->rss_lock, flags);
if (rsp->gp_count) {
/*
* We're at least a GP after the GP_IDLE->GP_ENTER transition.
*/
WRITE_ONCE(rsp->gp_state, GP_PASSED);
wake_up_locked(&rsp->gp_wait);
} else if (rsp->gp_state == GP_REPLAY) {
/*
* A new rcu_sync_exit() has happened; requeue the callback to
* catch a later GP.
*/
WRITE_ONCE(rsp->gp_state, GP_EXIT);
rcu_sync_call(rsp);
} else {
/*
* We're at least a GP after the last rcu_sync_exit(); eveybody
* will now have observed the write side critical section.
* Let 'em rip!.
*/
WRITE_ONCE(rsp->gp_state, GP_IDLE);
}
spin_unlock_irqrestore(&rsp->rss_lock, flags);
}
/** /**
* rcu_sync_enter() - Force readers onto slowpath * rcu_sync_enter() - Force readers onto slowpath
* @rsp: Pointer to rcu_sync structure to use for synchronization * @rsp: Pointer to rcu_sync structure to use for synchronization
@ -116,85 +133,43 @@ void rcu_sync_enter_start(struct rcu_sync *rsp)
*/ */
void rcu_sync_enter(struct rcu_sync *rsp) void rcu_sync_enter(struct rcu_sync *rsp)
{ {
bool need_wait, need_sync; int gp_state;
spin_lock_irq(&rsp->rss_lock); spin_lock_irq(&rsp->rss_lock);
need_wait = rsp->gp_count++; gp_state = rsp->gp_state;
need_sync = rsp->gp_state == GP_IDLE; if (gp_state == GP_IDLE) {
if (need_sync) WRITE_ONCE(rsp->gp_state, GP_ENTER);
rsp->gp_state = GP_PENDING; WARN_ON_ONCE(rsp->gp_count);
/*
* Note that we could simply do rcu_sync_call(rsp) here and
* avoid the "if (gp_state == GP_IDLE)" block below.
*
* However, synchronize_rcu() can be faster if rcu_expedited
* or rcu_blocking_is_gp() is true.
*
* Another reason is that we can't wait for rcu callback if
* we are called at early boot time but this shouldn't happen.
*/
}
rsp->gp_count++;
spin_unlock_irq(&rsp->rss_lock); spin_unlock_irq(&rsp->rss_lock);
BUG_ON(need_wait && need_sync); if (gp_state == GP_IDLE) {
if (need_sync) {
gp_ops[rsp->gp_type].sync();
rsp->gp_state = GP_PASSED;
wake_up_all(&rsp->gp_wait);
} else if (need_wait) {
wait_event(rsp->gp_wait, rsp->gp_state == GP_PASSED);
} else {
/* /*
* Possible when there's a pending CB from a rcu_sync_exit(). * See the comment above, this simply does the "synchronous"
* Nobody has yet been allowed the 'fast' path and thus we can * call_rcu(rcu_sync_func) which does GP_ENTER -> GP_PASSED.
* avoid doing any sync(). The callback will get 'dropped'.
*/ */
BUG_ON(rsp->gp_state != GP_PASSED); synchronize_rcu();
rcu_sync_func(&rsp->cb_head);
/* Not really needed, wait_event() would see GP_PASSED. */
return;
} }
wait_event(rsp->gp_wait, READ_ONCE(rsp->gp_state) >= GP_PASSED);
} }
/** /**
* rcu_sync_func() - Callback function managing reader access to fastpath * rcu_sync_exit() - Allow readers back onto fast path after grace period
* @rhp: Pointer to rcu_head in rcu_sync structure to use for synchronization
*
* This function is passed to one of the call_rcu() functions by
* rcu_sync_exit(), so that it is invoked after a grace period following the
* that invocation of rcu_sync_exit(). It takes action based on events that
* have taken place in the meantime, so that closely spaced rcu_sync_enter()
* and rcu_sync_exit() pairs need not wait for a grace period.
*
* If another rcu_sync_enter() is invoked before the grace period
* ended, reset state to allow the next rcu_sync_exit() to let the
* readers back onto their fastpaths (after a grace period). If both
* another rcu_sync_enter() and its matching rcu_sync_exit() are invoked
* before the grace period ended, re-invoke call_rcu() on behalf of that
* rcu_sync_exit(). Otherwise, set all state back to idle so that readers
* can again use their fastpaths.
*/
static void rcu_sync_func(struct rcu_head *rhp)
{
struct rcu_sync *rsp = container_of(rhp, struct rcu_sync, cb_head);
unsigned long flags;
BUG_ON(rsp->gp_state != GP_PASSED);
BUG_ON(rsp->cb_state == CB_IDLE);
spin_lock_irqsave(&rsp->rss_lock, flags);
if (rsp->gp_count) {
/*
* A new rcu_sync_begin() has happened; drop the callback.
*/
rsp->cb_state = CB_IDLE;
} else if (rsp->cb_state == CB_REPLAY) {
/*
* A new rcu_sync_exit() has happened; requeue the callback
* to catch a later GP.
*/
rsp->cb_state = CB_PENDING;
gp_ops[rsp->gp_type].call(&rsp->cb_head, rcu_sync_func);
} else {
/*
* We're at least a GP after rcu_sync_exit(); eveybody will now
* have observed the write side critical section. Let 'em rip!.
*/
rsp->cb_state = CB_IDLE;
rsp->gp_state = GP_IDLE;
}
spin_unlock_irqrestore(&rsp->rss_lock, flags);
}
/**
* rcu_sync_exit() - Allow readers back onto fast patch after grace period
* @rsp: Pointer to rcu_sync structure to use for synchronization * @rsp: Pointer to rcu_sync structure to use for synchronization
* *
* This function is used by updaters who have completed, and can therefore * This function is used by updaters who have completed, and can therefore
@ -205,13 +180,16 @@ static void rcu_sync_func(struct rcu_head *rhp)
*/ */
void rcu_sync_exit(struct rcu_sync *rsp) void rcu_sync_exit(struct rcu_sync *rsp)
{ {
WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_IDLE);
WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0);
spin_lock_irq(&rsp->rss_lock); spin_lock_irq(&rsp->rss_lock);
if (!--rsp->gp_count) { if (!--rsp->gp_count) {
if (rsp->cb_state == CB_IDLE) { if (rsp->gp_state == GP_PASSED) {
rsp->cb_state = CB_PENDING; WRITE_ONCE(rsp->gp_state, GP_EXIT);
gp_ops[rsp->gp_type].call(&rsp->cb_head, rcu_sync_func); rcu_sync_call(rsp);
} else if (rsp->cb_state == CB_PENDING) { } else if (rsp->gp_state == GP_EXIT) {
rsp->cb_state = CB_REPLAY; WRITE_ONCE(rsp->gp_state, GP_REPLAY);
} }
} }
spin_unlock_irq(&rsp->rss_lock); spin_unlock_irq(&rsp->rss_lock);
@ -223,18 +201,19 @@ void rcu_sync_exit(struct rcu_sync *rsp)
*/ */
void rcu_sync_dtor(struct rcu_sync *rsp) void rcu_sync_dtor(struct rcu_sync *rsp)
{ {
int cb_state; int gp_state;
BUG_ON(rsp->gp_count); WARN_ON_ONCE(READ_ONCE(rsp->gp_count));
WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_PASSED);
spin_lock_irq(&rsp->rss_lock); spin_lock_irq(&rsp->rss_lock);
if (rsp->cb_state == CB_REPLAY) if (rsp->gp_state == GP_REPLAY)
rsp->cb_state = CB_PENDING; WRITE_ONCE(rsp->gp_state, GP_EXIT);
cb_state = rsp->cb_state; gp_state = rsp->gp_state;
spin_unlock_irq(&rsp->rss_lock); spin_unlock_irq(&rsp->rss_lock);
if (cb_state != CB_IDLE) { if (gp_state != GP_IDLE) {
gp_ops[rsp->gp_type].wait(); rcu_barrier();
BUG_ON(rsp->cb_state != CB_IDLE); WARN_ON_ONCE(rsp->gp_state != GP_IDLE);
} }
} }

View File

@ -46,69 +46,27 @@ struct rcu_ctrlblk {
}; };
/* Definition for rcupdate control block. */ /* Definition for rcupdate control block. */
static struct rcu_ctrlblk rcu_sched_ctrlblk = { static struct rcu_ctrlblk rcu_ctrlblk = {
.donetail = &rcu_sched_ctrlblk.rcucblist, .donetail = &rcu_ctrlblk.rcucblist,
.curtail = &rcu_sched_ctrlblk.rcucblist, .curtail = &rcu_ctrlblk.rcucblist,
}; };
static struct rcu_ctrlblk rcu_bh_ctrlblk = { void rcu_barrier(void)
.donetail = &rcu_bh_ctrlblk.rcucblist,
.curtail = &rcu_bh_ctrlblk.rcucblist,
};
void rcu_barrier_bh(void)
{ {
wait_rcu_gp(call_rcu_bh); wait_rcu_gp(call_rcu);
} }
EXPORT_SYMBOL(rcu_barrier_bh); EXPORT_SYMBOL(rcu_barrier);
void rcu_barrier_sched(void) /* Record an rcu quiescent state. */
void rcu_qs(void)
{ {
wait_rcu_gp(call_rcu_sched); unsigned long flags;
}
EXPORT_SYMBOL(rcu_barrier_sched);
/* local_irq_save(flags);
* Helper function for rcu_sched_qs() and rcu_bh_qs(). if (rcu_ctrlblk.donetail != rcu_ctrlblk.curtail) {
* Also irqs are disabled to avoid confusion due to interrupt handlers rcu_ctrlblk.donetail = rcu_ctrlblk.curtail;
* invoking call_rcu(). raise_softirq_irqoff(RCU_SOFTIRQ);
*/
static int rcu_qsctr_help(struct rcu_ctrlblk *rcp)
{
if (rcp->donetail != rcp->curtail) {
rcp->donetail = rcp->curtail;
return 1;
} }
return 0;
}
/*
* Record an rcu quiescent state. And an rcu_bh quiescent state while we
* are at it, given that any rcu quiescent state is also an rcu_bh
* quiescent state. Use "+" instead of "||" to defeat short circuiting.
*/
void rcu_sched_qs(void)
{
unsigned long flags;
local_irq_save(flags);
if (rcu_qsctr_help(&rcu_sched_ctrlblk) +
rcu_qsctr_help(&rcu_bh_ctrlblk))
raise_softirq(RCU_SOFTIRQ);
local_irq_restore(flags);
}
/*
* Record an rcu_bh quiescent state.
*/
void rcu_bh_qs(void)
{
unsigned long flags;
local_irq_save(flags);
if (rcu_qsctr_help(&rcu_bh_ctrlblk))
raise_softirq(RCU_SOFTIRQ);
local_irq_restore(flags); local_irq_restore(flags);
} }
@ -118,38 +76,35 @@ void rcu_bh_qs(void)
* be called from hardirq context. It is normally called from the * be called from hardirq context. It is normally called from the
* scheduling-clock interrupt. * scheduling-clock interrupt.
*/ */
void rcu_check_callbacks(int user) void rcu_sched_clock_irq(int user)
{ {
if (user) if (user) {
rcu_sched_qs(); rcu_qs();
else if (!in_softirq()) } else if (rcu_ctrlblk.donetail != rcu_ctrlblk.curtail) {
rcu_bh_qs(); set_tsk_need_resched(current);
if (user) set_preempt_need_resched();
rcu_note_voluntary_context_switch(current); }
} }
/* /* Invoke the RCU callbacks whose grace period has elapsed. */
* Invoke the RCU callbacks on the specified rcu_ctrlkblk structure static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused)
* whose grace period has elapsed.
*/
static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp)
{ {
struct rcu_head *next, *list; struct rcu_head *next, *list;
unsigned long flags; unsigned long flags;
/* Move the ready-to-invoke callbacks to a local list. */ /* Move the ready-to-invoke callbacks to a local list. */
local_irq_save(flags); local_irq_save(flags);
if (rcp->donetail == &rcp->rcucblist) { if (rcu_ctrlblk.donetail == &rcu_ctrlblk.rcucblist) {
/* No callbacks ready, so just leave. */ /* No callbacks ready, so just leave. */
local_irq_restore(flags); local_irq_restore(flags);
return; return;
} }
list = rcp->rcucblist; list = rcu_ctrlblk.rcucblist;
rcp->rcucblist = *rcp->donetail; rcu_ctrlblk.rcucblist = *rcu_ctrlblk.donetail;
*rcp->donetail = NULL; *rcu_ctrlblk.donetail = NULL;
if (rcp->curtail == rcp->donetail) if (rcu_ctrlblk.curtail == rcu_ctrlblk.donetail)
rcp->curtail = &rcp->rcucblist; rcu_ctrlblk.curtail = &rcu_ctrlblk.rcucblist;
rcp->donetail = &rcp->rcucblist; rcu_ctrlblk.donetail = &rcu_ctrlblk.rcucblist;
local_irq_restore(flags); local_irq_restore(flags);
/* Invoke the callbacks on the local list. */ /* Invoke the callbacks on the local list. */
@ -164,37 +119,31 @@ static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp)
} }
} }
static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused)
{
__rcu_process_callbacks(&rcu_sched_ctrlblk);
__rcu_process_callbacks(&rcu_bh_ctrlblk);
}
/* /*
* Wait for a grace period to elapse. But it is illegal to invoke * Wait for a grace period to elapse. But it is illegal to invoke
* synchronize_sched() from within an RCU read-side critical section. * synchronize_rcu() from within an RCU read-side critical section.
* Therefore, any legal call to synchronize_sched() is a quiescent * Therefore, any legal call to synchronize_rcu() is a quiescent
* state, and so on a UP system, synchronize_sched() need do nothing. * state, and so on a UP system, synchronize_rcu() need do nothing.
* Ditto for synchronize_rcu_bh(). (But Lai Jiangshan points out the * (But Lai Jiangshan points out the benefits of doing might_sleep()
* benefits of doing might_sleep() to reduce latency.) * to reduce latency.)
* *
* Cool, huh? (Due to Josh Triplett.) * Cool, huh? (Due to Josh Triplett.)
*/ */
void synchronize_sched(void) void synchronize_rcu(void)
{ {
RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) || RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) ||
lock_is_held(&rcu_lock_map) || lock_is_held(&rcu_lock_map) ||
lock_is_held(&rcu_sched_lock_map), lock_is_held(&rcu_sched_lock_map),
"Illegal synchronize_sched() in RCU read-side critical section"); "Illegal synchronize_rcu() in RCU read-side critical section");
} }
EXPORT_SYMBOL_GPL(synchronize_sched); EXPORT_SYMBOL_GPL(synchronize_rcu);
/* /*
* Helper function for call_rcu() and call_rcu_bh(). * Post an RCU callback to be invoked after the end of an RCU grace
* period. But since we have but one CPU, that would be after any
* quiescent state.
*/ */
static void __call_rcu(struct rcu_head *head, void call_rcu(struct rcu_head *head, rcu_callback_t func)
rcu_callback_t func,
struct rcu_ctrlblk *rcp)
{ {
unsigned long flags; unsigned long flags;
@ -203,39 +152,20 @@ static void __call_rcu(struct rcu_head *head,
head->next = NULL; head->next = NULL;
local_irq_save(flags); local_irq_save(flags);
*rcp->curtail = head; *rcu_ctrlblk.curtail = head;
rcp->curtail = &head->next; rcu_ctrlblk.curtail = &head->next;
local_irq_restore(flags); local_irq_restore(flags);
if (unlikely(is_idle_task(current))) { if (unlikely(is_idle_task(current))) {
/* force scheduling for rcu_sched_qs() */ /* force scheduling for rcu_qs() */
resched_cpu(0); resched_cpu(0);
} }
} }
EXPORT_SYMBOL_GPL(call_rcu);
/*
* Post an RCU callback to be invoked after the end of an RCU-sched grace
* period. But since we have but one CPU, that would be after any
* quiescent state.
*/
void call_rcu_sched(struct rcu_head *head, rcu_callback_t func)
{
__call_rcu(head, func, &rcu_sched_ctrlblk);
}
EXPORT_SYMBOL_GPL(call_rcu_sched);
/*
* Post an RCU bottom-half callback to be invoked after any subsequent
* quiescent state.
*/
void call_rcu_bh(struct rcu_head *head, rcu_callback_t func)
{
__call_rcu(head, func, &rcu_bh_ctrlblk);
}
EXPORT_SYMBOL_GPL(call_rcu_bh);
void __init rcu_init(void) void __init rcu_init(void)
{ {
open_softirq(RCU_SOFTIRQ, rcu_process_callbacks); open_softirq(RCU_SOFTIRQ, rcu_process_callbacks);
rcu_early_boot_tests(); rcu_early_boot_tests();
srcu_init();
} }

File diff suppressed because it is too large Load Diff

View File

@ -34,29 +34,10 @@
#include "rcu_segcblist.h" #include "rcu_segcblist.h"
/* /* Communicate arguments to a workqueue handler. */
* Dynticks per-CPU state. struct rcu_exp_work {
*/ unsigned long rew_s;
struct rcu_dynticks { struct work_struct rew_work;
long long dynticks_nesting; /* Track irq/process nesting level. */
/* Process level is worth LLONG_MAX/2. */
int dynticks_nmi_nesting; /* Track NMI nesting level. */
atomic_t dynticks; /* Even value for idle, else odd. */
bool rcu_need_heavy_qs; /* GP old, need heavy quiescent state. */
unsigned long rcu_qs_ctr; /* Light universal quiescent state ctr. */
bool rcu_urgent_qs; /* GP old need light quiescent state. */
#ifdef CONFIG_RCU_FAST_NO_HZ
bool all_lazy; /* Are all CPU's CBs lazy? */
unsigned long nonlazy_posted;
/* # times non-lazy CBs posted to CPU. */
unsigned long nonlazy_posted_snap;
/* idle-period nonlazy_posted snapshot. */
unsigned long last_accelerate;
/* Last jiffy CBs were accelerated. */
unsigned long last_advance_all;
/* Last jiffy CBs were all advanced. */
int tick_nohz_enabled_snap; /* Previously seen value from sysfs. */
#endif /* #ifdef CONFIG_RCU_FAST_NO_HZ */
}; };
/* RCU's kthread states for tracing. */ /* RCU's kthread states for tracing. */
@ -74,18 +55,16 @@ struct rcu_node {
raw_spinlock_t __private lock; /* Root rcu_node's lock protects */ raw_spinlock_t __private lock; /* Root rcu_node's lock protects */
/* some rcu_state fields as well as */ /* some rcu_state fields as well as */
/* following. */ /* following. */
unsigned long gpnum; /* Current grace period for this node. */ unsigned long gp_seq; /* Track rsp->rcu_gp_seq. */
/* This will either be equal to or one */ unsigned long gp_seq_needed; /* Track furthest future GP request. */
/* behind the root rcu_node's gpnum. */ unsigned long completedqs; /* All QSes done for this node. */
unsigned long completed; /* Last GP completed for this node. */
/* This will either be equal to or one */
/* behind the root rcu_node's gpnum. */
unsigned long qsmask; /* CPUs or groups that need to switch in */ unsigned long qsmask; /* CPUs or groups that need to switch in */
/* order for current grace period to proceed.*/ /* order for current grace period to proceed.*/
/* In leaf rcu_node, each bit corresponds to */ /* In leaf rcu_node, each bit corresponds to */
/* an rcu_data structure, otherwise, each */ /* an rcu_data structure, otherwise, each */
/* bit corresponds to a child rcu_node */ /* bit corresponds to a child rcu_node */
/* structure. */ /* structure. */
unsigned long rcu_gp_init_mask; /* Mask of offline CPUs at GP init. */
unsigned long qsmaskinit; unsigned long qsmaskinit;
/* Per-GP initial value for qsmask. */ /* Per-GP initial value for qsmask. */
/* Initialized from ->qsmaskinitnext at the */ /* Initialized from ->qsmaskinitnext at the */
@ -103,6 +82,7 @@ struct rcu_node {
/* Online CPUs for next expedited GP. */ /* Online CPUs for next expedited GP. */
/* Any CPU that has ever been online will */ /* Any CPU that has ever been online will */
/* have its bit set. */ /* have its bit set. */
unsigned long ffmask; /* Fully functional CPUs. */
unsigned long grpmask; /* Mask to apply to parent qsmask. */ unsigned long grpmask; /* Mask to apply to parent qsmask. */
/* Only one bit will be set in this mask. */ /* Only one bit will be set in this mask. */
int grplo; /* lowest-numbered CPU or group here. */ int grplo; /* lowest-numbered CPU or group here. */
@ -146,23 +126,17 @@ struct rcu_node {
/* boosting for this rcu_node structure. */ /* boosting for this rcu_node structure. */
unsigned int boost_kthread_status; unsigned int boost_kthread_status;
/* State of boost_kthread_task for tracing. */ /* State of boost_kthread_task for tracing. */
unsigned long n_tasks_boosted;
/* Total number of tasks boosted. */
unsigned long n_exp_boosts;
/* Number of tasks boosted for expedited GP. */
unsigned long n_normal_boosts;
/* Number of tasks boosted for normal GP. */
#ifdef CONFIG_RCU_NOCB_CPU #ifdef CONFIG_RCU_NOCB_CPU
struct swait_queue_head nocb_gp_wq[2]; struct swait_queue_head nocb_gp_wq[2];
/* Place for rcu_nocb_kthread() to wait GP. */ /* Place for rcu_nocb_kthread() to wait GP. */
#endif /* #ifdef CONFIG_RCU_NOCB_CPU */ #endif /* #ifdef CONFIG_RCU_NOCB_CPU */
int need_future_gp[2];
/* Counts of upcoming no-CB GP requests. */
raw_spinlock_t fqslock ____cacheline_internodealigned_in_smp; raw_spinlock_t fqslock ____cacheline_internodealigned_in_smp;
spinlock_t exp_lock ____cacheline_internodealigned_in_smp; spinlock_t exp_lock ____cacheline_internodealigned_in_smp;
unsigned long exp_seq_rq; unsigned long exp_seq_rq;
wait_queue_head_t exp_wq[4]; wait_queue_head_t exp_wq[4];
struct rcu_exp_work rew;
bool exp_need_flush; /* Need to flush workitem? */
} ____cacheline_internodealigned_in_smp; } ____cacheline_internodealigned_in_smp;
/* /*
@ -170,7 +144,7 @@ struct rcu_node {
* are indexed relative to this interval rather than the global CPU ID space. * are indexed relative to this interval rather than the global CPU ID space.
* This generates the bit for a CPU in node-local masks. * This generates the bit for a CPU in node-local masks.
*/ */
#define leaf_node_cpu_bit(rnp, cpu) (1UL << ((cpu) - (rnp)->grplo)) #define leaf_node_cpu_bit(rnp, cpu) (BIT((cpu) - (rnp)->grplo))
/* /*
* Union to allow "aggregate OR" operation on the need for a quiescent * Union to allow "aggregate OR" operation on the need for a quiescent
@ -184,32 +158,24 @@ union rcu_noqs {
u16 s; /* Set of bits, aggregate OR here. */ u16 s; /* Set of bits, aggregate OR here. */
}; };
/* Index values for nxttail array in struct rcu_data. */
#define RCU_DONE_TAIL 0 /* Also RCU_WAIT head. */
#define RCU_WAIT_TAIL 1 /* Also RCU_NEXT_READY head. */
#define RCU_NEXT_READY_TAIL 2 /* Also RCU_NEXT head. */
#define RCU_NEXT_TAIL 3
#define RCU_NEXT_SIZE 4
/* Per-CPU data for read-copy update. */ /* Per-CPU data for read-copy update. */
struct rcu_data { struct rcu_data {
/* 1) quiescent-state and grace-period handling : */ /* 1) quiescent-state and grace-period handling : */
unsigned long completed; /* Track rsp->completed gp number */ unsigned long gp_seq; /* Track rsp->rcu_gp_seq counter. */
/* in order to detect GP end. */ unsigned long gp_seq_needed; /* Track furthest future GP request. */
unsigned long gpnum; /* Highest gp number that this CPU */
/* is aware of having started. */
unsigned long rcu_qs_ctr_snap;/* Snapshot of rcu_qs_ctr to check */
/* for rcu_all_qs() invocations. */
union rcu_noqs cpu_no_qs; /* No QSes yet for this CPU. */ union rcu_noqs cpu_no_qs; /* No QSes yet for this CPU. */
bool core_needs_qs; /* Core waits for quiesc state. */ bool core_needs_qs; /* Core waits for quiesc state. */
bool beenonline; /* CPU online at least once. */ bool beenonline; /* CPU online at least once. */
bool gpwrap; /* Possible gpnum/completed wrap. */ bool gpwrap; /* Possible ->gp_seq wrap. */
bool exp_deferred_qs; /* This CPU awaiting a deferred QS? */
struct rcu_node *mynode; /* This CPU's leaf of hierarchy */ struct rcu_node *mynode; /* This CPU's leaf of hierarchy */
unsigned long grpmask; /* Mask to apply to leaf qsmask. */ unsigned long grpmask; /* Mask to apply to leaf qsmask. */
unsigned long ticks_this_gp; /* The number of scheduling-clock */ unsigned long ticks_this_gp; /* The number of scheduling-clock */
/* ticks this CPU has handled */ /* ticks this CPU has handled */
/* during and after the last grace */ /* during and after the last grace */
/* period it is aware of. */ /* period it is aware of. */
struct irq_work defer_qs_iw; /* Obtain later scheduler attention. */
bool defer_qs_iw_pending; /* Scheduler attention pending? */
/* 2) batch handling */ /* 2) batch handling */
struct rcu_segcblist cblist; /* Segmented callback list, with */ struct rcu_segcblist cblist; /* Segmented callback list, with */
@ -217,77 +183,83 @@ struct rcu_data {
/* different grace periods. */ /* different grace periods. */
long qlen_last_fqs_check; long qlen_last_fqs_check;
/* qlen at last check for QS forcing */ /* qlen at last check for QS forcing */
unsigned long n_cbs_invoked; /* count of RCU cbs invoked. */
unsigned long n_nocbs_invoked; /* count of no-CBs RCU cbs invoked. */
unsigned long n_force_qs_snap; unsigned long n_force_qs_snap;
/* did other CPU force QS recently? */ /* did other CPU force QS recently? */
long blimit; /* Upper limit on a processed batch */ long blimit; /* Upper limit on a processed batch */
/* 3) dynticks interface. */ /* 3) dynticks interface. */
struct rcu_dynticks *dynticks; /* Shared per-CPU dynticks state. */
int dynticks_snap; /* Per-GP tracking for dynticks. */ int dynticks_snap; /* Per-GP tracking for dynticks. */
long dynticks_nesting; /* Track process nesting level. */
/* 4) reasons this CPU needed to be kicked by force_quiescent_state */ long dynticks_nmi_nesting; /* Track irq/NMI nesting level. */
unsigned long dynticks_fqs; /* Kicked due to dynticks idle. */ atomic_t dynticks; /* Even value for idle, else odd. */
unsigned long offline_fqs; /* Kicked due to being offline. */ bool rcu_need_heavy_qs; /* GP old, so heavy quiescent state! */
unsigned long cond_resched_completed; bool rcu_urgent_qs; /* GP old need light quiescent state. */
/* Grace period that needs help */
/* from cond_resched(). */
/* 5) __rcu_pending() statistics. */
unsigned long n_rcu_pending; /* rcu_pending() calls since boot. */
unsigned long n_rp_core_needs_qs;
unsigned long n_rp_report_qs;
unsigned long n_rp_cb_ready;
unsigned long n_rp_cpu_needs_gp;
unsigned long n_rp_gp_completed;
unsigned long n_rp_gp_started;
unsigned long n_rp_nocb_defer_wakeup;
unsigned long n_rp_need_nothing;
/* 6) _rcu_barrier(), OOM callbacks, and expediting. */
struct rcu_head barrier_head;
#ifdef CONFIG_RCU_FAST_NO_HZ #ifdef CONFIG_RCU_FAST_NO_HZ
struct rcu_head oom_head; bool all_lazy; /* All CPU's CBs lazy at idle start? */
unsigned long last_accelerate; /* Last jiffy CBs were accelerated. */
unsigned long last_advance_all; /* Last jiffy CBs were all advanced. */
int tick_nohz_enabled_snap; /* Previously seen value from sysfs. */
#endif /* #ifdef CONFIG_RCU_FAST_NO_HZ */ #endif /* #ifdef CONFIG_RCU_FAST_NO_HZ */
atomic_long_t exp_workdone0; /* # done by workqueue. */
atomic_long_t exp_workdone1; /* # done by others #1. */ /* 4) rcu_barrier(), OOM callbacks, and expediting. */
atomic_long_t exp_workdone2; /* # done by others #2. */ struct rcu_head barrier_head;
atomic_long_t exp_workdone3; /* # done by others #3. */
int exp_dynticks_snap; /* Double-check need for IPI. */ int exp_dynticks_snap; /* Double-check need for IPI. */
/* 7) Callback offloading. */ /* 5) Callback offloading. */
#ifdef CONFIG_RCU_NOCB_CPU #ifdef CONFIG_RCU_NOCB_CPU
struct rcu_head *nocb_head; /* CBs waiting for kthread. */ struct swait_queue_head nocb_cb_wq; /* For nocb kthreads to sleep on. */
struct rcu_head **nocb_tail; struct task_struct *nocb_gp_kthread;
atomic_long_t nocb_q_count; /* # CBs waiting for nocb */
atomic_long_t nocb_q_count_lazy; /* invocation (all stages). */
struct rcu_head *nocb_follower_head; /* CBs ready to invoke. */
struct rcu_head **nocb_follower_tail;
struct swait_queue_head nocb_wq; /* For nocb kthreads to sleep on. */
struct task_struct *nocb_kthread;
raw_spinlock_t nocb_lock; /* Guard following pair of fields. */ raw_spinlock_t nocb_lock; /* Guard following pair of fields. */
atomic_t nocb_lock_contended; /* Contention experienced. */
int nocb_defer_wakeup; /* Defer wakeup of nocb_kthread. */ int nocb_defer_wakeup; /* Defer wakeup of nocb_kthread. */
struct timer_list nocb_timer; /* Enforce finite deferral. */ struct timer_list nocb_timer; /* Enforce finite deferral. */
unsigned long nocb_gp_adv_time; /* Last call_rcu() CB adv (jiffies). */
/* The following fields are used by the leader, hence own cacheline. */ /* The following fields are used by call_rcu, hence own cacheline. */
struct rcu_head *nocb_gp_head ____cacheline_internodealigned_in_smp; raw_spinlock_t nocb_bypass_lock ____cacheline_internodealigned_in_smp;
/* CBs waiting for GP. */ struct rcu_cblist nocb_bypass; /* Lock-contention-bypass CB list. */
struct rcu_head **nocb_gp_tail; unsigned long nocb_bypass_first; /* Time (jiffies) of first enqueue. */
bool nocb_leader_sleep; /* Is the nocb leader thread asleep? */ unsigned long nocb_nobypass_last; /* Last ->cblist enqueue (jiffies). */
struct rcu_data *nocb_next_follower; int nocb_nobypass_count; /* # ->cblist enqueues at ^^^ time. */
/* Next follower in wakeup chain. */
/* The following fields are used by the follower, hence new cachline. */ /* The following fields are used by GP kthread, hence own cacheline. */
struct rcu_data *nocb_leader ____cacheline_internodealigned_in_smp; raw_spinlock_t nocb_gp_lock ____cacheline_internodealigned_in_smp;
/* Leader CPU takes GP-end wakeups. */ struct timer_list nocb_bypass_timer; /* Force nocb_bypass flush. */
u8 nocb_gp_sleep; /* Is the nocb GP thread asleep? */
u8 nocb_gp_bypass; /* Found a bypass on last scan? */
u8 nocb_gp_gp; /* GP to wait for on last scan? */
unsigned long nocb_gp_seq; /* If so, ->gp_seq to wait for. */
unsigned long nocb_gp_loops; /* # passes through wait code. */
struct swait_queue_head nocb_gp_wq; /* For nocb kthreads to sleep on. */
bool nocb_cb_sleep; /* Is the nocb CB thread asleep? */
struct task_struct *nocb_cb_kthread;
struct rcu_data *nocb_next_cb_rdp;
/* Next rcu_data in wakeup chain. */
/* The following fields are used by CB kthread, hence new cacheline. */
struct rcu_data *nocb_gp_rdp ____cacheline_internodealigned_in_smp;
/* GP rdp takes GP-end wakeups. */
#endif /* #ifdef CONFIG_RCU_NOCB_CPU */ #endif /* #ifdef CONFIG_RCU_NOCB_CPU */
/* 8) RCU CPU stall data. */ /* 6) RCU priority boosting. */
struct task_struct *rcu_cpu_kthread_task;
/* rcuc per-CPU kthread or NULL. */
unsigned int rcu_cpu_kthread_status;
char rcu_cpu_has_work;
/* 7) Diagnostic data, including RCU CPU stall warnings. */
unsigned int softirq_snap; /* Snapshot of softirq activity. */ unsigned int softirq_snap; /* Snapshot of softirq activity. */
/* ->rcu_iw* fields protected by leaf rcu_node ->lock. */
struct irq_work rcu_iw; /* Check for non-irq activity. */
bool rcu_iw_pending; /* Is ->rcu_iw pending? */
unsigned long rcu_iw_gp_seq; /* ->gp_seq associated with ->rcu_iw. */
unsigned long rcu_ofl_gp_seq; /* ->gp_seq at last offline. */
short rcu_ofl_gp_flags; /* ->gp_flags at last offline. */
unsigned long rcu_onl_gp_seq; /* ->gp_seq at last online. */
short rcu_onl_gp_flags; /* ->gp_flags at last online. */
unsigned long last_fqs_resched; /* Time of last rcu_resched(). */
int cpu; int cpu;
struct rcu_state *rsp;
}; };
/* Values for nocb_defer_wakeup field in struct rcu_data. */ /* Values for nocb_defer_wakeup field in struct rcu_data. */
@ -333,20 +305,19 @@ struct rcu_state {
struct rcu_node *level[RCU_NUM_LVLS + 1]; struct rcu_node *level[RCU_NUM_LVLS + 1];
/* Hierarchy levels (+1 to */ /* Hierarchy levels (+1 to */
/* shut bogus gcc warning) */ /* shut bogus gcc warning) */
struct rcu_data __percpu *rda; /* pointer of percu rcu_data. */
call_rcu_func_t call; /* call_rcu() flavor. */
int ncpus; /* # CPUs seen so far. */ int ncpus; /* # CPUs seen so far. */
/* The following fields are guarded by the root rcu_node's lock. */ /* The following fields are guarded by the root rcu_node's lock. */
u8 boost ____cacheline_internodealigned_in_smp; u8 boost ____cacheline_internodealigned_in_smp;
/* Subject to priority boost. */ /* Subject to priority boost. */
unsigned long gpnum; /* Current gp number. */ unsigned long gp_seq; /* Grace-period sequence #. */
unsigned long completed; /* # of last completed gp. */
struct task_struct *gp_kthread; /* Task for grace periods. */ struct task_struct *gp_kthread; /* Task for grace periods. */
struct swait_queue_head gp_wq; /* Where GP task waits. */ struct swait_queue_head gp_wq; /* Where GP task waits. */
short gp_flags; /* Commands for GP task. */ short gp_flags; /* Commands for GP task. */
short gp_state; /* GP kthread sleep state. */ short gp_state; /* GP kthread sleep state. */
unsigned long gp_wake_time; /* Last GP kthread wake. */
unsigned long gp_wake_seq; /* ->gp_seq at ^^^. */
/* End of fields guarded by root rcu_node's lock. */ /* End of fields guarded by root rcu_node's lock. */
@ -354,7 +325,7 @@ struct rcu_state {
atomic_t barrier_cpu_count; /* # CPUs waiting on. */ atomic_t barrier_cpu_count; /* # CPUs waiting on. */
struct completion barrier_completion; /* Wake at barrier end. */ struct completion barrier_completion; /* Wake at barrier end. */
unsigned long barrier_sequence; /* ++ at start and end of */ unsigned long barrier_sequence; /* ++ at start and end of */
/* _rcu_barrier(). */ /* rcu_barrier(). */
/* End of fields guarded by barrier_mutex. */ /* End of fields guarded by barrier_mutex. */
struct mutex exp_mutex; /* Serialize expedited GP. */ struct mutex exp_mutex; /* Serialize expedited GP. */
@ -370,14 +341,14 @@ struct rcu_state {
/* kthreads, if configured. */ /* kthreads, if configured. */
unsigned long n_force_qs; /* Number of calls to */ unsigned long n_force_qs; /* Number of calls to */
/* force_quiescent_state(). */ /* force_quiescent_state(). */
unsigned long n_force_qs_lh; /* ~Number of calls leaving */
/* due to lock unavailable. */
unsigned long n_force_qs_ngp; /* Number of calls leaving */
/* due to no GP active. */
unsigned long gp_start; /* Time at which GP started, */ unsigned long gp_start; /* Time at which GP started, */
/* but in jiffies. */ /* but in jiffies. */
unsigned long gp_end; /* Time last GP ended, again */
/* in jiffies. */
unsigned long gp_activity; /* Time of last GP kthread */ unsigned long gp_activity; /* Time of last GP kthread */
/* activity in jiffies. */ /* activity in jiffies. */
unsigned long gp_req_activity; /* Time of last GP request */
/* in jiffies. */
unsigned long jiffies_stall; /* Time at which to check */ unsigned long jiffies_stall; /* Time at which to check */
/* for CPU stalls. */ /* for CPU stalls. */
unsigned long jiffies_resched; /* Time at which to resched */ unsigned long jiffies_resched; /* Time at which to resched */
@ -388,7 +359,10 @@ struct rcu_state {
/* jiffies. */ /* jiffies. */
const char *name; /* Name of structure. */ const char *name; /* Name of structure. */
char abbr; /* Abbreviated name. */ char abbr; /* Abbreviated name. */
struct list_head flavors; /* List of RCU flavors. */
raw_spinlock_t ofl_lock ____cacheline_internodealigned_in_smp;
/* Synchronize offline with */
/* GP pre-initialization. */
}; };
/* Values for rcu_state structure's gp_flags field. */ /* Values for rcu_state structure's gp_flags field. */
@ -399,117 +373,115 @@ struct rcu_state {
#define RCU_GP_IDLE 0 /* Initial state and no GP in progress. */ #define RCU_GP_IDLE 0 /* Initial state and no GP in progress. */
#define RCU_GP_WAIT_GPS 1 /* Wait for grace-period start. */ #define RCU_GP_WAIT_GPS 1 /* Wait for grace-period start. */
#define RCU_GP_DONE_GPS 2 /* Wait done for grace-period start. */ #define RCU_GP_DONE_GPS 2 /* Wait done for grace-period start. */
#define RCU_GP_WAIT_FQS 3 /* Wait for force-quiescent-state time. */ #define RCU_GP_ONOFF 3 /* Grace-period initialization hotplug. */
#define RCU_GP_DOING_FQS 4 /* Wait done for force-quiescent-state time. */ #define RCU_GP_INIT 4 /* Grace-period initialization. */
#define RCU_GP_CLEANUP 5 /* Grace-period cleanup started. */ #define RCU_GP_WAIT_FQS 5 /* Wait for force-quiescent-state time. */
#define RCU_GP_CLEANED 6 /* Grace-period cleanup complete. */ #define RCU_GP_DOING_FQS 6 /* Wait done for force-quiescent-state time. */
#define RCU_GP_CLEANUP 7 /* Grace-period cleanup started. */
#define RCU_GP_CLEANED 8 /* Grace-period cleanup complete. */
#ifndef RCU_TREE_NONCORE
static const char * const gp_state_names[] = { static const char * const gp_state_names[] = {
"RCU_GP_IDLE", "RCU_GP_IDLE",
"RCU_GP_WAIT_GPS", "RCU_GP_WAIT_GPS",
"RCU_GP_DONE_GPS", "RCU_GP_DONE_GPS",
"RCU_GP_ONOFF",
"RCU_GP_INIT",
"RCU_GP_WAIT_FQS", "RCU_GP_WAIT_FQS",
"RCU_GP_DOING_FQS", "RCU_GP_DOING_FQS",
"RCU_GP_CLEANUP", "RCU_GP_CLEANUP",
"RCU_GP_CLEANED", "RCU_GP_CLEANED",
}; };
#endif /* #ifndef RCU_TREE_NONCORE */
extern struct list_head rcu_struct_flavors;
/* Sequence through rcu_state structures for each RCU flavor. */
#define for_each_rcu_flavor(rsp) \
list_for_each_entry((rsp), &rcu_struct_flavors, flavors)
/* /*
* RCU implementation internal declarations: * In order to export the rcu_state name to the tracing tools, it
* needs to be added in the __tracepoint_string section.
* This requires defining a separate variable tp_<sname>_varname
* that points to the string being used, and this will allow
* the tracing userspace tools to be able to decipher the string
* address to the matching string.
*/ */
extern struct rcu_state rcu_sched_state;
extern struct rcu_state rcu_bh_state;
#ifdef CONFIG_PREEMPT_RCU #ifdef CONFIG_PREEMPT_RCU
extern struct rcu_state rcu_preempt_state; #define RCU_ABBR 'p'
#endif /* #ifdef CONFIG_PREEMPT_RCU */ #define RCU_NAME_RAW "rcu_preempt"
#else /* #ifdef CONFIG_PREEMPT_RCU */
#define RCU_ABBR 's'
#define RCU_NAME_RAW "rcu_sched"
#endif /* #else #ifdef CONFIG_PREEMPT_RCU */
#ifndef CONFIG_TRACING
#define RCU_NAME RCU_NAME_RAW
#else /* #ifdef CONFIG_TRACING */
static char rcu_name[] = RCU_NAME_RAW;
static const char *tp_rcu_varname __used __tracepoint_string = rcu_name;
#define RCU_NAME rcu_name
#endif /* #else #ifdef CONFIG_TRACING */
int rcu_dynticks_snap(struct rcu_dynticks *rdtp); int rcu_dynticks_snap(struct rcu_data *rdp);
bool rcu_eqs_special_set(int cpu);
#ifdef CONFIG_RCU_BOOST /* Forward declarations for tree_plugin.h */
DECLARE_PER_CPU(unsigned int, rcu_cpu_kthread_status);
DECLARE_PER_CPU(int, rcu_cpu_kthread_cpu);
DECLARE_PER_CPU(unsigned int, rcu_cpu_kthread_loops);
DECLARE_PER_CPU(char, rcu_cpu_has_work);
#endif /* #ifdef CONFIG_RCU_BOOST */
#ifndef RCU_TREE_NONCORE
/* Forward declarations for rcutree_plugin.h */
static void rcu_bootup_announce(void); static void rcu_bootup_announce(void);
static void rcu_preempt_note_context_switch(bool preempt); static void rcu_qs(void);
static int rcu_preempt_blocked_readers_cgp(struct rcu_node *rnp); static int rcu_preempt_blocked_readers_cgp(struct rcu_node *rnp);
#ifdef CONFIG_HOTPLUG_CPU #ifdef CONFIG_HOTPLUG_CPU
static bool rcu_preempt_has_tasks(struct rcu_node *rnp); static bool rcu_preempt_has_tasks(struct rcu_node *rnp);
#endif /* #ifdef CONFIG_HOTPLUG_CPU */ #endif /* #ifdef CONFIG_HOTPLUG_CPU */
static void rcu_print_detail_task_stall(struct rcu_state *rsp);
static int rcu_print_task_stall(struct rcu_node *rnp);
static int rcu_print_task_exp_stall(struct rcu_node *rnp); static int rcu_print_task_exp_stall(struct rcu_node *rnp);
static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp); static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp);
static void rcu_preempt_check_callbacks(void); static void rcu_flavor_sched_clock_irq(int user);
void call_rcu(struct rcu_head *head, rcu_callback_t func); void call_rcu(struct rcu_head *head, rcu_callback_t func);
static void __init __rcu_init_preempt(void); static void dump_blkd_tasks(struct rcu_node *rnp, int ncheck);
static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags); static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags);
static void rcu_preempt_boost_start_gp(struct rcu_node *rnp); static void rcu_preempt_boost_start_gp(struct rcu_node *rnp);
static void invoke_rcu_callbacks_kthread(void);
static bool rcu_is_callbacks_kthread(void); static bool rcu_is_callbacks_kthread(void);
#ifdef CONFIG_RCU_BOOST static void rcu_cpu_kthread_setup(unsigned int cpu);
static void rcu_preempt_do_callbacks(void);
static int rcu_spawn_one_boost_kthread(struct rcu_state *rsp,
struct rcu_node *rnp);
#endif /* #ifdef CONFIG_RCU_BOOST */
static void __init rcu_spawn_boost_kthreads(void); static void __init rcu_spawn_boost_kthreads(void);
static void rcu_prepare_kthreads(int cpu); static void rcu_prepare_kthreads(int cpu);
static void rcu_cleanup_after_idle(void); static void rcu_cleanup_after_idle(void);
static void rcu_prepare_for_idle(void); static void rcu_prepare_for_idle(void);
static void rcu_idle_count_callbacks_posted(void);
static bool rcu_preempt_has_tasks(struct rcu_node *rnp); static bool rcu_preempt_has_tasks(struct rcu_node *rnp);
static void print_cpu_stall_info_begin(void); static bool rcu_preempt_need_deferred_qs(struct task_struct *t);
static void print_cpu_stall_info(struct rcu_state *rsp, int cpu); static void rcu_preempt_deferred_qs(struct task_struct *t);
static void print_cpu_stall_info_end(void);
static void zero_cpu_stall_ticks(struct rcu_data *rdp); static void zero_cpu_stall_ticks(struct rcu_data *rdp);
static void increment_cpu_stall_ticks(void);
static bool rcu_nocb_cpu_needs_barrier(struct rcu_state *rsp, int cpu);
static void rcu_nocb_gp_set(struct rcu_node *rnp, int nrq);
static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp); static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp);
static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq); static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq);
static void rcu_init_one_nocb(struct rcu_node *rnp); static void rcu_init_one_nocb(struct rcu_node *rnp);
static bool __call_rcu_nocb(struct rcu_data *rdp, struct rcu_head *rhp, static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
bool lazy, unsigned long flags); unsigned long j);
static bool rcu_nocb_adopt_orphan_cbs(struct rcu_data *my_rdp, static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
struct rcu_data *rdp, bool *was_alldone, unsigned long flags);
unsigned long flags); static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_empty,
unsigned long flags);
static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp); static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp);
static void do_nocb_deferred_wakeup(struct rcu_data *rdp); static void do_nocb_deferred_wakeup(struct rcu_data *rdp);
static void rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp); static void rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp);
static void rcu_spawn_all_nocb_kthreads(int cpu); static void rcu_spawn_cpu_nocb_kthread(int cpu);
static void __init rcu_spawn_nocb_kthreads(void); static void __init rcu_spawn_nocb_kthreads(void);
static void show_rcu_nocb_state(struct rcu_data *rdp);
static void rcu_nocb_lock(struct rcu_data *rdp);
static void rcu_nocb_unlock(struct rcu_data *rdp);
static void rcu_nocb_unlock_irqrestore(struct rcu_data *rdp,
unsigned long flags);
static void rcu_lockdep_assert_cblist_protected(struct rcu_data *rdp);
#ifdef CONFIG_RCU_NOCB_CPU #ifdef CONFIG_RCU_NOCB_CPU
static void __init rcu_organize_nocb_kthreads(struct rcu_state *rsp); static void __init rcu_organize_nocb_kthreads(void);
#endif /* #ifdef CONFIG_RCU_NOCB_CPU */ #define rcu_nocb_lock_irqsave(rdp, flags) \
static void __maybe_unused rcu_kick_nohz_cpu(int cpu); do { \
static bool init_nocb_callback_list(struct rcu_data *rdp); if (!rcu_segcblist_is_offloaded(&(rdp)->cblist)) \
local_irq_save(flags); \
else \
raw_spin_lock_irqsave(&(rdp)->nocb_lock, (flags)); \
} while (0)
#else /* #ifdef CONFIG_RCU_NOCB_CPU */
#define rcu_nocb_lock_irqsave(rdp, flags) local_irq_save(flags)
#endif /* #else #ifdef CONFIG_RCU_NOCB_CPU */
static void rcu_bind_gp_kthread(void); static void rcu_bind_gp_kthread(void);
static bool rcu_nohz_full_cpu(struct rcu_state *rsp); static bool rcu_nohz_full_cpu(void);
static void rcu_dynticks_task_enter(void); static void rcu_dynticks_task_enter(void);
static void rcu_dynticks_task_exit(void); static void rcu_dynticks_task_exit(void);
#ifdef CONFIG_SRCU /* Forward declarations for tree_stall.h */
void srcu_online_cpu(unsigned int cpu); static void record_gp_stall_check_time(void);
void srcu_offline_cpu(unsigned int cpu); static void rcu_iw_handler(struct irq_work *iwp);
#else /* #ifdef CONFIG_SRCU */ static void check_cpu_stall(struct rcu_data *rdp);
void srcu_online_cpu(unsigned int cpu) { } static void rcu_check_gp_start_stall(struct rcu_node *rnp, struct rcu_data *rdp,
void srcu_offline_cpu(unsigned int cpu) { } const unsigned long gpssdelay);
#endif /* #else #ifdef CONFIG_SRCU */
#endif /* #ifndef RCU_TREE_NONCORE */

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

720
kernel/rcu/tree_stall.h Normal file
View File

@ -0,0 +1,720 @@
// SPDX-License-Identifier: GPL-2.0+
/*
* RCU CPU stall warnings for normal RCU grace periods
*
* Copyright IBM Corporation, 2019
*
* Author: Paul E. McKenney <paulmck@linux.ibm.com>
*/
//////////////////////////////////////////////////////////////////////////////
//
// Controlling CPU stall warnings, including delay calculation.
/* panic() on RCU Stall sysctl. */
int sysctl_panic_on_rcu_stall __read_mostly = CONFIG_RCU_PANIC_ON_STALL;
#ifdef CONFIG_PROVE_RCU
#define RCU_STALL_DELAY_DELTA (5 * HZ)
#else
#define RCU_STALL_DELAY_DELTA 0
#endif
/* Limit-check stall timeouts specified at boottime and runtime. */
int rcu_jiffies_till_stall_check(void)
{
int till_stall_check = READ_ONCE(rcu_cpu_stall_timeout);
/*
* Limit check must be consistent with the Kconfig limits
* for CONFIG_RCU_CPU_STALL_TIMEOUT.
*/
if (till_stall_check < 3) {
WRITE_ONCE(rcu_cpu_stall_timeout, 3);
till_stall_check = 3;
} else if (till_stall_check > 300) {
WRITE_ONCE(rcu_cpu_stall_timeout, 300);
till_stall_check = 300;
}
return till_stall_check * HZ + RCU_STALL_DELAY_DELTA;
}
EXPORT_SYMBOL_GPL(rcu_jiffies_till_stall_check);
/* Don't do RCU CPU stall warnings during long sysrq printouts. */
void rcu_sysrq_start(void)
{
if (!rcu_cpu_stall_suppress)
rcu_cpu_stall_suppress = 2;
}
void rcu_sysrq_end(void)
{
if (rcu_cpu_stall_suppress == 2)
rcu_cpu_stall_suppress = 0;
}
/* Don't print RCU CPU stall warnings during a kernel panic. */
static int rcu_panic(struct notifier_block *this, unsigned long ev, void *ptr)
{
rcu_cpu_stall_suppress = 1;
return NOTIFY_DONE;
}
static struct notifier_block rcu_panic_block = {
.notifier_call = rcu_panic,
};
static int __init check_cpu_stall_init(void)
{
atomic_notifier_chain_register(&panic_notifier_list, &rcu_panic_block);
return 0;
}
early_initcall(check_cpu_stall_init);
/* If so specified via sysctl, panic, yielding cleaner stall-warning output. */
static void panic_on_rcu_stall(void)
{
if (sysctl_panic_on_rcu_stall)
panic("RCU Stall\n");
}
/**
* rcu_cpu_stall_reset - prevent further stall warnings in current grace period
*
* Set the stall-warning timeout way off into the future, thus preventing
* any RCU CPU stall-warning messages from appearing in the current set of
* RCU grace periods.
*
* The caller must disable hard irqs.
*/
void rcu_cpu_stall_reset(void)
{
WRITE_ONCE(rcu_state.jiffies_stall, jiffies + ULONG_MAX / 2);
}
//////////////////////////////////////////////////////////////////////////////
//
// Interaction with RCU grace periods
/* Start of new grace period, so record stall time (and forcing times). */
static void record_gp_stall_check_time(void)
{
unsigned long j = jiffies;
unsigned long j1;
rcu_state.gp_start = j;
j1 = rcu_jiffies_till_stall_check();
/* Record ->gp_start before ->jiffies_stall. */
smp_store_release(&rcu_state.jiffies_stall, j + j1); /* ^^^ */
rcu_state.jiffies_resched = j + j1 / 2;
rcu_state.n_force_qs_gpstart = READ_ONCE(rcu_state.n_force_qs);
}
/* Zero ->ticks_this_gp and snapshot the number of RCU softirq handlers. */
static void zero_cpu_stall_ticks(struct rcu_data *rdp)
{
rdp->ticks_this_gp = 0;
rdp->softirq_snap = kstat_softirqs_cpu(RCU_SOFTIRQ, smp_processor_id());
WRITE_ONCE(rdp->last_fqs_resched, jiffies);
}
/*
* If too much time has passed in the current grace period, and if
* so configured, go kick the relevant kthreads.
*/
static void rcu_stall_kick_kthreads(void)
{
unsigned long j;
if (!rcu_kick_kthreads)
return;
j = READ_ONCE(rcu_state.jiffies_kick_kthreads);
if (time_after(jiffies, j) && rcu_state.gp_kthread &&
(rcu_gp_in_progress() || READ_ONCE(rcu_state.gp_flags))) {
WARN_ONCE(1, "Kicking %s grace-period kthread\n",
rcu_state.name);
rcu_ftrace_dump(DUMP_ALL);
wake_up_process(rcu_state.gp_kthread);
WRITE_ONCE(rcu_state.jiffies_kick_kthreads, j + HZ);
}
}
/*
* Handler for the irq_work request posted about halfway into the RCU CPU
* stall timeout, and used to detect excessive irq disabling. Set state
* appropriately, but just complain if there is unexpected state on entry.
*/
static void rcu_iw_handler(struct irq_work *iwp)
{
struct rcu_data *rdp;
struct rcu_node *rnp;
rdp = container_of(iwp, struct rcu_data, rcu_iw);
rnp = rdp->mynode;
raw_spin_lock_rcu_node(rnp);
if (!WARN_ON_ONCE(!rdp->rcu_iw_pending)) {
rdp->rcu_iw_gp_seq = rnp->gp_seq;
rdp->rcu_iw_pending = false;
}
raw_spin_unlock_rcu_node(rnp);
}
//////////////////////////////////////////////////////////////////////////////
//
// Printing RCU CPU stall warnings
#ifdef CONFIG_PREEMPT
/*
* Dump detailed information for all tasks blocking the current RCU
* grace period on the specified rcu_node structure.
*/
static void rcu_print_detail_task_stall_rnp(struct rcu_node *rnp)
{
unsigned long flags;
struct task_struct *t;
raw_spin_lock_irqsave_rcu_node(rnp, flags);
if (!rcu_preempt_blocked_readers_cgp(rnp)) {
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
return;
}
t = list_entry(rnp->gp_tasks->prev,
struct task_struct, rcu_node_entry);
list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) {
/*
* We could be printing a lot while holding a spinlock.
* Avoid triggering hard lockup.
*/
touch_nmi_watchdog();
sched_show_task(t);
}
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
}
/*
* Scan the current list of tasks blocked within RCU read-side critical
* sections, printing out the tid of each.
*/
static int rcu_print_task_stall(struct rcu_node *rnp)
{
struct task_struct *t;
int ndetected = 0;
if (!rcu_preempt_blocked_readers_cgp(rnp))
return 0;
pr_err("\tTasks blocked on level-%d rcu_node (CPUs %d-%d):",
rnp->level, rnp->grplo, rnp->grphi);
t = list_entry(rnp->gp_tasks->prev,
struct task_struct, rcu_node_entry);
list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) {
pr_cont(" P%d", t->pid);
ndetected++;
}
pr_cont("\n");
return ndetected;
}
#else /* #ifdef CONFIG_PREEMPT */
/*
* Because preemptible RCU does not exist, we never have to check for
* tasks blocked within RCU read-side critical sections.
*/
static void rcu_print_detail_task_stall_rnp(struct rcu_node *rnp)
{
}
/*
* Because preemptible RCU does not exist, we never have to check for
* tasks blocked within RCU read-side critical sections.
*/
static int rcu_print_task_stall(struct rcu_node *rnp)
{
return 0;
}
#endif /* #else #ifdef CONFIG_PREEMPT */
/*
* Dump stacks of all tasks running on stalled CPUs. First try using
* NMIs, but fall back to manual remote stack tracing on architectures
* that don't support NMI-based stack dumps. The NMI-triggered stack
* traces are more accurate because they are printed by the target CPU.
*/
static void rcu_dump_cpu_stacks(void)
{
int cpu;
unsigned long flags;
struct rcu_node *rnp;
rcu_for_each_leaf_node(rnp) {
raw_spin_lock_irqsave_rcu_node(rnp, flags);
for_each_leaf_node_possible_cpu(rnp, cpu)
if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu))
if (!trigger_single_cpu_backtrace(cpu))
dump_cpu_task(cpu);
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
}
}
#ifdef CONFIG_RCU_FAST_NO_HZ
static void print_cpu_stall_fast_no_hz(char *cp, int cpu)
{
struct rcu_data *rdp = &per_cpu(rcu_data, cpu);
sprintf(cp, "last_accelerate: %04lx/%04lx, Nonlazy posted: %c%c%c",
rdp->last_accelerate & 0xffff, jiffies & 0xffff,
".l"[rdp->all_lazy],
".L"[!rcu_segcblist_n_nonlazy_cbs(&rdp->cblist)],
".D"[!!rdp->tick_nohz_enabled_snap]);
}
#else /* #ifdef CONFIG_RCU_FAST_NO_HZ */
static void print_cpu_stall_fast_no_hz(char *cp, int cpu)
{
*cp = '\0';
}
#endif /* #else #ifdef CONFIG_RCU_FAST_NO_HZ */
/*
* Print out diagnostic information for the specified stalled CPU.
*
* If the specified CPU is aware of the current RCU grace period, then
* print the number of scheduling clock interrupts the CPU has taken
* during the time that it has been aware. Otherwise, print the number
* of RCU grace periods that this CPU is ignorant of, for example, "1"
* if the CPU was aware of the previous grace period.
*
* Also print out idle and (if CONFIG_RCU_FAST_NO_HZ) idle-entry info.
*/
static void print_cpu_stall_info(int cpu)
{
unsigned long delta;
char fast_no_hz[72];
struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
char *ticks_title;
unsigned long ticks_value;
/*
* We could be printing a lot while holding a spinlock. Avoid
* triggering hard lockup.
*/
touch_nmi_watchdog();
ticks_value = rcu_seq_ctr(rcu_state.gp_seq - rdp->gp_seq);
if (ticks_value) {
ticks_title = "GPs behind";
} else {
ticks_title = "ticks this GP";
ticks_value = rdp->ticks_this_gp;
}
print_cpu_stall_fast_no_hz(fast_no_hz, cpu);
delta = rcu_seq_ctr(rdp->mynode->gp_seq - rdp->rcu_iw_gp_seq);
pr_err("\t%d-%c%c%c%c: (%lu %s) idle=%03x/%ld/%#lx softirq=%u/%u fqs=%ld %s\n",
cpu,
"O."[!!cpu_online(cpu)],
"o."[!!(rdp->grpmask & rdp->mynode->qsmaskinit)],
"N."[!!(rdp->grpmask & rdp->mynode->qsmaskinitnext)],
!IS_ENABLED(CONFIG_IRQ_WORK) ? '?' :
rdp->rcu_iw_pending ? (int)min(delta, 9UL) + '0' :
"!."[!delta],
ticks_value, ticks_title,
rcu_dynticks_snap(rdp) & 0xfff,
rdp->dynticks_nesting, rdp->dynticks_nmi_nesting,
rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu),
READ_ONCE(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart,
fast_no_hz);
}
/* Complain about starvation of grace-period kthread. */
static void rcu_check_gp_kthread_starvation(void)
{
struct task_struct *gpk = rcu_state.gp_kthread;
unsigned long j;
j = jiffies - READ_ONCE(rcu_state.gp_activity);
if (j > 2 * HZ) {
pr_err("%s kthread starved for %ld jiffies! g%ld f%#x %s(%d) ->state=%#lx ->cpu=%d\n",
rcu_state.name, j,
(long)rcu_seq_current(&rcu_state.gp_seq),
READ_ONCE(rcu_state.gp_flags),
gp_state_getname(rcu_state.gp_state), rcu_state.gp_state,
gpk ? gpk->state : ~0, gpk ? task_cpu(gpk) : -1);
if (gpk) {
pr_err("RCU grace-period kthread stack dump:\n");
sched_show_task(gpk);
wake_up_process(gpk);
}
}
}
static void print_other_cpu_stall(unsigned long gp_seq)
{
int cpu;
unsigned long flags;
unsigned long gpa;
unsigned long j;
int ndetected = 0;
struct rcu_node *rnp;
long totqlen = 0;
/* Kick and suppress, if so configured. */
rcu_stall_kick_kthreads();
if (rcu_cpu_stall_suppress)
return;
/*
* OK, time to rat on our buddy...
* See Documentation/RCU/stallwarn.txt for info on how to debug
* RCU CPU stall warnings.
*/
pr_err("INFO: %s detected stalls on CPUs/tasks:\n", rcu_state.name);
rcu_for_each_leaf_node(rnp) {
raw_spin_lock_irqsave_rcu_node(rnp, flags);
ndetected += rcu_print_task_stall(rnp);
if (rnp->qsmask != 0) {
for_each_leaf_node_possible_cpu(rnp, cpu)
if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu)) {
print_cpu_stall_info(cpu);
ndetected++;
}
}
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
}
for_each_possible_cpu(cpu)
totqlen += rcu_get_n_cbs_cpu(cpu);
pr_cont("\t(detected by %d, t=%ld jiffies, g=%ld, q=%lu)\n",
smp_processor_id(), (long)(jiffies - rcu_state.gp_start),
(long)rcu_seq_current(&rcu_state.gp_seq), totqlen);
if (ndetected) {
rcu_dump_cpu_stacks();
/* Complain about tasks blocking the grace period. */
rcu_for_each_leaf_node(rnp)
rcu_print_detail_task_stall_rnp(rnp);
} else {
if (rcu_seq_current(&rcu_state.gp_seq) != gp_seq) {
pr_err("INFO: Stall ended before state dump start\n");
} else {
j = jiffies;
gpa = READ_ONCE(rcu_state.gp_activity);
pr_err("All QSes seen, last %s kthread activity %ld (%ld-%ld), jiffies_till_next_fqs=%ld, root ->qsmask %#lx\n",
rcu_state.name, j - gpa, j, gpa,
READ_ONCE(jiffies_till_next_fqs),
rcu_get_root()->qsmask);
/* In this case, the current CPU might be at fault. */
sched_show_task(current);
}
}
/* Rewrite if needed in case of slow consoles. */
if (ULONG_CMP_GE(jiffies, READ_ONCE(rcu_state.jiffies_stall)))
WRITE_ONCE(rcu_state.jiffies_stall,
jiffies + 3 * rcu_jiffies_till_stall_check() + 3);
rcu_check_gp_kthread_starvation();
panic_on_rcu_stall();
rcu_force_quiescent_state(); /* Kick them all. */
}
static void print_cpu_stall(void)
{
int cpu;
unsigned long flags;
struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
struct rcu_node *rnp = rcu_get_root();
long totqlen = 0;
/* Kick and suppress, if so configured. */
rcu_stall_kick_kthreads();
if (rcu_cpu_stall_suppress)
return;
/*
* OK, time to rat on ourselves...
* See Documentation/RCU/stallwarn.txt for info on how to debug
* RCU CPU stall warnings.
*/
pr_err("INFO: %s self-detected stall on CPU\n", rcu_state.name);
raw_spin_lock_irqsave_rcu_node(rdp->mynode, flags);
print_cpu_stall_info(smp_processor_id());
raw_spin_unlock_irqrestore_rcu_node(rdp->mynode, flags);
for_each_possible_cpu(cpu)
totqlen += rcu_get_n_cbs_cpu(cpu);
pr_cont("\t(t=%lu jiffies g=%ld q=%lu)\n",
jiffies - rcu_state.gp_start,
(long)rcu_seq_current(&rcu_state.gp_seq), totqlen);
rcu_check_gp_kthread_starvation();
rcu_dump_cpu_stacks();
raw_spin_lock_irqsave_rcu_node(rnp, flags);
/* Rewrite if needed in case of slow consoles. */
if (ULONG_CMP_GE(jiffies, READ_ONCE(rcu_state.jiffies_stall)))
WRITE_ONCE(rcu_state.jiffies_stall,
jiffies + 3 * rcu_jiffies_till_stall_check() + 3);
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
panic_on_rcu_stall();
/*
* Attempt to revive the RCU machinery by forcing a context switch.
*
* A context switch would normally allow the RCU state machine to make
* progress and it could be we're stuck in kernel space without context
* switches for an entirely unreasonable amount of time.
*/
set_tsk_need_resched(current);
set_preempt_need_resched();
}
static void check_cpu_stall(struct rcu_data *rdp)
{
unsigned long gs1;
unsigned long gs2;
unsigned long gps;
unsigned long j;
unsigned long jn;
unsigned long js;
struct rcu_node *rnp;
if ((rcu_cpu_stall_suppress && !rcu_kick_kthreads) ||
!rcu_gp_in_progress())
return;
rcu_stall_kick_kthreads();
j = jiffies;
/*
* Lots of memory barriers to reject false positives.
*
* The idea is to pick up rcu_state.gp_seq, then
* rcu_state.jiffies_stall, then rcu_state.gp_start, and finally
* another copy of rcu_state.gp_seq. These values are updated in
* the opposite order with memory barriers (or equivalent) during
* grace-period initialization and cleanup. Now, a false positive
* can occur if we get an new value of rcu_state.gp_start and a old
* value of rcu_state.jiffies_stall. But given the memory barriers,
* the only way that this can happen is if one grace period ends
* and another starts between these two fetches. This is detected
* by comparing the second fetch of rcu_state.gp_seq with the
* previous fetch from rcu_state.gp_seq.
*
* Given this check, comparisons of jiffies, rcu_state.jiffies_stall,
* and rcu_state.gp_start suffice to forestall false positives.
*/
gs1 = READ_ONCE(rcu_state.gp_seq);
smp_rmb(); /* Pick up ->gp_seq first... */
js = READ_ONCE(rcu_state.jiffies_stall);
smp_rmb(); /* ...then ->jiffies_stall before the rest... */
gps = READ_ONCE(rcu_state.gp_start);
smp_rmb(); /* ...and finally ->gp_start before ->gp_seq again. */
gs2 = READ_ONCE(rcu_state.gp_seq);
if (gs1 != gs2 ||
ULONG_CMP_LT(j, js) ||
ULONG_CMP_GE(gps, js))
return; /* No stall or GP completed since entering function. */
rnp = rdp->mynode;
jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3;
if (rcu_gp_in_progress() &&
(READ_ONCE(rnp->qsmask) & rdp->grpmask) &&
cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) {
/* We haven't checked in, so go dump stack. */
print_cpu_stall();
if (rcu_cpu_stall_ftrace_dump)
rcu_ftrace_dump(DUMP_ALL);
} else if (rcu_gp_in_progress() &&
ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY) &&
cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) {
/* They had a few time units to dump stack, so complain. */
print_other_cpu_stall(gs2);
if (rcu_cpu_stall_ftrace_dump)
rcu_ftrace_dump(DUMP_ALL);
}
}
//////////////////////////////////////////////////////////////////////////////
//
// RCU forward-progress mechanisms, including of callback invocation.
/*
* Show the state of the grace-period kthreads.
*/
void show_rcu_gp_kthreads(void)
{
int cpu;
unsigned long j;
unsigned long ja;
unsigned long jr;
unsigned long jw;
struct rcu_data *rdp;
struct rcu_node *rnp;
j = jiffies;
ja = j - READ_ONCE(rcu_state.gp_activity);
jr = j - READ_ONCE(rcu_state.gp_req_activity);
jw = j - READ_ONCE(rcu_state.gp_wake_time);
pr_info("%s: wait state: %s(%d) ->state: %#lx delta ->gp_activity %lu ->gp_req_activity %lu ->gp_wake_time %lu ->gp_wake_seq %ld ->gp_seq %ld ->gp_seq_needed %ld ->gp_flags %#x\n",
rcu_state.name, gp_state_getname(rcu_state.gp_state),
rcu_state.gp_state,
rcu_state.gp_kthread ? rcu_state.gp_kthread->state : 0x1ffffL,
ja, jr, jw, (long)READ_ONCE(rcu_state.gp_wake_seq),
(long)READ_ONCE(rcu_state.gp_seq),
(long)READ_ONCE(rcu_get_root()->gp_seq_needed),
READ_ONCE(rcu_state.gp_flags));
rcu_for_each_node_breadth_first(rnp) {
if (ULONG_CMP_GE(rcu_state.gp_seq, rnp->gp_seq_needed))
continue;
pr_info("\trcu_node %d:%d ->gp_seq %ld ->gp_seq_needed %ld\n",
rnp->grplo, rnp->grphi, (long)rnp->gp_seq,
(long)rnp->gp_seq_needed);
if (!rcu_is_leaf_node(rnp))
continue;
for_each_leaf_node_possible_cpu(rnp, cpu) {
rdp = per_cpu_ptr(&rcu_data, cpu);
if (rdp->gpwrap ||
ULONG_CMP_GE(rcu_state.gp_seq,
rdp->gp_seq_needed))
continue;
pr_info("\tcpu %d ->gp_seq_needed %ld\n",
cpu, (long)rdp->gp_seq_needed);
}
}
for_each_possible_cpu(cpu) {
rdp = per_cpu_ptr(&rcu_data, cpu);
if (rcu_segcblist_is_offloaded(&rdp->cblist))
show_rcu_nocb_state(rdp);
}
/* sched_show_task(rcu_state.gp_kthread); */
}
EXPORT_SYMBOL_GPL(show_rcu_gp_kthreads);
/*
* This function checks for grace-period requests that fail to motivate
* RCU to come out of its idle mode.
*/
static void rcu_check_gp_start_stall(struct rcu_node *rnp, struct rcu_data *rdp,
const unsigned long gpssdelay)
{
unsigned long flags;
unsigned long j;
struct rcu_node *rnp_root = rcu_get_root();
static atomic_t warned = ATOMIC_INIT(0);
if (!IS_ENABLED(CONFIG_PROVE_RCU) || rcu_gp_in_progress() ||
ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed))
return;
j = jiffies; /* Expensive access, and in common case don't get here. */
if (time_before(j, READ_ONCE(rcu_state.gp_req_activity) + gpssdelay) ||
time_before(j, READ_ONCE(rcu_state.gp_activity) + gpssdelay) ||
atomic_read(&warned))
return;
raw_spin_lock_irqsave_rcu_node(rnp, flags);
j = jiffies;
if (rcu_gp_in_progress() ||
ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed) ||
time_before(j, READ_ONCE(rcu_state.gp_req_activity) + gpssdelay) ||
time_before(j, READ_ONCE(rcu_state.gp_activity) + gpssdelay) ||
atomic_read(&warned)) {
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
return;
}
/* Hold onto the leaf lock to make others see warned==1. */
if (rnp_root != rnp)
raw_spin_lock_rcu_node(rnp_root); /* irqs already disabled. */
j = jiffies;
if (rcu_gp_in_progress() ||
ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed) ||
time_before(j, rcu_state.gp_req_activity + gpssdelay) ||
time_before(j, rcu_state.gp_activity + gpssdelay) ||
atomic_xchg(&warned, 1)) {
if (rnp_root != rnp)
/* irqs remain disabled. */
raw_spin_unlock_rcu_node(rnp_root);
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
return;
}
WARN_ON(1);
if (rnp_root != rnp)
raw_spin_unlock_rcu_node(rnp_root);
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
show_rcu_gp_kthreads();
}
/*
* Do a forward-progress check for rcutorture. This is normally invoked
* due to an OOM event. The argument "j" gives the time period during
* which rcutorture would like progress to have been made.
*/
void rcu_fwd_progress_check(unsigned long j)
{
unsigned long cbs;
int cpu;
unsigned long max_cbs = 0;
int max_cpu = -1;
struct rcu_data *rdp;
if (rcu_gp_in_progress()) {
pr_info("%s: GP age %lu jiffies\n",
__func__, jiffies - rcu_state.gp_start);
show_rcu_gp_kthreads();
} else {
pr_info("%s: Last GP end %lu jiffies ago\n",
__func__, jiffies - rcu_state.gp_end);
preempt_disable();
rdp = this_cpu_ptr(&rcu_data);
rcu_check_gp_start_stall(rdp->mynode, rdp, j);
preempt_enable();
}
for_each_possible_cpu(cpu) {
cbs = rcu_get_n_cbs_cpu(cpu);
if (!cbs)
continue;
if (max_cpu < 0)
pr_info("%s: callbacks", __func__);
pr_cont(" %d: %lu", cpu, cbs);
if (cbs <= max_cbs)
continue;
max_cbs = cbs;
max_cpu = cpu;
}
if (max_cpu >= 0)
pr_cont("\n");
}
EXPORT_SYMBOL_GPL(rcu_fwd_progress_check);
/* Commandeer a sysrq key to dump RCU's tree. */
static bool sysrq_rcu;
module_param(sysrq_rcu, bool, 0444);
/* Dump grace-period-request information due to commandeered sysrq. */
static void sysrq_show_rcu(int key)
{
show_rcu_gp_kthreads();
}
static struct sysrq_key_op sysrq_rcudump_op = {
.handler = sysrq_show_rcu,
.help_msg = "show-rcu(y)",
.action_msg = "Show RCU tree",
.enable_mask = SYSRQ_ENABLE_DUMP,
};
static int __init rcu_sysrq_init(void)
{
if (sysrq_rcu)
return register_sysrq_key('y', &sysrq_rcudump_op);
return 0;
}
early_initcall(rcu_sysrq_init);

View File

@ -51,6 +51,7 @@
#include <linux/kthread.h> #include <linux/kthread.h>
#include <linux/tick.h> #include <linux/tick.h>
#include <linux/rcupdate_wait.h> #include <linux/rcupdate_wait.h>
#include <linux/kprobes.h>
#define CREATE_TRACE_POINTS #define CREATE_TRACE_POINTS
@ -72,9 +73,15 @@ module_param(rcu_normal_after_boot, int, 0);
#ifdef CONFIG_DEBUG_LOCK_ALLOC #ifdef CONFIG_DEBUG_LOCK_ALLOC
/** /**
* rcu_read_lock_sched_held() - might we be in RCU-sched read-side critical section? * rcu_read_lock_held_common() - might we be in RCU-sched read-side critical section?
* @ret: Best guess answer if lockdep cannot be relied on
* *
* If CONFIG_DEBUG_LOCK_ALLOC is selected, returns nonzero iff in an * Returns true if lockdep must be ignored, in which case *ret contains
* the best guess described below. Otherwise returns false, in which
* case *ret tells the caller nothing and the caller should instead
* consult lockdep.
*
* If CONFIG_DEBUG_LOCK_ALLOC is selected, set *ret to nonzero iff in an
* RCU-sched read-side critical section. In absence of * RCU-sched read-side critical section. In absence of
* CONFIG_DEBUG_LOCK_ALLOC, this assumes we are in an RCU-sched read-side * CONFIG_DEBUG_LOCK_ALLOC, this assumes we are in an RCU-sched read-side
* critical section unless it can prove otherwise. Note that disabling * critical section unless it can prove otherwise. Note that disabling
@ -86,35 +93,45 @@ module_param(rcu_normal_after_boot, int, 0);
* Check debug_lockdep_rcu_enabled() to prevent false positives during boot * Check debug_lockdep_rcu_enabled() to prevent false positives during boot
* and while lockdep is disabled. * and while lockdep is disabled.
* *
* Note that if the CPU is in the idle loop from an RCU point of * Note that if the CPU is in the idle loop from an RCU point of view (ie:
* view (ie: that we are in the section between rcu_idle_enter() and * that we are in the section between rcu_idle_enter() and rcu_idle_exit())
* rcu_idle_exit()) then rcu_read_lock_held() returns false even if the CPU * then rcu_read_lock_held() sets *ret to false even if the CPU did an
* did an rcu_read_lock(). The reason for this is that RCU ignores CPUs * rcu_read_lock(). The reason for this is that RCU ignores CPUs that are
* that are in such a section, considering these as in extended quiescent * in such a section, considering these as in extended quiescent state,
* state, so such a CPU is effectively never in an RCU read-side critical * so such a CPU is effectively never in an RCU read-side critical section
* section regardless of what RCU primitives it invokes. This state of * regardless of what RCU primitives it invokes. This state of affairs is
* affairs is required --- we need to keep an RCU-free window in idle * required --- we need to keep an RCU-free window in idle where the CPU may
* where the CPU may possibly enter into low power mode. This way we can * possibly enter into low power mode. This way we can notice an extended
* notice an extended quiescent state to other CPUs that started a grace * quiescent state to other CPUs that started a grace period. Otherwise
* period. Otherwise we would delay any grace period as long as we run in * we would delay any grace period as long as we run in the idle task.
* the idle task.
* *
* Similarly, we avoid claiming an SRCU read lock held if the current * Similarly, we avoid claiming an RCU read lock held if the current
* CPU is offline. * CPU is offline.
*/ */
static bool rcu_read_lock_held_common(bool *ret)
{
if (!debug_lockdep_rcu_enabled()) {
*ret = 1;
return true;
}
if (!rcu_is_watching()) {
*ret = 0;
return true;
}
if (!rcu_lockdep_current_cpu_online()) {
*ret = 0;
return true;
}
return false;
}
int rcu_read_lock_sched_held(void) int rcu_read_lock_sched_held(void)
{ {
int lockdep_opinion = 0; bool ret;
if (!debug_lockdep_rcu_enabled()) if (rcu_read_lock_held_common(&ret))
return 1; return ret;
if (!rcu_is_watching()) return lock_is_held(&rcu_sched_lock_map) || !preemptible();
return 0;
if (!rcu_lockdep_current_cpu_online())
return 0;
if (debug_locks)
lockdep_opinion = lock_is_held(&rcu_sched_lock_map);
return lockdep_opinion || !preemptible();
} }
EXPORT_SYMBOL(rcu_read_lock_sched_held); EXPORT_SYMBOL(rcu_read_lock_sched_held);
#endif #endif
@ -147,8 +164,7 @@ static atomic_t rcu_expedited_nesting = ATOMIC_INIT(1);
*/ */
bool rcu_gp_is_expedited(void) bool rcu_gp_is_expedited(void)
{ {
return rcu_expedited || atomic_read(&rcu_expedited_nesting) || return rcu_expedited || atomic_read(&rcu_expedited_nesting);
rcu_scheduler_active == RCU_SCHEDULER_INIT;
} }
EXPORT_SYMBOL_GPL(rcu_gp_is_expedited); EXPORT_SYMBOL_GPL(rcu_gp_is_expedited);
@ -202,11 +218,7 @@ void rcu_test_sync_prims(void)
if (!IS_ENABLED(CONFIG_PROVE_RCU)) if (!IS_ENABLED(CONFIG_PROVE_RCU))
return; return;
synchronize_rcu(); synchronize_rcu();
synchronize_rcu_bh();
synchronize_sched();
synchronize_rcu_expedited(); synchronize_rcu_expedited();
synchronize_rcu_bh_expedited();
synchronize_sched_expedited();
} }
#if !defined(CONFIG_TINY_RCU) || defined(CONFIG_SRCU) #if !defined(CONFIG_TINY_RCU) || defined(CONFIG_SRCU)
@ -225,54 +237,6 @@ core_initcall(rcu_set_runtime_mode);
#endif /* #if !defined(CONFIG_TINY_RCU) || defined(CONFIG_SRCU) */ #endif /* #if !defined(CONFIG_TINY_RCU) || defined(CONFIG_SRCU) */
#ifdef CONFIG_PREEMPT_RCU
/*
* Preemptible RCU implementation for rcu_read_lock().
* Just increment ->rcu_read_lock_nesting, shared state will be updated
* if we block.
*/
void __rcu_read_lock(void)
{
current->rcu_read_lock_nesting++;
barrier(); /* critical section after entry code. */
}
EXPORT_SYMBOL_GPL(__rcu_read_lock);
/*
* Preemptible RCU implementation for rcu_read_unlock().
* Decrement ->rcu_read_lock_nesting. If the result is zero (outermost
* rcu_read_unlock()) and ->rcu_read_unlock_special is non-zero, then
* invoke rcu_read_unlock_special() to clean up after a context switch
* in an RCU read-side critical section and other special cases.
*/
void __rcu_read_unlock(void)
{
struct task_struct *t = current;
if (t->rcu_read_lock_nesting != 1) {
--t->rcu_read_lock_nesting;
} else {
barrier(); /* critical section before exit code. */
t->rcu_read_lock_nesting = INT_MIN;
barrier(); /* assign before ->rcu_read_unlock_special load */
if (unlikely(READ_ONCE(t->rcu_read_unlock_special.s)))
rcu_read_unlock_special(t);
barrier(); /* ->rcu_read_unlock_special load before assign */
t->rcu_read_lock_nesting = 0;
}
#ifdef CONFIG_PROVE_LOCKING
{
int rrln = READ_ONCE(t->rcu_read_lock_nesting);
WARN_ON_ONCE(rrln < 0 && rrln > INT_MIN / 2);
}
#endif /* #ifdef CONFIG_PROVE_LOCKING */
}
EXPORT_SYMBOL_GPL(__rcu_read_unlock);
#endif /* #ifdef CONFIG_PREEMPT_RCU */
#ifdef CONFIG_DEBUG_LOCK_ALLOC #ifdef CONFIG_DEBUG_LOCK_ALLOC
static struct lock_class_key rcu_lock_key; static struct lock_class_key rcu_lock_key;
struct lockdep_map rcu_lock_map = struct lockdep_map rcu_lock_map =
@ -300,6 +264,7 @@ int notrace debug_lockdep_rcu_enabled(void)
current->lockdep_recursion == 0; current->lockdep_recursion == 0;
} }
EXPORT_SYMBOL_GPL(debug_lockdep_rcu_enabled); EXPORT_SYMBOL_GPL(debug_lockdep_rcu_enabled);
NOKPROBE_SYMBOL(debug_lockdep_rcu_enabled);
/** /**
* rcu_read_lock_held() - might we be in RCU read-side critical section? * rcu_read_lock_held() - might we be in RCU read-side critical section?
@ -323,12 +288,10 @@ EXPORT_SYMBOL_GPL(debug_lockdep_rcu_enabled);
*/ */
int rcu_read_lock_held(void) int rcu_read_lock_held(void)
{ {
if (!debug_lockdep_rcu_enabled()) bool ret;
return 1;
if (!rcu_is_watching()) if (rcu_read_lock_held_common(&ret))
return 0; return ret;
if (!rcu_lockdep_current_cpu_online())
return 0;
return lock_is_held(&rcu_lock_map); return lock_is_held(&rcu_lock_map);
} }
EXPORT_SYMBOL_GPL(rcu_read_lock_held); EXPORT_SYMBOL_GPL(rcu_read_lock_held);
@ -345,21 +308,33 @@ EXPORT_SYMBOL_GPL(rcu_read_lock_held);
* *
* Check debug_lockdep_rcu_enabled() to prevent false positives during boot. * Check debug_lockdep_rcu_enabled() to prevent false positives during boot.
* *
* Note that rcu_read_lock() is disallowed if the CPU is either idle or * Note that rcu_read_lock_bh() is disallowed if the CPU is either idle or
* offline from an RCU perspective, so check for those as well. * offline from an RCU perspective, so check for those as well.
*/ */
int rcu_read_lock_bh_held(void) int rcu_read_lock_bh_held(void)
{ {
if (!debug_lockdep_rcu_enabled()) bool ret;
return 1;
if (!rcu_is_watching()) if (rcu_read_lock_held_common(&ret))
return 0; return ret;
if (!rcu_lockdep_current_cpu_online())
return 0;
return in_softirq() || irqs_disabled(); return in_softirq() || irqs_disabled();
} }
EXPORT_SYMBOL_GPL(rcu_read_lock_bh_held); EXPORT_SYMBOL_GPL(rcu_read_lock_bh_held);
int rcu_read_lock_any_held(void)
{
bool ret;
if (rcu_read_lock_held_common(&ret))
return ret;
if (lock_is_held(&rcu_lock_map) ||
lock_is_held(&rcu_bh_lock_map) ||
lock_is_held(&rcu_sched_lock_map))
return 1;
return !preemptible();
}
EXPORT_SYMBOL_GPL(rcu_read_lock_any_held);
#endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */ #endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
/** /**
@ -383,11 +358,10 @@ void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t *crcu_array,
int i; int i;
int j; int j;
/* Initialize and register callbacks for each flavor specified. */ /* Initialize and register callbacks for each crcu_array element. */
for (i = 0; i < n; i++) { for (i = 0; i < n; i++) {
if (checktiny && if (checktiny &&
(crcu_array[i] == call_rcu || (crcu_array[i] == call_rcu)) {
crcu_array[i] == call_rcu_bh)) {
might_sleep(); might_sleep();
continue; continue;
} }
@ -403,8 +377,7 @@ void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t *crcu_array,
/* Wait for all callbacks to be invoked. */ /* Wait for all callbacks to be invoked. */
for (i = 0; i < n; i++) { for (i = 0; i < n; i++) {
if (checktiny && if (checktiny &&
(crcu_array[i] == call_rcu || (crcu_array[i] == call_rcu))
crcu_array[i] == call_rcu_bh))
continue; continue;
for (j = 0; j < i; j++) for (j = 0; j < i; j++)
if (crcu_array[j] == crcu_array[i]) if (crcu_array[j] == crcu_array[i])
@ -487,80 +460,41 @@ EXPORT_SYMBOL_GPL(do_trace_rcu_torture_read);
do { } while (0) do { } while (0)
#endif #endif
#ifdef CONFIG_RCU_STALL_COMMON #if IS_ENABLED(CONFIG_RCU_TORTURE_TEST) || IS_MODULE(CONFIG_RCU_TORTURE_TEST)
/* Get rcutorture access to sched_setaffinity(). */
long rcutorture_sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
{
int ret;
#ifdef CONFIG_PROVE_RCU ret = sched_setaffinity(pid, in_mask);
#define RCU_STALL_DELAY_DELTA (5 * HZ) WARN_ONCE(ret, "%s: sched_setaffinity() returned %d\n", __func__, ret);
#else return ret;
#define RCU_STALL_DELAY_DELTA 0 }
EXPORT_SYMBOL_GPL(rcutorture_sched_setaffinity);
#endif #endif
#ifdef CONFIG_RCU_STALL_COMMON
int rcu_cpu_stall_ftrace_dump __read_mostly;
module_param(rcu_cpu_stall_ftrace_dump, int, 0644);
int rcu_cpu_stall_suppress __read_mostly; /* 1 = suppress stall warnings. */ int rcu_cpu_stall_suppress __read_mostly; /* 1 = suppress stall warnings. */
static int rcu_cpu_stall_timeout __read_mostly = CONFIG_RCU_CPU_STALL_TIMEOUT; EXPORT_SYMBOL_GPL(rcu_cpu_stall_suppress);
module_param(rcu_cpu_stall_suppress, int, 0644); module_param(rcu_cpu_stall_suppress, int, 0644);
int rcu_cpu_stall_timeout __read_mostly = CONFIG_RCU_CPU_STALL_TIMEOUT;
module_param(rcu_cpu_stall_timeout, int, 0644); module_param(rcu_cpu_stall_timeout, int, 0644);
int rcu_jiffies_till_stall_check(void)
{
int till_stall_check = READ_ONCE(rcu_cpu_stall_timeout);
/*
* Limit check must be consistent with the Kconfig limits
* for CONFIG_RCU_CPU_STALL_TIMEOUT.
*/
if (till_stall_check < 3) {
WRITE_ONCE(rcu_cpu_stall_timeout, 3);
till_stall_check = 3;
} else if (till_stall_check > 300) {
WRITE_ONCE(rcu_cpu_stall_timeout, 300);
till_stall_check = 300;
}
return till_stall_check * HZ + RCU_STALL_DELAY_DELTA;
}
void rcu_sysrq_start(void)
{
if (!rcu_cpu_stall_suppress)
rcu_cpu_stall_suppress = 2;
}
void rcu_sysrq_end(void)
{
if (rcu_cpu_stall_suppress == 2)
rcu_cpu_stall_suppress = 0;
}
static int rcu_panic(struct notifier_block *this, unsigned long ev, void *ptr)
{
rcu_cpu_stall_suppress = 1;
return NOTIFY_DONE;
}
static struct notifier_block rcu_panic_block = {
.notifier_call = rcu_panic,
};
static int __init check_cpu_stall_init(void)
{
atomic_notifier_chain_register(&panic_notifier_list, &rcu_panic_block);
return 0;
}
early_initcall(check_cpu_stall_init);
#endif /* #ifdef CONFIG_RCU_STALL_COMMON */ #endif /* #ifdef CONFIG_RCU_STALL_COMMON */
#ifdef CONFIG_TASKS_RCU #ifdef CONFIG_TASKS_RCU
/* /*
* Simple variant of RCU whose quiescent states are voluntary context switch, * Simple variant of RCU whose quiescent states are voluntary context
* user-space execution, and idle. As such, grace periods can take one good * switch, cond_resched_rcu_qs(), user-space execution, and idle.
* long time. There are no read-side primitives similar to rcu_read_lock() * As such, grace periods can take one good long time. There are no
* and rcu_read_unlock() because this implementation is intended to get * read-side primitives similar to rcu_read_lock() and rcu_read_unlock()
* the system into a safe state for some of the manipulations involved in * because this implementation is intended to get the system into a safe
* tracing and the like. Finally, this implementation does not support * state for some of the manipulations involved in tracing and the like.
* high call_rcu_tasks() rates from multiple CPUs. If this is required, * Finally, this implementation does not support high call_rcu_tasks()
* per-CPU callback lists will be needed. * rates from multiple CPUs. If this is required, per-CPU callback lists
* will be needed.
*/ */
/* Global list of callbacks and associated lock. */ /* Global list of callbacks and associated lock. */
@ -577,7 +511,6 @@ DEFINE_STATIC_SRCU(tasks_rcu_exit_srcu);
static int rcu_task_stall_timeout __read_mostly = RCU_TASK_STALL_TIMEOUT; static int rcu_task_stall_timeout __read_mostly = RCU_TASK_STALL_TIMEOUT;
module_param(rcu_task_stall_timeout, int, 0644); module_param(rcu_task_stall_timeout, int, 0644);
static void rcu_spawn_tasks_kthread(void);
static struct task_struct *rcu_tasks_kthread_ptr; static struct task_struct *rcu_tasks_kthread_ptr;
/** /**
@ -589,11 +522,11 @@ static struct task_struct *rcu_tasks_kthread_ptr;
* period elapses, in other words after all currently executing RCU * period elapses, in other words after all currently executing RCU
* read-side critical sections have completed. call_rcu_tasks() assumes * read-side critical sections have completed. call_rcu_tasks() assumes
* that the read-side critical sections end at a voluntary context * that the read-side critical sections end at a voluntary context
* switch (not a preemption!), entry into idle, or transition to usermode * switch (not a preemption!), cond_resched_rcu_qs(), entry into idle,
* execution. As such, there are no read-side primitives analogous to * or transition to usermode execution. As such, there are no read-side
* rcu_read_lock() and rcu_read_unlock() because this primitive is intended * primitives analogous to rcu_read_lock() and rcu_read_unlock() because
* to determine that all tasks have passed through a safe state, not so * this primitive is intended to determine that all tasks have passed
* much for data-strcuture synchronization. * through a safe state, not so much for data-strcuture synchronization.
* *
* See the description of call_rcu() for more detailed information on * See the description of call_rcu() for more detailed information on
* memory ordering guarantees. * memory ordering guarantees.
@ -602,7 +535,6 @@ void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func)
{ {
unsigned long flags; unsigned long flags;
bool needwake; bool needwake;
bool havetask = READ_ONCE(rcu_tasks_kthread_ptr);
rhp->next = NULL; rhp->next = NULL;
rhp->func = func; rhp->func = func;
@ -612,11 +544,8 @@ void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func)
rcu_tasks_cbs_tail = &rhp->next; rcu_tasks_cbs_tail = &rhp->next;
raw_spin_unlock_irqrestore(&rcu_tasks_cbs_lock, flags); raw_spin_unlock_irqrestore(&rcu_tasks_cbs_lock, flags);
/* We can't create the thread unless interrupts are enabled. */ /* We can't create the thread unless interrupts are enabled. */
if ((needwake && havetask) || if (needwake && READ_ONCE(rcu_tasks_kthread_ptr))
(!havetask && !irqs_disabled_flags(flags))) {
rcu_spawn_tasks_kthread();
wake_up(&rcu_tasks_cbs_wq); wake_up(&rcu_tasks_cbs_wq);
}
} }
EXPORT_SYMBOL_GPL(call_rcu_tasks); EXPORT_SYMBOL_GPL(call_rcu_tasks);
@ -627,7 +556,7 @@ EXPORT_SYMBOL_GPL(call_rcu_tasks);
* grace period has elapsed, in other words after all currently * grace period has elapsed, in other words after all currently
* executing rcu-tasks read-side critical sections have elapsed. These * executing rcu-tasks read-side critical sections have elapsed. These
* read-side critical sections are delimited by calls to schedule(), * read-side critical sections are delimited by calls to schedule(),
* cond_resched_rcu_qs(), idle execution, userspace execution, calls * cond_resched_tasks_rcu_qs(), idle execution, userspace execution, calls
* to synchronize_rcu_tasks(), and (in theory, anyway) cond_resched(). * to synchronize_rcu_tasks(), and (in theory, anyway) cond_resched().
* *
* This is a very specialized primitive, intended only for a few uses in * This is a very specialized primitive, intended only for a few uses in
@ -750,19 +679,19 @@ static int __noreturn rcu_tasks_kthread(void *arg)
/* /*
* Wait for all pre-existing t->on_rq and t->nvcsw * Wait for all pre-existing t->on_rq and t->nvcsw
* transitions to complete. Invoking synchronize_sched() * transitions to complete. Invoking synchronize_rcu()
* suffices because all these transitions occur with * suffices because all these transitions occur with
* interrupts disabled. Without this synchronize_sched(), * interrupts disabled. Without this synchronize_rcu(),
* a read-side critical section that started before the * a read-side critical section that started before the
* grace period might be incorrectly seen as having started * grace period might be incorrectly seen as having started
* after the grace period. * after the grace period.
* *
* This synchronize_sched() also dispenses with the * This synchronize_rcu() also dispenses with the
* need for a memory barrier on the first store to * need for a memory barrier on the first store to
* ->rcu_tasks_holdout, as it forces the store to happen * ->rcu_tasks_holdout, as it forces the store to happen
* after the beginning of the grace period. * after the beginning of the grace period.
*/ */
synchronize_sched(); synchronize_rcu();
/* /*
* There were callbacks, so we need to wait for an * There were callbacks, so we need to wait for an
@ -789,7 +718,7 @@ static int __noreturn rcu_tasks_kthread(void *arg)
* This does only part of the job, ensuring that all * This does only part of the job, ensuring that all
* tasks that were previously exiting reach the point * tasks that were previously exiting reach the point
* where they have disabled preemption, allowing the * where they have disabled preemption, allowing the
* later synchronize_sched() to finish the job. * later synchronize_rcu() to finish the job.
*/ */
synchronize_srcu(&tasks_rcu_exit_srcu); synchronize_srcu(&tasks_rcu_exit_srcu);
@ -827,20 +756,20 @@ static int __noreturn rcu_tasks_kthread(void *arg)
* cause their RCU-tasks read-side critical sections to * cause their RCU-tasks read-side critical sections to
* extend past the end of the grace period. However, * extend past the end of the grace period. However,
* because these ->nvcsw updates are carried out with * because these ->nvcsw updates are carried out with
* interrupts disabled, we can use synchronize_sched() * interrupts disabled, we can use synchronize_rcu()
* to force the needed ordering on all such CPUs. * to force the needed ordering on all such CPUs.
* *
* This synchronize_sched() also confines all * This synchronize_rcu() also confines all
* ->rcu_tasks_holdout accesses to be within the grace * ->rcu_tasks_holdout accesses to be within the grace
* period, avoiding the need for memory barriers for * period, avoiding the need for memory barriers for
* ->rcu_tasks_holdout accesses. * ->rcu_tasks_holdout accesses.
* *
* In addition, this synchronize_sched() waits for exiting * In addition, this synchronize_rcu() waits for exiting
* tasks to complete their final preempt_disable() region * tasks to complete their final preempt_disable() region
* of execution, cleaning up after the synchronize_srcu() * of execution, cleaning up after the synchronize_srcu()
* above. * above.
*/ */
synchronize_sched(); synchronize_rcu();
/* Invoke the callbacks. */ /* Invoke the callbacks. */
while (list) { while (list) {
@ -851,31 +780,24 @@ static int __noreturn rcu_tasks_kthread(void *arg)
list = next; list = next;
cond_resched(); cond_resched();
} }
/* Paranoid sleep to keep this from entering a tight loop */
schedule_timeout_uninterruptible(HZ/10); schedule_timeout_uninterruptible(HZ/10);
} }
} }
/* Spawn rcu_tasks_kthread() at first call to call_rcu_tasks(). */ /* Spawn rcu_tasks_kthread() at core_initcall() time. */
static void rcu_spawn_tasks_kthread(void) static int __init rcu_spawn_tasks_kthread(void)
{ {
static DEFINE_MUTEX(rcu_tasks_kthread_mutex);
struct task_struct *t; struct task_struct *t;
if (READ_ONCE(rcu_tasks_kthread_ptr)) {
smp_mb(); /* Ensure caller sees full kthread. */
return;
}
mutex_lock(&rcu_tasks_kthread_mutex);
if (rcu_tasks_kthread_ptr) {
mutex_unlock(&rcu_tasks_kthread_mutex);
return;
}
t = kthread_run(rcu_tasks_kthread, NULL, "rcu_tasks_kthread"); t = kthread_run(rcu_tasks_kthread, NULL, "rcu_tasks_kthread");
BUG_ON(IS_ERR(t)); if (WARN_ONCE(IS_ERR(t), "%s: Could not start Tasks-RCU grace-period kthread, OOM is now expected behavior\n", __func__))
return 0;
smp_mb(); /* Ensure others see full kthread. */ smp_mb(); /* Ensure others see full kthread. */
WRITE_ONCE(rcu_tasks_kthread_ptr, t); WRITE_ONCE(rcu_tasks_kthread_ptr, t);
mutex_unlock(&rcu_tasks_kthread_mutex); return 0;
} }
core_initcall(rcu_spawn_tasks_kthread);
/* Do the srcu_read_lock() for the above synchronize_srcu(). */ /* Do the srcu_read_lock() for the above synchronize_srcu(). */
void exit_tasks_rcu_start(void) void exit_tasks_rcu_start(void)
@ -915,15 +837,10 @@ static void __init rcu_tasks_bootup_oddness(void)
#ifdef CONFIG_PROVE_RCU #ifdef CONFIG_PROVE_RCU
/* /*
* Early boot self test parameters, one for each flavor * Early boot self test parameters.
*/ */
static bool rcu_self_test; static bool rcu_self_test;
static bool rcu_self_test_bh;
static bool rcu_self_test_sched;
module_param(rcu_self_test, bool, 0444); module_param(rcu_self_test, bool, 0444);
module_param(rcu_self_test_bh, bool, 0444);
module_param(rcu_self_test_sched, bool, 0444);
static int rcu_self_test_counter; static int rcu_self_test_counter;
@ -933,25 +850,16 @@ static void test_callback(struct rcu_head *r)
pr_info("RCU test callback executed %d\n", rcu_self_test_counter); pr_info("RCU test callback executed %d\n", rcu_self_test_counter);
} }
DEFINE_STATIC_SRCU(early_srcu);
static void early_boot_test_call_rcu(void) static void early_boot_test_call_rcu(void)
{ {
static struct rcu_head head; static struct rcu_head head;
static struct rcu_head shead;
call_rcu(&head, test_callback); call_rcu(&head, test_callback);
} if (IS_ENABLED(CONFIG_SRCU))
call_srcu(&early_srcu, &shead, test_callback);
static void early_boot_test_call_rcu_bh(void)
{
static struct rcu_head head;
call_rcu_bh(&head, test_callback);
}
static void early_boot_test_call_rcu_sched(void)
{
static struct rcu_head head;
call_rcu_sched(&head, test_callback);
} }
void rcu_early_boot_tests(void) void rcu_early_boot_tests(void)
@ -960,10 +868,6 @@ void rcu_early_boot_tests(void)
if (rcu_self_test) if (rcu_self_test)
early_boot_test_call_rcu(); early_boot_test_call_rcu();
if (rcu_self_test_bh)
early_boot_test_call_rcu_bh();
if (rcu_self_test_sched)
early_boot_test_call_rcu_sched();
rcu_test_sync_prims(); rcu_test_sync_prims();
} }
@ -975,16 +879,11 @@ static int rcu_verify_early_boot_tests(void)
if (rcu_self_test) { if (rcu_self_test) {
early_boot_test_counter++; early_boot_test_counter++;
rcu_barrier(); rcu_barrier();
if (IS_ENABLED(CONFIG_SRCU)) {
early_boot_test_counter++;
srcu_barrier(&early_srcu);
}
} }
if (rcu_self_test_bh) {
early_boot_test_counter++;
rcu_barrier_bh();
}
if (rcu_self_test_sched) {
early_boot_test_counter++;
rcu_barrier_sched();
}
if (rcu_self_test_counter != early_boot_test_counter) { if (rcu_self_test_counter != early_boot_test_counter) {
WARN_ON(1); WARN_ON(1);
ret = -1; ret = -1;

View File

@ -5220,6 +5220,7 @@ int __sched _cond_resched(void)
preempt_schedule_common(); preempt_schedule_common();
return 1; return 1;
} }
rcu_all_qs();
return 0; return 0;
} }
EXPORT_SYMBOL(_cond_resched); EXPORT_SYMBOL(_cond_resched);
@ -5543,6 +5544,7 @@ void sched_show_task(struct task_struct *p)
show_stack(p, NULL); show_stack(p, NULL);
put_task_stack(p); put_task_stack(p);
} }
EXPORT_SYMBOL_GPL(sched_show_task);
static inline bool static inline bool
state_filter_match(unsigned long state_filter, struct task_struct *p) state_filter_match(unsigned long state_filter, struct task_struct *p)
@ -6326,10 +6328,6 @@ int sched_cpu_deactivate(unsigned int cpu)
* *
* Do sync before park smpboot threads to take care the rcu boost case. * Do sync before park smpboot threads to take care the rcu boost case.
*/ */
#ifdef CONFIG_PREEMPT
synchronize_sched();
#endif
synchronize_rcu(); synchronize_rcu();
#ifdef CONFIG_SCHED_SMT #ifdef CONFIG_SCHED_SMT

View File

@ -52,8 +52,8 @@ EXPORT_SYMBOL_GPL(cpufreq_add_update_util_hook);
* *
* Clear the update_util_data pointer for the given CPU. * Clear the update_util_data pointer for the given CPU.
* *
* Callers must use RCU-sched callbacks to free any memory that might be * Callers must use RCU callbacks to free any memory that might be
* accessed via the old update_util_data pointer or invoke synchronize_sched() * accessed via the old update_util_data pointer or invoke synchronize_rcu()
* right after this function to avoid use-after-free. * right after this function to avoid use-after-free.
*/ */
void cpufreq_remove_update_util_hook(int cpu) void cpufreq_remove_update_util_hook(int cpu)

View File

@ -1075,7 +1075,7 @@ static void sugov_stop(struct cpufreq_policy *policy)
for_each_cpu(cpu, policy->cpus) for_each_cpu(cpu, policy->cpus)
cpufreq_remove_update_util_hook(cpu); cpufreq_remove_update_util_hook(cpu);
synchronize_sched(); synchronize_rcu();
if (!policy->fast_switch_enabled) { if (!policy->fast_switch_enabled) {
irq_work_sync(&sg_policy->irq_work); irq_work_sync(&sg_policy->irq_work);

View File

@ -1648,7 +1648,7 @@ static void task_numa_compare(struct task_numa_env *env,
int dist = env->dist; int dist = env->dist;
rcu_read_lock(); rcu_read_lock();
cur = task_rcu_dereference(&dst_rq->curr); cur = rcu_dereference(dst_rq->curr);
if (cur && ((cur->flags & PF_EXITING) || is_idle_task(cur))) if (cur && ((cur->flags & PF_EXITING) || is_idle_task(cur)))
cur = NULL; cur = NULL;

View File

@ -79,7 +79,7 @@ static int membarrier_private_expedited(void)
if (cpu == raw_smp_processor_id()) if (cpu == raw_smp_processor_id())
continue; continue;
rcu_read_lock(); rcu_read_lock();
p = task_rcu_dereference(&cpu_rq(cpu)->curr); p = rcu_dereference(cpu_rq(cpu)->curr);
if (p && p->mm == current->mm) { if (p && p->mm == current->mm) {
if (!fallback) if (!fallback)
__cpumask_set_cpu(cpu, tmpmask); __cpumask_set_cpu(cpu, tmpmask);
@ -167,7 +167,7 @@ SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
if (tick_nohz_full_enabled()) if (tick_nohz_full_enabled())
return -EINVAL; return -EINVAL;
if (num_online_cpus() > 1) if (num_online_cpus() > 1)
synchronize_sched(); synchronize_rcu();
return 0; return 0;
case MEMBARRIER_CMD_PRIVATE_EXPEDITED: case MEMBARRIER_CMD_PRIVATE_EXPEDITED:
return membarrier_private_expedited(); return membarrier_private_expedited();

View File

@ -1248,7 +1248,7 @@ extern void sched_ttwu_pending(void);
/* /*
* The domain tree (rq->sd) is protected by RCU's quiescent state transition. * The domain tree (rq->sd) is protected by RCU's quiescent state transition.
* See detach_destroy_domains: synchronize_sched for details. * See destroy_sched_domains: call_rcu for details.
* *
* The domain tree of any CPU may only be accessed from within * The domain tree of any CPU may only be accessed from within
* preempt-disabled sections. * preempt-disabled sections.

View File

@ -262,7 +262,7 @@ void rq_attach_root(struct rq *rq, struct root_domain *rd)
raw_spin_unlock_irqrestore(&rq->lock, flags); raw_spin_unlock_irqrestore(&rq->lock, flags);
if (old_rd) if (old_rd)
call_rcu_sched(&old_rd->rcu, free_rootdomain); call_rcu(&old_rd->rcu, free_rootdomain);
} }
void sched_get_rd(struct root_domain *rd) void sched_get_rd(struct root_domain *rd)
@ -275,7 +275,7 @@ void sched_put_rd(struct root_domain *rd)
if (!atomic_dec_and_test(&rd->refcount)) if (!atomic_dec_and_test(&rd->refcount))
return; return;
call_rcu_sched(&rd->rcu, free_rootdomain); call_rcu(&rd->rcu, free_rootdomain);
} }
static int init_rootdomain(struct root_domain *rd) static int init_rootdomain(struct root_domain *rd)

View File

@ -302,7 +302,8 @@ restart:
} }
__this_cpu_write(active_softirqs, 0); __this_cpu_write(active_softirqs, 0);
rcu_bh_qs(); if (__this_cpu_read(ksoftirqd) == current)
rcu_softirq_qs();
local_irq_disable(); local_irq_disable();
pending = local_softirq_pending(); pending = local_softirq_pending();
@ -677,7 +678,7 @@ static void run_ksoftirqd(unsigned int cpu)
*/ */
__do_softirq(); __do_softirq();
local_irq_enable(); local_irq_enable();
cond_resched_rcu_qs(); cond_resched_tasks_rcu_qs();
return; return;
} }
local_irq_enable(); local_irq_enable();

View File

@ -949,8 +949,11 @@ static struct timer_base *lock_timer_base(struct timer_list *timer,
} }
} }
#define MOD_TIMER_PENDING_ONLY 0x01
#define MOD_TIMER_REDUCE 0x02
static inline int static inline int
__mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only) __mod_timer(struct timer_list *timer, unsigned long expires, unsigned int options)
{ {
struct timer_base *base, *new_base; struct timer_base *base, *new_base;
unsigned int idx = UINT_MAX; unsigned int idx = UINT_MAX;
@ -970,7 +973,11 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
* larger granularity than you would get from adding a new * larger granularity than you would get from adding a new
* timer with this expiry. * timer with this expiry.
*/ */
if (timer->expires == expires) long diff = timer->expires - expires;
if (!diff)
return 1;
if (options & MOD_TIMER_REDUCE && diff <= 0)
return 1; return 1;
/* /*
@ -982,6 +989,12 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
base = lock_timer_base(timer, &flags); base = lock_timer_base(timer, &flags);
forward_timer_base(base); forward_timer_base(base);
if (timer_pending(timer) && (options & MOD_TIMER_REDUCE) &&
time_before_eq(timer->expires, expires)) {
ret = 1;
goto out_unlock;
}
clk = base->clk; clk = base->clk;
idx = calc_wheel_index(expires, clk); idx = calc_wheel_index(expires, clk);
@ -991,7 +1004,10 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
* subsequent call will exit in the expires check above. * subsequent call will exit in the expires check above.
*/ */
if (idx == timer_get_idx(timer)) { if (idx == timer_get_idx(timer)) {
timer->expires = expires; if (!(options & MOD_TIMER_REDUCE))
timer->expires = expires;
else if (time_after(timer->expires, expires))
timer->expires = expires;
ret = 1; ret = 1;
goto out_unlock; goto out_unlock;
} }
@ -1001,7 +1017,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only)
} }
ret = detach_if_pending(timer, base, false); ret = detach_if_pending(timer, base, false);
if (!ret && pending_only) if (!ret && (options & MOD_TIMER_PENDING_ONLY))
goto out_unlock; goto out_unlock;
new_base = get_target_base(base, timer->flags); new_base = get_target_base(base, timer->flags);
@ -1062,7 +1078,7 @@ out_unlock:
*/ */
int mod_timer_pending(struct timer_list *timer, unsigned long expires) int mod_timer_pending(struct timer_list *timer, unsigned long expires)
{ {
return __mod_timer(timer, expires, true); return __mod_timer(timer, expires, MOD_TIMER_PENDING_ONLY);
} }
EXPORT_SYMBOL(mod_timer_pending); EXPORT_SYMBOL(mod_timer_pending);
@ -1088,10 +1104,25 @@ EXPORT_SYMBOL(mod_timer_pending);
*/ */
int mod_timer(struct timer_list *timer, unsigned long expires) int mod_timer(struct timer_list *timer, unsigned long expires)
{ {
return __mod_timer(timer, expires, false); return __mod_timer(timer, expires, 0);
} }
EXPORT_SYMBOL(mod_timer); EXPORT_SYMBOL(mod_timer);
/**
* timer_reduce - Modify a timer's timeout if it would reduce the timeout
* @timer: The timer to be modified
* @expires: New timeout in jiffies
*
* timer_reduce() is very similar to mod_timer(), except that it will only
* modify a running timer if that would reduce the expiration time (it will
* start a timer that isn't running).
*/
int timer_reduce(struct timer_list *timer, unsigned long expires)
{
return __mod_timer(timer, expires, MOD_TIMER_REDUCE);
}
EXPORT_SYMBOL(timer_reduce);
/** /**
* add_timer - start a timer * add_timer - start a timer
* @timer: the timer to be added * @timer: the timer to be added
@ -1641,7 +1672,7 @@ void update_process_times(int user_tick)
/* Note: this timer irq context must be accounted for as well. */ /* Note: this timer irq context must be accounted for as well. */
account_process_tick(p, user_tick); account_process_tick(p, user_tick);
run_local_timers(); run_local_timers();
rcu_check_callbacks(user_tick); rcu_sched_clock_irq(user_tick);
#ifdef CONFIG_IRQ_WORK #ifdef CONFIG_IRQ_WORK
if (in_irq()) if (in_irq())
irq_work_tick(); irq_work_tick();
@ -1731,9 +1762,20 @@ void run_local_timers(void)
raise_softirq(TIMER_SOFTIRQ); raise_softirq(TIMER_SOFTIRQ);
} }
static void process_timeout(unsigned long __data) /*
* Since schedule_timeout()'s timer is defined on the stack, it must store
* the target task on the stack as well.
*/
struct process_timer {
struct timer_list timer;
struct task_struct *task;
};
static void process_timeout(struct timer_list *t)
{ {
wake_up_process((struct task_struct *)__data); struct process_timer *timeout = from_timer(timeout, t, timer);
wake_up_process(timeout->task);
} }
/** /**
@ -1767,7 +1809,7 @@ static void process_timeout(unsigned long __data)
*/ */
signed long __sched schedule_timeout(signed long timeout) signed long __sched schedule_timeout(signed long timeout)
{ {
struct timer_list timer; struct process_timer timer;
unsigned long expire; unsigned long expire;
switch (timeout) switch (timeout)
@ -1801,13 +1843,14 @@ signed long __sched schedule_timeout(signed long timeout)
expire = timeout + jiffies; expire = timeout + jiffies;
setup_timer_on_stack(&timer, process_timeout, (unsigned long)current); timer.task = current;
__mod_timer(&timer, expire, false); timer_setup_on_stack(&timer.timer, process_timeout, 0);
__mod_timer(&timer.timer, expire, 0);
schedule(); schedule();
del_singleshot_timer_sync(&timer); del_singleshot_timer_sync(&timer.timer);
/* Remove the timer from the object tracker */ /* Remove the timer from the object tracker */
destroy_timer_on_stack(&timer); destroy_timer_on_stack(&timer.timer);
timeout = expire - jiffies; timeout = expire - jiffies;

View File

@ -20,6 +20,9 @@
* Author: Paul E. McKenney <paulmck@us.ibm.com> * Author: Paul E. McKenney <paulmck@us.ibm.com>
* Based on kernel/rcu/torture.c. * Based on kernel/rcu/torture.c.
*/ */
#define pr_fmt(fmt) fmt
#include <linux/types.h> #include <linux/types.h>
#include <linux/kernel.h> #include <linux/kernel.h>
#include <linux/init.h> #include <linux/init.h>
@ -47,12 +50,13 @@
#include <linux/ktime.h> #include <linux/ktime.h>
#include <asm/byteorder.h> #include <asm/byteorder.h>
#include <linux/torture.h> #include <linux/torture.h>
#include "rcu/rcu.h"
MODULE_LICENSE("GPL"); MODULE_LICENSE("GPL");
MODULE_AUTHOR("Paul E. McKenney <paulmck@us.ibm.com>"); MODULE_AUTHOR("Paul E. McKenney <paulmck@us.ibm.com>");
static char *torture_type; static char *torture_type;
static bool verbose; static int verbose;
/* Mediate rmmod and system shutdown. Concurrent rmmod & shutdown illegal! */ /* Mediate rmmod and system shutdown. Concurrent rmmod & shutdown illegal! */
#define FULLSTOP_DONTSTOP 0 /* Normal operation. */ #define FULLSTOP_DONTSTOP 0 /* Normal operation. */
@ -60,7 +64,6 @@ static bool verbose;
#define FULLSTOP_RMMOD 2 /* Normal rmmod of torture. */ #define FULLSTOP_RMMOD 2 /* Normal rmmod of torture. */
static int fullstop = FULLSTOP_RMMOD; static int fullstop = FULLSTOP_RMMOD;
static DEFINE_MUTEX(fullstop_mutex); static DEFINE_MUTEX(fullstop_mutex);
static int *torture_runnable;
#ifdef CONFIG_HOTPLUG_CPU #ifdef CONFIG_HOTPLUG_CPU
@ -72,6 +75,7 @@ static int *torture_runnable;
static struct task_struct *onoff_task; static struct task_struct *onoff_task;
static long onoff_holdoff; static long onoff_holdoff;
static long onoff_interval; static long onoff_interval;
static torture_ofl_func *onoff_f;
static long n_offline_attempts; static long n_offline_attempts;
static long n_offline_successes; static long n_offline_successes;
static unsigned long sum_offline; static unsigned long sum_offline;
@ -97,8 +101,10 @@ bool torture_offline(int cpu, long *n_offl_attempts, long *n_offl_successes,
if (!cpu_online(cpu) || !cpu_is_hotpluggable(cpu)) if (!cpu_online(cpu) || !cpu_is_hotpluggable(cpu))
return false; return false;
if (num_online_cpus() <= 1)
return false; /* Can't offline the last CPU. */
if (verbose) if (verbose > 1)
pr_alert("%s" TORTURE_FLAG pr_alert("%s" TORTURE_FLAG
"torture_onoff task: offlining %d\n", "torture_onoff task: offlining %d\n",
torture_type, cpu); torture_type, cpu);
@ -111,10 +117,12 @@ bool torture_offline(int cpu, long *n_offl_attempts, long *n_offl_successes,
"torture_onoff task: offline %d failed: errno %d\n", "torture_onoff task: offline %d failed: errno %d\n",
torture_type, cpu, ret); torture_type, cpu, ret);
} else { } else {
if (verbose) if (verbose > 1)
pr_alert("%s" TORTURE_FLAG pr_alert("%s" TORTURE_FLAG
"torture_onoff task: offlined %d\n", "torture_onoff task: offlined %d\n",
torture_type, cpu); torture_type, cpu);
if (onoff_f)
onoff_f();
(*n_offl_successes)++; (*n_offl_successes)++;
delta = jiffies - starttime; delta = jiffies - starttime;
*sum_offl += delta; *sum_offl += delta;
@ -147,7 +155,7 @@ bool torture_online(int cpu, long *n_onl_attempts, long *n_onl_successes,
if (cpu_online(cpu) || !cpu_is_hotpluggable(cpu)) if (cpu_online(cpu) || !cpu_is_hotpluggable(cpu))
return false; return false;
if (verbose) if (verbose > 1)
pr_alert("%s" TORTURE_FLAG pr_alert("%s" TORTURE_FLAG
"torture_onoff task: onlining %d\n", "torture_onoff task: onlining %d\n",
torture_type, cpu); torture_type, cpu);
@ -160,7 +168,7 @@ bool torture_online(int cpu, long *n_onl_attempts, long *n_onl_successes,
"torture_onoff task: online %d failed: errno %d\n", "torture_onoff task: online %d failed: errno %d\n",
torture_type, cpu, ret); torture_type, cpu, ret);
} else { } else {
if (verbose) if (verbose > 1)
pr_alert("%s" TORTURE_FLAG pr_alert("%s" TORTURE_FLAG
"torture_onoff task: onlined %d\n", "torture_onoff task: onlined %d\n",
torture_type, cpu); torture_type, cpu);
@ -191,11 +199,23 @@ torture_onoff(void *arg)
int cpu; int cpu;
int maxcpu = -1; int maxcpu = -1;
DEFINE_TORTURE_RANDOM(rand); DEFINE_TORTURE_RANDOM(rand);
int ret;
VERBOSE_TOROUT_STRING("torture_onoff task started"); VERBOSE_TOROUT_STRING("torture_onoff task started");
for_each_online_cpu(cpu) for_each_online_cpu(cpu)
maxcpu = cpu; maxcpu = cpu;
WARN_ON(maxcpu < 0); WARN_ON(maxcpu < 0);
if (!IS_MODULE(CONFIG_TORTURE_TEST))
for_each_possible_cpu(cpu) {
if (cpu_online(cpu))
continue;
ret = cpu_up(cpu);
if (ret && verbose) {
pr_alert("%s" TORTURE_FLAG
"%s: Initial online %d: errno %d\n",
__func__, torture_type, cpu, ret);
}
}
if (maxcpu == 0) { if (maxcpu == 0) {
VERBOSE_TOROUT_STRING("Only one CPU, so CPU-hotplug testing is disabled"); VERBOSE_TOROUT_STRING("Only one CPU, so CPU-hotplug testing is disabled");
@ -228,18 +248,18 @@ stop:
/* /*
* Initiate online-offline handling. * Initiate online-offline handling.
*/ */
int torture_onoff_init(long ooholdoff, long oointerval) int torture_onoff_init(long ooholdoff, long oointerval, torture_ofl_func *f)
{ {
int ret = 0;
#ifdef CONFIG_HOTPLUG_CPU #ifdef CONFIG_HOTPLUG_CPU
onoff_holdoff = ooholdoff; onoff_holdoff = ooholdoff;
onoff_interval = oointerval; onoff_interval = oointerval;
onoff_f = f;
if (onoff_interval <= 0) if (onoff_interval <= 0)
return 0; return 0;
ret = torture_create_kthread(torture_onoff, NULL, onoff_task); return torture_create_kthread(torture_onoff, NULL, onoff_task);
#endif /* #ifdef CONFIG_HOTPLUG_CPU */ #else /* #ifdef CONFIG_HOTPLUG_CPU */
return ret; return 0;
#endif /* #else #ifdef CONFIG_HOTPLUG_CPU */
} }
EXPORT_SYMBOL_GPL(torture_onoff_init); EXPORT_SYMBOL_GPL(torture_onoff_init);
@ -500,7 +520,7 @@ static int torture_shutdown(void *arg)
torture_shutdown_hook(); torture_shutdown_hook();
else else
VERBOSE_TOROUT_STRING("No torture_shutdown_hook(), skipping."); VERBOSE_TOROUT_STRING("No torture_shutdown_hook(), skipping.");
ftrace_dump(DUMP_ALL); rcu_ftrace_dump(DUMP_ALL);
kernel_power_off(); /* Shut down the system. */ kernel_power_off(); /* Shut down the system. */
return 0; return 0;
} }
@ -510,15 +530,13 @@ static int torture_shutdown(void *arg)
*/ */
int torture_shutdown_init(int ssecs, void (*cleanup)(void)) int torture_shutdown_init(int ssecs, void (*cleanup)(void))
{ {
int ret = 0;
torture_shutdown_hook = cleanup; torture_shutdown_hook = cleanup;
if (ssecs > 0) { if (ssecs > 0) {
shutdown_time = ktime_add(ktime_get(), ktime_set(ssecs, 0)); shutdown_time = ktime_add(ktime_get(), ktime_set(ssecs, 0));
ret = torture_create_kthread(torture_shutdown, NULL, return torture_create_kthread(torture_shutdown, NULL,
shutdown_task); shutdown_task);
} }
return ret; return 0;
} }
EXPORT_SYMBOL_GPL(torture_shutdown_init); EXPORT_SYMBOL_GPL(torture_shutdown_init);
@ -565,26 +583,32 @@ static void torture_shutdown_cleanup(void)
static struct task_struct *stutter_task; static struct task_struct *stutter_task;
static int stutter_pause_test; static int stutter_pause_test;
static int stutter; static int stutter;
static int stutter_gap;
/* /*
* Block until the stutter interval ends. This must be called periodically * Block until the stutter interval ends. This must be called periodically
* by all running kthreads that need to be subject to stuttering. * by all running kthreads that need to be subject to stuttering.
*/ */
void stutter_wait(const char *title) bool stutter_wait(const char *title)
{ {
cond_resched_rcu_qs(); int spt;
while (READ_ONCE(stutter_pause_test) || bool ret = false;
(torture_runnable && !READ_ONCE(*torture_runnable))) {
if (stutter_pause_test) cond_resched_tasks_rcu_qs();
if (READ_ONCE(stutter_pause_test) == 1) spt = READ_ONCE(stutter_pause_test);
schedule_timeout_interruptible(1); for (; spt; spt = READ_ONCE(stutter_pause_test)) {
else ret = true;
while (READ_ONCE(stutter_pause_test)) if (spt == 1) {
cond_resched(); schedule_timeout_interruptible(1);
else } else if (spt == 2) {
while (READ_ONCE(stutter_pause_test))
cond_resched();
} else {
schedule_timeout_interruptible(round_jiffies_relative(HZ)); schedule_timeout_interruptible(round_jiffies_relative(HZ));
}
torture_shutdown_absorb(title); torture_shutdown_absorb(title);
} }
return ret;
} }
EXPORT_SYMBOL_GPL(stutter_wait); EXPORT_SYMBOL_GPL(stutter_wait);
@ -594,19 +618,24 @@ EXPORT_SYMBOL_GPL(stutter_wait);
*/ */
static int torture_stutter(void *arg) static int torture_stutter(void *arg)
{ {
int wtime;
VERBOSE_TOROUT_STRING("torture_stutter task started"); VERBOSE_TOROUT_STRING("torture_stutter task started");
do { do {
if (!torture_must_stop()) { if (!torture_must_stop() && stutter > 1) {
if (stutter > 1) { wtime = stutter;
schedule_timeout_interruptible(stutter - 1); if (stutter > HZ + 1) {
WRITE_ONCE(stutter_pause_test, 2); WRITE_ONCE(stutter_pause_test, 1);
wtime = stutter - HZ - 1;
schedule_timeout_interruptible(wtime);
wtime = HZ + 1;
} }
schedule_timeout_interruptible(1); WRITE_ONCE(stutter_pause_test, 2);
WRITE_ONCE(stutter_pause_test, 1); schedule_timeout_interruptible(wtime);
} }
if (!torture_must_stop())
schedule_timeout_interruptible(stutter);
WRITE_ONCE(stutter_pause_test, 0); WRITE_ONCE(stutter_pause_test, 0);
if (!torture_must_stop())
schedule_timeout_interruptible(stutter_gap);
torture_shutdown_absorb("torture_stutter"); torture_shutdown_absorb("torture_stutter");
} while (!torture_must_stop()); } while (!torture_must_stop());
torture_kthread_stopping("torture_stutter"); torture_kthread_stopping("torture_stutter");
@ -616,13 +645,11 @@ static int torture_stutter(void *arg)
/* /*
* Initialize and kick off the torture_stutter kthread. * Initialize and kick off the torture_stutter kthread.
*/ */
int torture_stutter_init(int s) int torture_stutter_init(const int s, const int sgap)
{ {
int ret;
stutter = s; stutter = s;
ret = torture_create_kthread(torture_stutter, NULL, stutter_task); stutter_gap = sgap;
return ret; return torture_create_kthread(torture_stutter, NULL, stutter_task);
} }
EXPORT_SYMBOL_GPL(torture_stutter_init); EXPORT_SYMBOL_GPL(torture_stutter_init);
@ -647,7 +674,7 @@ static void torture_stutter_cleanup(void)
* The runnable parameter points to a flag that controls whether or not * The runnable parameter points to a flag that controls whether or not
* the test is currently runnable. If there is no such flag, pass in NULL. * the test is currently runnable. If there is no such flag, pass in NULL.
*/ */
bool torture_init_begin(char *ttype, bool v, int *runnable) bool torture_init_begin(char *ttype, int v)
{ {
mutex_lock(&fullstop_mutex); mutex_lock(&fullstop_mutex);
if (torture_type != NULL) { if (torture_type != NULL) {
@ -659,7 +686,6 @@ bool torture_init_begin(char *ttype, bool v, int *runnable)
} }
torture_type = ttype; torture_type = ttype;
verbose = v; verbose = v;
torture_runnable = runnable;
fullstop = FULLSTOP_DONTSTOP; fullstop = FULLSTOP_DONTSTOP;
return true; return true;
} }

View File

@ -233,7 +233,7 @@ static void ftrace_sync(struct work_struct *work)
{ {
/* /*
* This function is just a stub to implement a hard force * This function is just a stub to implement a hard force
* of synchronize_sched(). This requires synchronizing * of synchronize_rcu(). This requires synchronizing
* tasks even in userspace and idle. * tasks even in userspace and idle.
* *
* Yes, function tracing is rude. * Yes, function tracing is rude.
@ -1004,7 +1004,7 @@ ftrace_profile_write(struct file *filp, const char __user *ubuf,
ftrace_profile_enabled = 0; ftrace_profile_enabled = 0;
/* /*
* unregister_ftrace_profiler calls stop_machine * unregister_ftrace_profiler calls stop_machine
* so this acts like an synchronize_sched. * so this acts like an synchronize_rcu.
*/ */
unregister_ftrace_profiler(); unregister_ftrace_profiler();
} }
@ -1162,7 +1162,7 @@ bool is_ftrace_trampoline(unsigned long addr)
/* /*
* Some of the ops may be dynamically allocated, * Some of the ops may be dynamically allocated,
* they are freed after a synchronize_sched(). * they are freed after a synchronize_rcu().
*/ */
preempt_disable_notrace(); preempt_disable_notrace();
@ -1353,7 +1353,7 @@ static void free_ftrace_hash_rcu(struct ftrace_hash *hash)
{ {
if (!hash || hash == EMPTY_HASH) if (!hash || hash == EMPTY_HASH)
return; return;
call_rcu_sched(&hash->rcu, __free_ftrace_hash_rcu); call_rcu(&hash->rcu, __free_ftrace_hash_rcu);
} }
void ftrace_free_filter(struct ftrace_ops *ops) void ftrace_free_filter(struct ftrace_ops *ops)
@ -1568,7 +1568,7 @@ static bool hash_contains_ip(unsigned long ip,
* the ip is not in the ops->notrace_hash. * the ip is not in the ops->notrace_hash.
* *
* This needs to be called with preemption disabled as * This needs to be called with preemption disabled as
* the hashes are freed with call_rcu_sched(). * the hashes are freed with call_rcu().
*/ */
static int static int
ftrace_ops_test(struct ftrace_ops *ops, unsigned long ip, void *regs) ftrace_ops_test(struct ftrace_ops *ops, unsigned long ip, void *regs)
@ -4625,7 +4625,7 @@ unregister_ftrace_function_probe_func(char *glob, struct trace_array *tr,
if (ftrace_enabled && !ftrace_hash_empty(hash)) if (ftrace_enabled && !ftrace_hash_empty(hash))
ftrace_run_modify_code(&probe->ops, FTRACE_UPDATE_CALLS, ftrace_run_modify_code(&probe->ops, FTRACE_UPDATE_CALLS,
&old_hash_ops); &old_hash_ops);
synchronize_sched(); synchronize_rcu();
hlist_for_each_entry_safe(entry, tmp, &hhd, hlist) { hlist_for_each_entry_safe(entry, tmp, &hhd, hlist) {
hlist_del(&entry->hlist); hlist_del(&entry->hlist);
@ -6120,7 +6120,7 @@ __ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
/* /*
* Some of the ops may be dynamically allocated, * Some of the ops may be dynamically allocated,
* they must be freed after a synchronize_sched(). * they must be freed after a synchronize_rcu().
*/ */
preempt_disable_notrace(); preempt_disable_notrace();
@ -6299,7 +6299,7 @@ static void clear_ftrace_pids(struct trace_array *tr)
rcu_assign_pointer(tr->function_pids, NULL); rcu_assign_pointer(tr->function_pids, NULL);
/* Wait till all users are no longer using pid filtering */ /* Wait till all users are no longer using pid filtering */
synchronize_sched(); synchronize_rcu();
trace_free_pid_list(pid_list); trace_free_pid_list(pid_list);
} }
@ -6447,7 +6447,7 @@ ftrace_pid_write(struct file *filp, const char __user *ubuf,
rcu_assign_pointer(tr->function_pids, pid_list); rcu_assign_pointer(tr->function_pids, pid_list);
if (filtered_pids) { if (filtered_pids) {
synchronize_sched(); synchronize_rcu();
trace_free_pid_list(filtered_pids); trace_free_pid_list(filtered_pids);
} else if (pid_list) { } else if (pid_list) {
/* Register a probe to set whether to ignore the tracing of a task */ /* Register a probe to set whether to ignore the tracing of a task */

View File

@ -1769,7 +1769,7 @@ int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size,
* There could have been a race between checking * There could have been a race between checking
* record_disable and incrementing it. * record_disable and incrementing it.
*/ */
synchronize_sched(); synchronize_rcu();
for_each_buffer_cpu(buffer, cpu) { for_each_buffer_cpu(buffer, cpu) {
cpu_buffer = buffer->buffers[cpu]; cpu_buffer = buffer->buffers[cpu];
rb_check_pages(cpu_buffer); rb_check_pages(cpu_buffer);
@ -3087,7 +3087,7 @@ static bool rb_per_cpu_empty(struct ring_buffer_per_cpu *cpu_buffer)
* This prevents all writes to the buffer. Any attempt to write * This prevents all writes to the buffer. Any attempt to write
* to the buffer after this will fail and return NULL. * to the buffer after this will fail and return NULL.
* *
* The caller should call synchronize_sched() after this. * The caller should call synchronize_rcu() after this.
*/ */
void ring_buffer_record_disable(struct ring_buffer *buffer) void ring_buffer_record_disable(struct ring_buffer *buffer)
{ {
@ -3189,7 +3189,7 @@ int ring_buffer_record_is_set_on(struct ring_buffer *buffer)
* This prevents all writes to the buffer. Any attempt to write * This prevents all writes to the buffer. Any attempt to write
* to the buffer after this will fail and return NULL. * to the buffer after this will fail and return NULL.
* *
* The caller should call synchronize_sched() after this. * The caller should call synchronize_rcu() after this.
*/ */
void ring_buffer_record_disable_cpu(struct ring_buffer *buffer, int cpu) void ring_buffer_record_disable_cpu(struct ring_buffer *buffer, int cpu)
{ {
@ -4115,7 +4115,7 @@ EXPORT_SYMBOL_GPL(ring_buffer_read_prepare);
void void
ring_buffer_read_prepare_sync(void) ring_buffer_read_prepare_sync(void)
{ {
synchronize_sched(); synchronize_rcu();
} }
EXPORT_SYMBOL_GPL(ring_buffer_read_prepare_sync); EXPORT_SYMBOL_GPL(ring_buffer_read_prepare_sync);
@ -4289,7 +4289,7 @@ void ring_buffer_reset_cpu(struct ring_buffer *buffer, int cpu)
atomic_inc(&cpu_buffer->record_disabled); atomic_inc(&cpu_buffer->record_disabled);
/* Make sure all commits have finished */ /* Make sure all commits have finished */
synchronize_sched(); synchronize_rcu();
raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags); raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
@ -4424,7 +4424,7 @@ int ring_buffer_swap_cpu(struct ring_buffer *buffer_a,
goto out; goto out;
/* /*
* We can't do a synchronize_sched here because this * We can't do a synchronize_rcu here because this
* function can be called in atomic context. * function can be called in atomic context.
* Normally this will be called from the same CPU as cpu. * Normally this will be called from the same CPU as cpu.
* If not it's up to the caller to protect this. * If not it's up to the caller to protect this.

View File

@ -1685,7 +1685,7 @@ void tracing_reset(struct trace_buffer *buf, int cpu)
ring_buffer_record_disable(buffer); ring_buffer_record_disable(buffer);
/* Make sure all commits have finished */ /* Make sure all commits have finished */
synchronize_sched(); synchronize_rcu();
ring_buffer_reset_cpu(buffer, cpu); ring_buffer_reset_cpu(buffer, cpu);
ring_buffer_record_enable(buffer); ring_buffer_record_enable(buffer);
@ -1702,7 +1702,7 @@ void tracing_reset_online_cpus(struct trace_buffer *buf)
ring_buffer_record_disable(buffer); ring_buffer_record_disable(buffer);
/* Make sure all commits have finished */ /* Make sure all commits have finished */
synchronize_sched(); synchronize_rcu();
buf->time_start = buffer_ftrace_now(buf, buf->cpu); buf->time_start = buffer_ftrace_now(buf, buf->cpu);
@ -2261,7 +2261,7 @@ void trace_buffered_event_disable(void)
preempt_enable(); preempt_enable();
/* Wait for all current users to finish */ /* Wait for all current users to finish */
synchronize_sched(); synchronize_rcu();
for_each_tracing_cpu(cpu) { for_each_tracing_cpu(cpu) {
free_page((unsigned long)per_cpu(trace_buffered_event, cpu)); free_page((unsigned long)per_cpu(trace_buffered_event, cpu));
@ -2708,17 +2708,6 @@ void __trace_stack(struct trace_array *tr, unsigned long flags, int skip,
if (unlikely(in_nmi())) if (unlikely(in_nmi()))
return; return;
/*
* It is possible that a function is being traced in a
* location that RCU is not watching. A call to
* rcu_irq_enter() will make sure that it is, but there's
* a few internal rcu functions that could be traced
* where that wont work either. In those cases, we just
* do nothing.
*/
if (unlikely(rcu_irq_enter_disabled()))
return;
rcu_irq_enter_irqson(); rcu_irq_enter_irqson();
__ftrace_trace_stack(buffer, flags, skip, pc, NULL); __ftrace_trace_stack(buffer, flags, skip, pc, NULL);
rcu_irq_exit_irqson(); rcu_irq_exit_irqson();
@ -5431,7 +5420,7 @@ static int tracing_set_tracer(struct trace_array *tr, const char *buf)
if (tr->current_trace->reset) if (tr->current_trace->reset)
tr->current_trace->reset(tr); tr->current_trace->reset(tr);
/* Current trace needs to be nop_trace before synchronize_sched */ /* Current trace needs to be nop_trace before synchronize_rcu */
tr->current_trace = &nop_trace; tr->current_trace = &nop_trace;
#ifdef CONFIG_TRACER_MAX_TRACE #ifdef CONFIG_TRACER_MAX_TRACE
@ -5445,7 +5434,7 @@ static int tracing_set_tracer(struct trace_array *tr, const char *buf)
* The update_max_tr is called from interrupts disabled * The update_max_tr is called from interrupts disabled
* so a synchronized_sched() is sufficient. * so a synchronized_sched() is sufficient.
*/ */
synchronize_sched(); synchronize_rcu();
free_snapshot(tr); free_snapshot(tr);
} }
#endif #endif

View File

@ -159,13 +159,13 @@ static int benchmark_event_kthread(void *arg)
* wants to run, schedule in, but if the CPU is idle, * wants to run, schedule in, but if the CPU is idle,
* we'll keep burning cycles. * we'll keep burning cycles.
* *
* Note the _rcu_qs() version of cond_resched() will * Note the tasks_rcu_qs() version of cond_resched() will
* notify synchronize_rcu_tasks() that this thread has * notify synchronize_rcu_tasks() that this thread has
* passed a quiescent state for rcu_tasks. Otherwise * passed a quiescent state for rcu_tasks. Otherwise
* this thread will never voluntarily schedule which would * this thread will never voluntarily schedule which would
* block synchronize_rcu_tasks() indefinitely. * block synchronize_rcu_tasks() indefinitely.
*/ */
cond_resched_rcu_qs(); cond_resched_tasks_rcu_qs();
} }
return 0; return 0;

View File

@ -1858,7 +1858,7 @@ static int replace_system_preds(struct trace_subsystem_dir *dir,
/* /*
* The calls can still be using the old filters. * The calls can still be using the old filters.
* Do a synchronize_sched() to ensure all calls are * Do a synchronize_rcu() to ensure all calls are
* done with them before we free them. * done with them before we free them.
*/ */
synchronize_sched(); synchronize_sched();
@ -2096,7 +2096,7 @@ int apply_subsystem_event_filter(struct trace_subsystem_dir *dir,
if (filter) { if (filter) {
/* /*
* No event actually uses the system filter * No event actually uses the system filter
* we can free it without synchronize_sched(). * we can free it without synchronize_rcu().
*/ */
__free_filter(system->filter); __free_filter(system->filter);
system->filter = filter; system->filter = filter;

View File

@ -459,7 +459,7 @@ disable_trace_kprobe(struct trace_kprobe *tk, struct trace_event_file *file)
* event_call related objects, which will be accessed in * event_call related objects, which will be accessed in
* the kprobe_trace_func/kretprobe_trace_func. * the kprobe_trace_func/kretprobe_trace_func.
*/ */
synchronize_sched(); synchronize_rcu();
kfree(link); /* Ignored if link == NULL */ kfree(link); /* Ignored if link == NULL */
} }

View File

@ -83,7 +83,7 @@ static inline void release_probes(struct tracepoint_func *old)
if (old) { if (old) {
struct tp_probes *tp_probes = container_of(old, struct tp_probes *tp_probes = container_of(old,
struct tp_probes, probes[0]); struct tp_probes, probes[0]);
call_rcu_sched(&tp_probes->rcu, rcu_free_old_probes); call_rcu(&tp_probes->rcu, rcu_free_old_probes);
} }
} }

View File

@ -2170,7 +2170,7 @@ __acquires(&pool->lock)
* stop_machine. At the same time, report a quiescent RCU state so * stop_machine. At the same time, report a quiescent RCU state so
* the same condition doesn't freeze RCU. * the same condition doesn't freeze RCU.
*/ */
cond_resched_rcu_qs(); cond_resched_tasks_rcu_qs();
spin_lock_irq(&pool->lock); spin_lock_irq(&pool->lock);
@ -3381,7 +3381,7 @@ static void put_unbound_pool(struct worker_pool *pool)
del_timer_sync(&pool->mayday_timer); del_timer_sync(&pool->mayday_timer);
/* sched-RCU protected to allow dereferences from get_work_pool() */ /* sched-RCU protected to allow dereferences from get_work_pool() */
call_rcu_sched(&pool->rcu, rcu_free_pool); call_rcu(&pool->rcu, rcu_free_pool);
} }
/** /**
@ -3494,14 +3494,14 @@ static void pwq_unbound_release_workfn(struct work_struct *work)
put_unbound_pool(pool); put_unbound_pool(pool);
mutex_unlock(&wq_pool_mutex); mutex_unlock(&wq_pool_mutex);
call_rcu_sched(&pwq->rcu, rcu_free_pwq); call_rcu(&pwq->rcu, rcu_free_pwq);
/* /*
* If we're the last pwq going away, @wq is already dead and no one * If we're the last pwq going away, @wq is already dead and no one
* is gonna access it anymore. Schedule RCU free. * is gonna access it anymore. Schedule RCU free.
*/ */
if (is_last) if (is_last)
call_rcu_sched(&wq->rcu, rcu_free_wq); call_rcu(&wq->rcu, rcu_free_wq);
} }
/** /**
@ -4208,7 +4208,7 @@ void destroy_workqueue(struct workqueue_struct *wq)
* The base ref is never dropped on per-cpu pwqs. Directly * The base ref is never dropped on per-cpu pwqs. Directly
* schedule RCU free. * schedule RCU free.
*/ */
call_rcu_sched(&wq->rcu, rcu_free_wq); call_rcu(&wq->rcu, rcu_free_wq);
} else { } else {
/* /*
* We're the sole accessor of @wq at this point. Directly * We're the sole accessor of @wq at this point. Directly

View File

@ -1328,7 +1328,6 @@ config DEBUG_LOCKING_API_SELFTESTS
config LOCK_TORTURE_TEST config LOCK_TORTURE_TEST
tristate "torture tests for locking" tristate "torture tests for locking"
depends on DEBUG_KERNEL
select TORTURE_TEST select TORTURE_TEST
default n default n
help help

View File

@ -181,7 +181,7 @@ static void __percpu_ref_switch_to_atomic(struct percpu_ref *ref,
ref->confirm_switch = confirm_switch ?: percpu_ref_noop_confirm_switch; ref->confirm_switch = confirm_switch ?: percpu_ref_noop_confirm_switch;
percpu_ref_get(ref); /* put after confirmation */ percpu_ref_get(ref); /* put after confirmation */
call_rcu_sched(&ref->rcu, percpu_ref_switch_to_atomic_rcu); call_rcu(&ref->rcu, percpu_ref_switch_to_atomic_rcu);
} }
static void __percpu_ref_switch_to_percpu(struct percpu_ref *ref) static void __percpu_ref_switch_to_percpu(struct percpu_ref *ref)

View File

@ -788,7 +788,7 @@ static int apply_mlockall_flags(int flags)
/* Ignore errors */ /* Ignore errors */
mlock_fixup(vma, &prev, vma->vm_start, vma->vm_end, newflags); mlock_fixup(vma, &prev, vma->vm_start, vma->vm_end, newflags);
cond_resched_rcu_qs(); cond_resched_tasks_rcu_qs();
} }
out: out:
return 0; return 0;

View File

@ -952,10 +952,10 @@ static int setup_kmem_cache_node(struct kmem_cache *cachep,
* To protect lockless access to n->shared during irq disabled context. * To protect lockless access to n->shared during irq disabled context.
* If n->shared isn't NULL in irq disabled context, accessing to it is * If n->shared isn't NULL in irq disabled context, accessing to it is
* guaranteed to be valid until irq is re-enabled, because it will be * guaranteed to be valid until irq is re-enabled, because it will be
* freed after synchronize_sched(). * freed after synchronize_rcu().
*/ */
if (old_shared && force_change) if (old_shared && force_change)
synchronize_sched(); synchronize_rcu();
fail: fail:
kfree(old_shared); kfree(old_shared);

View File

@ -699,7 +699,7 @@ void slab_deactivate_memcg_cache_rcu_sched(struct kmem_cache *s,
css_get(&s->memcg_params.memcg->css); css_get(&s->memcg_params.memcg->css);
s->memcg_params.deact_fn = deact_fn; s->memcg_params.deact_fn = deact_fn;
call_rcu_sched(&s->memcg_params.deact_rcu_head, kmemcg_deactivate_rcufn); call_rcu(&s->memcg_params.deact_rcu_head, kmemcg_deactivate_rcufn);
} }
void memcg_deactivate_kmem_caches(struct mem_cgroup *memcg) void memcg_deactivate_kmem_caches(struct mem_cgroup *memcg)

View File

@ -1162,7 +1162,7 @@ static void gc_worker(struct work_struct *work)
* we will just continue with next hash slot. * we will just continue with next hash slot.
*/ */
rcu_read_unlock(); rcu_read_unlock();
cond_resched_rcu_qs(); cond_resched_tasks_rcu_qs();
} while (++buckets < goal); } while (++buckets < goal);
if (gc_work->exiting) if (gc_work->exiting)

View File

@ -512,7 +512,7 @@ void qdisc_put_stab(struct qdisc_size_table *tab)
if (--tab->refcnt == 0) { if (--tab->refcnt == 0) {
list_del(&tab->list); list_del(&tab->list);
call_rcu_bh(&tab->rcu, stab_kfree_rcu); call_rcu(&tab->rcu, stab_kfree_rcu);
} }
} }
EXPORT_SYMBOL(qdisc_put_stab); EXPORT_SYMBOL(qdisc_put_stab);

View File

@ -115,6 +115,6 @@ int scnprintf(char * buf, size_t size, const char * fmt, ...);
#define round_down(x, y) ((x) & ~__round_mask(x, y)) #define round_down(x, y) ((x) & ~__round_mask(x, y))
#define current_gfp_context(k) 0 #define current_gfp_context(k) 0
#define synchronize_sched() #define synchronize_rcu()
#endif #endif

View File

@ -19,7 +19,7 @@ static inline bool rcu_is_watching(void)
return false; return false;
} }
#define rcu_assign_pointer(p, v) ((p) = (v)) #define rcu_assign_pointer(p, v) do { (p) = (v); } while (0)
#define RCU_INIT_POINTER(p, v) p=(v) #define RCU_INIT_POINTER(p, v) do { (p) = (v); } while (0)
#endif #endif

View File

@ -38,6 +38,5 @@ per_version_boot_params () {
echo $1 `locktorture_param_onoff "$1" "$2"` \ echo $1 `locktorture_param_onoff "$1" "$2"` \
locktorture.stat_interval=15 \ locktorture.stat_interval=15 \
locktorture.shutdown_secs=$3 \ locktorture.shutdown_secs=$3 \
locktorture.torture_runnable=1 \
locktorture.verbose=1 locktorture.verbose=1
} }

View File

@ -1,3 +1 @@
rcupdate.rcu_self_test=1 rcupdate.rcu_self_test=1
rcupdate.rcu_self_test_bh=1
rcutorture.torture_type=rcu_bh

View File

@ -1,4 +1,4 @@
rcutorture.torture_type=rcu_bh maxcpus=8 nr_cpus=43 maxcpus=8 nr_cpus=43
rcutree.gp_preinit_delay=3 rcutree.gp_preinit_delay=3
rcutree.gp_init_delay=3 rcutree.gp_init_delay=3
rcutree.gp_cleanup_delay=3 rcutree.gp_cleanup_delay=3

View File

@ -1,5 +1,5 @@
rcutorture.onoff_interval=1 rcutorture.onoff_holdoff=30 rcutorture.onoff_interval=200 rcutorture.onoff_holdoff=30
rcutree.gp_preinit_delay=3 rcutree.gp_preinit_delay=12
rcutree.gp_init_delay=3 rcutree.gp_init_delay=3
rcutree.gp_cleanup_delay=3 rcutree.gp_cleanup_delay=3
rcutree.kthread_prio=2 rcutree.kthread_prio=2

View File

@ -1 +1 @@
rcutorture.torture_type=rcu_bh rcutree.rcu_fanout_leaf=4 rcutree.rcu_fanout_leaf=4

View File

@ -1,5 +1,3 @@
rcutorture.torture_type=sched
rcupdate.rcu_self_test_sched=1
rcutree.gp_preinit_delay=3 rcutree.gp_preinit_delay=3
rcutree.gp_init_delay=3 rcutree.gp_init_delay=3
rcutree.gp_cleanup_delay=3 rcutree.gp_cleanup_delay=3

View File

@ -1,6 +1,4 @@
rcupdate.rcu_self_test=1 rcupdate.rcu_self_test=1
rcupdate.rcu_self_test_bh=1
rcupdate.rcu_self_test_sched=1
rcutree.rcu_fanout_exact=1 rcutree.rcu_fanout_exact=1
rcutree.gp_preinit_delay=3 rcutree.gp_preinit_delay=3
rcutree.gp_init_delay=3 rcutree.gp_init_delay=3

View File

@ -1,5 +1,3 @@
rcutorture.torture_type=sched
rcupdate.rcu_self_test=1 rcupdate.rcu_self_test=1
rcupdate.rcu_self_test_sched=1
rcutree.rcu_fanout_exact=1 rcutree.rcu_fanout_exact=1
rcu_nocbs=0-7 rcu_nocbs=0-7

View File

@ -0,0 +1,14 @@
CONFIG_SMP=y
CONFIG_NR_CPUS=8
CONFIG_PREEMPT_NONE=y
CONFIG_PREEMPT_VOLUNTARY=n
CONFIG_PREEMPT=n
CONFIG_HZ_PERIODIC=n
CONFIG_NO_HZ_IDLE=y
CONFIG_NO_HZ_FULL=n
CONFIG_HOTPLUG_CPU=n
CONFIG_SUSPEND=n
CONFIG_HIBERNATION=n
CONFIG_DEBUG_LOCK_ALLOC=n
CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
CONFIG_RCU_EXPERT=y

View File

@ -0,0 +1,3 @@
rcutorture.torture_type=trivial
rcutorture.onoff_interval=0
rcutorture.shuffle_interval=0

View File

@ -39,7 +39,7 @@ rcutorture_param_onoff () {
if ! bootparam_hotplug_cpu "$1" && configfrag_hotplug_cpu "$2" if ! bootparam_hotplug_cpu "$1" && configfrag_hotplug_cpu "$2"
then then
echo CPU-hotplug kernel, adding rcutorture onoff. 1>&2 echo CPU-hotplug kernel, adding rcutorture onoff. 1>&2
echo rcutorture.onoff_interval=3 rcutorture.onoff_holdoff=30 echo rcutorture.onoff_interval=1000 rcutorture.onoff_holdoff=30
fi fi
} }
@ -51,7 +51,6 @@ per_version_boot_params () {
`rcutorture_param_n_barrier_cbs "$1"` \ `rcutorture_param_n_barrier_cbs "$1"` \
rcutorture.stat_interval=15 \ rcutorture.stat_interval=15 \
rcutorture.shutdown_secs=$3 \ rcutorture.shutdown_secs=$3 \
rcutorture.torture_runnable=1 \
rcutorture.test_no_idle_hz=1 \ rcutorture.test_no_idle_hz=1 \
rcutorture.verbose=1 rcutorture.verbose=1
} }

View File

@ -20,33 +20,10 @@
# #
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com> # Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
# rcuperf_param_nreaders bootparam-string
#
# Adds nreaders rcuperf module parameter if not already specified.
rcuperf_param_nreaders () {
if ! echo "$1" | grep -q "rcuperf.nreaders"
then
echo rcuperf.nreaders=-1
fi
}
# rcuperf_param_nwriters bootparam-string
#
# Adds nwriters rcuperf module parameter if not already specified.
rcuperf_param_nwriters () {
if ! echo "$1" | grep -q "rcuperf.nwriters"
then
echo rcuperf.nwriters=-1
fi
}
# per_version_boot_params bootparam-string config-file seconds # per_version_boot_params bootparam-string config-file seconds
# #
# Adds per-version torture-module parameters to kernels supporting them. # Adds per-version torture-module parameters to kernels supporting them.
per_version_boot_params () { per_version_boot_params () {
echo $1 `rcuperf_param_nreaders "$1"` \ echo $1 rcuperf.shutdown=1 \
`rcuperf_param_nwriters "$1"` \
rcuperf.perf_runnable=1 \
rcuperf.shutdown=1 \
rcuperf.verbose=1 rcuperf.verbose=1
} }

Some files were not shown because too many files have changed in this diff Show More