Lines Matching +full:three +full:- +full:state
1 .. SPDX-License-Identifier: GPL-2.0
9 options that can be used to fine-tune the detector's operation. Finally,
20 - A CPU looping in an RCU read-side critical section.
22 - A CPU looping with interrupts disabled.
24 - A CPU looping with preemption disabled.
26 - A CPU looping with bottom halves disabled.
28 - For !CONFIG_PREEMPTION kernels, a CPU looping anywhere in the
33 - Booting Linux using a console connection that is too slow to
34 keep up with the boot-time console-message rate. For example,
36 with boot-time message rates, and will frequently result in
40 - Anything that prevents RCU's grace-period kthreads from running.
41 This can result in the "All QSes seen" console-log message.
44 result in the ``rcu_.*kthread starved for`` console-log message,
47 - A CPU-bound real-time task in a CONFIG_PREEMPTION kernel, which might
48 happen to preempt a low-priority task in the middle of an RCU
49 read-side critical section. This is especially damaging if
50 that low-priority task is not permitted to run on any other CPU,
54 memory, you might see stall-warning messages.
56 - A CPU-bound real-time task in a CONFIG_PREEMPT_RT kernel that
62 CONFIG_PREEMPT_RCU case, you might see stall-warning
68 can increase your system's context-switch rate and thus degrade
71 - A periodic interrupt whose handler takes longer than the time
74 Note that certain high-overhead debugging options, for example
79 - Testing a workload on a fast system, tuning the stall-warning
81 running the same workload with the same stall-warning timeout on a
82 slow system. Note that thermal throttling and on-demand governors
85 - A hardware or software issue shuts off the scheduler-clock
86 interrupt on a CPU that is not in dyntick-idle mode. This
90 - A hardware or software issue that prevents time-based wakeups
96 the ``rcu_.*timer wakeup didn't happen for`` console-log message,
99 - A low-level kernel issue that either fails to invoke one of the
107 of issues, which sometimes arise in architecture-specific code.
109 - A bug in the RCU implementation.
111 - A hardware failure. This is quite unlikely, but is not at all
118 The RCU, RCU-sched, RCU-tasks, and RCU-tasks-trace implementations have
136 Fine-Tuning the RCU CPU Stall Detector
142 but may be overridden via boot-time parameter or at runtime via sysfs.
147 ----------------------------
157 So if you are 10 seconds into a 40-second stall, setting this
163 Stall-warning messages may be enabled and disabled completely via
167 --------------------------------
181 the timeout for the -next- stall.
183 Stall-warning messages may be enabled and disabled completely via
187 ---------------------
196 -------------------
199 own warnings, as this often gives better-quality stack traces.
207 -------------------------------
209 This boot/sysfs parameter controls the RCU-tasks and
210 RCU-tasks-trace stall warning intervals. A value of zero or less
211 suppresses RCU-tasks stall warnings. A positive value sets the
212 stall-warning interval in seconds. An RCU-tasks stall warning
218 task stalling the current RCU-tasks grace period.
220 An RCU-tasks-trace stall warning starts (and continues) similarly:
225 Interpreting RCU's CPU Stall-Detector "Splats"
228 For non-RCU-tasks flavors of RCU, when a CPU detects that some other
232 2-...: (3 GPs behind) idle=06c/0/0 softirq=1453/1455 fqs=0
233 16-...: (0 ticks this GP) idle=81c/0/0 softirq=764/764 fqs=0
237 causing stalls, and that the stall was affecting RCU-sched. This message
244 in a self-detected stall.
247 the RCU core for the past three grace periods. In contrast, CPU 16's "(0
248 ticks this GP)" indicates that this CPU has not taken any scheduling-clock
251 The "idle=" portion of the message prints the dyntick-idle state.
252 The hex number before the first "/" is the low-order 12 bits of the
253 dynticks counter, which will have an even-numbered value if the CPU
254 is in dyntick-idle mode and an odd-numbered value otherwise. The hex
256 a small non-negative number if in the idle loop (as shown above) and a
258 "/" is the NMI nesting, which will be a small non-negative number.
265 example, if the CPU might have been in dyntick-idle mode for an extended
268 across repeated stall-warning messages, it is possible that RCU's softirq
270 the stalled CPU is spinning with interrupts are disabled, or, in -rt
271 kernels, if a high-priority process is starving RCU's softirq handler.
273 The "fqs=" shows the number of force-quiescent-state idle/offline
274 detection passes that the grace-period kthread has made across this
280 period (in this case 2603), the grace-period sequence number (7075), and
285 there will be a spurious stall-warning message, which will include
288 INFO: Stall ended before state dump start
291 possible for a zero-jiffy stall to be flagged in this case, depending
292 on how the stall warning and the grace-period initialization happen to
298 grace period has nevertheless failed to end, the stall-warning splat
301 …n, last rcu_preempt kthread activity 23807 (4297905177-4297881370), jiffies_till_next_fqs=3, root …
304 since the grace-period kthread ran. The "jiffies_till_next_fqs"
306 of jiffies between force-quiescent-state scans, in this case three,
308 ->qsmask field is printed, which will normally be zero.
310 If the relevant grace-period kthread has been unable to run prior to
314 rcu_sched kthread starved for 23807 jiffies! g7075 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1 ->cpu=5
317 Starving the grace-period kthreads of CPU time can of course result
320 grace-period sequence number, the "f" precedes the ->gp_flags command
321 to the grace-period kthread, the "RCU_GP_WAIT_FQS" indicates that the
322 kthread is waiting for a short timeout, the "state" precedes value of the
323 task_struct ->state field, and the "cpu" indicates that the grace-period
326 If the relevant grace-period kthread does not wake from FQS wait in a
329 kthread timer wakeup didn't happen for 23804 jiffies! g7076 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
337 Possible timer handling issue on cpu=4 timer-softirq=11142
339 Here "cpu" indicates that the grace-period kthread last ran on CPU 4,
340 where it queued the fqs timer. The number following the "timer-softirq"
354 If a stall lasts long enough, multiple stall-warning messages will
357 message will be about three times the interval between the beginning
368 INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 7-... } 21119 jiffies s: 73 root: 0x2/.
371 The three periods (".") following the CPU number indicate that the CPU
379 grace-period sequence counter is 73. The fact that this last value is
408 first three values in row "cputime:" indicate the CPU time in
411 in milliseconds. Because user-mode tasks normally do not cause RCU CPU
417 |<------------first timeout---------->|<-----second timeout----->|
418 |<--half timeout-->|<--half timeout-->| |
419 | |<--first period-->| |
420 | |<-----------second sampling period---------->|
422 snapshot time point 1st-stall 2nd-stall
443 This is similar to the previous example, but with non-zero number of
444 and CPU time consumed by hard interrupts, along with non-zero CPU
445 time consumed by in-kernel execution::
476 Here, the number and CPU time of hard interrupts are all non-zero,
477 but the number of context switches and the in-kernel CPU time consumed
479 non-zero, but could be zero, for example, if the CPU was spinning