Lines Matching +full:up +full:- +full:samples

1 .. SPDX-License-Identifier: GPL-2.0
15 2.2.2 Per-thread mode
16 2.2.3 Per-CPU mode
19 2.3.1 Producer-consumer model
21 2.3.3 Writing samples into buffer
22 2.3.4 Reading samples from buffer
55 -------------------
63 +---------------------------+
65 +---------------------------+
66 `-> Tail `-> Head
86 read-only mapping, which is to be addressed in the section
92 +---------+---------+ +---------------------------------------+
94 +---------+---------+ +---------------------------------------+
95 ` `----------------^ ^
96 `----------------------------------------------|
103 with option ``-m`` or ``--mmap-pages=``, the given size will be rounded up
114 -------------------------------------------
135 evsel::cpus::map[] = { 0 .. _SC_NPROCESSORS_ONLN-1 }
144 threads in the system. The perf samples are exclusively collected for
151 +----+ +-----------+ +----+
153 +----+--------------+-----------+----------+----+-------->
156 +-----------------------------------------------------+
158 +-----------------------------------------------------+
161 +-----+
163 -----+-----+--------------------------------------------->
166 +-----------------------------------------------------+
168 +-----------------------------------------------------+
171 +----+ +-------+
173 --------------------------+----+--------+-------+-------->
176 +-----------------------------------------------------+
178 +-----------------------------------------------------+
181 +--------------+
183 -----------+--------------+------------------------------>
186 +-----------------------------------------------------+
188 +-----------------------------------------------------+
195 2.2.2 Per-thread mode
198 By specifying option ``--per-thread`` in perf command, e.g.
202 perf record --per-thread test_program
207 evsel::cpus::map[0] = { -1 }
221 +----+ +-----------+ +----+
223 +----+--------------+-----------+----------+----+-------->
226 | +-----+ |
228 --|--+-----+----------------------------------|---------->
231 | | +----+ +---+ |
233 --|-----|-----------------+----+--------+---+-|---------->
236 | | +--------------+ | |
238 --|-----|--+--------------+-|-----------------|---------->
241 +-----------------------------------------------------+
243 +-----------------------------------------------------+
248 Figure 4. Ring buffer for per-thread mode
250 When perf runs in per-thread mode, a ring buffer is allocated for the
256 2.2.3 Per-CPU mode
259 The option ``-C`` is used to collect samples on the list of CPUs, for
260 example the below perf command receives option ``-C 0,2``::
262 perf record -C 0,2 test_program
268 evsel::threads::map[] = { -1 }
275 options for per-thread mode and per-CPU mode, e.g. the options ``–C 0,2`` and
276 ``––per–thread`` are specified together, the samples are recorded only when
282 +----+ +-----------+ +----+
284 +----+--------------+-----------+----------+----+-------->
287 +-----------------------------------------------------+
289 +-----------------------------------------------------+
292 +-----+
294 -----+-----+--------------------------------------------->
297 +----+ +-------+
299 --------------------------+----+--------+-------+-------->
302 +-----------------------------------------------------+
304 +-----------------------------------------------------+
307 +--------------+
309 -----------+--------------+------------------------------>
314 Figure 5. Ring buffer for per-CPU mode
319 By using option ``–a`` or ``––all–cpus``, perf collects samples on all CPUs
322 perf record -a test_program
324 Similar to the per-CPU mode, the perf event doesn't bind to any PID, and
327 evsel::cpus::map[] = { 0 .. _SC_NPROCESSORS_ONLN-1 }
328 evsel::threads::map[] = { -1 }
332 are monitored during the running state and the samples are recorded into
338 +----+ +-----------+ +----+
340 +----+--------------+-----------+----------+----+-------->
343 +-----------------------------------------------------+
345 +-----------------------------------------------------+
348 +-----+
350 -----+-----+--------------------------------------------->
353 +-----------------------------------------------------+
355 +-----------------------------------------------------+
358 +----+ +-------+
360 --------------------------+----+--------+-------+-------->
363 +-----------------------------------------------------+
365 +-----------------------------------------------------+
368 +--------------+
370 -----------+--------------+------------------------------>
373 +-----------------------------------------------------+
375 +-----------------------------------------------------+
383 --------------------
388 2.3.1 Producer-consumer model
391 In the Linux kernel, the PMU events can produce samples which are stored
393 samples by reading out data from the ring buffer and finally saves the
394 data into the file for post analysis. It’s a typical producer-consumer
402 kernel wakes up the perf process to read samples from the ring buffer.
407 / | Read samples
408 Polling / `--------------| Ring buffer
409 v v ;---------------------v
410 +----------------+ +---------+---------+ +-------------------+
412 +----------------+ +---------+---------+ +-------------------+
413 ^ ^ `------------------------^
414 | Wake up tasks | Store samples
415 +-----------------------------+
417 +-----------------------------+
424 multiple events might share the same ring buffer for recording samples,
426 wakes up tasks waiting on the event. This is fulfilled by the kernel
429 After the perf process is woken up, it starts to check the ring buffers
430 one by one, if it finds any ring buffer containing samples it will read
431 out the samples for statistics or saving into the data file. Given the
441 backward. The forward writing saves samples from the beginning of the ring
445 Additionally, the tool can map buffers in either read-write mode or read-only
448 The ring buffer in the read-write mode is mapped with the property
455 Alternatively, in the read-only mode, only the kernel keeps to update
461 combinations to support buffer types: the non-overwrite buffer and the
464 .. list-table::
466 :header-rows: 1
468 * - Mapping mode
469 - Forward
470 - Backward
471 * - read-write
472 - Non-overwrite ring buffer
473 - Not used
474 * - read-only
475 - Not used
476 - Overwritable ring buffer
478 The non-overwrite ring buffer uses the read-write mapping with forward
480 and wrap around when overflow, which is used with the read-write mode in
481 the normal ring buffer. When the consumer doesn't keep up with the
487 read-only mode. It saves the data from the end of the ring buffer and
495 2.3.3 Writing samples into buffer
514 2.3.4 Reading samples from buffer
542 if (LOAD ->data_tail) { LOAD ->data_head
546 STORE ->data_head STORE ->data_tail
558 writing the pointer ``data_tail``, perf tool first consumes samples and then
568 pointer is fetched before reading samples.
575 Some architectures support one-way permeable barrier with load-acquire
576 and store-release operations, these barriers are more relaxed with less
580 If an architecture doesn’t support load-acquire and store-release in its
593 examine how the AUX ring buffer co-works with the regular ring buffer,
598 ---------------------------------------------------------
602 samples and every event format complies with the definition in the
608 regular profile samples that write to the regular ring buffer cause an
609 interrupt. Tracing execution requires a high number of samples and
619 During the initialisation phase, besides the mmap()-ed regular ring
622 non-zero file offset; ``rb_alloc_aux()`` in the kernel allocates pages
630 perf record -a -e cycles -e cs_etm/@tmc_etr0/ -- sleep 2
639 ring buffer and the AUX ring buffer per CPU-wise, which is the same as
640 the system wide mode, however, the default mode records samples only for
642 in the system. For per-thread mode, the perf tool allocates only one
644 the per-CPU mode, the perf allocates two kinds of ring buffers for
645 selected CPUs specified by the option ``-C``.
648 mode; if there are any activities on one CPU, the AUX event samples and
655 +----+ +-----------+ +----+
657 +----+--------------+-----------+----------+----+-------->
660 +-----------------------------------------------------+
662 +-----------------------------------------------------+
665 +-----------------------------------------------------+
667 +-----------------------------------------------------+
670 +-----+
672 -----+-----+--------------------------------------------->
675 +-----------------------------------------------------+
677 +-----------------------------------------------------+
680 +-----------------------------------------------------+
682 +-----------------------------------------------------+
685 +----+ +-------+
687 --------------------------+----+--------+-------+-------->
690 +-----------------------------------------------------+
692 +-----------------------------------------------------+
695 +-----------------------------------------------------+
697 +-----------------------------------------------------+
700 +--------------+
702 -----------+--------------+------------------------------>
705 +-----------------------------------------------------+
707 +-----------------------------------------------------+
710 +-----------------------------------------------------+
712 +-----------------------------------------------------+
720 --------------
738 - It fills an event ``PERF_RECORD_AUX`` into the regular ring buffer, this
742 - Since the hardware trace driver has stored new trace data into the AUX
762 -----------------
769 perf record -e cs_etm/@tmc_etr0/u -S -a program &
772 kill -USR2 $PERFPID
778 - Before a snapshot is taken, the AUX ring buffer acts in free run mode.
782 - Once the perf tool receives the *USR2* signal, it triggers the callback
788 - Then perf tool takes a snapshot, ``record__read_auxtrace_snapshot()``
792 - After the snapshot is finished, ``auxtrace_record::snapshot_finish()``
801 end it fixes up the AUX buffer's head which are used to calculate the
804 As we know, the buffers' deployment can be per-thread mode, per-CPU
814 +------------------------+
815 | AUX Ring buffer 0 | <- aux_head
816 +------------------------+
818 +--------------------------------+
819 | AUX Ring buffer 1 | <- aux_head
820 +--------------------------------+
822 +--------------------------------------------+
823 | AUX Ring buffer 2 | <- aux_head
824 +--------------------------------------------+
826 +---------------------------------------+
827 | AUX Ring buffer 3 | <- aux_head
828 +---------------------------------------+