perf/Documentation/perf-record.txt

1 perf-record(1)
5 ----
6 perf-record - Run a command and record its profile into perf.data
9 --------
11 'perf record' [-e <EVENT> | --event=EVENT] [-a] <command>
12 'perf record' [-e <EVENT> | --event=EVENT] [-a] \-- <command> [<options>]
15 -----------
17 from it, into perf.data - without displaying anything.
23 -------
27 -e::
28 --event=::
31         - a symbolic event name	(use 'perf list' to list all events)
33         - a raw PMU event in the form of rN where N is a hexadecimal value
38         - a symbolic or raw PMU event followed by an optional colon
39 	  and a list of event modifiers, e.g., cpu-cycles:p.  See the
40 	  linkperf:perf-list[1] man page for details on event modifiers.
42 	- a symbolically formed PMU event like 'pmu/param1=0x3,param2/' where
46 	- a symbolically formed event like 'pmu/config=M,config1=N,config3=K/'
57 	  - 'period': Set event sampling period
58 	  - 'freq': Set event sampling frequency
59 	  - 'time': Disable/enable time stamping. Acceptable values are 1 for
60 		    enabling time stamping. 0 for disabling time stamping.
62 	  - 'call-graph': Disable/enable callgraph. Acceptable str are "fp" for
65 	  - 'stack-size': user stack size for dwarf mode
66 	  - 'name' : User defined event name. Single quotes (') may be used to
69 	  - 'aux-output': Generate AUX records instead of events. This requires
71 	  - 'aux-sample-size': Set sample size for AUX area sampling. If the
72 	  '--aux-sample' option has been used, set aux-sample-size=0 to disable
75           See the linkperf:perf-list[1] man page for more parameters.
85 	  perf record -e some_event/@cfg1,@cfg2=config/ ...
92         - a hardware breakpoint event in the form of '\mem:addr[/len][:access]'
97           If you want to profile read-write accesses in 0x1000, just set
102 	- a group of events surrounded by a pair of brace ("{event1,event2,...}").
104 	  prevent the shell interpretation.  You also need to use --group on
107 --filter=<filter>::
108 	Event filter.  This option should follow an event selector (-e).
115 	- tracepoint filters
117 	In the case of tracepoints, multiple '--filter' options are combined
120 	- address filters
123 	address filters	by specifying a non-zero value in
131 	- 'filter': defines a region that will be traced.
132 	- 'start': defines an address at which tracing will begin.
133 	- 'stop': defines an address at which tracing will stop.
134 	- 'tracestop': defines a region in which tracing will stop.
160 	To see the filter that is passed, use the -v option.
168 	- bpf filters
170 	A BPF filter can access the sample data and make a decision based on the
171 	data.  Users need to set an appropriate sample type to use the BPF
174 	The sample data field can be specified in lower case letter.  Multiple
177 	  --filter 'period > 1000, cpu == 1'
179 	  --filter 'mem_op == load || mem_op == store, mem_lvl > l1'
187 	Also user should request to collect that information (with -d option in
190 	  $ sudo perf record -e cycles --filter 'mem_op == load'
192 	   Hint: please add -d option to perf record.
200 	  ip, id, tid, pid, cpu, time, addr, period, txn, weight, phys_addr,
219 --exclude-perf::
221 	an event selector (-e) which selects tracepoint event(s). It adds a
223 	'--filter' exists, the new filter expression will be combined with
226 -a::
227 --all-cpus::
228         System-wide collection from all CPUs (default if no target is specified).
230 -p::
231 --pid=::
234 -t::
235 --tid=::
238         --inherit.
240 -u::
241 --uid=::
244 -r::
245 --realtime=::
248 --no-buffering::
251 -c::
252 --count=::
253 	Event period to sample.
255 -o::
256 --output=::
259 -i::
260 --no-inherit::
263 -F::
264 --freq=::
268 	See --strict-freq.
270 --strict-freq::
273 -m::
274 --mmap-pages=::
276 	specification in bytes with appended unit character - B/K/M/G.
277 	The size is rounded up to the nearest power-of-two page value.
282 -g::
283 	Enables call-graph (stack chain/backtrace) recording for both
286 --call-graph::
287 	Setup and enable call-graph (stack chain/backtrace) recording,
288 	implies -g.  Default is "fp" (for user space).
296 	Valid options are "fp" (frame pointer), "dwarf" (DWARF's CFI -
301 	--fomit-frame-pointer, using the "fp" method will produce bogus
308 	doesn't work with branch stack sampling at the same time.
313 	"--call-graph dwarf,4096".
318 	like "--call-graph fp,32".
320 -q::
321 --quiet::
324 -v::
325 --verbose::
328 -s::
329 --stat::
330 	Record per-thread event counts.  Use it with 'perf report -T' to see
333 -d::
334 --data::
335 	Record the sample virtual addresses.
337 --phys-data::
338 	Record the sample physical addresses.
340 --data-page-size::
343 --code-page-size::
346 -T::
347 --timestamp::
348 	Record the sample timestamps. Use it with 'perf report -D' to see the
351 -P::
352 --period::
353 	Record the sample period.
355 --sample-cpu::
356 	Record the sample cpu.
358 --sample-identifier::
359 	Record the sample identifier i.e. PERF_SAMPLE_IDENTIFIER bit set in
363 -n::
364 --no-samples::
365 	Don't sample.
367 -R::
368 --raw-samples::
369 Collect raw sample records from all opened counters (default for tracepoint counters).
371 -C::
372 --cpu::
374 comma-separated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2.
375 In per-thread mode with inheritance mode on (default), samples are captured only when
381 -B::
382 --no-buildid::
385 the recording process to take a long time, as it needs to process all
389 pathname. You can also set the "record.build-id" config variable to
392 -N::
393 --no-buildid-cache::
396 is sufficient.  You can also set the "record.build-id" config variable to
397 'no-cache' to have the same effect.
399 -G name,...::
400 --cgroup name,...::
402 in per-cpu mode. The cgroup filesystem must be mounted. All threads belonging to
406 an empty cgroup (monitor all the time) using, e.g., -G foo,,bar. Cgroups must have
409 use '-e e1 -e e2 -G foo,foo' or just use '-e e1 -e e2 -G foo'.
412 command line can be used: 'perf stat -e cycles -G cgroup_name -a -e cycles'.
414 -b::
415 --branch-any::
417 This is a shortcut for --branch-filter any. See --branch-filter for more infos.
419 -j::
420 --branch-filter::
421 Enable taken branch stack sampling. Each sample captures a series of consecutive
422 taken branches. The number of branches captured with each sample depends on the
427         - any:  any type of branches
428         - any_call: any function call or system call
429         - any_ret: any function return or system call return
430         - ind_call: any indirect branch
431         - ind_jmp: any indirect jump
432         - call: direct calls, including far (to/from kernel) calls
433         - u:  only when the branch target is at the user level
434         - k: only when the branch target is in the kernel
435         - hv: only when the target is at the hypervisor level
436 	- in_tx: only when the target is in a hardware transaction
437 	- no_tx: only when the target is not in a hardware transaction
438 	- abort_tx: only when the target is a hardware transaction abort
439 	- cond: conditional branches
440 	- call_stack: save call stack
441 	- no_flags: don't save branch flags e.g prediction, misprediction etc
442 	- no_cycles: don't save branch cycles
443 	- hw_index: save branch hardware index
444 	- save_type: save branch type during sampling in case binary is not available later
445 		     For the platforms with Intel Arch LBR support (12th-Gen+ client or
446 		     4th-Gen Xeon+ server), the save branch type is unconditionally enabled
448 	- priv: save privilege state during sampling in case binary is not available later
449 	- counter: save occurrences of the event since the last branch entry. Currently, the
460 The various filters must be specified as a comma separated list: --branch-filter any_ret,u,k
463 -W::
464 --weight::
465 Enable weightened sampling. An additional weight is recorded per sample and can be
469 --namespaces::
472 --all-cgroups::
475 --transaction::
478 --per-thread::
479 Use per-thread mmaps.  By default per-cpu mmaps are created.  This option
480 overrides that and uses per-thread mmaps.  A side-effect of that is that
481 inheritance is automatically disabled.  --per-thread is ignored with a warning
482 if combined with -a or -C options.
484 -D::
485 --delay=::
486 After starting the program, wait msecs before measuring (-1: start with events
488 -D 10-20,30-40 means wait 10 msecs, enable for 10 msecs, wait 10 msecs, enable
492 -I::
493 --intr-regs::
495 each sample. List of captured registers depends on the architecture. This option
496 is off by default. It is possible to select the registers to sample using their
498 --intr-regs=\?. To name registers, pass a comma separated list such as
499 --intr-regs=ax,bx. The list of register is architecture dependent.
501 --user-regs::
502 Similar to -I, but capture user registers at sample time. To list the available
503 user registers use --user-regs=\?.
505 --running-time::
506 Record running and enabled time for read events (:S)
508 -k::
509 --clockid::
510 Sets the clock id to use for the various time fields in the perf_event_type
515 -S::
516 --snapshot::
521   - 'e': take one last snapshot on exit; guarantees that there is at least one
523   - <size>: if the PMU supports this, specify the desired snapshot size.
528 --aux-sample[=OPTIONS]::
529 Select AUX area sampling. At least one of the events selected by the -e option
531 data from the AUX area. Optionally sample size may be specified, otherwise it
534 --proc-map-timeout::
535 When processing pre-existing threads /proc/XXX/mmap, it may take a long time,
536 because the file may be huge. A time out is needed in such cases.
537 This option sets the time out limit. The default value is 500 ms.
539 --switch-events::
543 by the option --no-switch-events.
545 --vmlinux=PATH::
549 --buildid-all::
550 Record build-id of all DSOs regardless whether it's actually hit or not.
552 --buildid-mmap::
553 Record build ids in mmap2 events, disables build id cache (implies --no-buildid).
555 --aio[=n]::
560 --affinity=mode::
563   - node - thread affinity mask is set to NUMA node cpu mask of the processed mmap buffer
564   - cpu  - thread affinity mask is set to cpu of the processed mmap buffer
566 --mmap-flush=number::
573 The default option value is 1 byte which means that every time that the output
575 possibly compressed (-z) and written to the output, perf.data or pipe.
582 can take less time than executing more output write syscalls with smaller data
585 -z::
586 --compression-level[=n]::
587 Produce compressed trace using specified level n (default: 1 - fastest compression,
588 22 - smallest trace)
590 --all-kernel::
593 --all-user::
596 --kernel-callchains::
600 --user-callchains::
604 Don't use both --kernel-callchains and --user-callchains at the same time or no
607 --timestamp-filename
610 --timestamp-boundary::
611 Record timestamp boundary (time of first/last samples).
613 --switch-output[=mode]::
617   - "signal" - when receiving a SIGUSR2 (default value) or
618   - <size>   - when reaching the size threshold, size is expected to
619                be a number with appended unit character - B/K/M/G
620   - <time>   - when reaching the time threshold, size is expected to
621                be a number with appended unit character - s/m/h/d
624                on your configuration  - the number and size of  your  ring
625                buffers (-m). It is generally more precise for higher sizes
632 Implies --timestamp-filename, --no-buildid and --no-buildid-cache.
636   --switch-output --no-no-buildid  --no-no-buildid-cache
638 --switch-output-event::
639 Events that will cause the switch of the perf.data file, auto-selecting
640 --switch-output=signal, the results are similar as internally the side band
643 Uses the same syntax as --event, it will just not be recorded, serving only to
644 switch the perf.data file as soon as the --switch-output event is processed by
651 --switch-max-files=N::
653 When rotating perf.data with --switch-output, only keep N files.
655 --dry-run::
656 Parse options then exit. --dry-run can be used to detect errors in cmdline
659 'perf record --dry-run -e' can act as a BPF script compiler if llvm.dump-obj
662 --synth=TYPE::
665 task status for pre-existing threads.
668 choice in this option.  For example, --synth=no would have MMAP events for
673   - 'task'    - synthesize FORK and COMM events for each task
674   - 'mmap'    - synthesize MMAP events for each process (implies 'task')
675   - 'cgroup'  - synthesize CGROUP events for each cgroup
676   - 'all'     - synthesize all events (default)
677   - 'no'      - do not synthesize any of the above events
679 --tail-synthesize::
680 Instead of collecting non-sample events (for example, fork, comm, mmap) at
682 The collected non-sample events reflects the status of the system when
685 --overwrite::
691 When '--overwrite' and '--switch-output' are used perf records and drops
697 config terms. For example: 'cycles/overwrite/' and 'instructions/no-overwrite/'.
699 Implies --tail-synthesize.
701 --kcore::
704 --max-size=<size>::
705 Limit the sample data max size, <size> is expected to be a number with
706 appended unit character - B/K/M/G
708 --num-thread-synthesize::
713 --pfm-events events::
715 including support for event filters. For example '--pfm-events
718 events cannot be mixed together. The latter must be used with the -e
719 option. The -e option and this one can be mixed and matched.  Events
723 --control=fifo:ctl-fifo[,ack-fifo]::
724 --control=fd:ctl-fd[,ack-fd]::
725 ctl-fifo / ack-fifo are opened and used as ctl-fd / ack-fd as follows.
726 Listen on ctl-fd descriptor for command to control measurement.
730   - 'enable'           : enable events
731   - 'disable'          : disable events
732   - 'enable name'      : enable event 'name'
733   - 'disable name'     : disable event 'name'
734   - 'snapshot'         : AUX area tracing snapshot).
735   - 'stop'             : stop perf record
736   - 'ping'             : ping
737   - 'evlist [-v|-g|-F] : display all events
739                          -F  Show just the sample frequency used for each event.
740                          -v  Show all fields.
741                          -g  Show event group information.
743 Measurements can be started with events disabled using --delay=-1 option. Optionally
744 send control command completion ('ack\n') to ack-fd descriptor to synchronize with the
753  test -p ${ctl_fifo} && unlink ${ctl_fifo}
758  test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
762  perf record -D -1 -e cpu-cycles -a               \
763              --control fd:${ctl_fd},${ctl_fd_ack} \
764              -- sleep 30 &
767  sleep 5  && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
768  sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
770  exec {ctl_fd_ack}>&-
773  exec {ctl_fd}>&-
776  wait -n ${perf_pid}
779 --threads=<spec>::
793     0,2-4/2-4:1,5-7/5-7
796 the first thread monitors CPUs 0 and 2-4 with the affinity mask 2-4,
797 the second monitors CPUs 1 and 5-7 with the affinity mask 5-7.
802     - cpu    - create new data streaming thread for every monitored cpu
803     - core   - create new thread to monitor CPUs grouped by a core
804     - package - create new thread to monitor CPUs grouped by a package
805     - numa   - create new threed to monitor CPUs grouped by a NUMA domain
808 order not to spawn multiple per-cpu streaming threads but still avoid LOST
811 filtered through the mask provided by -C option.
813 --debuginfod[=URLs]::
822 --off-cpu::
823 	Enable off-cpu profiling with BPF.  The BPF program will collect
825 	as sample data of a software event named "offcpu-time".  The
826 	sample period will have the time the task slept in nanoseconds.
832 --setup-filter=<action>::
837 include::intel-hybrid.txt[]
840 --------
841 linkperf:perf-stat[1], linkperf:perf-list[1], linkperf:perf-intel-pt[1]