Lines Matching +full:cpu +full:- +full:viewed
1 perf-arm-spe(1)
5 ----
6 perf-arm-spe - Support for Arm Statistical Profiling Extension within Perf tools
9 --------
11 'perf record' -e arm_spe//
14 -----------
17 events down to individual instructions. Rather than being interrupt-driven, it picks an
33 architectural instructions or all micro-ops. Sampling happens at a programmable interval. The
35 sample. This minimum interval is used by the driver if no interval is specified. A pseudo-random
62 ----------------
72 -------------
74 - Sampling, rather than tracing, cuts down the profiling problem to something more manageable for
77 - Allows precise attribution data, including: Full PC of instruction, data virtual and physical
80 - Allows correlation between an instruction and events, such as TLB and cache miss. (Data source
84 However, SPE does not provide any call-graph information, and relies on statistical methods.
87 ----------
99 -----------------------------------------
101 If an implementation samples micro-operations instead of instructions, the results of sampling must
104 For example, if a given instruction A is always converted into two micro-operations, A0 and A1, it
111 -------------------
115 Depending on CPU model, the kernel may need to be booted with page table isolation disabled
128 Capturing SPE with perf command-line tools
129 ------------------------------------------
133 perf record -e arm_spe// -- ./mybench
135 The sample period is set from the -c option, and because the minimum interval is used by default
141 These are placed between the // in the event and comma separated. For example '-e
144 branch_filter=1 - collect branches only (PMSFCR.B)
145 event_filter=<mask> - filter on specific events (PMSEVFR) - see bitfield description below
146 jitter=1 - use jitter to avoid resonance when sampling (PMSIRR.RND)
147 load_filter=1 - collect loads only (PMSFCR.LD)
148 min_latency=<n> - collect only samples with this latency or higher* (PMSLATFR)
149 …pa_enable=1 - collect physical address (as well as VA) of loads/stores (PMSCR.PA) - requir…
150 …pct_enable=1 - collect physical timestamp instead of virtual timestamp (PMSCR.PCT) - requir…
151 store_filter=1 - collect stores only (PMSFCR.ST)
152 ts_enable=1 - enable timestamping with value of generic timer (PMSCR.TS)
159 bit 1 - instruction retired (i.e. omit speculative instructions)
160 bit 3 - L1D refill
161 bit 5 - TLB refill
162 bit 7 - mispredict
163 bit 11 - misaligned access
167 perf record -e arm_spe/event_filter=2/ -- ./mybench
171 perf record -e arm_spe/event_filter=0x80/ -- ./mybench
184 21 l1d-miss
185 897 l1d-access
186 5 llc-miss
187 7 llc-access
188 2 tlb-miss
189 1K tlb-access
190 36 branch-miss
191 0 remote-access
200 perf report --itrace=i1i
202 Memory access details are also stored on the samples and this can be viewed with:
204 perf report --mem-mode
209 - "Cannot find PMU `arm_spe'. Missing kernel support?"
214 - "Arm SPE CONTEXT packets not found in the traces."
219 - Excessively large perf.data file size
225 --------
227 linkperf:perf-record[1], linkperf:perf-script[1], linkperf:perf-report[1],
228 linkperf:perf-inject[1]