perf/Documentation/topdown.txt

2 ---------------------
11 perf stat --topdown implements this using available metrics that vary
14 % perf stat -a --topdown -I1000
28 fixed counters and do not require generic counters. This allows
53 metric event, and allow user programs to read the performance counters.
84 int slots_fd = perf_event_open(&slots, 0, -1, -1, 0);
104 int metrics_fd = perf_event_open(&metrics, 0, -1, slots_fd, 0);
123 #define RDPMC_FIXED	(1 << 30)	/* return fixed counters */
124 #define RDPMC_METRIC	(1 << 29)	/* return metric counters */
147 _rdpmc calls should not be mixed with reading the metrics and slots counters
148 through system calls, as the kernel will reset these counters after each system
205 	retiring_slots = GET_METRIC(metric_b, 0) * slots_b - retiring_slots_a
206 	bad_spec_slots = GET_METRIC(metric_b, 1) * slots_b - bad_spec_slots_a
207 	fe_bound_slots = GET_METRIC(metric_b, 2) * slots_b - fe_bound_slots_a
208 	be_bound_slots = GET_METRIC(metric_b, 3) * slots_b - be_bound_slots_a
213 	slots_delta = slots_b - slots_a
226 recreated from L1 and L2 metric counters. (Available on Sapphire Rapids and
236 	heavy_ops_slots = GET_METRIC(metric_b, 4) * slots_b - heavy_ops_slots_a
237 	br_mispredict_slots = GET_METRIC(metric_b, 5) * slots_b - br_mispredict_slots_a
238 	fetch_lat_slots = GET_METRIC(metric_b, 6) * slots_b - fetch_lat_slots_a
239 	mem_bound_slots = GET_METRIC(metric_b, 7) * slots_b - mem_bound_slots_a
241 	slots_delta = slots_b - slots_a
243 	light_ops_ratio = retiring_ratio - heavy_ops_ratio;
246 	machine_clears_ratio = bad_spec_ratio - br_mispredict_ratio;
249 	fetch_bw_ratio = fe_bound_ratio - fetch_lat_ratio;
252 	core_bound_ratio = be_bound_ratio - mem_bound_ratio;
267 Resetting metrics counters
272 fraction bit shrinks. So the counters need to be reset regularly.
278 When using perf stat it is recommended to always use the -I option,
281 	perf stat -I 1000 --topdown ...
296 Four pseudo TopDown metric events are exposed for the end-users,
297 topdown-retiring, topdown-bad-spec, topdown-fe-bound and topdown-be-bound.
300 - All the TopDown metric events must be in a group with the SLOTS event.
301 - The SLOTS event must be the leader of the group.
302 - The PERF_FORMAT_GROUP flag must be applied for each TopDown metric
308 For example, perf record -e '{slots, $sampling_event, topdown-retiring}:S'
314 The upper half is also divided into four 8-bit fields for the new level 2
315 metrics. Four more TopDown metric events are exposed for the end-users,
316 topdown-heavy-ops, topdown-br-mispredict, topdown-fetch-lat and
317 topdown-mem-bound.
323     Light_Operations = Retiring - Heavy_Operations
324     Machine_Clears = Bad_Speculation - Branch_Mispredicts
325     Fetch_Bandwidth = Frontend_Bound - Fetch_Latency
326     Core_Bound = Backend_Bound - Memory_Bound
340 	perf record -e event_name -W ...
355 	perf stat -M metric_name --record-tpebs ...
359 [1] https://software.intel.com/en-us/top-down-microarchitecture-analysis-method-win
360 [2] https://sites.google.com/site/analysismethods/yasin-pubs
361 [3] https://perf.wiki.kernel.org/index.php/Top-Down_Analysis
362 [4] https://github.com/andikleen/pmu-tools/tree/master/jevents