Lines Matching +full:average +full:- +full:on
4 "MetricExpr": "cstate_pkg@c2\\-residency@ / TSC",
11 "MetricExpr": "cstate_core@c3\\-residency@ / TSC",
18 "MetricExpr": "cstate_pkg@c3\\-residency@ / TSC",
25 "MetricExpr": "cstate_core@c6\\-residency@ / TSC",
32 "MetricExpr": "cstate_pkg@c6\\-residency@ / TSC",
39 "MetricExpr": "cstate_core@c7\\-residency@ / TSC",
46 "MetricExpr": "cstate_pkg@c7\\-residency@ / TSC",
164 …"BriefDescription": "Average latency of a last level cache (LLC) demand and prefetch data read mis…
170 …"BriefDescription": "Average latency of a last level cache (LLC) demand and prefetch data read mis…
176 …"BriefDescription": "Average latency of a last level cache (LLC) demand and prefetch data read mis…
230 …"BriefDescription": "Uops delivered from legacy decode pipeline (Micro-instruction Translation Eng…
237 … "MetricExpr": "(UOPS_ISSUED.ANY - IDQ.MITE_UOPS - IDQ.MS_UOPS - IDQ.DSB_UOPS) / UOPS_ISSUED.ANY",
255 "MetricExpr": "((msr@aperf@ - cycles) / msr@aperf@ if msr@smi@ > 0 else 0)",
280 …sible; which incur a few cycles load re-issue. However; the short re-issue duration is often hidde…
284 …"BriefDescription": "This metric represents Core fraction of cycles CPU dispatched uops on executi…
298 …-cases for operations that cannot be handled natively by the execution pipeline. For example; when…
304 "MetricExpr": "1 - (tma_frontend_bound + tma_bad_speculation + tma_retiring)",
309 …-of-order scheduler dispatches ready uops into their respective execution units; and once complete…
314 …"MetricExpr": "(UOPS_ISSUED.ANY - UOPS_RETIRED.RETIRE_SLOTS + 4 * (INT_MISC.RECOVERY_CYCLES_ANY / …
319 …s for which the issue-pipeline was blocked due to recovery from earlier incorrect speculation. For…
330 …etched from an incorrectly speculated program path; or stalls when the out-of-order part of the ma…
339 … corrected path; following all sorts of miss-predicted branches. For example; branchy code with lo…
345 "MetricExpr": "max(0, tma_microcode_sequencer - tma_assists)",
349 … as in the case of read-modify-write as an example. Since these instructions require multiple uops…
359 …ata written by one Logical Processor are read by another Logical Processor on a different Physical…
363 …"BriefDescription": "This metric represents fraction of slots where Core non-memory issues were of…
365 "MetricExpr": "tma_backend_bound - tma_memory_bound",
370 …-memory issues were of a bottleneck. Shortage in hardware compute resources; or dependencies in s…
374 …n of cycles while the memory subsystem was handling synchronizations due to data-sharing accesses",
380 … cycles while the memory subsystem was handling synchronizations due to data-sharing accesses. Dat…
393 …"BriefDescription": "This metric estimates how often the CPU was stalled on accesses to external m…
395 …"MetricExpr": "(1 - MEM_LOAD_UOPS_RETIRED.L3_HIT / (MEM_LOAD_UOPS_RETIRED.L3_HIT + 7 * MEM_LOAD_UO…
399 …"PublicDescription": "This metric estimates how often the CPU was stalled on accesses to external …
404 …"MetricExpr": "(IDQ.ALL_DSB_CYCLES_ANY_UOPS - IDQ.ALL_DSB_CYCLES_4_UOPS) / tma_info_core_core_clks…
417 …o switches from DSB to MITE pipelines. The DSB (decoded i-cache) is a Uop Cache where the front-en…
426 …-aside Buffers) are processor caches for recently used entries out of the Page Tables that are use…
430 …: "This metric roughly estimates the fraction of cycles spent handling first-level data TLB store …
435 …-level data TLB store misses. As with ordinary data caching; focus on improving data locality and…
444 …a multithreading hiccup; where multiple Logical Processors contend on different data-elements mapp…
454 …the misses are satisfied from (metric values >1 are valid). Often it hints on approaching bandwidt…
459 "MetricExpr": "tma_frontend_bound - tma_fetch_latency",
474 …he CPU was stalled due to Frontend latency issues. For example; instruction-cache misses; iTLB mi…
484 …on by the Backend part. Within the Frontend; a branch predictor predicts the next address to fetch…
488 … slots where the CPU was retiring heavy-weight operations -- instructions that require two or more…
494 …he CPU was retiring heavy-weight operations -- instructions that require two or more uops or micro…
513 …"BriefDescription": "Number of Instructions per non-speculative Branch Misprediction (JEClear) (lo…
520 …"BriefDescription": "Core actual clocks when any Logical Processor is active on the Physical Core",
526 "BriefDescription": "Instructions Per Cycle across hyper-threads (per physical core)",
532 …efDescription": "Instruction-Level-Parallelism (average number of uops executed when there is exec…
601 "BriefDescription": "Average per-core data fill bandwidth to the L1 data cache [GB / sec]",
607 "BriefDescription": "Average per-core data fill bandwidth to the L2 cache [GB / sec]",
613 "BriefDescription": "Average per-core data fill bandwidth to the L3 cache [GB / sec]",
619 … "BriefDescription": "Average per-thread data fill bandwidth to the L1 data cache [GB / sec]",
631 "BriefDescription": "Average per-thread data fill bandwidth to the L2 cache [GB / sec]",
649 "BriefDescription": "Average per-thread data fill bandwidth to the L3 cache [GB / sec]",
661 "BriefDescription": "Average Parallel L2 cache miss data reads",
667 "BriefDescription": "Average Latency for L2 cache miss demand Loads",
673 "BriefDescription": "Average Parallel L2 cache miss demand Loads",
679 …"BriefDescription": "Actual Average Latency for L1 data-cache miss demand load operations (in core…
686 …"BriefDescription": "Memory-Level-Parallelism (average number of L1 miss demand load when there is…
691 …ublicDescription": "Memory-Level-Parallelism (average number of L1 miss demand load when there is …
701 …"BriefDescription": "Average number of Uops retired in cycles where at least one uop has retired.",
707 "BriefDescription": "Measured Average Core Frequency for unhalted processors [GHz]",
713 "BriefDescription": "Average CPU Utilization (percentage)",
719 "BriefDescription": "Average number of utilized CPUs",
725 "BriefDescription": "Average external Memory Bandwidth Use for reads and writes [GB / sec]",
729 …"PublicDescription": "Average external Memory Bandwidth Use for reads and writes [GB / sec]. Relat…
752 "BriefDescription": "Average number of parallel data read requests to external memory",
756 …"PublicDescription": "Average number of parallel data read requests to external memory. Accounts f…
759 … "BriefDescription": "Average latency of data read request to external memory (in nanoseconds)",
763 …tion": "Average latency of data read request to external memory (in nanoseconds). Accounts for dem…
767 …"MetricExpr": "(1 - CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE / (CPU_CLK_UNHALTED.REF_XCLK_ANY / 2) if #S…
772 "BriefDescription": "Socket actual clocks when any core is active on that socket",
778 "BriefDescription": "Average Frequency Utilization relative nominal frequency",
784 "BriefDescription": "Measured Average Uncore Frequency for the SoC [GHz]",
790 … "BriefDescription": "Per-Logical Processor actual clocks when the Logical Processor is active.",
808 …"BriefDescription": "Total issue-pipeline slots (per-Physical Core till ICL; per-Logical Processor…
838 …"MetricExpr": "max((min(CPU_CLK_UNHALTED.THREAD, CYCLE_ACTIVITY.STALLS_LDM_PENDING) - CYCLE_ACTIVI…
842 …on older stores; a load might suffer due to high latency even though it is being satisfied by the …
847 …"MetricExpr": "(CYCLE_ACTIVITY.STALLS_L1D_PENDING - CYCLE_ACTIVITY.STALLS_L2_PENDING) / tma_info_t…
884 …slots where the CPU was retiring light-weight operations -- instructions that require no more than…
885 "MetricExpr": "tma_retiring - tma_heavy_operations",
890 …-weight operations -- instructions that require no more than one uop (micro-operation). This corre…
894 …"BriefDescription": "This metric represents Core fraction of cycles CPU dispatched uops on executi…
896 …HED_PORT.PORT_2 + UOPS_DISPATCHED_PORT.PORT_3 + UOPS_DISPATCHED_PORT.PORT_7 - UOPS_DISPATCHED_PORT…
900 …"PublicDescription": "This metric represents Core fraction of cycles CPU dispatched uops on execut…
925 "MetricExpr": "tma_bad_speculation - tma_branch_mispredicts",
930 …-of-order portion of the machine needs to recover its state after the clear. For example; this can…
934 …as likely hurt due to approaching bandwidth limits of external memory - DRAM ([SPR-HBM] and/or HBM…
939 …- DRAM ([SPR-HBM] and/or HBM). The underlying heuristic assumes that a similar off-core traffic i…
943 …e the performance was likely hurt due to latency from external memory - DRAM ([SPR-HBM] and/or HBM…
944 …EAD, OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DATA_RD) / tma_info_thread_clks - tma_mem_bandwidth",
948 …e the performance was likely hurt due to latency from external memory - DRAM ([SPR-HBM] and/or HBM…
954 …- (cpu@UOPS_EXECUTED.CORE\\,cmask\\=3@ if tma_info_thread_ipc > 1.8 else cpu@UOPS_EXECUTED.CORE\\,…
959 …-completed in-flight memory demand loads which coincides with execution units starvation; in addit…
973 …"MetricExpr": "(IDQ.ALL_MITE_CYCLES_ANY_UOPS - IDQ.ALL_MITE_CYCLES_4_UOPS) / tma_info_core_core_cl…
977 …the legacy decode pipeline). This pipeline is used for code that was not pre-cached in the DSB or …
986 … Commonly used instructions are optimized for delivery by the DSB (decoded i-cache) or MITE (legac…
990 …"BriefDescription": "This metric represents Core fraction of cycles CPU dispatched uops on executi…
995 …"PublicDescription": "This metric represents Core fraction of cycles CPU dispatched uops on execut…
999 …"BriefDescription": "This metric represents Core fraction of cycles CPU dispatched uops on executi…
1004 …"PublicDescription": "This metric represents Core fraction of cycles CPU dispatched uops on execut…
1008 …represents Core fraction of cycles CPU dispatched uops on execution port 2 ([SNB+]Loads and Store-…
1013 …represents Core fraction of cycles CPU dispatched uops on execution port 2 ([SNB+]Loads and Store-…
1017 …represents Core fraction of cycles CPU dispatched uops on execution port 3 ([SNB+]Loads and Store-…
1022 …represents Core fraction of cycles CPU dispatched uops on execution port 3 ([SNB+]Loads and Store-…
1026 …is metric represents Core fraction of cycles CPU dispatched uops on execution port 4 (Store-data)",
1031 … metric represents Core fraction of cycles CPU dispatched uops on execution port 4 (Store-data). S…
1035 …"BriefDescription": "This metric represents Core fraction of cycles CPU dispatched uops on executi…
1040 …"PublicDescription": "This metric represents Core fraction of cycles CPU dispatched uops on execut…
1044 …"BriefDescription": "This metric represents Core fraction of cycles CPU dispatched uops on executi…
1049 …"PublicDescription": "This metric represents Core fraction of cycles CPU dispatched uops on execut…
1053 … represents Core fraction of cycles CPU dispatched uops on execution port 7 ([HSW+]simple Store-ad…
1058 … represents Core fraction of cycles CPU dispatched uops on execution port 7 ([HSW+]simple Store-ad…
1062 … the CPU performance was potentially limited due to Core computation issues (non divider-related)",
1064 …- (cpu@UOPS_EXECUTED.CORE\\,cmask\\=3@ if tma_info_thread_ipc > 1.8 else cpu@UOPS_EXECUTED.CORE\\,…
1068 …-related). Two distinct categories can be attributed into this metric: (1) heavy data-dependency …
1072 …"BriefDescription": "This metric represents fraction of cycles CPU executed no uops on any executi…
1073 …SMT_on else (min(CPU_CLK_UNHALTED.THREAD, CYCLE_ACTIVITY.CYCLES_NO_EXECUTE) - (RS_EVENTS.EMPTY_CYC…
1077 …cycles CPU executed no uops on any execution port (Logical Processor cycles since ICL, Physical Co…
1081 …resents fraction of cycles where the CPU executed total of 1 uop per cycle on all execution ports …
1082 …_EXECUTED.CORE\\,cmask\\=1@ - cpu@UOPS_EXECUTED.CORE\\,cmask\\=2@) / 2 if #SMT_on else (cpu@UOPS_E…
1086 …on all execution ports (Logical Processor cycles since ICL, Physical Core cycles otherwise). This …
1090 …etric represents fraction of cycles CPU executed total of 2 uops per cycle on all execution ports …
1091 …_EXECUTED.CORE\\,cmask\\=2@ - cpu@UOPS_EXECUTED.CORE\\,cmask\\=3@) / 2 if #SMT_on else (cpu@UOPS_E…
1095 …on all execution ports (Logical Processor cycles since ICL, Physical Core cycles otherwise). Loop…
1099 …presents fraction of cycles CPU executed total of 3 or more uops per cycle on all execution ports …
1113 …r sockets including synchronizations issues. This is caused often due to non-optimal NUMA allocati…
1122 …ystem was handling loads from remote memory. This is caused often due to non-optimal NUMA allocati…
1132 …ions-per-cycle (see IPC metric). Note that a high Retiring value does not necessary mean there is …
1136 … estimates fraction of cycles handling memory load split accesses - load that cross 64-byte cache …
1142 … estimates fraction of cycles handling memory load split accesses - load that cross 64-byte cache …
1151 …resents rate of split store accesses. Consider aligning your data to the 64-byte cache line granu…
1155 …f cycles where the Super Queue (SQ) was full taking into account all request-types and both hardwa…
1160 …f cycles where the Super Queue (SQ) was full taking into account all request-types and both hardwa…
1164 … CPU was stalled due to RFO store memory accesses; RFO store issue a read-for-ownership request b…
1169 …ses; RFO store issue a read-for-ownership request before the write. Even though store accesses do …
1178 …perations in the pipeline; a load can avoid waiting for memory if a prior in-flight store is writi…
1184 …"MetricExpr": "(L2_RQSTS.RFO_HIT * 9 * (1 - MEM_UOPS_RETIRED.LOCK_LOADS / MEM_UOPS_RETIRED.ALL_STO…
1188 …-of-order core performance; however; holding resources for longer time can lead into undesired imp…
1192 …"BriefDescription": "This metric represents Core fraction of cycles CPU dispatched uops on executi…