x86/alderlake/adl-metrics.json

4         "MetricExpr": "cstate_pkg@c10\\-residency@ / TSC",
11         "MetricExpr": "cstate_core@c1\\-residency@ / TSC",
18         "MetricExpr": "cstate_pkg@c2\\-residency@ / TSC",
25         "MetricExpr": "cstate_pkg@c3\\-residency@ / TSC",
32         "MetricExpr": "cstate_core@c6\\-residency@ / TSC",
39         "MetricExpr": "cstate_pkg@c6\\-residency@ / TSC",
46         "MetricExpr": "cstate_core@c7\\-residency@ / TSC",
53         "MetricExpr": "cstate_pkg@c7\\-residency@ / TSC",
60         "MetricExpr": "cstate_pkg@c8\\-residency@ / TSC",
67         "MetricExpr": "cstate_pkg@c9\\-residency@ / TSC",
74         "MetricExpr": "((msr@aperf@ - cycles) / msr@aperf@ if msr@smi@ > 0 else 0)",
89         "MetricExpr": "(max(cycles\\-t - cycles\\-ct, 0) / cycles if has_event(cycles\\-t) else 0)",
96         "MetricExpr": "(cycles\\-t / el\\-start if has_event(el\\-start) else 0)",
103         "MetricExpr": "(cycles\\-t / tx\\-start if has_event(cycles\\-t) else 0)",
110         "MetricExpr": "(cycles\\-t / cycles if has_event(cycles\\-t) else 0)",
139 …"MetricExpr": "(5 * cpu_atom@CPU_CLK_UNHALTED.CORE@ - (cpu_atom@TOPDOWN_FE_BOUND.ALL@ + cpu_atom@T…
304 …"MetricExpr": "cpu_atom@INST_RETIRED.ANY@ / (cpu_atom@BR_MISP_RETIRED.COND@ - cpu_atom@BR_MISP_RET…
471 …umber of machine clears relative to thousands of instructions retired, due to self-modifying code",
477 …    "BriefDescription": "Percentage of total non-speculative loads with an address aliasing block",
483 …"BriefDescription": "Percentage of total non-speculative loads with a store forward or unknown sto…
531 …    "BriefDescription": "Percentage of total non-speculative loads that perform one or more locks",
537         "BriefDescription": "Percentage of total non-speculative loads that are splits",
682         "MetricExpr": "tma_backend_bound - tma_core_bound",
713         "MetricGroup": "SoC",
732 …er-cases for operations that cannot be handled natively by the execution pipeline. For example; wh…
748 …opdown\\-be\\-bound@ / (cpu_core@topdown\\-fe\\-bound@ + cpu_core@topdown\\-bad\\-spec@ + cpu_core…
753 …-of-order scheduler dispatches ready uops into their respective execution units; and once complete…
760         "MetricExpr": "max(1 - (tma_frontend_bound + tma_backend_bound + tma_retiring), 0)",
765 …s for which the issue-pipeline was blocked due to recovery from earlier incorrect speculation. For…
771 …own\\-br\\-mispredict@ / (cpu_core@topdown\\-fe\\-bound@ + cpu_core@topdown\\-bad\\-spec@ + cpu_co…
776 …etched from an incorrectly speculated program path; or stalls when the out-of-order part of the ma…
786 … corrected path; following all sorts of miss-predicted branches. For example; branchy code with lo…
791 … represents fraction of cycles the CPU was stalled due staying in C0.1 power-performance optimized…
800 … represents fraction of cycles the CPU was stalled due staying in C0.2 power-performance optimized…
810         "MetricExpr": "max(0, tma_microcode_sequencer - tma_assists)",
814 … as in the case of read-modify-write as an example. Since these instructions require multiple uops…
820 …"MetricExpr": "(1 - tma_branch_mispredicts / tma_bad_speculation) * cpu_core@INT_MISC.CLEAR_RESTEE…
839 …"BriefDescription": "This metric represents fraction of slots where Core non-memory issues were of…
840         "MetricExpr": "max(0, tma_backend_bound - tma_memory_bound)",
845 …-memory issues were of a bottleneck.  Shortage in hardware compute resources; or dependencies in s…
850 …n of cycles while the memory subsystem was handling synchronizations due to data-sharing accesses",
851 …_HIT_RETIRED.XSNP_NO_FWD@ + cpu_core@MEM_LOAD_L3_HIT_RETIRED.XSNP_FWD@ * (1 - cpu_core@OCR.DEMAND_…
855 … cycles while the memory subsystem was handling synchronizations due to data-sharing accesses. Dat…
860 …"BriefDescription": "This metric represents fraction of cycles where decoder-0 was the only active…
861 …"MetricExpr": "(cpu_core@INST_DECODED.DECODERS\\,cmask\\=1@ - cpu_core@INST_DECODED.DECODERS\\,cma…
865 …"PublicDescription": "This metric represents fraction of cycles where decoder-0 was the only activ…
891 …"MetricExpr": "(cpu_core@IDQ.DSB_CYCLES_ANY@ - cpu_core@IDQ.DSB_CYCLES_OK@) / tma_info_core_core_c…
905 …o switches from DSB to MITE pipelines. The DSB (decoded i-cache) is a Uop Cache where the front-en…
911 …@DTLB_LOAD_MISSES.WALK_ACTIVE@, max(cpu_core@CYCLE_ACTIVITY.CYCLES_MEM_ANY@ - cpu_core@MEMORY_ACTI…
915 …-aside Buffers) are processor caches for recently used entries out of the Page Tables that are use…
920 …: "This metric roughly estimates the fraction of cycles spent handling first-level data TLB store …
925 …-level data TLB store misses.  As with ordinary data caching; focus on improving data locality and…
935 …hreading hiccup; where multiple Logical Processors contend on different data-elements mapped into …
951         "MetricExpr": "max(0, tma_frontend_bound - tma_fetch_latency)",
962 …n\\-fetch\\-lat@ / (cpu_core@topdown\\-fe\\-bound@ + cpu_core@topdown\\-bad\\-spec@ + cpu_core@top…
967 …he CPU was stalled due to Frontend latency issues.  For example; instruction-cache misses; iTLB mi…
973         "MetricExpr": "max(0, tma_heavy_operations - tma_microcode_sequencer)",
977 …t are decoder into two or up to ([SNB+] four; [ADL+] five) uops. This highly-correlates with the n…
982 …"BriefDescription": "This metric represents overall arithmetic floating-point (FP) operations frac…
987 …-point (FP) operations fraction the CPU has executed (retired). Note this metric's value may excee…
997 …ts. FP Assist may apply when working with very small floating point values (so-called Denormals).",
1002 …"BriefDescription": "This metric approximates arithmetic floating-point (FP) scalar uops fraction …
1007 …"PublicDescription": "This metric approximates arithmetic floating-point (FP) scalar uops fraction…
1012 …"BriefDescription": "This metric approximates arithmetic floating-point (FP) vector uops fraction …
1017 …"PublicDescription": "This metric approximates arithmetic floating-point (FP) vector uops fraction…
1022 …tric approximates arithmetic FP vector uops fraction the CPU has retired for 128-bit wide vectors",
1027 … approximates arithmetic FP vector uops fraction the CPU has retired for 128-bit wide vectors. May…
1032 …tric approximates arithmetic FP vector uops fraction the CPU has retired for 256-bit wide vectors",
1037 … approximates arithmetic FP vector uops fraction the CPU has retired for 256-bit wide vectors. May…
1044 …n\\-fe\\-bound@ / (cpu_core@topdown\\-fe\\-bound@ + cpu_core@topdown\\-bad\\-spec@ + cpu_core@topd…
1049 …-lines are fetched from the memory subsystem; parsed into instructions; and lastly decoded into mi…
1054 …represents fraction of slots where the CPU was retiring fused instructions -- where one uop can re…
1059 …represents fraction of slots where the CPU was retiring fused instructions -- where one uop can re…
1064 … slots where the CPU was retiring heavy-weight operations -- instructions that require two or more…
1065 …pdown\\-heavy\\-ops@ / (cpu_core@topdown\\-fe\\-bound@ + cpu_core@topdown\\-bad\\-spec@ + cpu_core…
1070 …he CPU was retiring heavy-weight operations -- instructions that require two or more uops or micro…
1085 …"BriefDescription": "Branch Misprediction Cost: Fraction of TMA slots wasted per non-speculative b…
1089 …"PublicDescription": "Branch Misprediction Cost: Fraction of TMA slots wasted per non-speculative …
1093 …"BriefDescription": "Instructions per retired mispredicts for conditional non-taken branches (lowe…
1125 …"BriefDescription": "Number of Instructions per non-speculative Branch Misprediction (JEClear) (lo…
1140 …      "BriefDescription": "Probability of Core Bound bottleneck hidden by SMT-profiling artifacts",
1141 …"MetricExpr": "(100 * (1 - tma_core_bound / tma_ports_utilization if tma_core_bound < tma_ports_ut…
1148 …"BriefDescription": "Total pipeline cost of DSB (uop cache) hits - subset of the Instruction_Fetch…
1153 …"PublicDescription": "Total pipeline cost of DSB (uop cache) hits - subset of the Instruction_Fetc…
1157 …"BriefDescription": "Total pipeline cost of DSB (uop cache) misses - subset of the Instruction_Fet…
1162 …"PublicDescription": "Total pipeline cost of DSB (uop cache) misses - subset of the Instruction_Fe…
1166 …"BriefDescription": "Total pipeline cost of Instruction Cache misses - subset of the Big_Code Bott…
1171 …"PublicDescription": "Total pipeline cost of Instruction Cache misses - subset of the Big_Code Bot…
1175 …of instruction fetch related bottlenecks by large code footprint programs (i-side cache; TLB and B…
1183 …"BriefDescription": "Total pipeline cost of instructions used for program control-flow - a subset …
1188 …"PublicDescription": "Total pipeline cost of instructions used for program control-flow - a subset…
1192 …"BriefDescription": "Total pipeline cost of external Memory- or Cache-Bandwidth related bottleneck…
1197 …"PublicDescription": "Total pipeline cost of external Memory- or Cache-Bandwidth related bottlenec…
1201 …"BriefDescription": "Total pipeline cost of external Memory- or Cache-Latency related bottlenecks",
1206 …"PublicDescription": "Total pipeline cost of external Memory- or Cache-Latency related bottlenecks…
1210 …     "BriefDescription": "Total pipeline cost when the execution is compute-bound - an estimation",
1215 …ine cost when the execution is compute-bound - an estimation. Covers Core Bound when High ILP as w…
1219 …tch bandwidth related bottlenecks (when the front-end could not sustain operations delivery to the…
1220 …- (1 - 10 * tma_microcode_sequencer * tma_other_mispredicts / tma_branch_mispredicts) * tma_fetch_…
1228 …"MetricExpr": "100 * ((1 - cpu_core@INST_RETIRED.REP_ITERATION@ / cpu_core@UOPS_RETIRED.MS\\,cmask…
1232 …"PublicDescription": "Total pipeline cost of irregular execution (e.g. FP-assists in HPC, Wait tim…
1236 …ription": "Total pipeline cost of Memory Address Translation related bottlenecks (data-side TLBs)",
1241 …"Total pipeline cost of Memory Address Translation related bottlenecks (data-side TLBs). Related m…
1246 …t_stores + tma_store_latency + tma_streaming_stores - tma_store_latency)) + tma_machine_clears * (…
1255 …"MetricExpr": "100 * (1 - 10 * tma_microcode_sequencer * tma_other_mispredicts / tma_branch_mispre…
1263         "BriefDescription": "Total pipeline cost of remaining bottlenecks in the back-end",
1264 …"MetricExpr": "100 - (tma_info_bottleneck_big_code + tma_info_bottleneck_instruction_fetch_bw + tm…
1268 …aining bottlenecks in the back-end. Examples include data-dependencies (Core Bound when Low ILP) a…
1272 …"BriefDescription": "Total pipeline cost of \"useful operations\" - the portion of Retiring catego…
1273 …tiring - (cpu_core@BR_INST_RETIRED.ALL_BRANCHES@ + 2 * cpu_core@BR_INST_RETIRED.NEAR_CALL@ + cpu_c…
1287         "BriefDescription": "Fraction of branches that are non-taken conditionals",
1302 …"MetricExpr": "(cpu_core@BR_INST_RETIRED.NEAR_TAKEN@ - cpu_core@BR_INST_RETIRED.COND_TAKEN@ - 2 * …
1309 …"MetricExpr": "1 - (tma_info_branches_cond_nt + tma_info_branches_cond_tk + tma_info_branches_call…
1322         "BriefDescription": "Instructions Per Cycle across hyper-threads (per physical core)",
1343 …BriefDescription": "Actual per-core usage of the Floating Point non-X87 execution units (regardles…
1347 …-core usage of the Floating Point non-X87 execution units (regardless of precision or vector-width…
1351 …efDescription": "Instruction-Level-Parallelism (average number of uops executed when there is exec…
1367 …tion": "Average number of cycles of a switch from the DSB fetch-unit to MITE fetch unit - see DSB_…
1374         "BriefDescription": "Average number of Uops issued by front-end when it issued something",
1388 …"BriefDescription": "Instructions per non-speculative DSB miss (lower number means higher occurren…
1424 …"BriefDescription": "Average number of cycles the front-end was delayed due to an Unknown Branch d…
1428 …"PublicDescription": "Average number of cycles the front-end was delayed due to an Unknown Branch …
1456 …"BriefDescription": "Instructions per FP Arithmetic AVX/SSE 128-bit instruction (lower number mean…
1461 …"PublicDescription": "Instructions per FP Arithmetic AVX/SSE 128-bit instruction (lower number mea…
1465 …"BriefDescription": "Instructions per FP Arithmetic AVX* 256-bit instruction (lower number means h…
1470 …"PublicDescription": "Instructions per FP Arithmetic AVX* 256-bit instruction (lower number means …
1474 …"BriefDescription": "Instructions per FP Arithmetic Scalar Double-Precision instruction (lower num…
1479 …"PublicDescription": "Instructions per FP Arithmetic Scalar Double-Precision instruction (lower nu…
1483 …"BriefDescription": "Instructions per FP Arithmetic Scalar Single-Precision instruction (lower num…
1488 …"PublicDescription": "Instructions per FP Arithmetic Scalar Single-Precision instruction (lower nu…
1556         "BriefDescription": "Average per-core data fill bandwidth to the L1 data cache [GB / sec]",
1563         "BriefDescription": "Average per-core data fill bandwidth to the L2 cache [GB / sec]",
1570         "BriefDescription": "Average per-core data access bandwidth to the L3 cache [GB / sec]",
1577         "BriefDescription": "Average per-core data fill bandwidth to the L3 cache [GB / sec]",
1584 … instructions for retired demand loads (L1D misses that merge into ongoing miss-handling entries)",
1591 …      "BriefDescription": "Average per-thread data fill bandwidth to the L1 data cache [GB / sec]",
1612         "BriefDescription": "Average per-thread data fill bandwidth to the L2 cache [GB / sec]",
1620 …"MetricExpr": "1e3 * (cpu_core@L2_RQSTS.REFERENCES@ - cpu_core@L2_RQSTS.MISS@) / cpu_core@INST_RET…
1661         "BriefDescription": "Average per-thread data access bandwidth to the L3 cache [GB / sec]",
1668         "BriefDescription": "Average per-thread data fill bandwidth to the L3 cache [GB / sec]",
1710 …"BriefDescription": "Actual Average Latency for L1 data-cache miss demand load operations (in core…
1724         "BriefDescription": "Un-cacheable retired load per kilo instruction",
1731 …"BriefDescription": "Memory-Level-Parallelism (average number of L1 miss demand load when there is…
1735 …ublicDescription": "Memory-Level-Parallelism (average number of L1 miss demand load when there is …
1739 … level TLB) code speculative misses per kilo instruction (misses of any page-size that complete th…
1746 …l TLB) data load speculative misses per kilo instruction (misses of any page-size that complete th…
1761 … TLB) data store speculative misses per kilo instruction (misses of any page-size that complete th…
1768 …"BriefDescription": "Instruction-Level-Parallelism (average number of uops executed when there is …
1812 …    "BriefDescription": "Estimated fraction of retirement-cycles dealing with repeat instructions",
1820 …et unhalted; covering legacy PAUSE instruction, as well as C0.1 / C0.2 power-performance optimized…
1851         "MetricGroup": "HPC;MemOffcore;MemoryBW;SoC;tma_issueBW",
1861 …gate across all supported options of: FP precisions, scalar and vector instructions, vector-width",
1890         "MetricGroup": "Mem;MemoryBW;SoC",
1899         "MetricGroup": "Mem;MemoryLat;SoC",
1901 … (in nanoseconds). Accounts for demand loads and L1/L2 prefetches. ([RKL+]memory-controller only)",
1906 …"MetricExpr": "(1 - cpu_core@CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE@ / cpu_core@CPU_CLK_UNHALTED.REF_D…
1914         "MetricGroup": "SoC",
1926 …   "BriefDescription": "Per-Logical Processor actual clocks when the Logical Processor is active.",
1940         "BriefDescription": "The ratio of Executed- by Issued-Uops",
1944 …"PublicDescription": "The ratio of Executed- by Issued-Uops. Ratio > 1 suggests high rate of uop m…
1955 …"BriefDescription": "Total issue-pipeline slots (per-Physical Core till ICL; per-Logical Processor…
1962 …    "BriefDescription": "Fraction of Physical Core issue-slots utilized by this Logical Processor",
1995 …"BriefDescription": "This metric represents 128-bit vector Integer ADD/SUB/SAD or VNNI (Vector Neu…
2000 …"PublicDescription": "This metric represents 128-bit vector Integer ADD/SUB/SAD or VNNI (Vector Ne…
2005 …"BriefDescription": "This metric represents 256-bit vector Integer ADD/SUB/SAD/MUL or VNNI (Vector…
2010 …"PublicDescription": "This metric represents 256-bit vector Integer ADD/SUB/SAD/MUL or VNNI (Vecto…
2026 …"MetricExpr": "max((cpu_core@EXE_ACTIVITY.BOUND_ON_LOADS@ - cpu_core@MEMORY_ACTIVITY.STALLS_L1D_MI…
2030 … TLB. These cases are characterized by execution unit stalls; while some non-completed demand load…
2036 …ALL_LOADS@ - cpu_core@MEM_LOAD_RETIRED.FB_HIT@ - cpu_core@MEM_LOAD_RETIRED.L1_MISS@) * 20 / 100, m…
2040 …e L1 cache. The short latency of the L1 data cache may be exposed in pointer-chasing memory access…
2046 …"MetricExpr": "(cpu_core@MEMORY_ACTIVITY.STALLS_L1D_MISS@ - cpu_core@MEMORY_ACTIVITY.STALLS_L2_MIS…
2056 …"MetricExpr": "(cpu_core@MEMORY_ACTIVITY.STALLS_L2_MISS@ - cpu_core@MEMORY_ACTIVITY.STALLS_L3_MISS…
2085 …slots where the CPU was retiring light-weight operations -- instructions that require no more than…
2086         "MetricExpr": "max(0, tma_retiring - tma_heavy_operations)",
2091 …-weight operations -- instructions that require no more than one uop (micro-operation). This corre…
2106 … the (first level) DTLB was missed by load accesses, that later on hit in second-level TLB (STLB)",
2107         "MetricExpr": "tma_dtlb_load - tma_load_stlb_miss",
2115 …"BriefDescription": "This metric estimates the fraction of cycles where the Second-level TLB (STLB…
2125 …"MetricExpr": "(16 * max(0, cpu_core@MEM_INST_RETIRED.LOCK_LOADS@ - cpu_core@L2_RQSTS.ALL_RFO@) + …
2135 …"MetricExpr": "(cpu_core@LSD.CYCLES_ACTIVE@ - cpu_core@LSD.CYCLES_OK@) / tma_info_core_core_clks /…
2139 …ly does well sustaining Uop supply. However; in some rare cases; optimal uop-delivery could not be…
2145         "MetricExpr": "max(0, tma_bad_speculation - tma_branch_mispredicts)",
2150 …-of-order portion of the machine needs to recover its state after the clear. For example; this can…
2155 …as likely hurt due to approaching bandwidth limits of external memory - DRAM ([SPR-HBM] and/or HBM…
2160 …- DRAM ([SPR-HBM] and/or HBM).  The underlying heuristic assumes that a similar off-core traffic i…
2165 …e the performance was likely hurt due to latency from external memory - DRAM ([SPR-HBM] and/or HBM…
2166 …ore@OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DATA_RD@) / tma_info_thread_clks - tma_mem_bandwidth",
2170 …e the performance was likely hurt due to latency from external memory - DRAM ([SPR-HBM] and/or HBM…
2176 …pdown\\-mem\\-bound@ / (cpu_core@topdown\\-fe\\-bound@ + cpu_core@topdown\\-bad\\-spec@ + cpu_core…
2181 …o demand load or store instructions. This accounts mainly for (1) non-completed in-flight memory d…
2196 … represents fraction of slots where the CPU was retiring memory operations -- uops for memory load…
2226 …"MetricExpr": "(cpu_core@IDQ.MITE_CYCLES_ANY@ - cpu_core@IDQ.MITE_CYCLES_OK@) / tma_info_core_core…
2230 …the legacy decode pipeline). This pipeline is used for code that was not pre-cached in the DSB or …
2235 …n terms of percentage of([SKL+] injected blend uops out of all Uops Issued -- the Count Domain; [A…
2240 …n terms of percentage of([SKL+] injected blend uops out of all Uops Issued -- the Count Domain; [A…
2250 … Commonly used instructions are optimized for delivery by the DSB (decoded i-cache) or MITE (legac…
2256 …"MetricExpr": "tma_light_operations * (cpu_core@BR_INST_RETIRED.ALL_BRANCHES@ - cpu_core@INST_RETI…
2260 …lots where the CPU was retiring branch instructions that were not fused. Non-conditional branches …
2270 …o op) instructions. Compilers often use NOPs for certain address alignments - e.g. start address o…
2275 …is metric represents the remaining light uops fraction the CPU has executed - remaining means not …
2276 …"MetricExpr": "max(0, tma_light_operations - (tma_fp_arith + tma_int_operations + tma_memory_opera…
2280 …is metric represents the remaining light uops fraction the CPU has executed - remaining means not …
2285 …action of slots the CPU was stalled due to other cases of misprediction (non-retired x86 branches …
2286 …pr": "max(tma_branch_mispredicts * (1 - cpu_core@BR_MISP_RETIRED.ALL_BRANCHES@ / (cpu_core@INT_MIS…
2295 …"MetricExpr": "max(tma_machine_clears * (1 - cpu_core@MACHINE_CLEARS.MEMORY_ORDERING@ / cpu_core@M…
2343 … the CPU performance was potentially limited due to Core computation issues (non divider-related)",
2344 …_clks if cpu_core@ARITH.DIV_ACTIVE@ < cpu_core@CYCLE_ACTIVITY.STALLS_TOTAL@ - cpu_core@EXE_ACTIVIT…
2348 …-related).  Two distinct categories can be attributed into this metric: (1) heavy data-dependency …
2354 …RS.EMPTY\\,umask\\=1@ - cpu_core@RESOURCE_STALLS.SCOREBOARD@, 0)) / tma_info_thread_clks * (cpu_co…
2358 …t (Logical Processor cycles since ICL, Physical Core cycles otherwise). Long-latency instructions …
2368 …-dependency among software instructions; or over oversubscribing a particular hardware resource. I…
2379 …cal Core cycles otherwise).  Loop Vectorization -most compilers feature auto-Vectorization options…
2397 …topdown\\-retiring@ / (cpu_core@topdown\\-fe\\-bound@ + cpu_core@topdown\\-bad\\-spec@ + cpu_core@…
2402 …ions-per-cycle (see IPC metric). Note that a high Retiring value does not necessary mean there is …
2407 …"BriefDescription": "This metric represents fraction of cycles the CPU issue-pipeline was stalled …
2412 …ycles the CPU issue-pipeline was stalled due to serializing operations. Instructions like CPUID; W…
2417 …sents fraction of slots where the CPU was retiring Shuffle operations of 256-bit vector size (FP o…
2422 …sents fraction of slots where the CPU was retiring Shuffle operations of 256-bit vector size (FP o…
2438 … estimates fraction of cycles handling memory load split accesses - load that cross 64-byte cache …
2443 … estimates fraction of cycles handling memory load split accesses - load that cross 64-byte cache …
2453 …resents rate of split store accesses.  Consider aligning your data to the 64-byte cache line granu…
2458 …f cycles where the Super Queue (SQ) was full taking into account all request-types and both hardwa…
2463 …f cycles where the Super Queue (SQ) was full taking into account all request-types and both hardwa…
2468 … CPU was stalled  due to RFO store memory accesses; RFO store issue a read-for-ownership request b…
2473 …ses; RFO store issue a read-for-ownership request before the write. Even though store accesses do …
2483 …perations in the pipeline; a load can avoid waiting for memory if a prior in-flight store is writi…
2489 …_STORE_RETIRED.L2_HIT@ * 10 * (1 - cpu_core@MEM_INST_RETIRED.LOCK_LOADS@ / cpu_core@MEM_INST_RETIR…
2493 …-of-order core performance; however; holding resources for longer time can lead into undesired imp…
2508 …tion of cycles where the TLB was missed by store accesses, hitting in the second-level TLB (STLB)",
2509         "MetricExpr": "tma_dtlb_store - tma_store_stlb_miss",
2531 …uired by RFO stores. Even though store accesses do not typically stall out-of-order CPUs; there ar…