SHORT Double Precision MFLOP/s

EVENTSET
FIXC1 ACTUAL_CPU_CLOCK
FIXC2 MAX_CPU_CLOCK
PMC0  RETIRED_INSTRUCTIONS
PMC1  RETIRED_SSE_AVX_FLOPS_DOUBLE_FMA
PMC2  RETIRED_MMX_FP_INSTR_ALL
PMC3  RETIRED_SSE_AVX_FLOPS_DOUBLE_ADD_MULT_DIV

METRICS
Runtime (RDTSC) [s] time
Runtime unhalted [s]   FIXC1*inverseClock
Clock [MHz]  1.E-06*(FIXC1/FIXC2)/inverseClock
CPI   FIXC0/PMC0
DP [MFLOP/s] (scalar assumed)   1.0E-06*(PMC3+(PMC2/2)+(PMC1*2))/time
DP [MFLOP/s] (SSE assumed)   1.0E-06*(PMC3+PMC2+(PMC1*2))/time
DP [MFLOP/s] (AVX assumed)   1.0E-06*(PMC3+(PMC2*2)+(PMC1*2))/time

LONG
Formulas:
CPI = INST_RETIRED_ANY/ACTUAL_CPU_CLOCK
DP [MFLOP/s] (scalar assumed) = 1.0E-06*(RETIRED_SSE_AVX_FLOPS_DOUBLE_ADD_MULT_DIV + (RETIRED_MMX_FP_INSTR_ALL/2)+(RETIRED_SSE_AVX_FLOPS_DOUBLE_FMA*2))/time
DP [MFLOP/s] (SSE assumed) = 1.0E-06*(RETIRED_SSE_AVX_FLOPS_DOUBLE_ADD_MULT_DIV + RETIRED_MMX_FP_INSTR_ALL+(RETIRED_SSE_AVX_FLOPS_DOUBLE_FMA*2))/time
DP [MFLOP/s] (AVX assumed) = 1.0E-06*(RETIRED_SSE_AVX_FLOPS_DOUBLE_ADD_MULT_DIV + (RETIRED_MMX_FP_INSTR_ALL*2)+(RETIRED_SSE_AVX_FLOPS_DOUBLE_FMA*2))/time
-
Profiling group to measure double precisision FLOP rate. The Zen architecture
does not provide distinct events for SSE and AVX FLOPs. Moreover, all FP
instructions are counted in RETIRED_MMX_FP_INSTR_ALL.
Therefore, you have to select the SP MFLOP/s metric based on the measured code.
Moreover, the SSE and AVX metrics overcount but SSE metrics are more accurate.


