Skip to content

Conversation

tlfalcon
Copy link
Contributor

Replace "TSC" with "msr/tsc" or "msr/tsc,cpu=" to specify global or cpu type specific TSC metric.

Replace "TSC" with "msr/tsc" or "msr/tsc,cpu=<cpu type>"
to specify global or cpu type specific TSC metric.

for m in metrics:
form = m['MetricExpr']
if "TSC" in form:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me but we've normally done similar fixing up here:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py#L1405
With the patch series:
https://lore.kernel.org/lkml/[email protected]/
has this resolved the metric validation test issues on hybrid?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I tested the generated alderlake json with this change and:
https://lore.kernel.org/lkml/[email protected]/
and:
https://lore.kernel.org/lkml/[email protected]/
plus this fix:

--- a/tools/perf/arch/x86/util/topdown.c
+++ b/tools/perf/arch/x86/util/topdown.c
@@ -92,7 +92,7 @@ int topdown_insert_slots_event(struct list_head *list, int idx, struct evsel *me
 
        evsel->core.attr.config = TOPDOWN_SLOTS;
        evsel->core.cpus = perf_cpu_map__get(metric_event->core.cpus);
-       evsel->core.own_cpus = perf_cpu_map__get(metric_event->core.own_cpus);
+       evsel->core.pmu_cpus = perf_cpu_map__get(metric_event->core.pmu_cpus);
        evsel->core.is_pmu_core = true;
        evsel->pmu = metric_event->pmu;
        evsel->name = strdup("slots");

and the tma_l2_latency is fixed if I add --no-scale:

$ sudo /tmp/perf/perf stat --no-scale -M tma_l2_hit_latency -- /tmp/perf/perf bench futex hash -r 2 -s
# Running 'futex/hash' benchmark:
Run summary [PID 1599693]: 28 threads, each operating on 1024 [private] futexes for 2 secs.

Averaged 2247423 operations/sec (+- 0.79%), total secs = 2
Futex hashing: global hash

 Performance counter stats for '/tmp/perf/perf bench futex hash -r 2 -s':

     7,505,544,206      cpu_core/MEMORY_ACTIVITY.STALLS_L1D_MISS/ #     41.4 %  tma_l2_hit_latency       (42.83%)
   331,067,557,224      cpu_core/slots/                                                         (49.94%)
   177,936,737,645      cpu_core/topdown-retiring/                                              (49.94%)
    15,671,596,981      cpu_core/topdown-mem-bound/                                             (49.94%)
     5,360,676,349      cpu_core/topdown-bad-spec/                                              (49.94%)
   118,705,368,744      cpu_core/topdown-fe-bound/                                              (49.94%)
    28,820,278,709      cpu_core/topdown-be-bound/                                              (49.94%)
     8,073,335,211      cpu_core/MEMORY_ACTIVITY.STALLS_L2_MISS/                                        (49.88%)
    67,113,967,663      msr/tsc,cpu=cpu_core/                                                   (3.57%)
       154,135,358      cpu_core/MEM_LOAD_RETIRED.L1_MISS/                                        (42.79%)
   110,196,035,974      cpu_core/CPU_CLK_UNHALTED.THREAD/                                        (49.92%)
       177,643,036      cpu_core/MEM_LOAD_RETIRED.L2_HIT/                                        (49.94%)
       116,204,655      cpu_core/MEM_LOAD_RETIRED.FB_HIT/                                        (42.82%)
    58,612,990,535      cpu_core/CPU_CLK_UNHALTED.REF_TSC/                                        (49.93%)
     2,031,139,007      duration_time                                                         

       2.030949462 seconds time elapsed

      10.485634000 seconds user
      45.198350000 seconds sys

it is also possible to constrain the events to just be on the p-core like:

$ sudo /tmp/perf/perf stat --no-scale -M tma_l2_hit_latency -- taskset -c `cat /sys/bus/event_source/devices/cpu_core/cpus` /tmp/perf/perf bench futex hash -r 2 -s
# Running 'futex/hash' benchmark:
Run summary [PID 1600141]: 28 threads, each operating on 1024 [private] futexes for 2 secs.

Averaged 2253275 operations/sec (+- 0.83%), total secs = 2
Futex hashing: global hash

 Performance counter stats for 'taskset -c 0-15 /tmp/perf/perf bench futex hash -r 2 -s':

     7,548,932,450      cpu_core/MEMORY_ACTIVITY.STALLS_L1D_MISS/ #     45.7 %  tma_l2_hit_latency       (42.80%)
   331,704,741,522      cpu_core/slots/                                                         (49.96%)
   176,789,039,556      cpu_core/topdown-retiring/                                              (49.96%)
    16,859,212,234      cpu_core/topdown-mem-bound/                                             (49.96%)
     6,041,092,894      cpu_core/topdown-bad-spec/                                              (49.96%)
   117,700,385,076      cpu_core/topdown-fe-bound/                                              (49.96%)
    30,614,367,237      cpu_core/topdown-be-bound/                                              (49.96%)
     8,090,931,352      cpu_core/MEMORY_ACTIVITY.STALLS_L2_MISS/                                        (49.98%)
    67,257,502,068      msr/tsc,cpu=cpu_core/                                                   (3.57%)
       172,221,109      cpu_core/MEM_LOAD_RETIRED.L1_MISS/                                        (42.90%)
   110,525,548,606      cpu_core/CPU_CLK_UNHALTED.THREAD/                                        (49.99%)
       198,217,890      cpu_core/MEM_LOAD_RETIRED.L2_HIT/                                        (49.97%)
       124,717,379      cpu_core/MEM_LOAD_RETIRED.FB_HIT/                                        (42.75%)
    58,695,684,575      cpu_core/CPU_CLK_UNHALTED.REF_TSC/                                        (49.91%)
     2,030,443,163      duration_time                                                         

       2.030268760 seconds time elapsed

      10.501135000 seconds user
      45.270940000 seconds sys

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a heads up. This commit:
8476bd8
is in:
#324
I think that PR can be merged as I resolved the test issue (by deleting the non-longer relevant test).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants