Skip to content

Commit 6b55dd8

Browse files
committed
add missing scripts; improve scripts documentation
1 parent 5d947df commit 6b55dd8

File tree

7 files changed

+987
-35
lines changed

7 files changed

+987
-35
lines changed

scripts/README.md

Lines changed: 128 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,39 +1,154 @@
1-
## Creating LaTeX tables
1+
## Scripts for Benchmark Analysis
22

3-
Prerequisite: You should be able to build and run the C++ benchmark. You need Python 3 on your system.
3+
This directory contains scripts for processing and visualizing benchmark results.
44

5-
Run your benchmark:
5+
### Prerequisites
66

7+
- Python 3.6+
8+
- Required Python packages: `pandas`, `numpy`, `matplotlib`, `seaborn`
9+
10+
You can install the required packages with:
11+
12+
```bash
13+
pip install pandas numpy matplotlib seaborn
714
```
8-
cmake -B build
15+
16+
### Creating LaTeX Tables
17+
18+
#### Basic Table Generation
19+
20+
Run your benchmark and convert the output to a LaTeX table:
21+
22+
```bash
23+
# Run benchmark
24+
cmake -B build .
25+
cmake --build build
926
./build/benchmarks/benchmark -f data/canada.txt > myresults.txt
27+
28+
# Convert to LaTeX table
29+
./scripts/latex_table.py myresults.txt
1030
```
1131

12-
Process the raw output:
32+
This will print a LaTeX table to stdout with numbers rounded to two significant digits.
33+
34+
#### Automated Multiple Table Generation
1335

36+
Instead of manually running benchmarks and generating tables, you can use the `generate_multiple_tables.py` script to automate the entire process:
37+
38+
```bash
39+
# Basic usage with g++ compiler
40+
./scripts/generate_multiple_tables.py g++
1441
```
15-
./scripts/latex_table.py myresults.txt
42+
43+
This script:
44+
45+
- Automatically compiles the benchmark code with the specified compiler
46+
- Runs multiple benchmarks with different configurations
47+
- Generates LaTeX tables for each benchmark result
48+
- Saves all tables to the output directory
49+
50+
Options:
51+
52+
- First argument: Compiler to use (g++, clang++)
53+
- `--build-dir`: Build directory (default: build)
54+
- `--output-dir`: Output directory for tables (default: ./outputs)
55+
- `--clean`: Clean build directory before compilation
56+
- `--march`: Architecture target for -march flag (default: native)
57+
58+
The script also has several configurable variables at the top of the file:
59+
60+
- Benchmark datasets (canada, mesh, uniform_01)
61+
- Algorithm filters
62+
- Number of runs
63+
- Volume size
64+
65+
This is the recommended approach for generating comprehensive benchmark results.
66+
67+
### Combining Tables
68+
69+
The `concat_tables.py` script combines separate benchmark tables (mesh, canada, uniform_01) into comprehensive tables:
70+
71+
```bash
72+
# Basic usage, using tables in ./outputs
73+
./scripts/concat_tables.py
1674
```
1775

18-
This will print to stdout the table.
19-
The numbers are already rounded to two significant digits, ready to be included in a scientific manuscript.
76+
Options:
77+
78+
- `--input-dir`, `-i`: Directory containing benchmark .tex files (default: ./outputs)
79+
- `--output-dir`, `-o`: Output directory for combined tables (default: same as input)
80+
- `--exclude`, `-e`: Algorithms to exclude from the output tables
81+
82+
### Generating Visualization Figures
2083

21-
It is also possible to create multiple LaTeX tables at once with:
84+
The `generate_figures.py` script creates heatmaps and relative performance plots:
2285

86+
```bash
87+
# Generate figures for nanoseconds per float metric
88+
./scripts/generate_figures.py nsf ./outputs
2389
```
24-
./scripts/generate_multiple_tables.py <compiler_name>`
90+
91+
Options:
92+
93+
- First argument: Metric to visualize (`nsf`, `insf`, or `insc`)
94+
- Second argument: Directory containing benchmark result .tex files
95+
- `--output-dir`, `-o`: Directory to save generated figures (default: same as input directory)
96+
- `--exclude`, `-e`: Algorithms to exclude from visualization
97+
- `--cpus`, `-c`: CPUs to include in relative performance plots
98+
99+
### Extracting Summary Metrics
100+
101+
The `get_summary_metrics.py` script analyzes raw benchmark files to extract performance metrics:
102+
103+
```bash
104+
# Analyze all CPUs
105+
./scripts/get_summary_metrics.py
25106
```
26107

27-
## Running tests on Amazon AWS
108+
Options:
109+
110+
- `--cpu`: CPU folder name to restrict analysis
111+
- `--input-dir`, `-i`: Directory containing benchmark .raw files (default: ./outputs)
112+
- `--outlier-threshold`, `-t`: Threshold for reporting outliers (default: 5.0%)
113+
- `--dedicated-cpus`, `-d`: CPU folder names considered dedicated (non-cloud)
114+
115+
### Running Tests on Amazon AWS
28116

29117
It is possible to generate tests on Amazon AWS:
30118

31-
```
32-
./scripts/aws_tests.py
119+
```bash
120+
./scripts/aws_tests.bash
33121
```
34122

35123
This script will create new EC2 instances, run
36124
`./scripts/generate_multiple_tables.py` script on both g++ and clang++ builds,
37125
save each output to a separate folder, and then terminate the instance.
38126

39127
Prerequisites and some user configurable variables are in the script itself.
128+
129+
### Workflow Example
130+
131+
A typical complete workflow might look like:
132+
133+
1. **Generate benchmark results and tables automatically**:
134+
```bash
135+
# For g++ compiler (compiles and runs benchmarks)
136+
./scripts/generate_multiple_tables.py g++ --clean
137+
138+
# For clang++ compiler (compiles and runs benchmarks)
139+
./scripts/generate_multiple_tables.py clang++ --clean
140+
```
141+
2. **Combine tables for better comparison**:
142+
```bash
143+
./scripts/concat_tables.py
144+
```
145+
3. **Generate visualization figures**:
146+
```bash
147+
./scripts/generate_figures.py nsf ./outputs
148+
```
149+
4. **Extract summary metrics**:
150+
```bash
151+
./scripts/get_summary_metrics.py
152+
```
153+
154+
This automated workflow handles the entire process from compilation to visualization with minimal manual intervention.

scripts/concat_tables.py

Lines changed: 177 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,177 @@
1+
#!/usr/bin/env python3
2+
"""
3+
Concatenate multiple benchmark result tables into a single comprehensive table.
4+
5+
This script finds and combines related benchmark results from different datasets
6+
(mesh, canada, uniform_01) into a single LaTeX table for easier comparison.
7+
"""
8+
import os
9+
import re
10+
import argparse
11+
import pandas as pd
12+
13+
14+
def parse_tex_table(filepath):
15+
"""Parse a LaTeX table file into a pandas DataFrame."""
16+
with open(filepath, 'r') as file:
17+
lines = file.readlines()
18+
data_start = False
19+
parsed = []
20+
for line in lines:
21+
if "\\midrule" in line:
22+
data_start = True
23+
continue
24+
if "\\bottomrule" in line:
25+
break
26+
if data_start and '&' in line:
27+
row = [x.strip().strip('\\') for x in line.split('&')]
28+
if len(row) == 4:
29+
parsed.append({
30+
'algorithm': row[0],
31+
'ns/f': row[1],
32+
'ins/f': row[2],
33+
'ins/c': row[3]
34+
})
35+
return pd.DataFrame(parsed)
36+
37+
38+
def clean_cpu_name(cpu_name):
39+
"""Clean CPU name for better display in tables."""
40+
cpu_cleaned = cpu_name.replace("Ryzen9900x", "Ryzen 9900X")
41+
cpu_cleaned = cpu_cleaned.replace("_Platinum", "")
42+
cpu_cleaned = re.sub(r"_\d+-Core_Processor", "", cpu_cleaned)
43+
cpu_cleaned = re.sub(r"_CPU__\d+\.\d+GHz", "", cpu_cleaned)
44+
cpu_cleaned = re.sub(r"\(R\)", "", cpu_cleaned)
45+
return cpu_cleaned.replace("_", " ").replace(" ", " ").strip()
46+
47+
48+
def format_latex_table(df, cpu_name, compiler, float_bits, microarch=None,
49+
exclude_algos=None):
50+
"""Format the combined data as a LaTeX table."""
51+
if exclude_algos is None:
52+
exclude_algos = set()
53+
54+
cpu_cleaned = clean_cpu_name(cpu_name)
55+
caption = f"{cpu_cleaned} results ({compiler}, {float_bits}-bit floats"
56+
if microarch:
57+
caption += f", {microarch}"
58+
caption += ")"
59+
label = f"tab:{re.sub(r'[^a-zA-Z0-9]+', '', cpu_name.lower())}results"
60+
header = (
61+
"\\begin{table}\n"
62+
" \\centering\n"
63+
f" \\caption{{{caption}}}%\n"
64+
f" \\label{{{label}}}\n"
65+
" \\begin{tabular}{lccccccccc}\n"
66+
" \\toprule\n"
67+
" \\multirow{1}{*}{Name} & \\multicolumn{3}{c|}{mesh} & "
68+
"\\multicolumn{3}{c|}{canada} & \\multicolumn{3}{c}{unit} \\\\\n"
69+
" & {ns/f} & {ins/f} & {ins/c} & "
70+
"{ns/f} & {ins/f} & {ins/c} & {ns/f} & {ins/f} & {ins/c} \\\\ "
71+
"\\midrule\n"
72+
)
73+
body = ""
74+
for _, row in df.iterrows():
75+
if row['algorithm'] in exclude_algos:
76+
continue
77+
line = (
78+
f" {row['algorithm']} & {row['ns/f_mesh']} & "
79+
f"{row['ins/f_mesh']} & {row['ins/c_mesh']} & "
80+
f"{row['ns/f_canada']} & {row['ins/f_canada']} & "
81+
f"{row['ins/c_canada']} & "
82+
f"{row['ns/f_unit']} & {row['ins/f_unit']} & "
83+
f"{row['ins/c_unit']} \\\\\n"
84+
)
85+
body += line
86+
footer = (
87+
" \\bottomrule\n"
88+
" \\end{tabular}\\restartrowcolors\n"
89+
"\\end{table}\n"
90+
)
91+
return header + body + footer
92+
93+
94+
def find_combinations(root, pattern=None):
95+
"""Find all combinations of benchmark result files that can be combined."""
96+
if pattern is None:
97+
pattern = re.compile(
98+
r"(.*?)_(g\+\+|clang\+\+)_(mesh|canada|uniform_01)_(none|s)"
99+
r"(?:_(x86-64|x86-64-v2|x86-64-v3|x86-64-v4|native))?\.tex"
100+
)
101+
# group(1)=cpu, 2=compiler, 3=dataset, 4=variant, 5=microarch (optional)
102+
103+
combos = []
104+
for dirpath, _, filenames in os.walk(root):
105+
tex_files = [f for f in filenames if f.endswith('.tex')]
106+
table = {}
107+
for f in tex_files:
108+
m = pattern.match(f)
109+
if m:
110+
cpu, compiler, dataset, variant, microarch = m.groups()
111+
key = (dirpath, cpu, compiler, variant, microarch)
112+
if key not in table:
113+
table[key] = {}
114+
table[key][dataset] = os.path.join(dirpath, f)
115+
for (dirpath, cpu, compiler, variant, microarch), files in table.items():
116+
if {"mesh", "canada", "uniform_01"}.issubset(files.keys()):
117+
combos.append((dirpath, cpu, compiler, variant, microarch, files))
118+
return combos
119+
120+
121+
def main():
122+
parser = argparse.ArgumentParser(
123+
description="Concatenate benchmark tables into comprehensive tables")
124+
parser.add_argument(
125+
"--input-dir", "-i", default="./outputs",
126+
help="Directory containing benchmark .tex files")
127+
parser.add_argument(
128+
"--output-dir", "-o",
129+
help="Output directory for combined tables (defaults to input directory)")
130+
parser.add_argument(
131+
"--exclude", "-e", nargs="+",
132+
default=["netlib", "teju\\_jagua", "yy\\_double", "snprintf", "abseil"],
133+
help="Algorithms to exclude from the output tables")
134+
args = parser.parse_args()
135+
136+
input_dir = args.input_dir
137+
output_dir = args.output_dir if args.output_dir else input_dir
138+
exclude_algos = set(args.exclude)
139+
140+
# Create output directory if it doesn't exist
141+
if not os.path.exists(output_dir):
142+
os.makedirs(output_dir)
143+
144+
combos = find_combinations(input_dir)
145+
if not combos:
146+
print(f"No matching benchmark files found in {input_dir}")
147+
return
148+
149+
print(f"Found {len(combos)} combinations to process")
150+
151+
for dirpath, cpu, compiler, variant, microarch, paths in combos:
152+
df_mesh = parse_tex_table(paths['mesh'])
153+
df_canada = parse_tex_table(paths['canada'])
154+
df_unit = parse_tex_table(paths['uniform_01'])
155+
df_merged = df_mesh.merge(
156+
df_canada, on='algorithm', suffixes=('_mesh', '_canada'))
157+
df_merged = df_merged.merge(df_unit, on='algorithm')
158+
df_merged.rename(columns={
159+
'ns/f': 'ns/f_unit',
160+
'ins/f': 'ins/f_unit',
161+
'ins/c': 'ins/c_unit'
162+
}, inplace=True)
163+
164+
float_bits = "32" if variant == "s" else "64"
165+
tex_code = format_latex_table(
166+
df_merged, cpu, compiler, float_bits, microarch, exclude_algos)
167+
168+
suffix = f"_{microarch}" if microarch else ""
169+
out_path = os.path.join(
170+
output_dir, f"{cpu}_{compiler}_all_{variant}{suffix}.tex")
171+
with open(out_path, "w") as f:
172+
f.write(tex_code)
173+
print(f"[OK] {out_path}")
174+
175+
176+
if __name__ == "__main__":
177+
main()

0 commit comments

Comments
 (0)