test(mem/dram): merge differential validation into one scenario set by syifan · Pull Request #391 · sarchlab/akita

syifan · 2026-06-15T23:42:41Z

Summary

Collapses the two trace-driven differential tiers (command counts vs. read latency) into one differential suite over a single scenario set. Each scenario is now run once per oracle and once through Akita and compared on both metrics — instead of having a count-only scenario list and a separate latency-only list.

The in-process unit tests (formerly "Tiers 1–4") are unchanged.

What changed

run_oracles.py — runs both oracles for every scenario:

DRAMSim3 → command counts + read latency.
Ramulator2 → command counts via tail-subtraction (now honoring each scenario's page policy, not hard-coded close-page).
reference.csv carries both oracles for all 8 scenarios.
Scenario schema simplified to a per-scenario read_latency flag (enforced / known_gap / off) + gap_reason.

validation_tier5_test.go — one Differential validation suite:

Command counts asserted exact wherever the two oracles agree on a count — reads/writes always agree (one column command per access); activates agree wherever the count is map-independent (they agree on every current scenario). Where the oracles disagree, it's a documented reference divergence, not asserted. This replaces the earlier per-scenario activates_exact guess with a data-driven oracle-consensus rule.
Read latency compared vs DRAMSim3 (15% tolerance); known_gap scenarios characterized (currently 54–63%, the address-mapping gap).

README — coverage table reflects the merged suite; adds the read-coalescing feature gap to Findings (DRAMSim3 coalesces pending same-address reads; Akita doesn't — so saturated latency is only fair on distinct-address traces until modeled).

Testing

Full mem/dram suite, go vet, golangci-lint (0 issues) clean.
Both oracles agree on all counts for all 8 scenarios; Akita matches; enforced latency within tolerance; the two known gaps stay characterized.

Follow-ups (separate)

Ramulator2 latency (needs a documented drain patch) → symmetric latency comparison.
Read-coalescing feature + its sequential/random × small/large-range scenarios.
The PER_BANK queue work (feat(mem/dram): PER_BANK command-queue structure (P1.2) #389) is independent and still open.

Generated by Claude Code

Collapse the two trace-driven differential tiers (counts vs latency) into a single differential suite over one scenario set. Each scenario is now run once per oracle and once through Akita and compared on BOTH command counts and read latency, instead of having separate count-only and latency-only scenario lists. - run_oracles.py: run BOTH oracles for every scenario (DRAMSim3 gives counts + latency; Ramulator2 gives counts via tail-subtraction, now honoring each scenario's page policy). reference.csv carries both oracles for all 8 scenarios. Scenario schema simplified to a per-scenario read_latency flag (enforced/known_gap/off) + gap_reason. - validation_tier5_test.go: one "Differential validation" suite. Command counts are asserted exact wherever the TWO oracles AGREE on a count (reads/writes always; activates where map-independent — they agree on all current scenarios); read latency is compared vs DRAMSim3 (15% tol, known-gaps characterized). This removes the activates_exact guesswork in favor of a data-driven oracle-consensus rule. - README: coverage table reflects the merged suite; adds the read-coalescing feature gap to Findings. Tiers 1-4 (the in-process unit tests) are unchanged. Full mem/dram suite, vet, golangci-lint clean; both oracles agree on all counts and Akita matches.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 27e1e04f36

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-15T23:48:38Z

+            for k in ("reads", "writes"):
+                if got[k] != exp[k]:
+                    print(f"WARN {sim}/{scn['name']}: {k} {got[k]} != expected "
+                          f"{exp[k]}", file=sys.stderr)


Fail on read/write count mismatches

When regenerating references, if either oracle reports a read/write count different from the trace count (for example because Ramulator2 tail-subtraction fails on an open-page scenario), this only prints a warning and still writes reference.csv. The Go test added here asserts counts only when the two oracle values agree, so a mismatched read/write row silently disables that metric even though reads/writes are supposed to be trace-determined; return non-zero before writing the CSV instead.

Useful? React with 👍 / 👎.

chatgpt-codex-connector Bot reviewed Jun 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(mem/dram): merge differential validation into one scenario set#391

test(mem/dram): merge differential validation into one scenario set#391
syifan wants to merge 1 commit into
mainfrom
claude/dram-unified-differential

syifan commented Jun 15, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

syifan commented Jun 15, 2026

Summary

What changed

Testing

Follow-ups (separate)

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants