Merge v4_Researcher → GEAK_v4: Deep Research Agent (DRA) for kernel_workflow#293
Open
Umangatamd wants to merge 9 commits into
Open
Merge v4_Researcher → GEAK_v4: Deep Research Agent (DRA) for kernel_workflow#293Umangatamd wants to merge 9 commits into
Umangatamd wants to merge 9 commits into
Conversation
Add the `researcher` persona (kernel_workflow/roles/researcher.md) — a v4-native
Deep Research Agent mirroring v3's Stage 0-7 pipeline (fact extraction → ranked
research questions → per-question native web research → optional blindspot
critique → ranked-directions portfolio) with phase contracts for research_plan /
research_question / research_blindspot / research_synthesize.
Wire a new opt-in phase('Research') into kernel_workflow.js AFTER Profile and
BEFORE the optimize loop, gated behind args.dra_enabled (default off → existing
runs byte-identical). The phase fans research questions out in PARALLEL via
parallel(), wraps every research agent in the agentT() hang-guard so a hung
research agent resolves null instead of wedging the round-barrier, and writes
deep_search.md / deep_search_brief.md / deep_search.json into EVAL_DIR. Adds
RESEARCH_PLAN/QUESTION/BLINDSPOT/RESEARCH schemas and threads the brief path into
tech_lead plan_round.
… brief plan_round now Reads EVAL_DIR/deep_search_brief.md (when DEEP_SEARCH_BRIEF is set) and seeds directions[] from the ranked DRA directions, carrying v3's hard-won lessons: DIVERSIFY (spread different ranked directions across parallel engineers, always keep >=1 free explorer slot, never anchor all engineers on one theme); treat HIGH-CEILING rewrites (raw-HIP/load_inline, HIP/CUDA graph capture, algorithmic reformulation) as FIRST-CLASS not secondary; and don't over-prescribe (idea/mechanism only). The brief is a prior, never a cage — profile/per-case data and measurement still rule. No-op when the brief path is empty.
Add WebSearch + WebFetch to interface/run_e2e.py ALLOWED_TOOLS so the Deep Research Agent's per-question research agents can do native web research. Harmless when dra_enabled is off (nothing opts into them).
Document the opt-in Research phase (Stage 0-7 flow, parallel fan-out + hang-guard, brief->plan_round handoff with diversity + de-conservatism), the dra_enabled / dra_max_questions / dra_blindspot / dra_max_blindspots args, the deep_search.* artifacts + research/ trail, the researcher role, and the WebSearch/WebFetch allowlist requirement.
CONCERN 1 (fusion): a single-kernel DRA could overlook fusion entirely. Add a "Fusion & kernel scope" section + a fusion angle to research_plan question generation + a synthesis rule so fusion is never buried: intra-kernel fusion (collapse dispatches / fold epilogue) is surfaced as a first-class executable direction; cross-kernel fusion (merge with an adjacent op) is recorded as an e2e-level ESCALATION in open_measurements (the single-kernel layer can't extract a neighbor against its immutable single-op oracle) rather than lost. The researcher must not propose keeping an op standalone against an upstream fusion. CONCERN 2 (advisory-not-dominant): add an explicit "You are ADVISORY, not the decision-maker" section and reframe the Stage 7 portfolio as suggestions to be vetted against the profile, never mandates.
…nant Rewrite plan_round rule 2b so the TechLead remains THE decision-maker and the DRA brief cannot regress into v3-style anchoring: - brief is ADVISORY/OPTIONAL, not a plan to execute; critically evaluate each Dk against THIS kernel's profile/per-case data and reject/ignore ones that don't fit - the DRA NEVER fills 100% of the round: always generate >=1 of the TechLead's own profile-driven directions, keep >=1 free explorer slot, brief seeds at most BUDGET_REMAINING-1 directions - DIVERSIFY (spread different Dk across engineers, never converge on one theme) - HIGH-CEILING directions first-class WHEN they fit the profile - FUSION: intra-kernel fusion is a normal direction; a cross-kernel-fusion escalation is NOT executable here, leave it as the researcher's note
… first Strengthen the advisory framing so the DRA brief is unambiguously a set of SUGGESTIONS to consider, not directives. plan_round now mandates an explicit order: the TechLead does its OWN independent profile/code analysis and forms its own candidate directions FIRST, then consults the brief and decides by its own judgment which (if any) suggestions to adopt — free to adapt/ignore/reject all (adopting none is valid). researcher.md synthesis tone reworded to "consider/one option is" rather than imperative. Existing diversity + free-explorer + high-ceiling-first + fusion rules preserved. node --check passes.
feat(dra): Deep Research Agent (DRA) for kernel_workflow
Collaborator
|
Could you also add the runtime overhead introduced by DRA? In addition, DRA does not appear to provide much benefit for the LLM head kernels based on the current results. Could you include more head-kernel benchmark results to better evaluate its effectiveness in that scenario? |
Collaborator
Author
|
Runtime overhead: Right now the DRA research phase adds about ~20% to the run wall-clock — a one-time, opt-in cost before the optimize loop (tunable via the question/blindspot budget). Head kernels: Agreed it's worth more coverage there. Could you recommend a few head kernels you'd like benchmarked? Happy to run the no-DRA vs DRA A/B on them. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Promotes the Deep Research Agent (DRA) work from
v4_ResearcherintoGEAK_v4(via #291).What DRA does
An opt-in
Researchphase inkernel_workflow(gated bydra_enabled, default off). After profiling, DRA:WebSearch/WebFetch,deep_search_brief.md(full evidence indeep_search.md).The brief is handed to the TechLead planner as advisory suggestions: the optimizer does its own profile/code analysis first, then decides which (if any) to adopt.
Results (A/B: no-DRA vs DRA, budget=3)
On KNN, the DRA gain traces to adopted brief directions — warp-cooperative WarpSelect (wave64),
Template<K>scratch-spill elimination into VGPRs, and wrapper/output-layout fixes.Made with Cursor