AgentHub by karpathy · Pull Request #92 · karpathy/autoresearch

karpathy · 2026-03-09T19:30:40Z

Call for help/discussion on autoresearch integration into AgentHub. I have an early version deployed on autoresearchhub.com. The new program.md I am using for my first agent is below.

After some iteration I might push to master of autoresearch. Just want to iterate on first a bit more and think it through a bit.

dumko2001 · 2026-03-11T09:29:44Z

@karpathy One question that came to mind while reading the experiment loop: the system is elegantly optimized for incremental hill-climbing: agents propose a change, it improves val_bpb, gets pushed, becomes part of the lineage. Clean and effective.
But some of the biggest jumps in ML historically required temporarily worse performance before unlocking a better regime : architectural overhauls, different scaling tradeoffs, training dynamics that look broken before they stabilize. The path to a higher peak sometimes means stepping into the valley first. And an agent that hard-discards anything below the current best has no way to take that step, it can only ever optimize the hill it's already standing on.
Humans handle this partly through intuition ,we can look across the landscape and sense that a taller mountain exists somewhere, even before the numbers confirm it. Agents here don't have that mechanism yet.
Have you thought about how to encourage that kind of exploration? A few directions that come to mind:

Letting agents maintain side branches that are temporarily worse but flagged as speculative
A separate exploration budget for more radical architectural changes
Agents occasionally sampling outside the current frontier rather than always building on the current best lineage

Otherwise I wonder if the system converges strongly to local optima over time — not because agents are bad, but because the incentive structure only rewards the next incremental step.
Curious how you're thinking about the exploration/exploitation tradeoff. Happy to dig into this more if it's something you're actively considering.

bigsnarfdude · 2026-03-11T15:09:50Z

repo was deleted and forks point to https://github.com/ygivenx/agenthub

autonull · 2026-03-11T23:44:07Z

multiobjective model optimization: max accuracy, min parameters, min iteration time
https://github.com/autonull/bioplausible/blob/83bfde7bf4469a97d4fb890569d9233db3d0577d/bioplausible/hyperopt/optuna_bridge.py#L212

dhanaway · 2026-03-12T12:06:39Z

Autoresearch and AgentHub made me think a lot to evolution and a gene pool; it feels like there are shared characteristics between the two. In a gene pool there is no single 'main' branch lots of tracks are going at once in different directions each trying to find some new, better path. Similar to the vision of AgentHub, there is sharing and swapping of genes between these tracks (analogous to the sharing and swapping of commits on these branches). There is also no notion of a 'merge back into main' each track is going independently, some will fail and end, others will continue on and become the de facto 'best track' for a time.

I wonder if there is some design that builds on this proven strategy? Or maybe this is just an indication that the current design is a good one.

morozow · 2026-03-13T06:59:14Z

@karpathy I propose to consider using stdio Bus – NDJSON over stdio, MCP/ACP-compatible – as the primary inter-agent routing base, instead of coordinating agent-to-agent traffic via AgentHub HTTP API calls.

Right now AgentHub coordinates work by explicit API endpoints – agents poll/post/claim through the hub. That hard-codes "where to send/how to coordinate" into the hub's API surface.

With stdio Bus, we can move coordination one level down: routing is done by the agents themselves over a deterministic stdio transport that is protocol-semantics-agnostic – MCP/ACP messages, or any custom JSON-RPC/NDJSON frames. In this model, the hub stops being a message router and becomes mainly:

agents registry/discovery – who exists, capabilities
durable storage for results, artifacts, experiment graph if desired
optional policy/accounting

But the actual message passing and task handoff becomes: "agent/user/etc. > stdio Bus > agent".

Net effect:

agents can dynamically choose peers, build their own topology – centralized, mesh, hybrid – and evolve routing strategy during runtime;
transport stays unified NDJSON/MCP/ACP across local processes and remote workers.
no need to encode routing decisions into REST endpoints;

I already started implementing this direction in #223 where the goal was to replace the HTTP-based hub coordination for the common case by local multi-GPU/local cluster with stdio_bus as the transport/router, including multi-GPU scheduling + parallel execution + sync across agents.

If this direction fits the goals of AgentHub, I can sketch how the hub would expose only minimal "bootstrap + registry", while all inter-agent coordination runs through stdio Bus as the router/data-plane.

References:

Website & Documentation: https://stdiobus.com/
stdio Bus Kernel Repository: https://github.com/stdiobus/stdiobus
Worker Registry: https://github.com/stdiobus/workers-registry

hmmm

7004de0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AgentHub#92

AgentHub#92
karpathy wants to merge 1 commit intomasterfrom
agenthub

karpathy commented Mar 9, 2026

Uh oh!

dumko2001 commented Mar 11, 2026 •

edited

Loading

Uh oh!

bigsnarfdude commented Mar 11, 2026 •

edited

Loading

Uh oh!

autonull commented Mar 11, 2026

Uh oh!

dhanaway commented Mar 12, 2026

Uh oh!

morozow commented Mar 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

karpathy commented Mar 9, 2026

Uh oh!

dumko2001 commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bigsnarfdude commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

autonull commented Mar 11, 2026

Uh oh!

dhanaway commented Mar 12, 2026

Uh oh!

morozow commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

dumko2001 commented Mar 11, 2026 •

edited

Loading

bigsnarfdude commented Mar 11, 2026 •

edited

Loading

morozow commented Mar 13, 2026 •

edited

Loading