-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Proposal: AdaptiveModel #3335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Proposal: AdaptiveModel #3335
Conversation
| self, | ||
| messages: list[ModelMessage], | ||
| model_settings: ModelSettings | None, | ||
| model_request_parameters: ModelRequestParameters, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DouweM is there a reason that run_context is available in request_stream but not request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I added it to request_stream pre-v1 because we needed it for Temporal integration, and I really should've added it to request as well at the time, as we can't do so anymore now as it'd be a breaking change for users that have custom Model implementations 😬
Of course if we only support deps for request_stream, this AdaptiveModel will feel half-baked. Could we make it not depend on RunContext, and perhaps require the deps (or another context object) to be passed into the model explicitly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can't do so anymore now as it'd be a breaking change for users that have custom Model implementations
Does python not support optional params or signature overloads in a way thats backwards compatible? That's painful 😬
and perhaps require the deps (or another context object) to be passed into the model explicitly?
I'm trying to wrap my brain around the different use-cases where context matters and what we give up by not having the RunContext. I guess if the Model and Agent need to share dependencies, then those objects can just be setup on each request and passed into both.
Honestly, there are probably more use cases for global AdaptiveModel state vs run isolated run state.
Looking at the RunContext, the only other property we're potentially giving up is the usage, which could maybe be useful for upgrading/downgrading model based on input/output tokens. But maybe theres a simple workaround.
DouweM
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ChuckJonas Thanks Charlie, I like it!
| self, | ||
| messages: list[ModelMessage], | ||
| model_settings: ModelSettings | None, | ||
| model_request_parameters: ModelRequestParameters, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I added it to request_stream pre-v1 because we needed it for Temporal integration, and I really should've added it to request as well at the time, as we can't do so anymore now as it'd be a breaking change for users that have custom Model implementations 😬
Of course if we only support deps for request_stream, this AdaptiveModel will feel half-baked. Could we make it not depend on RunContext, and perhaps require the deps (or another context object) to be passed into the model explicitly?
| class AdaptiveModel(Model, Generic[AgentDepsT]): | ||
| """A model that uses custom logic to select which model to try next. | ||
|
|
||
| Unlike FallbackModel which tries models sequentially, AdaptiveModel gives |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you try refactoring FallbackModel as subclass of AdaptiveModel, to prove this is flexible enough (and simplify and deduplicate FallbackModel)?
| attempt_number += 1 | ||
|
|
||
| # Check max attempts | ||
| if self._max_attempts is not None and attempt_number > self._max_attempts: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should reduce the duplication between request and request_stream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the issues linked in #3303 will also apply this model, so I will fix it in FallbackModel before we merge this, and then we should make sure this also works properly with output modes.
|
|
||
|
|
||
| @dataclass | ||
| class AttemptResult: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think just Attempt will be clear enough
| exception: Exception | None | ||
| """The exception raised by the model, if any.""" | ||
|
|
||
| timestamp: float |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's store a full datetime
| timestamp: float | ||
| """Unix timestamp when the attempt was made.""" | ||
|
|
||
| duration: float |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a store a timedelta? Don't feel strongly about this one
| """History of attempts in this request.""" | ||
|
|
||
| attempt_number: int | ||
| """Current attempt number (1-indexed).""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this always be len(attempts) + 1? In that case it could be a property, or be omitted entirely
| # Call selector to get next model | ||
| model = await self._call_selector(context) | ||
|
|
||
| if model is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we allow the selector to return None if that will always raise an error? Couldn't the selector itself be required to return a Model, or itself raise an error if it can't?
|
|
||
| # Try the selected model | ||
| start_time = time.time() | ||
| customized_params = model.customize_request_parameters(model_request_parameters) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we need to use prepare_request as we do in FallbackModel
|
@DouweM sorry I wasn't super clear, but this was a really just quick POC for validating this abstraction supported difference use-cases. I hardly reviewed at the internals at all 😅 . Wanted to make sure you'd be open the the approach/contribution before putting too much effort into. One thought I had was to allow the I'll take a proper pass on it as soon as soon as I have some free cycles. If there are any other issues you want to make sure this approach covers, send them over and I'll make sure they are covered! |
@ChuckJonas I realized that, I just couldn't help myself 😄
It seems like people sometimes forget that they can just subclass our public classes, so generic state makes sense to me. |
a0c2cad to
b9f29d1
Compare
gnaryak
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see this PR. You made my day.
| - Inspect previous attempts via ctx.attempts | ||
| """ | ||
|
|
||
| models: Sequence[Model] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does the AdaptiveModel class have a models member? It seems to imply that the _selector function is choosing one from this sequence. What advantage does that give us? Could we make it optional? Why not just let the _selector function provide a model however it wants?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya, I've been debating this internally. It's was mainly here because the design evolved organically from FallbackModel.
The biggest issue is having to know the models at initialization. Maybe there are instances where you actually want to init a model on the fly? (although I'm struggling to actually come up with one in practice)
What advantage does that give us?
The main things we lose by removing it:
- it's potentially less "discoverable" (being able to introspect/reflect the underlying models). Not sure where this would actually be needed in practice.
- some of the model properties would be less specific (IE
model_name). This probably only really impacts observability.
Could we make it optional?
If it's not critical, I'd lean more for just removing it for overall consistency.
Basically this becomes a ModelProvider / Factory with some internal state tracking (arguably still too opinionated) and life cycle hooks (I've since added on_failure/success hooks).
I do think there is a fine line between being "general purpose" and having "no purpose" 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, when I first started exploring our use case, I was initially hoping that agent might accept either a model factory or a model instance. So to me the idea of this being a model factory sounds just fine. A model factory in model's clothing.
WIP: submitted code in POC stage
Based on discussions from #3023
AdaptiveModelis a new model type that provides full control over model selection at runtime. UnlikeFallbackModelwhich tries models sequentially,AdaptiveModelallows custom logic to select the next model based on rich context including attempts, exceptions, and agent dependencies.Core API
Execution Flow
The
AdaptiveModelautomatically handles fallback and retry by:selector(context)to get the next model to trycontext.attemptsselector(context)again with updated contextModel, goto step 2 (retry/fallback)None, raise exception group with all failuresmax_attemptsis reached, stop and raise exception groupThe selector has full control over retry/fallback logic:
None→ stop tryingctx.attempts[-1].exceptionto make decisionstime.sleep()orawait asyncio.sleep()for backoff/waiting (selector can be sync or async)Context
Important Implementation Notes
Agent Dependencies Availability:
run_context(and thusctx.run_context.deps) is only available in streaming mode (run_stream)run,run_sync),ctx.run_contextwill beNoneModel.request()API not accepting arun_contextparameterUse Cases
1. Throttling with Timeout
Handle rate limiting by timing out models for 30 seconds after throttling errors.
2. Load Balancing
Distribute requests evenly across multiple accounts/instances.
3. User Tier-Based Selection
Route to different models based on user subscription level.
4. Extended Context Model Upgrade
Automatically upgrade to a long-context model when conversation exceeds a threshold.
5. Cost-Optimized with Quality Fallback
Start with cheap models, upgrade if quality is insufficient.
6. Smart Retry with Exponential Backoff
Retry same model with backoff for transient errors, fallback for permanent errors.
7. Account Quota Management
Rotate across accounts based on remaining quota.
Key Benefits
Comparison with FallbackModel
AdaptiveModelcomplementsFallbackModelby supporting sophisticated routing scenarios whileFallbackModelremains the simpler choice for basic sequential fallback.