Make chat and embed interfaces provider-agnostic using pydantic_ai#200
Make chat and embed interfaces provider-agnostic using pydantic_ai#200gvanrossum wants to merge 11 commits intomainfrom
Conversation
|
pyproject.toml or lockfile was not touched so the dependecy to pydantic_ai is missing. |
src/typeagent/aitools/embeddings.py
Outdated
| return input, len(tokens) | ||
|
|
||
|
|
||
| def create_embedding_model( |
There was a problem hiding this comment.
There are now two different create_embedding_model functions with different signatures:
embeddings.py:355: create_embedding_model(embedding_size, model_name, **kwargs) — creates AsyncEmbeddingModel (OpenAI/Azure)
model_adapters.py:184: create_embedding_model(model_spec, *, embedding_size) — creates PydanticAIEmbeddingModel
This is confusing. Callers importing from different modules get entirely different behaviors.
IMO we only need the PydanticAIEmbeddingModel as it can include to call/configure also OpenAI/Azure LLM
| # --------------------------------------------------------------------------- | ||
|
|
||
|
|
||
| class PydanticAIEmbeddingModel(IEmbeddingModel): |
There was a problem hiding this comment.
AsyncEmbeddingModel (in aitools/embeddings.py) does NOT inherit from IEmbeddingModel, so kind of asymetric ?
furthermore: Is subclassing a Protocol directly really the intended pattern ?
|
|
||
| def test_embedding_model_is_iembedding_model() -> None: | ||
| """PydanticAIEmbeddingModel inherits from IEmbeddingModel.""" | ||
| assert IEmbeddingModel in PydanticAIEmbeddingModel.__mro__ |
There was a problem hiding this comment.
mro ? again the inheritance with IEmbeddingModel seems asymetric
tools/ingest_vtt.py
Outdated
| print("Setting up conversation settings...") | ||
| try: | ||
| embedding_model = AsyncEmbeddingModel(model_name=embedding_name) | ||
| embedding_model = create_embedding_model(model_name=embedding_name) |
There was a problem hiding this comment.
the pydantic.ai API uses a ctor call to instantiate models or agents and not a method so this looks kind of strange.
e.g.
model = os.getenv('PYDANTIC_AI_MODEL', 'openai:gpt-5.2')
print(f'Using model: {model}')
agent = Agent(model, output_type=MyModel)
| env: dict[str, str | None] = dict(os.environ) | ||
| key_name = "AZURE_OPENAI_API_KEY" | ||
| env[key_name] = api_key | ||
| self.base_model = typechat.create_language_model(env) |
There was a problem hiding this comment.
- why do we set the environment again ? better pass the api key via argument than via the environment
- why do we create a new language model each time the refresh (line 112). might be expensive operation
| ) | ||
|
|
||
| from .embeddings import AsyncEmbeddingModel, NormalizedEmbedding, NormalizedEmbeddings | ||
| DEFAULT_MAX_RETRIES = 2 |
There was a problem hiding this comment.
Before: from openai import DEFAULT_MAX_RETRIES (whatever OpenAI sets).
Now: DEFAULT_MAX_RETRIES = 2.
?
| def add_embedding(self, key: str, embedding: NormalizedEmbedding) -> None: | ||
| self._cache[key] = embedding | ||
|
|
||
| async def _probe_embedding_size(self) -> None: |
There was a problem hiding this comment.
why probe the embedding size ?
we should specify it, or reject the call
| if self.embedding_size == 0: | ||
| await self._probe_embedding_size() |
There was a problem hiding this comment.
same where. we should specify the embedding size as mandatory parameter
|
|
||
|
|
||
| @runtime_checkable | ||
| class IEmbeddingModel(Protocol): |
There was a problem hiding this comment.
he Protocol bundles caching semantics (add_embedding, get_embedding with cache, _nocache variants) into the provider interface itself. This means every new provider implementation (e.g., Anthropic, Cohere) must re-implement the same caching boilerplate.
e.g. PydanticAIEmbeddingModel duplicates nearly identical caching logic from AsyncEmbeddingModel.
Consider splitting the interface:
A minimal IEmbedder Protocol with only get_embedding_nocache / get_embeddings_nocache + model_name + embedding_size
A shared CachingEmbeddingModel base class (or a decorator/wrapper) that adds the cache layer on top
| DEFAULT_TIMEOUT_SECONDS = 25 | ||
|
|
||
|
|
||
| class ModelWrapper(typechat.TypeChatLanguageModel): |
There was a problem hiding this comment.
ModelWrapper is a concrete class wrapping typechat.TypeChatLanguageModel with Azure-specific token refresh logic. It now lives in utils.py — a catch-all module. With the new PydanticAIChatModel abstraction, there are now two different abstractions for wrapping a TypeChat model: ModelWrapper (Azure, in utils.py) and PydanticAIChatModel (pydantic_ai, in model_adapters.py). There's no clear guidance on when to use which.
Unreviewed agent output.