Skip to content

Commit

Permalink
llm.get_async_model(), llm.AsyncModel base class and OpenAI async mod…
Browse files Browse the repository at this point in the history
…els (#613)

- #507 (comment)

* register_model is now async aware

Refs #507 (comment)

* Refactor Chat and AsyncChat to use _Shared base class

Refs #507 (comment)

* fixed function name

* Fix for infinite loop

* Applied Black

* Ran cog

* Applied Black

* Add Response.from_row() classmethod back again

It does not matter that this is a blocking call, since it is a classmethod

* Made mypy happy with llm/models.py

* mypy fixes for openai_models.py

I am unhappy with this, had to duplicate some code.

* First test for AsyncModel

* Still have not quite got this working

* Fix for not loading plugins during tests, refs #626

* audio/wav not audio/wave, refs #603

* Black and mypy and ruff all happy

* Refactor to avoid generics

* Removed obsolete response() method

* Support text = await async_mock_model.prompt("hello")

* Initial docs for llm.get_async_model() and await model.prompt()

Refs #507

* Initial async model plugin creation docs

* duration_ms ANY to pass test

* llm models --async option

Refs #613 (comment)

* Removed obsolete TypeVars

* Expanded register_models() docs for async

* await model.prompt() now returns AsyncResponse

Refs #613 (comment)

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
  • Loading branch information
simonw and github-actions[bot] authored Nov 14, 2024
1 parent 5a984d0 commit ba75c67
Show file tree
Hide file tree
Showing 14 changed files with 688 additions and 219 deletions.
2 changes: 2 additions & 0 deletions docs/help.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,7 @@ Options:
--cid, --conversation TEXT Continue the conversation with the given ID.
--key TEXT API key to use
--save TEXT Save prompt with this template name
--async Run prompt asynchronously
--help Show this message and exit.
```

Expand Down Expand Up @@ -322,6 +323,7 @@ Usage: llm models list [OPTIONS]
Options:
--options Show options for each model, if available
--async List async models
--help Show this message and exit.
```

Expand Down
51 changes: 51 additions & 0 deletions docs/plugins/advanced-model-plugins.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,64 @@ The {ref}`model plugin tutorial <tutorial-model-plugin>` covers the basics of de

This document covers more advanced topics.

(advanced-model-plugins-async)=

## Async models

Plugins can optionally provide an asynchronous version of their model, suitable for use with Python [asyncio](https://docs.python.org/3/library/asyncio.html). This is particularly useful for remote models accessible by an HTTP API.

The async version of a model subclasses `llm.AsyncModel` instead of `llm.Model`. It must implement an `async def execute()` async generator method instead of `def execute()`.

This example shows a subset of the OpenAI default plugin illustrating how this method might work:


```python
from typing import AsyncGenerator
import llm

class MyAsyncModel(llm.AsyncModel):
# This cn duplicate the model_id of the sync model:
model_id = "my-model-id"

async def execute(
self, prompt, stream, response, conversation=None
) -> AsyncGenerator[str, None]:
if stream:
completion = await client.chat.completions.create(
model=self.model_id,
messages=messages,
stream=True,
)
async for chunk in completion:
yield chunk.choices[0].delta.content
else:
completion = await client.chat.completions.create(
model=self.model_name or self.model_id,
messages=messages,
stream=False,
)
yield completion.choices[0].message.content
```
This async model instance should then be passed to the `register()` method in the `register_models()` plugin hook:

```python
@hookimpl
def register_models(register):
register(
MyModel(), MyAsyncModel(), aliases=("my-model-aliases",)
)
```

(advanced-model-plugins-attachments)=

## Attachments for multi-modal models

Models such as GPT-4o, Claude 3.5 Sonnet and Google's Gemini 1.5 are multi-modal: they accept input in the form of images and maybe even audio, video and other formats.

LLM calls these **attachments**. Models can specify the types of attachments they accept and then implement special code in the `.execute()` method to handle them.

See {ref}`the Python attachments documentation <python-api-attachments>` for details on using attachments in the Python API.

### Specifying attachment types

A `Model` subclass can list the types of attachments it accepts by defining a `attachment_types` class attribute:
Expand Down
17 changes: 16 additions & 1 deletion docs/plugins/plugin-hooks.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,5 +42,20 @@ class HelloWorld(llm.Model):
def execute(self, prompt, stream, response):
return ["hello world"]
```
If your model includes an async version, you can register that too:

```python
class AsyncHelloWorld(llm.AsyncModel):
model_id = "helloworld"

async def execute(self, prompt, stream, response):
return ["hello world"]

@llm.hookimpl
def register_models(register):
register(HelloWorld(), AsyncHelloWorld(), aliases=("hw",))
```
This demonstrates how to register a model with both sync and async versions, and how to specify an alias for that model.

The {ref}`model plugin tutorial <tutorial-model-plugin>` describes how to use this hook in detail. Asynchronous models {ref}`are described here <advanced-model-plugins-async>`.

{ref}`tutorial-model-plugin` describes how to use this hook in detail.
30 changes: 29 additions & 1 deletion docs/python-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ print(response.text())
```
Some models do not use API keys at all.

## Streaming responses
### Streaming responses

For models that support it you can stream responses as they are generated, like this:

Expand All @@ -112,6 +112,34 @@ The `response.text()` method described earlier does this for you - it runs throu

If a response has been evaluated, `response.text()` will continue to return the same string.

(python-api-async)=

## Async models

Some plugins provide async versions of their supported models, suitable for use with Python [asyncio](https://docs.python.org/3/library/asyncio.html).

To use an async model, use the `llm.get_async_model()` function instead of `llm.get_model()`:

```python
import llm
model = llm.get_async_model("gpt-4o")
```
You can then run a prompt using `await model.prompt(...)`:

```python
response = await model.prompt(
"Five surprising names for a pet pelican"
)
print(await response.text())
```
Or use `async for chunk in ...` to stream the response as it is generated:
```python
async for chunk in model.prompt(
"Five surprising names for a pet pelican"
):
print(chunk, end="", flush=True)
```

## Conversations

LLM supports *conversations*, where you ask follow-up questions of a model as part of an ongoing conversation.
Expand Down
60 changes: 53 additions & 7 deletions llm/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
NeedsKeyException,
)
from .models import (
AsyncModel,
AsyncResponse,
Attachment,
Conversation,
Model,
Expand All @@ -26,9 +28,11 @@

__all__ = [
"hookimpl",
"get_async_model",
"get_model",
"get_key",
"user_dir",
"AsyncResponse",
"Attachment",
"Collection",
"Conversation",
Expand Down Expand Up @@ -74,11 +78,11 @@ def get_models_with_aliases() -> List["ModelWithAliases"]:
for alias, model_id in configured_aliases.items():
extra_model_aliases.setdefault(model_id, []).append(alias)

def register(model, aliases=None):
def register(model, async_model=None, aliases=None):
alias_list = list(aliases or [])
if model.model_id in extra_model_aliases:
alias_list.extend(extra_model_aliases[model.model_id])
model_aliases.append(ModelWithAliases(model, alias_list))
model_aliases.append(ModelWithAliases(model, async_model, alias_list))

load_plugins()
pm.hook.register_models(register=register)
Expand Down Expand Up @@ -137,26 +141,68 @@ def get_embedding_model_aliases() -> Dict[str, EmbeddingModel]:
return model_aliases


def get_async_model_aliases() -> Dict[str, AsyncModel]:
async_model_aliases = {}
for model_with_aliases in get_models_with_aliases():
if model_with_aliases.async_model:
for alias in model_with_aliases.aliases:
async_model_aliases[alias] = model_with_aliases.async_model
async_model_aliases[model_with_aliases.model.model_id] = (
model_with_aliases.async_model
)
return async_model_aliases


def get_model_aliases() -> Dict[str, Model]:
model_aliases = {}
for model_with_aliases in get_models_with_aliases():
for alias in model_with_aliases.aliases:
model_aliases[alias] = model_with_aliases.model
model_aliases[model_with_aliases.model.model_id] = model_with_aliases.model
if model_with_aliases.model:
for alias in model_with_aliases.aliases:
model_aliases[alias] = model_with_aliases.model
model_aliases[model_with_aliases.model.model_id] = model_with_aliases.model
return model_aliases


class UnknownModelError(KeyError):
pass


def get_model(name: Optional[str] = None) -> Model:
def get_async_model(name: Optional[str] = None) -> AsyncModel:
aliases = get_async_model_aliases()
name = name or get_default_model()
try:
return aliases[name]
except KeyError:
# Does a sync model exist?
sync_model = None
try:
sync_model = get_model(name, _skip_async=True)
except UnknownModelError:
pass
if sync_model:
raise UnknownModelError("Unknown async model (sync model exists): " + name)
else:
raise UnknownModelError("Unknown model: " + name)


def get_model(name: Optional[str] = None, _skip_async: bool = False) -> Model:
aliases = get_model_aliases()
name = name or get_default_model()
try:
return aliases[name]
except KeyError:
raise UnknownModelError("Unknown model: " + name)
# Does an async model exist?
if _skip_async:
raise UnknownModelError("Unknown model: " + name)
async_model = None
try:
async_model = get_async_model(name)
except UnknownModelError:
pass
if async_model:
raise UnknownModelError("Unknown model (async model exists): " + name)
else:
raise UnknownModelError("Unknown model: " + name)


def get_key(
Expand Down
69 changes: 54 additions & 15 deletions llm/cli.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import asyncio
import click
from click_default_group import DefaultGroup
from dataclasses import asdict
Expand All @@ -11,6 +12,7 @@
Template,
UnknownModelError,
encode,
get_async_model,
get_default_model,
get_default_embedding_model,
get_embedding_models_with_aliases,
Expand Down Expand Up @@ -199,6 +201,7 @@ def cli():
)
@click.option("--key", help="API key to use")
@click.option("--save", help="Save prompt with this template name")
@click.option("async_", "--async", is_flag=True, help="Run prompt asynchronously")
def prompt(
prompt,
system,
Expand All @@ -215,6 +218,7 @@ def prompt(
conversation_id,
key,
save,
async_,
):
"""
Execute a prompt
Expand Down Expand Up @@ -337,9 +341,12 @@ def read_prompt():

# Now resolve the model
try:
model = model_aliases[model_id]
except KeyError:
raise click.ClickException("'{}' is not a known model".format(model_id))
if async_:
model = get_async_model(model_id)
else:
model = get_model(model_id)
except UnknownModelError as ex:
raise click.ClickException(ex)

# Provide the API key, if one is needed and has been provided
if model.needs_key:
Expand Down Expand Up @@ -375,21 +382,48 @@ def read_prompt():
prompt_method = conversation.prompt

try:
response = prompt_method(
prompt, attachments=resolved_attachments, system=system, **validated_options
)
if should_stream:
for chunk in response:
print(chunk, end="")
sys.stdout.flush()
print("")
if async_:

async def inner():
if should_stream:
async for chunk in prompt_method(
prompt,
attachments=resolved_attachments,
system=system,
**validated_options,
):
print(chunk, end="")
sys.stdout.flush()
print("")
else:
response = prompt_method(
prompt,
attachments=resolved_attachments,
system=system,
**validated_options,
)
print(await response.text())

asyncio.run(inner())
else:
print(response.text())
response = prompt_method(
prompt,
attachments=resolved_attachments,
system=system,
**validated_options,
)
if should_stream:
for chunk in response:
print(chunk, end="")
sys.stdout.flush()
print("")
else:
print(response.text())
except Exception as ex:
raise click.ClickException(str(ex))

# Log to the database
if (logs_on() or log) and not no_log:
if (logs_on() or log) and not no_log and not async_:
log_path = logs_db_path()
(log_path.parent).mkdir(parents=True, exist_ok=True)
db = sqlite_utils.Database(log_path)
Expand Down Expand Up @@ -981,14 +1015,19 @@ def models():
@click.option(
"--options", is_flag=True, help="Show options for each model, if available"
)
def models_list(options):
@click.option("async_", "--async", is_flag=True, help="List async models")
def models_list(options, async_):
"List available models"
models_that_have_shown_options = set()
for model_with_aliases in get_models_with_aliases():
if async_ and not model_with_aliases.async_model:
continue
extra = ""
if model_with_aliases.aliases:
extra = " (aliases: {})".format(", ".join(model_with_aliases.aliases))
model = model_with_aliases.model
model = (
model_with_aliases.model if not async_ else model_with_aliases.async_model
)
output = str(model) + extra
if options and model.Options.schema()["properties"]:
output += "\n Options:"
Expand Down
Loading

0 comments on commit ba75c67

Please sign in to comment.