Skip to content

Commit c851e77

Browse files
authored
Add new events + stat (#142)
1 parent 4ae1fa3 commit c851e77

File tree

2 files changed

+160
-0
lines changed

2 files changed

+160
-0
lines changed

1_developer/_2_rest/chat.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -236,6 +236,10 @@ variants:
236236
- name: time_to_first_token_seconds
237237
type: number
238238
description: Time in seconds to generate the first token.
239+
- name: model_load_time_seconds
240+
type: number
241+
optional: true
242+
description: Time taken to load the model for this request in seconds. Present only if the model was not already loaded.
239243
- name: thread_id
240244
type: string
241245
optional: true

1_developer/_2_rest/streaming-events.md

Lines changed: 156 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,12 @@ Streaming events let you render chat responses incrementally over Server‑Sent
88

99
List of event types that can be sent in an `api/v1/chat` response stream:
1010
- `chat.start`
11+
- `model_load.start`
12+
- `model_load.progress`
13+
- `model_load.end`
14+
- `prompt_processing.start`
15+
- `prompt_processing.progress`
16+
- `prompt_processing.end`
1117
- `reasoning.start`
1218
- `reasoning.delta`
1319
- `reasoning.end`
@@ -51,6 +57,156 @@ variants:
5157
```
5258
````
5359

60+
### `model_load.start`
61+
````lms_hstack
62+
Signals the start of a model being loaded to fulfill the chat request. Will not be emitted if the requested model is already loaded.
63+
```lms_params
64+
- name: model_instance_id
65+
type: string
66+
description: Unique identifier for the model instance being loaded.
67+
- name: type
68+
type: '"model_load.start"'
69+
description: The type of the event. Always `model_load.start`.
70+
```
71+
:::split:::
72+
```lms_code_snippet
73+
title: Example Event Data
74+
variants:
75+
json:
76+
language: json
77+
code: |
78+
{
79+
"type": "model_load.start",
80+
"model_instance_id": "openai/gpt-oss-20b"
81+
}
82+
```
83+
````
84+
85+
### `model_load.progress`
86+
````lms_hstack
87+
Progress of the model load.
88+
```lms_params
89+
- name: model_instance_id
90+
type: string
91+
description: Unique identifier for the model instance being loaded.
92+
- name: progress
93+
type: number
94+
description: Progress of the model load as a float between `0` and `1`.
95+
- name: type
96+
type: '"model_load.progress"'
97+
description: The type of the event. Always `model_load.progress`.
98+
```
99+
:::split:::
100+
```lms_code_snippet
101+
title: Example Event Data
102+
variants:
103+
json:
104+
language: json
105+
code: |
106+
{
107+
"type": "model_load.progress",
108+
"model_instance_id": "openai/gpt-oss-20b",
109+
"progress": 0.65
110+
}
111+
```
112+
````
113+
114+
### `model_load.end`
115+
````lms_hstack
116+
Signals a successfully completed model load.
117+
```lms_params
118+
- name: model_instance_id
119+
type: string
120+
description: Unique identifier for the model instance that was loaded.
121+
- name: load_time_seconds
122+
type: number
123+
description: Time taken to load the model in seconds.
124+
- name: type
125+
type: '"model_load.end"'
126+
description: The type of the event. Always `model_load.end`.
127+
```
128+
:::split:::
129+
```lms_code_snippet
130+
title: Example Event Data
131+
variants:
132+
json:
133+
language: json
134+
code: |
135+
{
136+
"type": "model_load.end",
137+
"model_instance_id": "openai/gpt-oss-20b",
138+
"load_time_seconds": 12.34
139+
}
140+
```
141+
````
142+
143+
### `prompt_processing.start`
144+
````lms_hstack
145+
Signals the start of the model processing a prompt.
146+
```lms_params
147+
- name: type
148+
type: '"prompt_processing.start"'
149+
description: The type of the event. Always `prompt_processing.start`.
150+
```
151+
:::split:::
152+
```lms_code_snippet
153+
title: Example Event Data
154+
variants:
155+
json:
156+
language: json
157+
code: |
158+
{
159+
"type": "prompt_processing.start"
160+
}
161+
```
162+
````
163+
164+
### `prompt_processing.progress`
165+
````lms_hstack
166+
Progress of the model processing a prompt.
167+
```lms_params
168+
- name: progress
169+
type: number
170+
description: Progress of the prompt processing as a float between `0` and `1`.
171+
- name: type
172+
type: '"prompt_processing.progress"'
173+
description: The type of the event. Always `prompt_processing.progress`.
174+
```
175+
:::split:::
176+
```lms_code_snippet
177+
title: Example Event Data
178+
variants:
179+
json:
180+
language: json
181+
code: |
182+
{
183+
"type": "prompt_processing.progress",
184+
"progress": 0.5
185+
}
186+
```
187+
````
188+
189+
### `prompt_processing.end`
190+
````lms_hstack
191+
Signals the end of the model processing a prompt.
192+
```lms_params
193+
- name: type
194+
type: '"prompt_processing.end"'
195+
description: The type of the event. Always `prompt_processing.end`.
196+
```
197+
:::split:::
198+
```lms_code_snippet
199+
title: Example Event Data
200+
variants:
201+
json:
202+
language: json
203+
code: |
204+
{
205+
"type": "prompt_processing.end"
206+
}
207+
```
208+
````
209+
54210
### `reasoning.start`
55211
````lms_hstack
56212
Signals the model is starting to stream reasoning content.

0 commit comments

Comments
 (0)