You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We need integration tests that test to make sure batching does not change generation results. Previously, we've had issues like the one fixed by #873 where improper masking caused
token corruption due to a batch having more than 1 request
token corruption due to a batch being not full
Our current overall shortfin llm server integration tests weren't able to reliably expose them in a replicate-able way because our concurrency tests go through the http endpoint, where the timing in which the requests come in causes differences in how they are batched.
The text was updated successfully, but these errors were encountered:
We need integration tests that test to make sure batching does not change generation results. Previously, we've had issues like the one fixed by #873 where improper masking caused
Our current overall shortfin llm server integration tests weren't able to reliably expose them in a replicate-able way because our concurrency tests go through the http endpoint, where the timing in which the requests come in causes differences in how they are batched.
The text was updated successfully, but these errors were encountered: