You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is there a way to create a mid-layer to serve functionary on a vanilla sglang server?
The unit I work in runs a sglang server where I can easily serve any huggingface hosted LLM. But I cannot install the functionary library on the server side and run the server_sglang.py script.
I attempted to create a middleware "server" on the client side without much success.
Thanks in advance for any tips.
The text was updated successfully, but these errors were encountered:
Same for vllm would be nice.
I've found out that server_vllm.py contains only /chat/completion endpoint
Also it seems that 3.1 medium generates a bunch of backslashes \\\, when it's prompted with a long input sequence. Is it because of the FC tune or FP8 quantization? Haven't tried 3.1 medium non quantized yet.
Hi,
Is there a way to create a mid-layer to serve functionary on a vanilla sglang server?
The unit I work in runs a sglang server where I can easily serve any huggingface hosted LLM. But I cannot install the functionary library on the server side and run the server_sglang.py script.
I attempted to create a middleware "server" on the client side without much success.
Thanks in advance for any tips.
The text was updated successfully, but these errors were encountered: