Ollama UX Needs Improvement #54

LumiWasTaken · 2024-12-03T09:19:53Z

Using:
https://github.com/Kiln-AI/Kiln/actions/runs/12130985759

Backend:
Both normal Ollama and KoboldCPP Ollama Compatible API

Issue:

No model found.

LumiWasTaken · 2024-12-03T09:56:18Z

I dug a bit deeper and it seems to be a Fundamental issue with:

Ollama running, but no models available

it does not properly fetch the /api/tags model list. Sadly I could with my effort not properly debug the "Why"

scosman · 2024-12-03T20:38:16Z

interesting. What does a curl http://localhost:11434/api/tags return?

Do you have one of the supported Ollama models installed. llama3.1:8b, llama3.1:70b, llama3.1:405b, mistral-large, llama3.2:1b, llama3.2, lama3.2-vision, llama3.2-vision:90b, phi3.5, gemma2:2b, gemma2:9b, gemma2:27b

Also: note the (i) info link in the screenshot - fewer models are supported for structured output. Smaller models can't produce it reliably, so they are filtered out. It might be working for task with unstructured output, but filtered for tasks with structured output.

UI improvements I should make here:

UI could be better at giving you a list of supported Ollama models.
Dropdown should be better at showing installed but unsupported models for structured tasks.

And potentially the root cause for why /api/tags is failing to parse

LumiWasTaken · 2024-12-03T20:40:45Z

Oh- i did not know there is a specific list of "supported" models as i use custom loaded models via gguf since mine have a slightly altered finedtune.

may i ask why you added a list of "supported" models... given that technically people who'd want to create synthetic data know what kind of model they should choose?

LumiWasTaken · 2024-12-03T20:42:54Z

the /api/tags endpoint returns the normal
{"models":[{"name":"dagbs/qwen2.5-coder-14b-instruct-abliterated:q4_k_m","model":"dagbs/qwen2.5-coder-14b-instruct-abliterated:q4_k_m","modified_at":"2024-12-03T14:08:10.388678503+01:00","size":8988113261,"digest":"c729c6635cd6ed6d14ee6d5d272d7c3930cb0cb62303f2bd1c4cba65f8f47fa1","details":{"parent_model":"","format":"gguf","family":"qwen2","families":["qwen2"],"parameter_size":"14.8B","quantization_level":"Q4_K_M"}}]}

scosman · 2024-12-03T20:55:51Z

yeah - that's by design (if the design is good is another question). I want the out of box experience to be great so I test every feature with every model in a big matrix. But you should be able to override it.

Try one of the supported models for now.

I'll keep this issue for improving the UX around Ollama. Issues exist already for the other two:

I should add qwen (you can upvote): Add more models to default pack [Ongoing] #29
You should be able to add any model: Add ability to add any model from any supported provider #28

scosman · 2024-12-04T17:38:12Z

@LumiWasTaken fixed this. Grab a updated version from a main build here: https://github.com/Kiln-AI/Kiln/actions/workflows/build_desktop.yml

If you have time to confirm it works for you that would be great!

LumiWasTaken · 2024-12-04T18:35:11Z

@scosman

I've tested the new version.
Ollama STILL does not Connect UNLESS you have at least 1 Supported Model Installed.

While this is a valid approach, consider making the error message more clear by actively stating that unsupported models have been found, and prompt the user to install at least one Supported model or allow to just connect directly but keep the "Supported" Tab instead of empty, with one Place-Holder that says "No Supported Models installed" to highlight that all of them are untested

While i was writing this message, i installed llama3.2 to debug and saw that it's unsupported / structured output fails...
Maybe don't use that one as an example then in your Ollama prompt >.<

Additional Bug i found:
If the model name contains a / in the model name (from the ollama user repo it has the name of username/modelname in a HF style), it will only use anything pre-slash. So Lumi/llama-something resulting in a fail "Lumi is not installed"

Why do the Models require "tools" support? Could you highlight that one for me? If it's about the JSON output, their API supports "format": "json"

Additional Note:
I saw that the "joke" example has
{
"punchline": "",
"setup": " "
}

The "Raw Output" Represents that too.
Why would the Punchline come before the setup? Or am i missing a critical thing here...
The order of the JSON output determines the path it takes e.g.
If you ask a model to respind in a {thoughts: string, response: string} order, it will first think, then speak.
Swap that around then it's thinking will rely on what response it gave before.

Wouldn't it make more sense to swap it around as the generation is Left to right / top to bottom?

Sorry for this long winded response! I hope this gave some insight into my thoughts as i'm eating my dinner

LumiWasTaken · 2024-12-04T19:30:57Z

Another extension to my previous post.

I have also noticed, if you use a quanted pull e.g. using:
llama3.2:1b-instruct-q3_K_M
or even
llama3.2:3b-instruct-q3_K_M

Kiln will not recognize it UNLESS you specifically pull only the llama3.2 with no other extension.

That's pretty bad if you do not support for quants.

The API responds with:
"models":[{"name":"llama3.2:3b-instruct-q3_K_M","model":"llama3.2:3b-instruct-q3_K_M","modified_at":"2024-12-04T20:23:00.519908513+01:00","size":1687174821,"digest":"fea4b467793058703bce1eaa2ffa09cb236f530e01476c9b0c2a486f6a2e2920","details":{"parent_model":"","format":"gguf","family":"llama","families":["llama"],"parameter_size":"3.2B","quantization_level":"Q3_K_M"}}

So one way to check it with is somewhere here
(I am not too familliar with it)

Kiln/libs/core/kiln_ai/adapters/ml_model_list.py

Line 933 in 159a479

provider.provider_options["model"]

is to Split it by : and use the [0] value in that array so.
llama3.2:3b-instruct-q3_K_M
is deconstructed into
llama3.2 and 3b-instruct-q3_K_M

ADDITIONALLY you could compare the model name (pre : ) with a "startwith" or even better an "contains" rather than full text comparison.
This would allow models like:
dagbs-qwen2.5-coder-14b-instruct-abliterated:q4_k_m (i replaced the / with a - here given the previously mentioned / issue
to be counted as qwen2.5-coder-14b based while ignoring it's being an abliterated modification.

scosman · 2024-12-04T20:28:58Z

fixed the issue requiring at least 1 supported model just now. Good call. The messaging/UI should reflect what you're asking for now. It was setup like that, just missed the zero supported model case in tests.

Also fixed the issue with slashes. Literally had a TODO to fix that in code, just missed it 🤦

Great bug reports, keep them coming!

re:tools. The prompt should be pointing you to llama 3.1 now (which supports tools). Although the docs say it should work on 3.2 as well so not sure why it's erroring. I'll look at json mode + Ollama soon which might let me expand the structured output support. Trying to make a bunch of providers play nice is no fun. For now we test, and flag them as unlikely to work when there are known compatibility issues. You should see a big warning in the UI under the model selector.

re:joke example - JSON dictionaries don't have order so they might render in any order once parsed. The rendering you're seeing is after parsing (including "raw"). In general, I've seen most models are smart enough to write the setup first and punchline second, but if they aren't that's just a model problem. 1B param models might not get it right. Let me know if I missed something.

re:String matching - I really hate Ollama IDs. I can't just match the prefix, because "llama3.2" != "llama3.2:1b" (the first is 3b). However sometimes 2 IDs map to same model "llama3.2" == "llama3.2:latest". However, I think the current is working as I want it to. I only want it to match if that exact version (including quants) has passed the tests for structured output and data generation. I don't want to treat q4 like fp16 or even fp16 like fpb16. Your specific quants should appear in the untested section and work fine if selected (let me know if they don't, that would be a bug). They shouldn't appear in the "tested" list since those quants really aren't tested.

LumiWasTaken · 2024-12-04T20:46:41Z

The Quants actually did not appear in the Tested section. Your idea was working as suggested.
BUT i downloaded a llama 3.2 version that was a specific quant ergo: i had the issue that it kept triggering the "you need at least one supported model"

Regarding the string matching, maybe you are right... maybe there is also a combo of things to check, like general matching if ids match OR if the name contains "llama3.2" AND "3b" or some weird chains.

of course this might be an issue for a later "polishing" state where it could slowly work into the "convenience" features. I don't know if you plan on re-factoring code some day. As i was exploring these issues i slowly managed to get myself a overview of the structuring and found that some files e.g. with the list for the Ai models might not be very "long-term-scaleable"

Regarding the JSON / Tools i am not too sure why the models still need tool support but maybe that's for the JSON compatibility which would be logical.
The JSON "order" itsself is somewhat important as using the "thinking" example above it severely impacts how one or the other object is perceived. Another example would be a "criteria evaluation" where the JSON has an
criteria 1, 2,3,4,5 etc. and each one explains if it matches or not with the "summary" and "verdict" being at the end.
If the verdict would be done at the start the criterias would be created under the bias that the verdict has already been made or even do contradicting verdicts.

Other than that, thank you for the feedback. if i see more issues or things that bug me i'll make sure to open a seperate Issue to keep things on-topic

LumiWasTaken · 2024-12-04T20:47:09Z

If you'd like maybe a bit more in-depth theorizing or experimenting, feel free to let me know, we could chat on discord or something!

scosman · 2024-12-05T19:54:14Z

Just to wrap this, I think everything is fixed? Please confirm I'm not missing anything.

Quants now work fine. They appear in untested section, but they are untested so that's correct.
Fixed several bugs, including the zero supported model bug and forward slash bug (if you can confirm that would be great)
ml_model_list file cleaned up. I pulled out the code into two files. This file is basically a long config file now, but that's needed for a "just works" UX that isn't a thousand options.

JSON vs tools is going to be a forever problem. Some models are better at one, and others at another. We'll bake in guidance for the built in models. When I add the ability to add any model, we'll have an option for this. But there isn't a universal solution, and there will probably be more modes in time. For Kiln default install I want to focus on ease of use, with lots of options in config files under the covers for power users (not documented yet, but I'll get there).

LumiWasTaken · 2024-12-06T09:30:58Z

Yup works!

scosman changed the title ~~Bug: Ollama API - Model Selector Box Empty~~ Ollama UX Needs Improvement Dec 3, 2024

scosman added the enhancement New feature or request label Dec 4, 2024

scosman mentioned this issue Dec 4, 2024

Improve ollama support a bunch #57

Merged

scosman closed this as completed in #57 Dec 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ollama UX Needs Improvement #54

Ollama UX Needs Improvement #54

LumiWasTaken commented Dec 3, 2024

LumiWasTaken commented Dec 3, 2024

scosman commented Dec 3, 2024

LumiWasTaken commented Dec 3, 2024

LumiWasTaken commented Dec 3, 2024

scosman commented Dec 3, 2024

scosman commented Dec 4, 2024

LumiWasTaken commented Dec 4, 2024

LumiWasTaken commented Dec 4, 2024

scosman commented Dec 4, 2024

LumiWasTaken commented Dec 4, 2024

LumiWasTaken commented Dec 4, 2024

scosman commented Dec 5, 2024

LumiWasTaken commented Dec 6, 2024

Ollama UX Needs Improvement #54

Ollama UX Needs Improvement #54

Comments

LumiWasTaken commented Dec 3, 2024

LumiWasTaken commented Dec 3, 2024

scosman commented Dec 3, 2024

LumiWasTaken commented Dec 3, 2024

LumiWasTaken commented Dec 3, 2024

scosman commented Dec 3, 2024

scosman commented Dec 4, 2024

LumiWasTaken commented Dec 4, 2024

LumiWasTaken commented Dec 4, 2024

scosman commented Dec 4, 2024

LumiWasTaken commented Dec 4, 2024

LumiWasTaken commented Dec 4, 2024

scosman commented Dec 5, 2024

LumiWasTaken commented Dec 6, 2024