Agent from guided tour runs unrelated sample tasks on execution #106

psschwei · 2025-01-07T19:57:08Z

I was working through the guided tour docs and ran the first example for getting a fibonacci number:

from smolagents import CodeAgent, LiteLLMModel

model = LiteLLMModel(
    model_id="ollama_chat/llama3.2",
    api_base="http://localhost:11434",
    api_key="YOUR_API_KEY"
)

agent = CodeAgent(tools=[], model=model, add_base_tools=True)

agent.run(
    "Could you give me the 118th number in the Fibonacci sequence?",
)

It'll usually get the right answer, but instead of returning it will often then try to answer very unrelated questions, such as how old is the pope or the population of various Chinese cities. On further review, it seems that both of these are from the task examples passed in as part of the system prompt (see here and here), so it seems that agent seems to be running them in addition to the user prompt.

I've seen this happen with llama3.1, llama3.2, mistral and granite3.1-dense, so it doesn't seem to be model-specific.

Here's the output of one such run where the agent return the pope's age instead of the fibonacci number (this was using llama3.2):

venv/lib64/python3.11/site-packages/pydantic/_internal/_config.py:345: UserWarning: Valid config keys have changed in V2:
* 'fields' has been removed
  warnings.warn(message, UserWarning)
╭────────────────────────────────────────────────────────────────────────────────────────────── New run ───────────────────────────────────────────────────────────────────────────────────────────────╮
│                                                                                                                                                                                                      │
│ Could you give me the 118th number in the Fibonacci sequence?                                                                                                                                        │
│                                                                                                                                                                                                      │
╰─ LiteLLMModel - ollama_chat/llama3.2 ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
╭─ Executing this code: ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│    1 def fibonacci(n):                                                                                                                                                                               │
│    2     if n <= 0:                                                                                                                                                                                  │
│    3         return "Input should be positive integer."                                                                                                                                              │
│    4     elif n == 1:                                                                                                                                                                                │
│    5         return 0                                                                                                                                                                                │
│    6     elif n == 2:                                                                                                                                                                                │
│    7         return 1                                                                                                                                                                                │
│    8                                                                                                                                                                                                 │
│    9     fib_sequence = [0, 1]                                                                                                                                                                       │
│   10     while len(fib_sequence) < n:                                                                                                                                                                │
│   11         next_fib = fib_sequence[-1] + fib_sequence[-2]                                                                                                                                          │
│   12         fib_sequence.append(next_fib)                                                                                                                                                           │
│   13                                                                                                                                                                                                 │
│   14     return fib_sequence[-1]                                                                                                                                                                     │
│   15                                                                                                                                                                                                 │
│   16 print(fibonacci(118))                                                                                                                                                                           │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Execution logs:
1264937032042997393488322

Out: None
[Step 0: Duration 13.96 seconds| Input tokens: 42 | Output tokens: 449]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
╭─ Executing this code: ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│   1 pope_current_age = 88 ** 0.36                                                                                                                                                                    │
│   2 final_answer(pope_current_age)                                                                                                                                                                   │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Out - Final answer: 5.012031155582636
[Step 1: Duration 6.26 seconds| Input tokens: 2,090 | Output tokens: 578]

The text was updated successfully, but these errors were encountered:

aymeric-roucher · 2025-01-09T22:39:32Z

Thank you for reporting @psschwei : I often have this kind of problem with less powerful LLMs.
What we're asking them is quite complicated conceptually, so no wonder they go astray and just replicate prompt instructions.

You can check that this is an LLM-related issue and not framework-related by switching for a stronger LLM.

IMO anything under 7B at the moment is not really capable of agentic workflows. (that said it's also not a guarantee that everything over 7B would work, depends on the individual LLM!)

psschwei changed the title ~~Agent from guided tour runs additional sample tasks on execution~~ Agent from guided tour runs unrelated sample tasks on execution Jan 7, 2025

aymeric-roucher closed this as completed Jan 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent from guided tour runs unrelated sample tasks on execution #106

Agent from guided tour runs unrelated sample tasks on execution #106

psschwei commented Jan 7, 2025

aymeric-roucher commented Jan 9, 2025

Agent from guided tour runs unrelated sample tasks on execution #106

Agent from guided tour runs unrelated sample tasks on execution #106

Comments

psschwei commented Jan 7, 2025

aymeric-roucher commented Jan 9, 2025