Skip to content

fix(coding-lint): validate generated skill calls correctly#733

Open
Wuodan wants to merge 1 commit intomindcraft-bots:developfrom
Wuodan:fix/skill-validation-in-generated-code
Open

fix(coding-lint): validate generated skill calls correctly#733
Wuodan wants to merge 1 commit intomindcraft-bots:developfrom
Wuodan:fix/skill-validation-in-generated-code

Conversation

@Wuodan
Copy link
Contributor

@Wuodan Wuodan commented Mar 11, 2026

This fixes two problems in validation of generated code. The two problems
canceled each other out for a while, which made the bug hard to spot.

  1. Skill-library methods vs. validated parsed methods

    • SkillLibrary sees them in the form skills.placeBlock
    • while coder.js validated only the parsed method name like placeBlock
    • so this comparison was not using compatible data
  2. Validation result double-negated in coder.js

    • const missingSkills = skills.filter(skill => !!allDocs[skill]);

With the old code, generated skill calls were therefore not validated
correctly. After correcting only the double-negation, valid calls could then
show up as missing because the underlying comparison still used incompatible
formats.

This change fixes that by validating generated calls against a structured,
namespace-aware function index instead of against raw skill-doc strings.

To reproduce the problem:

  1. Start from the old implementation and first remove the double negation in
    src/agent/coder.js:
         // check function exists
-        const missingSkills = skills.filter(skill => !!allDocs[skill]);
+        const missingSkills = skills.filter(skill => !allDocs[skill]);
         if (missingSkills.length > 0) {
  1. Run mindcraft with allow_insecure_coding=true and a bot profile that has
    a code_model and an embedding model.

  2. Use a task that is likely to trigger code generation. One example is the
    small_church task from
    tasks-demo/construction_tasks/custom/tasks.json, adapted to one bot and
    with explicit wording to force !newAction and avoid resource gathering:

"goal": "Build the structure from the blueprint below by using !newAction. Use only the resources already in your inventory. Do not search for, mine, or craft additional resources unless the blueprint truly requires something missing.",
"conversation": "Use !newAction to build the structure from the blueprint below using only the resources already in your inventory. Do not search for, mine, or craft additional resources unless something is actually missing.",
"timeout": 9007199254740991

Then the logs can show valid generated code like:

async function buildStructure() {
    const blueprint = [
        ['oak_planks', 'oak_planks', 'oak_planks'],
        ['oak_planks', 'oak_planks', 'oak_planks'],
        ['oak_planks', 'oak_planks', 'oak_planks']
    ];

    const position = bot.entity.position;

    for (let i = 0; i < blueprint.length; i++) {
        for (let j = 0; j < blueprint[i].length; j++) {
            const block = blueprint[i][j];
            const x = position.x + j;
            const y = position.y + i + 1;
            const z = position.z;

            await skills.placeBlock(bot, block, x, y, z);
        }
    }
}

await buildStructure();

but still report a validation error like:

{
  "role": "user",
  "content": "SYSTEM: Error: Code lint error:\n#### CODE ERROR INFO ###\nThese functions do not exist.\n### FUNCTIONS NOT FOUND ###\nplaceBlock\nPlease try again."
}

This PR fixes that by keeping namespace information during validation and by
checking generated calls against a structured skill index instead of a doc
string array.

… index

Generated-code lint did not validate referenced `skills.*` / `world.*`
calls correctly.

`coder.js` extracted only method names such as `placeBlock`, while
`SkillLibrary` exposed skill docs as strings keyed by fully qualified names
such as `skills.placeBlock`. That meant lint validation depended on the
shape of the prompt-doc data instead of on a real function index, so the
check was unreliable and could misreport valid generated calls.

Replace that with a structured namespace-aware function index in
`SkillLibrary` and validate parsed calls against it in `coder.js`.
Missing calls are now reported as fully qualified names, prompt doc
selection stays unchanged, and the now-unused `getAllSkillDocs()`
accessor is removed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant