feat: add append_diagram tool and improve truncation handling (#252)

* feat: add append_diagram tool for truncation continuation

When LLM output hits maxOutputTokens mid-generation, instead of
failing with an error loop, the system now:

1. Detects truncation (missing </root> in XML)
2. Stores partial XML and tells LLM to use new append_diagram tool
3. LLM continues generating from where it stopped
4. Fragments are accumulated until XML is complete
5. Server limits to 5 steps via stepCountIs(5)

Key changes:
- Add append_diagram tool definition in route.ts
- Add append_diagram handler in chat-panel.tsx
- Track continuation mode separately from error mode
- Continuation mode has unlimited retries (not counted against limit)
- Error mode still limited to MAX_AUTO_RETRY_COUNT (1)
- Update system prompts to document append_diagram tool

* fix: show friendly message and yellow badge for truncated output

- Add yellow 'Truncated' badge in UI instead of red 'Error' when XML is incomplete
- Show friendly error message for toolUse.input is invalid errors
- Built on top of append_diagram continuation feature

* refactor: remove debug logs and simplify truncation state

- Remove all debug console.log statements
- Remove isContinuationModeRef, derive from partialXmlRef.current.length > 0

* docs: fix append_diagram instructions for consistency

- Change 'Do NOT include' to 'Do NOT start with' (clearer intent)
- Add <mxCell id="0"> to prohibited start patterns
- Change 'closing tags </root></mxGraphModel>' to just '</root>' (wrapWithMxFile handles the rest)
This commit is contained in:
Dayuan Jiang
2025-12-14 12:34:34 +09:00
committed by GitHub
parent b33e09be05
commit 66bd0e5493
6 changed files with 262 additions and 65 deletions

View File

@@ -3,10 +3,12 @@ import {
convertToModelMessages,
createUIMessageStream,
createUIMessageStreamResponse,
InvalidToolInputError,
LoadAPIKeyError,
stepCountIs,
streamText,
} from "ai"
import { jsonrepair } from "jsonrepair"
import { z } from "zod"
import { getAIModel, supportsPromptCaching } from "@/lib/ai-providers"
import { findCachedResponse } from "@/lib/cached-responses"
@@ -320,6 +322,31 @@ ${userInputText}
maxOutputTokens: parseInt(process.env.MAX_OUTPUT_TOKENS, 10),
}),
stopWhen: stepCountIs(5),
// Repair truncated tool calls when maxOutputTokens is reached mid-JSON
experimental_repairToolCall: async ({ toolCall, error }) => {
// Only attempt repair for invalid tool input (broken JSON from truncation)
if (
error instanceof InvalidToolInputError ||
error.name === "AI_InvalidToolInputError"
) {
try {
// Use jsonrepair to fix truncated JSON
const repairedInput = jsonrepair(toolCall.input)
console.log(
`[repairToolCall] Repaired truncated JSON for tool: ${toolCall.toolName}`,
)
return { ...toolCall, input: repairedInput }
} catch (repairError) {
console.warn(
`[repairToolCall] Failed to repair JSON for tool: ${toolCall.toolName}`,
repairError,
)
return null
}
}
// Don't attempt to repair other errors (like NoSuchToolError)
return null
},
messages: allMessages,
...(providerOptions && { providerOptions }), // This now includes all reasoning configs
...(headers && { headers }),
@@ -411,6 +438,26 @@ IMPORTANT: Keep edits concise:
),
}),
},
append_diagram: {
description: `Continue generating diagram XML when previous display_diagram output was truncated due to length limits.
WHEN TO USE: Only call this tool after display_diagram was truncated (you'll see an error message about truncation).
CRITICAL INSTRUCTIONS:
1. Do NOT start with <mxGraphModel>, <root>, or <mxCell id="0"> - they already exist in the partial
2. Continue from EXACTLY where your previous output stopped
3. End with the closing </root> tag to complete the diagram
4. If still truncated, call append_diagram again with the next fragment
Example: If previous output ended with '<mxCell id="x" style="rounded=1', continue with ';" vertex="1">...' and complete the remaining elements.`,
inputSchema: z.object({
xml: z
.string()
.describe(
"Continuation XML fragment to append (NO wrapper tags)",
),
}),
},
},
...(process.env.TEMPERATURE !== undefined && {
temperature: parseFloat(process.env.TEMPERATURE),
@@ -435,6 +482,7 @@ IMPORTANT: Keep edits concise:
return {
inputTokens: totalInputTokens,
outputTokens: usage.outputTokens ?? 0,
finishReason: (part as any).finishReason,
}
}
return undefined