feat: add append_diagram tool for truncation continuation

When LLM output hits maxOutputTokens mid-generation, instead of failing with an error loop, the system now: 1. Detects truncation (missing </root> in XML) 2. Stores partial XML and tells LLM to use new append_diagram tool 3. LLM continues generating from where it stopped 4. Fragments are accumulated until XML is complete 5. Server limits to 5 steps via stepCountIs(5) Key changes: - Add append_diagram tool definition in route.ts - Add append_diagram handler in chat-panel.tsx - Track continuation mode separately from error mode - Continuation mode has unlimited retries (not counted against limit) - Error mode still limited to MAX_AUTO_RETRY_COUNT (1) - Update system prompts to document append_diagram tool
2026-01-02 22:32:27 +08:00 · 2025-12-14 09:38:47 +09:00
parent b33e09be05
commit 62e07f5f9c
5 changed files with 346 additions and 39 deletions
--- a/app/api/chat/route.ts
+++ b/app/api/chat/route.ts
@@ -3,10 +3,12 @@ import {
    convertToModelMessages,
    createUIMessageStream,
    createUIMessageStreamResponse,
+    InvalidToolInputError,
    LoadAPIKeyError,
    stepCountIs,
    streamText,
 } from "ai"
+import { jsonrepair } from "jsonrepair"
 import { z } from "zod"
 import { getAIModel, supportsPromptCaching } from "@/lib/ai-providers"
 import { findCachedResponse } from "@/lib/cached-responses"
@@ -320,6 +322,31 @@ ${userInputText}
            maxOutputTokens: parseInt(process.env.MAX_OUTPUT_TOKENS, 10),
        }),
        stopWhen: stepCountIs(5),
+        // Repair truncated tool calls when maxOutputTokens is reached mid-JSON
+        experimental_repairToolCall: async ({ toolCall, error }) => {
+            // Only attempt repair for invalid tool input (broken JSON from truncation)
+            if (
+                error instanceof InvalidToolInputError ||
+                error.name === "AI_InvalidToolInputError"
+            ) {
+                try {
+                    // Use jsonrepair to fix truncated JSON
+                    const repairedInput = jsonrepair(toolCall.input)
+                    console.log(
+                        `[repairToolCall] Repaired truncated JSON for tool: ${toolCall.toolName}`,
+                    )
+                    return { ...toolCall, input: repairedInput }
+                } catch (repairError) {
+                    console.warn(
+                        `[repairToolCall] Failed to repair JSON for tool: ${toolCall.toolName}`,
+                        repairError,
+                    )
+                    return null
+                }
+            }
+            // Don't attempt to repair other errors (like NoSuchToolError)
+            return null
+        },
        messages: allMessages,
        ...(providerOptions && { providerOptions }), // This now includes all reasoning configs
        ...(headers && { headers }),
@@ -411,6 +438,26 @@ IMPORTANT: Keep edits concise:
                        ),
                }),
            },
+            append_diagram: {
+                description: `Continue generating diagram XML when previous display_diagram output was truncated due to length limits.
+
+WHEN TO USE: Only call this tool after display_diagram was truncated (you'll see an error message about truncation).
+
+CRITICAL INSTRUCTIONS:
+1. Do NOT include <mxGraphModel> or <root> tags - they already exist in the partial
+2. Continue from EXACTLY where your previous output stopped
+3. Generate the remaining XML including closing tags </root></mxGraphModel>
+4. If still truncated, call append_diagram again with the next fragment
+
+Example: If previous output ended with '<mxCell id="x" style="rounded=1', continue with ';" vertex="1">...' and complete the remaining elements.`,
+                inputSchema: z.object({
+                    xml: z
+                        .string()
+                        .describe(
+                            "Continuation XML fragment to append (NO wrapper tags)",
+                        ),
+                }),
+            },
        },
        ...(process.env.TEMPERATURE !== undefined && {
            temperature: parseFloat(process.env.TEMPERATURE),
@@ -435,6 +482,7 @@ IMPORTANT: Keep edits concise:
                return {
                    inputTokens: totalInputTokens,
                    outputTokens: usage.outputTokens ?? 0,
+                    finishReason: (part as any).finishReason,
                }
            }
            return undefined