refactor: simplify LLM XML format to output bare mxCells only (#254)

* refactor: simplify LLM XML format to output bare mxCells only

- Update wrapWithMxFile() to always add root cells (id=0, id=1) automatically
- LLM now generates only mxCell elements starting from id=2 (no wrapper tags)
- Update system prompts and tool descriptions with new format instructions
- Update cached responses to remove root cells and wrapper tags
- Update truncation detection to check for complete mxCell endings
- Update documentation in xml_guide.md

* fix: address PR review issues for XML format refactor

- Fix critical bug: inconsistent truncation check using old </root> pattern
- Fix stale error message referencing </root> tag
- Add isMxCellXmlComplete() helper for consistent truncation detection
- Improve regex patterns to handle any attribute order in root cells
- Update wrapWithMxFile JSDoc to document root cell removal behavior

* fix: handle non-self-closing root cells in wrapWithMxFile regex
This commit is contained in:
Dayuan Jiang
2025-12-14 14:04:44 +09:00
committed by GitHub
parent 2e24071539
commit 0851b32b67
7 changed files with 143 additions and 144 deletions

View File

@@ -367,20 +367,17 @@ ${userInputText}
tools: {
// Client-side tool that will be executed on the client
display_diagram: {
description: `Display a diagram on draw.io. Pass the XML content inside <root> tags.
description: `Display a diagram on draw.io. Pass ONLY the mxCell elements - wrapper tags and root cells are added automatically.
VALIDATION RULES (XML will be rejected if violated):
1. All mxCell elements must be DIRECT children of <root> - never nested
2. Every mxCell needs a unique id
3. Every mxCell (except id="0") needs a valid parent attribute
4. Edge source/target must reference existing cell IDs
5. Escape special chars in values: &lt; &gt; &amp; &quot;
6. Always start with: <mxCell id="0"/><mxCell id="1" parent="0"/>
1. Generate ONLY mxCell elements - NO wrapper tags (<mxfile>, <mxGraphModel>, <root>)
2. Do NOT include root cells (id="0" or id="1") - they are added automatically
3. All mxCell elements must be siblings - never nested
4. Every mxCell needs a unique id (start from "2")
5. Every mxCell needs a valid parent attribute (use "1" for top-level)
6. Escape special chars in values: &lt; &gt; &amp; &quot;
Example with swimlanes and edges (note: all mxCells are siblings):
<root>
<mxCell id="0"/>
<mxCell id="1" parent="0"/>
Example (generate ONLY this - no wrapper tags):
<mxCell id="lane1" value="Frontend" style="swimlane;" vertex="1" parent="1">
<mxGeometry x="40" y="40" width="200" height="200" as="geometry"/>
</mxCell>
@@ -396,7 +393,6 @@ Example with swimlanes and edges (note: all mxCells are siblings):
<mxCell id="edge1" style="edgeStyle=orthogonalEdgeStyle;endArrow=classic;" edge="1" parent="1" source="step1" target="step2">
<mxGeometry relative="1" as="geometry"/>
</mxCell>
</root>
Notes:
- For AWS diagrams, use **AWS 2025 icons**.
@@ -444,9 +440,9 @@ IMPORTANT: Keep edits concise:
WHEN TO USE: Only call this tool after display_diagram was truncated (you'll see an error message about truncation).
CRITICAL INSTRUCTIONS:
1. Do NOT start with <mxGraphModel>, <root>, or <mxCell id="0"> - they already exist in the partial
1. Do NOT include any wrapper tags - just continue the mxCell elements
2. Continue from EXACTLY where your previous output stopped
3. End with the closing </root> tag to complete the diagram
3. Complete the remaining mxCell elements
4. If still truncated, call append_diagram again with the next fragment
Example: If previous output ended with '<mxCell id="x" style="rounded=1', continue with ';" vertex="1">...' and complete the remaining elements.`,

View File

@@ -81,16 +81,15 @@ Contains the actual diagram data.
## Root Cell Container: `<root>`
Contains all the cells in the diagram.
Contains all the cells in the diagram. **Note:** When generating diagrams, you only need to provide the mxCell elements - the root container and root cells (id="0", id="1") are added automatically.
**Example:**
**Internal structure (auto-generated):**
```xml
<root>
<mxCell id="0"/>
<mxCell id="1" parent="0"/>
<!-- Other cells go here -->
<mxCell id="0"/> <!-- Auto-added -->
<mxCell id="1" parent="0"/> <!-- Auto-added -->
<!-- Your mxCell elements go here (start from id="2") -->
</root>
```
@@ -203,15 +202,15 @@ Draw.io files contain two special cells that are always present:
1. **Root Cell** (id = "0"): The parent of all cells
2. **Default Parent Cell** (id = "1", parent = "0"): The default layer and parent for most cells
## Tips for Manually Creating Draw.io XML
## Tips for Creating Draw.io XML
1. Start with the basic structure (`mxfile`, `diagram`, `mxGraphModel`, `root`)
2. Always include the two special cells (id = "0" and id = "1")
1. **Generate ONLY mxCell elements** - wrapper tags and root cells (id="0", id="1") are added automatically
2. Start IDs from "2" (id="0" and id="1" are reserved for root cells)
3. Assign unique and sequential IDs to all cells
4. Define parent relationships correctly
4. Define parent relationships correctly (use parent="1" for top-level shapes)
5. Use `mxGeometry` elements to position shapes
6. For connectors, specify `source` and `target` attributes
7. **CRITICAL: All mxCell elements must be DIRECT children of `<root>`. NEVER nest mxCell inside another mxCell.**
7. **CRITICAL: All mxCell elements must be siblings. NEVER nest mxCell inside another mxCell.**
## Common Patterns

View File

@@ -29,7 +29,12 @@ import {
ReasoningTrigger,
} from "@/components/ai-elements/reasoning"
import { ScrollArea } from "@/components/ui/scroll-area"
import { convertToLegalXml, replaceNodes, validateAndFixXml } from "@/lib/utils"
import {
convertToLegalXml,
isMxCellXmlComplete,
replaceNodes,
validateAndFixXml,
} from "@/lib/utils"
import ExamplePanel from "./chat-example-panel"
import { CodeBlock } from "./code-block"
@@ -457,8 +462,7 @@ export function ChatMessageDisplay({
const isTruncated =
(toolName === "display_diagram" ||
toolName === "append_diagram") &&
(!input?.xml ||
!input.xml.includes("</root>"))
!isMxCellXmlComplete(input?.xml)
return isTruncated ? (
<span className="text-xs font-medium text-yellow-600 bg-yellow-50 px-2 py-0.5 rounded-full">
Truncated
@@ -507,7 +511,7 @@ export function ChatMessageDisplay({
const isTruncated =
(toolName === "display_diagram" ||
toolName === "append_diagram") &&
(!input?.xml || !input.xml.includes("</root>"))
!isMxCellXmlComplete(input?.xml)
return (
<div
className={`px-4 py-3 border-t border-border/40 text-sm ${isTruncated ? "text-yellow-600" : "text-red-600"}`}

View File

@@ -26,7 +26,7 @@ import { findCachedResponse } from "@/lib/cached-responses"
import { isPdfFile, isTextFile } from "@/lib/pdf-utils"
import { type FileData, useFileProcessor } from "@/lib/use-file-processor"
import { useQuotaManager } from "@/lib/use-quota-manager"
import { formatXML, wrapWithMxFile } from "@/lib/utils"
import { formatXML, isMxCellXmlComplete, wrapWithMxFile } from "@/lib/utils"
import { ChatMessageDisplay } from "./chat-message-display"
// localStorage keys for persistence
@@ -222,9 +222,8 @@ export default function ChatPanel({
if (toolCall.toolName === "display_diagram") {
const { xml } = toolCall.input as { xml: string }
// Check if XML is truncated (missing </root> indicates incomplete output)
const isTruncated =
!xml.includes("</root>") && !xml.trim().endsWith("/>")
// Check if XML is truncated (incomplete mxCell indicates truncated output)
const isTruncated = !isMxCellXmlComplete(xml)
if (isTruncated) {
// Store the partial XML for continuation via append_diagram
@@ -244,9 +243,9 @@ ${partialEnding}
\`\`\`
NEXT STEP: Call append_diagram with the continuation XML.
- Do NOT start with <mxGraphModel>, <root>, or <mxCell id="0"> (they already exist)
- Do NOT include wrapper tags or root cells (id="0", id="1")
- Start from EXACTLY where you stopped
- End with the closing </root> tag to complete the diagram`,
- Complete all remaining mxCell elements`,
})
return
}
@@ -376,17 +375,21 @@ Please retry with an adjusted search pattern or use display_diagram if retries a
const { xml } = toolCall.input as { xml: string }
// Detect if LLM incorrectly started fresh instead of continuing
// LLM should only output bare mxCells now, so wrapper tags indicate error
const trimmed = xml.trim()
const isFreshStart =
xml.trim().startsWith("<mxGraphModel") ||
xml.trim().startsWith("<root") ||
xml.trim().startsWith('<mxCell id="0"')
trimmed.startsWith("<mxGraphModel") ||
trimmed.startsWith("<root") ||
trimmed.startsWith("<mxfile") ||
trimmed.startsWith('<mxCell id="0"') ||
trimmed.startsWith('<mxCell id="1"')
if (isFreshStart) {
addToolOutput({
tool: "append_diagram",
toolCallId: toolCall.toolCallId,
state: "output-error",
errorText: `ERROR: You started fresh with wrapper tags. Do NOT include <mxGraphModel>, <root>, or <mxCell id="0">.
errorText: `ERROR: You started fresh with wrapper tags. Do NOT include wrapper tags or root cells (id="0", id="1").
Continue from EXACTLY where the partial ended:
\`\`\`
@@ -401,8 +404,8 @@ Start your continuation with the NEXT character after where it stopped.`,
// Append to accumulated XML
partialXmlRef.current += xml
// Check if XML is now complete
const isComplete = partialXmlRef.current.includes("</root>")
// Check if XML is now complete (last mxCell is complete)
const isComplete = isMxCellXmlComplete(partialXmlRef.current)
if (isComplete) {
// Wrap and display the complete diagram
@@ -439,7 +442,7 @@ Please use display_diagram with corrected XML.`,
tool: "append_diagram",
toolCallId: toolCall.toolCallId,
state: "output-error",
errorText: `XML still incomplete (missing </root>). Call append_diagram again to continue.
errorText: `XML still incomplete (mxCell not closed). Call append_diagram again to continue.
Current ending:
\`\`\`

View File

@@ -9,12 +9,7 @@ export const CACHED_EXAMPLE_RESPONSES: CachedResponse[] = [
promptText:
"Give me a **animated connector** diagram of transformer's architecture",
hasImage: false,
xml: `<root>
<mxCell id="0"/>
<mxCell id="1" parent="0"/>
<mxCell id="title" value="Transformer Architecture" style="text;html=1;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=20;fontStyle=1;" vertex="1" parent="1">
xml: `<mxCell id="title" value="Transformer Architecture" style="text;html=1;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=20;fontStyle=1;" vertex="1" parent="1">
<mxGeometry x="300" y="20" width="250" height="30" as="geometry"/>
</mxCell>
@@ -254,18 +249,12 @@ export const CACHED_EXAMPLE_RESPONSES: CachedResponse[] = [
<mxCell id="output_label" value="Outputs&#xa;(shifted right)" style="text;html=1;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=12;fontStyle=1;" vertex="1" parent="1">
<mxGeometry x="660" y="530" width="100" height="30" as="geometry"/>
</mxCell>
</root>`,
</mxCell>`,
},
{
promptText: "Replicate this in aws style",
hasImage: true,
xml: `<root>
<mxCell id="0"/>
<mxCell id="1" parent="0"/>
<mxCell id="2" value="AWS" style="sketch=0;outlineConnect=0;gradientColor=none;html=1;whiteSpace=wrap;fontSize=12;fontStyle=0;container=1;pointerEvents=0;collapsible=0;recursiveResize=0;shape=mxgraph.aws4.group;grIcon=mxgraph.aws4.group_aws_cloud;strokeColor=#232F3E;fillColor=none;verticalAlign=top;align=left;spacingLeft=30;fontColor=#232F3E;dashed=0;rounded=1;arcSize=5;" vertex="1" parent="1">
xml: `<mxCell id="2" value="AWS" style="sketch=0;outlineConnect=0;gradientColor=none;html=1;whiteSpace=wrap;fontSize=12;fontStyle=0;container=1;pointerEvents=0;collapsible=0;recursiveResize=0;shape=mxgraph.aws4.group;grIcon=mxgraph.aws4.group_aws_cloud;strokeColor=#232F3E;fillColor=none;verticalAlign=top;align=left;spacingLeft=30;fontColor=#232F3E;dashed=0;rounded=1;arcSize=5;" vertex="1" parent="1">
<mxGeometry x="340" y="40" width="880" height="520" as="geometry"/>
</mxCell>
@@ -324,18 +313,12 @@ export const CACHED_EXAMPLE_RESPONSES: CachedResponse[] = [
<mxPoint x="700" y="350" as="sourcePoint"/>
<mxPoint x="750" y="300" as="targetPoint"/>
</mxGeometry>
</mxCell>
</root>`,
</mxCell>`,
},
{
promptText: "Replicate this flowchart.",
hasImage: true,
xml: `<root>
<mxCell id="0"/>
<mxCell id="1" parent="0"/>
<mxCell id="2" value="Lamp doesn't work" style="rounded=1;whiteSpace=wrap;html=1;fillColor=#ffcccc;strokeColor=#000000;strokeWidth=2;fontSize=18;fontStyle=0;" vertex="1" parent="1">
xml: `<mxCell id="2" value="Lamp doesn't work" style="rounded=1;whiteSpace=wrap;html=1;fillColor=#ffcccc;strokeColor=#000000;strokeWidth=2;fontSize=18;fontStyle=0;" vertex="1" parent="1">
<mxGeometry x="140" y="40" width="180" height="60" as="geometry"/>
</mxCell>
@@ -391,16 +374,12 @@ export const CACHED_EXAMPLE_RESPONSES: CachedResponse[] = [
<mxCell id="12" value="Repair lamp" style="rounded=1;whiteSpace=wrap;html=1;fillColor=#99ff99;strokeColor=#000000;strokeWidth=2;fontSize=18;fontStyle=0;" vertex="1" parent="1">
<mxGeometry x="130" y="650" width="200" height="60" as="geometry"/>
</mxCell>
</root>`,
</mxCell>`,
},
{
promptText: "Summarize this paper as a diagram",
hasImage: true,
xml: ` <root>
<mxCell id="0" />
<mxCell id="1" parent="0" />
<mxCell id="title_bg" parent="1"
xml: `<mxCell id="title_bg" parent="1"
style="rounded=1;whiteSpace=wrap;html=1;fillColor=#1a237e;strokeColor=none;arcSize=8;"
value="" vertex="1">
<mxGeometry height="80" width="720" x="40" y="20" as="geometry" />
@@ -751,18 +730,12 @@ export const CACHED_EXAMPLE_RESPONSES: CachedResponse[] = [
value="Foundational technique for modern LLM reasoning - inspired many follow-up works including Self-Consistency, Tree-of-Thought, etc."
vertex="1">
<mxGeometry height="55" width="230" x="530" y="600" as="geometry" />
</mxCell>
</root>`,
</mxCell>`,
},
{
promptText: "Draw a cat for me",
hasImage: false,
xml: `<root>
<mxCell id="0"/>
<mxCell id="1" parent="0"/>
<mxCell id="2" value="" style="ellipse;whiteSpace=wrap;html=1;aspect=fixed;fillColor=#FFE6CC;strokeColor=#000000;strokeWidth=2;" vertex="1" parent="1">
xml: `<mxCell id="2" value="" style="ellipse;whiteSpace=wrap;html=1;aspect=fixed;fillColor=#FFE6CC;strokeColor=#000000;strokeWidth=2;" vertex="1" parent="1">
<mxGeometry x="300" y="150" width="120" height="120" as="geometry"/>
</mxCell>
@@ -902,9 +875,7 @@ export const CACHED_EXAMPLE_RESPONSES: CachedResponse[] = [
<mxPoint x="235" y="290"/>
</Array>
</mxGeometry>
</mxCell>
</root>`,
</mxCell>`,
},
]

View File

@@ -104,22 +104,21 @@ When using edit_diagram tool:
## Draw.io XML Structure Reference
Basic structure:
**IMPORTANT:** You only generate the mxCell elements. The wrapper structure and root cells (id="0", id="1") are added automatically.
Example - generate ONLY this:
\`\`\`xml
<mxGraphModel>
<root>
<mxCell id="0"/>
<mxCell id="1" parent="0"/>
</root>
</mxGraphModel>
<mxCell id="2" value="Label" style="rounded=1;" vertex="1" parent="1">
<mxGeometry x="100" y="100" width="120" height="60" as="geometry"/>
</mxCell>
\`\`\`
Note: All other mxCell elements go as siblings after id="1".
CRITICAL RULES:
1. Always include the two root cells: <mxCell id="0"/> and <mxCell id="1" parent="0"/>
2. ALL mxCell elements must be DIRECT children of <root> - NEVER nest mxCell inside another mxCell
3. Use unique sequential IDs for all cells (start from "2" for user content)
4. Set parent="1" for top-level shapes, or parent="<container-id>" for grouped elements
1. Generate ONLY mxCell elements - NO wrapper tags (<mxfile>, <mxGraphModel>, <root>)
2. Do NOT include root cells (id="0" or id="1") - they are added automatically
3. ALL mxCell elements must be siblings - NEVER nest mxCell inside another mxCell
4. Use unique sequential IDs starting from "2"
5. Set parent="1" for top-level shapes, or parent="<container-id>" for grouped elements
Shape (vertex) example:
\`\`\`xml
@@ -151,18 +150,15 @@ const EXTENDED_ADDITIONS = `
### display_diagram Details
**VALIDATION RULES** (XML will be rejected if violated):
1. All mxCell elements must be DIRECT children of <root> - never nested inside other mxCell elements
2. Every mxCell needs a unique id attribute
3. Every mxCell (except id="0") needs a valid parent attribute referencing an existing cell
4. Edge source/target attributes must reference existing cell IDs
5. Escape special characters in values: &lt; for <, &gt; for >, &amp; for &, &quot; for "
6. Always start with the two root cells: <mxCell id="0"/><mxCell id="1" parent="0"/>
1. Generate ONLY mxCell elements - wrapper tags and root cells are added automatically
2. All mxCell elements must be siblings - never nested inside other mxCell elements
3. Every mxCell needs a unique id attribute (start from "2")
4. Every mxCell needs a valid parent attribute (use "1" for top-level, or container-id for grouped)
5. Edge source/target attributes must reference existing cell IDs
6. Escape special characters in values: &lt; for <, &gt; for >, &amp; for &, &quot; for "
**Example with swimlanes and edges** (note: all mxCells are siblings under <root>):
**Example with swimlanes and edges** (generate ONLY this - no wrapper tags):
\`\`\`xml
<root>
<mxCell id="0"/>
<mxCell id="1" parent="0"/>
<mxCell id="lane1" value="Frontend" style="swimlane;" vertex="1" parent="1">
<mxGeometry x="40" y="40" width="200" height="200" as="geometry"/>
</mxCell>
@@ -178,7 +174,6 @@ const EXTENDED_ADDITIONS = `
<mxCell id="edge1" style="edgeStyle=orthogonalEdgeStyle;endArrow=classic;" edge="1" parent="1" source="step1" target="step2">
<mxGeometry relative="1" as="geometry"/>
</mxCell>
</root>
\`\`\`
### append_diagram Details
@@ -186,9 +181,9 @@ const EXTENDED_ADDITIONS = `
**WHEN TO USE:** Only call this tool when display_diagram output was truncated (you'll see an error message about truncation).
**CRITICAL RULES:**
1. Do NOT start with <mxGraphModel>, <root>, or <mxCell id="0"> - they already exist in the partial
1. Do NOT include any wrapper tags - just continue the mxCell elements
2. Continue from EXACTLY where your previous output stopped
3. End with the closing </root> tag to complete the diagram
3. Complete the remaining mxCell elements
4. If still truncated, call append_diagram again with the next fragment
**Example:** If previous output ended with \`<mxCell id="x" style="rounded=1\`, continue with \`;" vertex="1">...\` and complete the remaining elements.

View File

@@ -29,6 +29,22 @@ const STRUCTURAL_ATTRS = [
/** Valid XML entity names */
const VALID_ENTITIES = new Set(["lt", "gt", "amp", "quot", "apos"])
// ============================================================================
// mxCell XML Helpers
// ============================================================================
/**
* Check if mxCell XML output is complete (not truncated).
* Complete XML ends with a self-closing tag (/>) or closing mxCell tag.
* @param xml - The XML string to check (can be undefined/null)
* @returns true if XML appears complete, false if truncated or empty
*/
export function isMxCellXmlComplete(xml: string | undefined | null): boolean {
const trimmed = xml?.trim() || ""
if (!trimmed) return false
return trimmed.endsWith("/>") || trimmed.endsWith("</mxCell>")
}
// ============================================================================
// XML Parsing Helpers
// ============================================================================
@@ -197,13 +213,17 @@ export function convertToLegalXml(xmlString: string): string {
/**
* Wrap XML content with the full mxfile structure required by draw.io.
* Handles cases where XML is just <root>, <mxGraphModel>, or already has <mxfile>.
* @param xml - The XML string (may be partial or complete)
* @returns Full mxfile-wrapped XML string
* Always adds root cells (id="0" and id="1") automatically.
* If input already contains root cells, they are removed to avoid duplication.
* LLM should only generate mxCell elements starting from id="2".
* @param xml - The XML string (bare mxCells, <root>, <mxGraphModel>, or full <mxfile>)
* @returns Full mxfile-wrapped XML string with root cells included
*/
export function wrapWithMxFile(xml: string): string {
if (!xml) {
return `<mxfile><diagram name="Page-1" id="page-1"><mxGraphModel><root><mxCell id="0"/><mxCell id="1" parent="0"/></root></mxGraphModel></diagram></mxfile>`
const ROOT_CELLS = '<mxCell id="0"/><mxCell id="1" parent="0"/>'
if (!xml || !xml.trim()) {
return `<mxfile><diagram name="Page-1" id="page-1"><mxGraphModel><root>${ROOT_CELLS}</root></mxGraphModel></diagram></mxfile>`
}
// Already has full structure
@@ -216,9 +236,20 @@ export function wrapWithMxFile(xml: string): string {
return `<mxfile><diagram name="Page-1" id="page-1">${xml}</diagram></mxfile>`
}
// Just <root> content - extract inner content and wrap fully
const rootContent = xml.replace(/<\/?root>/g, "").trim()
return `<mxfile><diagram name="Page-1" id="page-1"><mxGraphModel><root>${rootContent}</root></mxGraphModel></diagram></mxfile>`
// Has <root> wrapper - extract inner content
let content = xml
if (xml.includes("<root>")) {
content = xml.replace(/<\/?root>/g, "").trim()
}
// Remove any existing root cells from content (LLM shouldn't include them, but handle it gracefully)
// Use flexible patterns that match both self-closing (/>) and non-self-closing (></mxCell>) formats
content = content
.replace(/<mxCell[^>]*\bid=["']0["'][^>]*(?:\/>|><\/mxCell>)/g, "")
.replace(/<mxCell[^>]*\bid=["']1["'][^>]*(?:\/>|><\/mxCell>)/g, "")
.trim()
return `<mxfile><diagram name="Page-1" id="page-1"><mxGraphModel><root>${ROOT_CELLS}${content}</root></mxGraphModel></diagram></mxfile>`
}
/**