fix: use totalUsage with all token types for accurate quota tracking

The onFinish callback's 'usage' only contains the final step's tokens, which underreports usage for multi-step tool calls (like diagram generation). Changed to 'totalUsage' which provides cumulative counts across all steps. Include all 4 token types for accurate counting: 1. inputTokens - non-cached input tokens 2. outputTokens - generated output tokens 3. cachedInputTokens - tokens read from prompt cache 4. inputTokenDetails.cacheWriteTokens - tokens written to cache Tested locally: - Request 1 (cache write): 334 + 62 + 0 + 6671 = 7,067 tokens - Request 2 (cache read): 334 + 184 + 6551 + 120 = 7,189 tokens - DynamoDB total: 14,256 ✓
fix: enable progressive diagram rendering during streaming (#380 )
2026-01-02 22:32:27 +08:00 · 2025-12-23 20:16:24 +09:00 · 2025-12-23 18:54:03 +09:00 · 2025-12-23 18:36:27 +09:00 · 2025-12-23 16:26:45 +09:00 · 2025-12-23 14:17:06 +09:00
21 changed files with 1741 additions and 867 deletions
--- a/app/[lang]/about/cn/page.tsx
+++ b/app/[lang]/about/cn/page.tsx
@@ -117,9 +117,9 @@ export default function AboutCN() {
                                    (TPS/TPM)。一旦超限，系统就会暂停，导致请求失败。
                                </p>
                                <p>
-                                    由于使用量过高，我已将模型从 Claude 更换为{" "}
+                                    由于使用量过高，我已将模型从 Opus 4.5 更换为{" "}
                                    <span className="font-semibold text-amber-700">
-                                        minimax-m2
+                                        Haiku 4.5
                                    </span>
                                    ，以降低成本。
                                </p>
--- a/app/[lang]/about/ja/page.tsx
+++ b/app/[lang]/about/ja/page.tsx
@@ -126,9 +126,9 @@ export default function AboutJA() {
                                </p>
                                <p>
                                    利用量の増加に伴い、コスト削減のためモデルを
-                                    Claude から{" "}
+                                    Opus 4.5 から{" "}
                                    <span className="font-semibold text-amber-700">
-                                        minimax-m2
+                                        Haiku 4.5
                                    </span>{" "}
                                    に変更しました。
                                </p>
--- a/app/[lang]/about/page.tsx
+++ b/app/[lang]/about/page.tsx
@@ -129,9 +129,9 @@ export default function About() {
                                </p>
                                <p>
                                    Due to the high usage, I have changed the
-                                    model from Claude to{" "}
+                                    model from Opus 4.5 to{" "}
                                    <span className="font-semibold text-amber-700">
-                                        minimax-m2
+                                        Haiku 4.5
                                    </span>
                                    , which is more cost-effective.
                                </p>
--- a/app/api/chat/route.ts
+++ b/app/api/chat/route.ts
@@ -14,6 +14,11 @@ import path from "path"
 import { z } from "zod"
 import { getAIModel, supportsPromptCaching } from "@/lib/ai-providers"
 import { findCachedResponse } from "@/lib/cached-responses"
+import {
+    checkAndIncrementRequest,
+    isQuotaEnabled,
+    recordTokenUsage,
+} from "@/lib/dynamo-quota-manager"
 import {
    getTelemetryConfig,
    setTraceInput,
@@ -162,9 +167,13 @@ async function handleChatRequest(req: Request): Promise<Response> {

    const { messages, xml, previousXml, sessionId } = await req.json()

-    // Get user IP for Langfuse tracking
+    // Get user IP for Langfuse tracking (hashed for privacy)
    const forwardedFor = req.headers.get("x-forwarded-for")
-    const userId = forwardedFor?.split(",")[0]?.trim() || "anonymous"
+    const rawIp = forwardedFor?.split(",")[0]?.trim() || "anonymous"
+    const userId =
+        rawIp === "anonymous"
+            ? rawIp
+            : `user-${Buffer.from(rawIp).toString("base64url").slice(0, 8)}`

    // Validate sessionId for Langfuse (must be string, max 200 chars)
    const validSessionId =
@@ -173,9 +182,12 @@ async function handleChatRequest(req: Request): Promise<Response> {
            : undefined

    // Extract user input text for Langfuse trace
-    const lastMessage = messages[messages.length - 1]
+    // Find the last USER message, not just the last message (which could be assistant in multi-step tool flows)
+    const lastUserMessage = [...messages]
+        .reverse()
+        .find((m: any) => m.role === "user")
    const userInputText =
-        lastMessage?.parts?.find((p: any) => p.type === "text")?.text || ""
+        lastUserMessage?.parts?.find((p: any) => p.type === "text")?.text || ""

    // Update Langfuse trace with input, session, and user
    setTraceInput({
@@ -184,6 +196,33 @@ async function handleChatRequest(req: Request): Promise<Response> {
        userId: userId,
    })

+    // === SERVER-SIDE QUOTA CHECK START ===
+    // Quota is opt-in: only enabled when DYNAMODB_QUOTA_TABLE env var is set
+    const hasOwnApiKey = !!(
+        req.headers.get("x-ai-provider") && req.headers.get("x-ai-api-key")
+    )
+
+    // Skip quota check if: quota disabled, user has own API key, or is anonymous
+    if (isQuotaEnabled() && !hasOwnApiKey && userId !== "anonymous") {
+        const quotaCheck = await checkAndIncrementRequest(userId, {
+            requests: Number(process.env.DAILY_REQUEST_LIMIT) || 10,
+            tokens: Number(process.env.DAILY_TOKEN_LIMIT) || 200000,
+            tpm: Number(process.env.TPM_LIMIT) || 20000,
+        })
+        if (!quotaCheck.allowed) {
+            return Response.json(
+                {
+                    error: quotaCheck.error,
+                    type: quotaCheck.type,
+                    used: quotaCheck.used,
+                    limit: quotaCheck.limit,
+                },
+                { status: 429 },
+            )
+        }
+    }
+    // === SERVER-SIDE QUOTA CHECK END ===
+
    // === FILE VALIDATION START ===
    const fileValidation = validateFileParts(messages)
    if (!fileValidation.valid) {
@@ -237,9 +276,10 @@ async function handleChatRequest(req: Request): Promise<Response> {
    // Get the appropriate system prompt based on model (extended for Opus/Haiku 4.5)
    const systemMessage = getSystemPrompt(modelId, minimalStyle)

-    // Extract file parts (images) from the last message
+    // Extract file parts (images) from the last user message
    const fileParts =
-        lastMessage.parts?.filter((part: any) => part.type === "file") || []
+        lastUserMessage?.parts?.filter((part: any) => part.type === "file") ||
+        []

    // User input only - XML is now in a separate cached system message
    const formattedUserInput = `User input:
@@ -248,7 +288,7 @@ ${userInputText}
 """`

    // Convert UIMessages to ModelMessages and add system message
-    const modelMessages = convertToModelMessages(messages)
+    const modelMessages = await convertToModelMessages(messages)

    // DEBUG: Log incoming messages structure
    console.log("[route.ts] Incoming messages count:", messages.length)
@@ -502,12 +542,26 @@ ${userInputText}
                userId,
            }),
        }),
-        onFinish: ({ text, usage }) => {
-            // Pass usage to Langfuse (Bedrock streaming doesn't auto-report tokens to telemetry)
-            setTraceOutput(text, {
-                promptTokens: usage?.inputTokens,
-                completionTokens: usage?.outputTokens,
-            })
+        onFinish: ({ text, totalUsage }) => {
+            // AI SDK 6 telemetry auto-reports token usage on its spans
+            setTraceOutput(text)
+
+            // Record token usage for server-side quota tracking (if enabled)
+            // Use totalUsage (cumulative across all steps) instead of usage (final step only)
+            // Include all 4 token types: input, output, cache read, cache write
+            if (
+                isQuotaEnabled() &&
+                !hasOwnApiKey &&
+                userId !== "anonymous" &&
+                totalUsage
+            ) {
+                const totalTokens =
+                    (totalUsage.inputTokens || 0) +
+                    (totalUsage.outputTokens || 0) +
+                    (totalUsage.cachedInputTokens || 0) +
+                    (totalUsage.inputTokenDetails?.cacheWriteTokens || 0)
+                recordTokenUsage(userId, totalTokens)
+            }
        },
        tools: {
            // Client-side tool that will be executed on the client
@@ -677,19 +731,9 @@ Call this tool to get shape names and usage syntax for a specific library.`,
        messageMetadata: ({ part }) => {
            if (part.type === "finish") {
                const usage = (part as any).totalUsage
-                if (!usage) {
-                    console.warn(
-                        "[messageMetadata] No usage data in finish part",
-                    )
-                    return undefined
-                }
-                // Total input = non-cached + cached (these are separate counts)
-                // Note: cacheWriteInputTokens is not available on finish part
-                const totalInputTokens =
-                    (usage.inputTokens ?? 0) + (usage.cachedInputTokens ?? 0)
+                // AI SDK 6 provides totalTokens directly
                return {
-                    inputTokens: totalInputTokens,
-                    outputTokens: usage.outputTokens ?? 0,
+                    totalTokens: usage?.totalTokens ?? 0,
                    finishReason: (part as any).finishReason,
                }
            }
--- a/app/api/log-feedback/route.ts
+++ b/app/api/log-feedback/route.ts
@@ -27,9 +27,18 @@ export async function POST(req: Request) {

    const { messageId, feedback, sessionId } = data

-    // Get user IP for tracking
+    // Skip logging if no sessionId - prevents attaching to wrong user's trace
+    if (!sessionId) {
+        return Response.json({ success: true, logged: false })
+    }
+
+    // Get user IP for tracking (hashed for privacy)
    const forwardedFor = req.headers.get("x-forwarded-for")
-    const userId = forwardedFor?.split(",")[0]?.trim() || "anonymous"
+    const rawIp = forwardedFor?.split(",")[0]?.trim() || "anonymous"
+    const userId =
+        rawIp === "anonymous"
+            ? rawIp
+            : `user-${Buffer.from(rawIp).toString("base64url").slice(0, 8)}`

    try {
        // Find the most recent chat trace for this session to attach the score to
--- a/app/api/log-save/route.ts
+++ b/app/api/log-save/route.ts
@@ -27,6 +27,11 @@ export async function POST(req: Request) {

    const { filename, format, sessionId } = data

+    // Skip logging if no sessionId - prevents attaching to wrong user's trace
+    if (!sessionId) {
+        return Response.json({ success: true, logged: false })
+    }
+
    try {
        const timestamp = new Date().toISOString()

--- a/components/chat-message-display.tsx
+++ b/components/chat-message-display.tsx
@@ -31,6 +31,7 @@ import { getApiEndpoint } from "@/lib/base-path"
 import {
    applyDiagramOperations,
    convertToLegalXml,
+    extractCompleteMxCells,
    isMxCellXmlComplete,
    replaceNodes,
    validateAndFixXml,
@@ -315,12 +316,28 @@ export function ChatMessageDisplay({

    const handleDisplayChart = useCallback(
        (xml: string, showToast = false) => {
-            const currentXml = xml || ""
+            let currentXml = xml || ""
+            const startTime = performance.now()
+
+            // During streaming (showToast=false), extract only complete mxCell elements
+            // This allows progressive rendering even with partial/incomplete trailing XML
+            if (!showToast) {
+                const completeCells = extractCompleteMxCells(currentXml)
+                if (!completeCells) {
+                    return
+                }
+                currentXml = completeCells
+            }
+
            const convertedXml = convertToLegalXml(currentXml)
            if (convertedXml !== previousXML.current) {
                // Parse and validate XML BEFORE calling replaceNodes
                const parser = new DOMParser()
-                const testDoc = parser.parseFromString(convertedXml, "text/xml")
+                // Wrap in root element for parsing multiple mxCell elements
+                const testDoc = parser.parseFromString(
+                    `<root>${convertedXml}</root>`,
+                    "text/xml",
+                )
                const parseError = testDoc.querySelector("parsererror")

                if (parseError) {
@@ -347,7 +364,22 @@ export function ChatMessageDisplay({
                        `<mxfile><diagram name="Page-1" id="page-1"><mxGraphModel><root><mxCell id="0"/><mxCell id="1" parent="0"/></root></mxGraphModel></diagram></mxfile>`
                    const replacedXML = replaceNodes(baseXML, convertedXml)

-                    // Validate and auto-fix the XML
+                    const xmlProcessTime = performance.now() - startTime
+
+                    // During streaming (showToast=false), skip heavy validation for lower latency
+                    // The quick DOM parse check above catches malformed XML
+                    // Full validation runs on final output (showToast=true)
+                    if (!showToast) {
+                        previousXML.current = convertedXml
+                        const loadStartTime = performance.now()
+                        onDisplayChart(replacedXML, true)
+                        console.log(
+                            `[Streaming] XML processing: ${xmlProcessTime.toFixed(1)}ms, drawio load: ${(performance.now() - loadStartTime).toFixed(1)}ms`,
+                        )
+                        return
+                    }
+
+                    // Final output: run full validation and auto-fix
                    const validation = validateAndFixXml(replacedXML)
                    if (validation.valid) {
                        previousXML.current = convertedXml
@@ -360,18 +392,19 @@ export function ChatMessageDisplay({
                            )
                        }
                        // Skip validation in loadDiagram since we already validated above
+                        const loadStartTime = performance.now()
                        onDisplayChart(xmlToLoad, true)
+                        console.log(
+                            `[Final] XML processing: ${xmlProcessTime.toFixed(1)}ms, validation+load: ${(performance.now() - loadStartTime).toFixed(1)}ms`,
+                        )
                    } else {
                        console.error(
                            "[ChatMessageDisplay] XML validation failed:",
                            validation.error,
                        )
-                        // Only show toast if this is the final XML (not during streaming)
-                        if (showToast) {
-                            toast.error(
-                                "Diagram validation failed. Please try regenerating.",
-                            )
-                        }
+                        toast.error(
+                            "Diagram validation failed. Please try regenerating.",
+                        )
                    }
                } catch (error) {
                    console.error(
@@ -603,17 +636,10 @@ export function ChatMessageDisplay({
            }
        })

-        // Cleanup: clear any pending debounce timeout on unmount
-        return () => {
-            if (debounceTimeoutRef.current) {
-                clearTimeout(debounceTimeoutRef.current)
-                debounceTimeoutRef.current = null
-            }
-            if (editDebounceTimeoutRef.current) {
-                clearTimeout(editDebounceTimeoutRef.current)
-                editDebounceTimeoutRef.current = null
-            }
-        }
+        // NOTE: Don't cleanup debounce timeouts here!
+        // The cleanup runs on every re-render (when messages changes),
+        // which would cancel the timeout before it fires.
+        // Let the timeouts complete naturally - they're harmless if component unmounts.
    }, [messages, handleDisplayChart, chartXML])

    const renderToolPart = (part: ToolPartLike) => {
--- a/components/chat-panel.tsx
+++ b/components/chat-panel.tsx
@@ -76,6 +76,7 @@ interface ChatPanelProps {
 const TOOL_ERROR_STATE = "output-error" as const
 const DEBUG = process.env.NODE_ENV === "development"
 const MAX_AUTO_RETRY_COUNT = 1
+const MAX_CONTINUATION_RETRY_COUNT = 2 // Limit for truncation continuation retries

 /**
 * Check if auto-resubmit should happen based on tool errors.
@@ -216,6 +217,8 @@ export default function ChatPanel({

    // Ref to track consecutive auto-retry count (reset on user action)
    const autoRetryCountRef = useRef(0)
+    // Ref to track continuation retry count (for truncation handling)
+    const continuationRetryCountRef = useRef(0)

    // Ref to accumulate partial XML when output is truncated due to maxOutputTokens
    // When partialXmlRef.current.length > 0, we're in continuation mode
@@ -553,6 +556,23 @@ Continue from EXACTLY where you stopped.`,
            }
        },
        onError: (error) => {
+            // Handle server-side quota limit (429 response)
+            if (error.message.includes("Daily request limit")) {
+                quotaManager.showQuotaLimitToast()
+                return
+            }
+            if (error.message.includes("Daily token limit")) {
+                quotaManager.showTokenLimitToast(dailyTokenLimit)
+                return
+            }
+            if (
+                error.message.includes("Rate limit exceeded") ||
+                error.message.includes("tokens per minute")
+            ) {
+                quotaManager.showTPMLimitToast()
+                return
+            }
+
            // Silence access code error in console since it's handled by UI
            if (!error.message.includes("Invalid or missing access code")) {
                console.error("Chat error:", error)
@@ -629,22 +649,6 @@ Continue from EXACTLY where you stopped.`,

            // DEBUG: Log finish reason to diagnose truncation
            console.log("[onFinish] finishReason:", metadata?.finishReason)
-            console.log("[onFinish] metadata:", metadata)
-
-            if (metadata) {
-                // Use Number.isFinite to guard against NaN (typeof NaN === 'number' is true)
-                const inputTokens = Number.isFinite(metadata.inputTokens)
-                    ? (metadata.inputTokens as number)
-                    : 0
-                const outputTokens = Number.isFinite(metadata.outputTokens)
-                    ? (metadata.outputTokens as number)
-                    : 0
-                const actualTokens = inputTokens + outputTokens
-                if (actualTokens > 0) {
-                    quotaManager.incrementTokenCount(actualTokens)
-                    quotaManager.incrementTPMCount(actualTokens)
-                }
-            }
        },
        sendAutomaticallyWhen: ({ messages }) => {
            const isInContinuationMode = partialXmlRef.current.length > 0
@@ -656,15 +660,25 @@ Continue from EXACTLY where you stopped.`,
            if (!shouldRetry) {
                // No error, reset retry count and clear state
                autoRetryCountRef.current = 0
+                continuationRetryCountRef.current = 0
                partialXmlRef.current = ""
                return false
            }

-            // Continuation mode: unlimited retries (truncation continuation, not real errors)
-            // Server limits to 5 steps via stepCountIs(5)
+            // Continuation mode: limited retries for truncation handling
            if (isInContinuationMode) {
-                // Don't count against retry limit for continuation
-                // Quota checks still apply below
+                if (
+                    continuationRetryCountRef.current >=
+                    MAX_CONTINUATION_RETRY_COUNT
+                ) {
+                    toast.error(
+                        `Continuation retry limit reached (${MAX_CONTINUATION_RETRY_COUNT}). The diagram may be too complex.`,
+                    )
+                    continuationRetryCountRef.current = 0
+                    partialXmlRef.current = ""
+                    return false
+                }
+                continuationRetryCountRef.current++
            } else {
                // Regular error: check retry count limit
                if (autoRetryCountRef.current >= MAX_AUTO_RETRY_COUNT) {
@@ -679,23 +693,6 @@ Continue from EXACTLY where you stopped.`,
                autoRetryCountRef.current++
            }

-            // Check quota limits before auto-retry
-            const tokenLimitCheck = quotaManager.checkTokenLimit()
-            if (!tokenLimitCheck.allowed) {
-                quotaManager.showTokenLimitToast(tokenLimitCheck.used)
-                autoRetryCountRef.current = 0
-                partialXmlRef.current = ""
-                return false
-            }
-
-            const tpmCheck = quotaManager.checkTPMLimit()
-            if (!tpmCheck.allowed) {
-                quotaManager.showTPMLimitToast()
-                autoRetryCountRef.current = 0
-                partialXmlRef.current = ""
-                return false
-            }
-
            return true
        },
    })
@@ -912,9 +909,6 @@ Continue from EXACTLY where you stopped.`,
                xmlSnapshotsRef.current.set(messageIndex, chartXml)
                saveXmlSnapshots()

-                // Check all quota limits
-                if (!checkAllQuotaLimits()) return
-
                sendChatMessage(parts, chartXml, previousXml, sessionId)

                // Token count is tracked in onFinish with actual server usage
@@ -992,30 +986,7 @@ Continue from EXACTLY where you stopped.`,
        saveXmlSnapshots()
    }

-    // Check all quota limits (daily requests, tokens, TPM)
-    const checkAllQuotaLimits = (): boolean => {
-        const limitCheck = quotaManager.checkDailyLimit()
-        if (!limitCheck.allowed) {
-            quotaManager.showQuotaLimitToast()
-            return false
-        }
-
-        const tokenLimitCheck = quotaManager.checkTokenLimit()
-        if (!tokenLimitCheck.allowed) {
-            quotaManager.showTokenLimitToast(tokenLimitCheck.used)
-            return false
-        }
-
-        const tpmCheck = quotaManager.checkTPMLimit()
-        if (!tpmCheck.allowed) {
-            quotaManager.showTPMLimitToast()
-            return false
-        }
-
-        return true
-    }
-
-    // Send chat message with headers and increment quota
+    // Send chat message with headers
    const sendChatMessage = (
        parts: any,
        xml: string,
@@ -1024,6 +995,7 @@ Continue from EXACTLY where you stopped.`,
    ) => {
        // Reset all retry/continuation state on user-initiated message
        autoRetryCountRef.current = 0
+        continuationRetryCountRef.current = 0
        partialXmlRef.current = ""

        const config = getSelectedAIConfig()
@@ -1064,7 +1036,6 @@ Continue from EXACTLY where you stopped.`,
                },
            },
        )
-        quotaManager.incrementRequestCount()
    }

    // Process files and append content to user text (handles PDF, text, and optionally images)
@@ -1152,13 +1123,8 @@ Continue from EXACTLY where you stopped.`,
            setMessages(newMessages)
        })

-        // Check all quota limits
-        if (!checkAllQuotaLimits()) return
-
        // Now send the message after state is guaranteed to be updated
        sendChatMessage(userParts, savedXml, previousXml, sessionId)
-
-        // Token count is tracked in onFinish with actual server usage
    }

    const handleEditMessage = async (messageIndex: number, newText: string) => {
@@ -1200,12 +1166,8 @@ Continue from EXACTLY where you stopped.`,
            setMessages(newMessages)
        })

-        // Check all quota limits
-        if (!checkAllQuotaLimits()) return
-
        // Now send the edited message after state is guaranteed to be updated
        sendChatMessage(newParts, savedXml, previousXml, sessionId)
-        // Token count is tracked in onFinish with actual server usage
    }

    // Collapsed view (desktop only)
@@ -1281,24 +1243,6 @@ Continue from EXACTLY where you stopped.`,
                                    About
                                </Link>
                            )}
-                        {!isMobile &&
-                            process.env.NEXT_PUBLIC_SHOW_ABOUT_AND_NOTICE ===
-                                "true" && (
-                                <Link
-                                    href="/about"
-                                    target="_blank"
-                                    rel="noopener noreferrer"
-                                >
-                                    <ButtonWithTooltip
-                                        tooltipContent="Due to high usage, I have changed the model to minimax-m2 and added some usage limits. See About page for details."
-                                        variant="ghost"
-                                        size="icon"
-                                        className="h-6 w-6 text-amber-500 hover:text-amber-600"
-                                    >
-                                        <AlertTriangle className="h-4 w-4" />
-                                    </ButtonWithTooltip>
-                                </Link>
-                            )}
                    </div>
                    <div className="flex items-center gap-1 justify-end overflow-visible">
                        <ButtonWithTooltip
--- a/components/model-config-dialog.tsx
+++ b/components/model-config-dialog.tsx
@@ -52,6 +52,7 @@ import {
 } from "@/components/ui/select"
 import { useDictionary } from "@/hooks/use-dictionary"
 import type { UseModelConfigReturn } from "@/hooks/use-model-config"
+import { formatMessage } from "@/lib/i18n/utils"
 import type { ProviderConfig, ProviderName } from "@/lib/types/model-config"
 import { PROVIDER_INFO, SUGGESTED_MODELS } from "@/lib/types/model-config"
 import { cn } from "@/lib/utils"
@@ -107,10 +108,12 @@ function ValidationButton({
    status,
    onClick,
    disabled,
+    dict,
 }: {
    status: ValidationStatus
    onClick: () => void
    disabled: boolean
+    dict: ReturnType<typeof useDictionary>
 }) {
    return (
        <Button
@@ -129,10 +132,10 @@ function ValidationButton({
            ) : status === "success" ? (
                <>
                    <Check className="h-4 w-4 mr-1.5" />
-                    Verified
+                    {dict.modelConfig.verified}
                </>
            ) : (
-                "Test"
+                dict.modelConfig.test
            )}
        </Button>
    )
@@ -406,7 +409,7 @@ export function ModelConfigDialog({
                    <div className="w-56 flex-shrink-0 flex flex-col border-r bg-muted/20">
                        <div className="px-4 py-3 border-b">
                            <span className="text-xs font-medium text-muted-foreground uppercase tracking-wider">
-                                Providers
+                                {dict.modelConfig.providers}
                            </span>
                        </div>

@@ -418,7 +421,7 @@ export function ModelConfigDialog({
                                            <Plus className="h-5 w-5 text-muted-foreground" />
                                        </div>
                                        <p className="text-xs text-muted-foreground">
-                                            Add a provider to get started
+                                            {dict.modelConfig.addProviderHint}
                                        </p>
                                    </div>
                                ) : (
@@ -484,7 +487,11 @@ export function ModelConfigDialog({
                            >
                                <SelectTrigger className="h-9 bg-background hover:bg-accent">
                                    <Plus className="h-4 w-4 mr-2 text-muted-foreground" />
-                                    <SelectValue placeholder="Add Provider" />
+                                    <SelectValue
+                                        placeholder={
+                                            dict.modelConfig.addProvider
+                                        }
+                                    />
                                </SelectTrigger>
                                <SelectContent>
                                    {availableProviders.map((p) => (
@@ -552,15 +559,27 @@ export function ModelConfigDialog({
                                                <p className="text-xs text-muted-foreground">
                                                    {selectedProvider.models
                                                        .length === 0
-                                                        ? "No models configured"
-                                                        : `${selectedProvider.models.length} model${selectedProvider.models.length > 1 ? "s" : ""} configured`}
+                                                        ? dict.modelConfig
+                                                              .noModelsConfigured
+                                                        : formatMessage(
+                                                              dict.modelConfig
+                                                                  .modelsConfiguredCount,
+                                                              {
+                                                                  count: selectedProvider
+                                                                      .models
+                                                                      .length,
+                                                              },
+                                                          )}
                                                </p>
                                            </div>
                                            {selectedProvider.validated && (
                                                <div className="flex items-center gap-1.5 px-2.5 py-1 rounded-full bg-emerald-500/10 text-emerald-600 dark:text-emerald-400">
                                                    <Check className="h-3.5 w-3.5" />
                                                    <span className="text-xs font-medium">
-                                                        Verified
+                                                        {
+                                                            dict.modelConfig
+                                                                .verified
+                                                        }
                                                    </span>
                                                </div>
                                            )}
@@ -570,7 +589,12 @@ export function ModelConfigDialog({
                                        <div className="space-y-4">
                                            <div className="flex items-center gap-2 text-sm font-medium text-muted-foreground">
                                                <Settings2 className="h-4 w-4" />
-                                                <span>Configuration</span>
+                                                <span>
+                                                    {
+                                                        dict.modelConfig
+                                                            .configuration
+                                                    }
+                                                </span>
                                            </div>

                                            <div className="rounded-xl border bg-card p-4 space-y-4">
@@ -581,7 +605,10 @@ export function ModelConfigDialog({
                                                        className="text-xs font-medium flex items-center gap-1.5"
                                                    >
                                                        <Tag className="h-3.5 w-3.5 text-muted-foreground" />
-                                                        Display Name
+                                                        {
+                                                            dict.modelConfig
+                                                                .displayName
+                                                        }
                                                    </Label>
                                                    <Input
                                                        id="provider-name"
@@ -616,8 +643,11 @@ export function ModelConfigDialog({
                                                                className="text-xs font-medium flex items-center gap-1.5"
                                                            >
                                                                <Key className="h-3.5 w-3.5 text-muted-foreground" />
-                                                                AWS Access Key
-                                                                ID
+                                                                {
+                                                                    dict
+                                                                        .modelConfig
+                                                                        .awsAccessKeyId
+                                                                }
                                                            </Label>
                                                            <Input
                                                                id="aws-access-key-id"
@@ -649,8 +679,11 @@ export function ModelConfigDialog({
                                                                className="text-xs font-medium flex items-center gap-1.5"
                                                            >
                                                                <Key className="h-3.5 w-3.5 text-muted-foreground" />
-                                                                AWS Secret
-                                                                Access Key
+                                                                {
+                                                                    dict
+                                                                        .modelConfig
+                                                                        .awsSecretAccessKey
+                                                                }
                                                            </Label>
                                                            <div className="relative">
                                                                <Input
@@ -674,7 +707,11 @@ export function ModelConfigDialog({
                                                                                .value,
                                                                        )
                                                                    }
-                                                                    placeholder="Enter your secret access key"
+                                                                    placeholder={
+                                                                        dict
+                                                                            .modelConfig
+                                                                            .enterSecretKey
+                                                                    }
                                                                    className="h-9 pr-10 font-mono text-xs"
                                                                />
                                                                <button
@@ -707,7 +744,11 @@ export function ModelConfigDialog({
                                                                className="text-xs font-medium flex items-center gap-1.5"
                                                            >
                                                                <Link2 className="h-3.5 w-3.5 text-muted-foreground" />
-                                                                AWS Region
+                                                                {
+                                                                    dict
+                                                                        .modelConfig
+                                                                        .awsRegion
+                                                                }
                                                            </Label>
                                                            <Select
                                                                value={
@@ -724,7 +765,13 @@ export function ModelConfigDialog({
                                                                }
                                                            >
                                                                <SelectTrigger className="h-9 font-mono text-xs hover:bg-accent">
-                                                                    <SelectValue placeholder="Select region" />
+                                                                    <SelectValue
+                                                                        placeholder={
+                                                                            dict
+                                                                                .modelConfig
+                                                                                .selectRegion
+                                                                        }
+                                                                    />
                                                                </SelectTrigger>
                                                                <SelectContent className="max-h-64">
                                                                    <SelectItem value="us-east-1">
@@ -819,10 +866,16 @@ export function ModelConfigDialog({
                                                                  "success" ? (
                                                                    <>
                                                                        <Check className="h-4 w-4 mr-1.5" />
-                                                                        Verified
+                                                                        {
+                                                                            dict
+                                                                                .modelConfig
+                                                                                .verified
+                                                                        }
                                                                    </>
                                                                ) : (
-                                                                    "Test"
+                                                                    dict
+                                                                        .modelConfig
+                                                                        .test
                                                                )}
                                                            </Button>
                                                            {validationStatus ===
@@ -846,7 +899,11 @@ export function ModelConfigDialog({
                                                                className="text-xs font-medium flex items-center gap-1.5"
                                                            >
                                                                <Key className="h-3.5 w-3.5 text-muted-foreground" />
-                                                                API Key
+                                                                {
+                                                                    dict
+                                                                        .modelConfig
+                                                                        .apiKey
+                                                                }
                                                            </Label>
                                                            <div className="flex gap-2">
                                                                <div className="relative flex-1">
@@ -870,7 +927,11 @@ export function ModelConfigDialog({
                                                                                    .value,
                                                                            )
                                                                        }
-                                                                        placeholder="Enter your API key"
+                                                                        placeholder={
+                                                                            dict
+                                                                                .modelConfig
+                                                                                .enterApiKey
+                                                                        }
                                                                        className="h-9 pr-10 font-mono text-xs"
                                                                    />
                                                                    <button
@@ -924,10 +985,16 @@ export function ModelConfigDialog({
                                                                      "success" ? (
                                                                        <>
                                                                            <Check className="h-4 w-4 mr-1.5" />
-                                                                            Verified
+                                                                            {
+                                                                                dict
+                                                                                    .modelConfig
+                                                                                    .verified
+                                                                            }
                                                                        </>
                                                                    ) : (
-                                                                        "Test"
+                                                                        dict
+                                                                            .modelConfig
+                                                                            .test
                                                                    )}
                                                                </Button>
                                                            </div>
@@ -950,9 +1017,17 @@ export function ModelConfigDialog({
                                                                className="text-xs font-medium flex items-center gap-1.5"
                                                            >
                                                                <Link2 className="h-3.5 w-3.5 text-muted-foreground" />
-                                                                Base URL
+                                                                {
+                                                                    dict
+                                                                        .modelConfig
+                                                                        .baseUrl
+                                                                }
                                                                <span className="text-muted-foreground font-normal">
-                                                                    (optional)
+                                                                    {
+                                                                        dict
+                                                                            .modelConfig
+                                                                            .optional
+                                                                    }
                                                                </span>
                                                            </Label>
                                                            <Input
@@ -974,7 +1049,9 @@ export function ModelConfigDialog({
                                                                            .provider
                                                                    ]
                                                                        .defaultBaseUrl ||
-                                                                    "Custom endpoint URL"
+                                                                    dict
+                                                                        .modelConfig
+                                                                        .customEndpoint
                                                                }
                                                                className="h-9 font-mono text-xs"
                                                            />
@@ -989,12 +1066,20 @@ export function ModelConfigDialog({
                                            <div className="flex items-center justify-between">
                                                <div className="flex items-center gap-2 text-sm font-medium text-muted-foreground">
                                                    <Sparkles className="h-4 w-4" />
-                                                    <span>Models</span>
+                                                    <span>
+                                                        {
+                                                            dict.modelConfig
+                                                                .models
+                                                        }
+                                                    </span>
                                                </div>
                                                <div className="flex items-center gap-2">
                                                    <div className="relative">
                                                        <Input
-                                                            placeholder="Custom model ID..."
+                                                            placeholder={
+                                                                dict.modelConfig
+                                                                    .customModelId
+                                                            }
                                                            value={
                                                                customModelInput
                                                            }
@@ -1088,8 +1173,12 @@ export function ModelConfigDialog({
                                                            <span className="text-xs">
                                                                {availableSuggestions.length ===
                                                                0
-                                                                    ? "All added"
-                                                                    : "Suggested"}
+                                                                    ? dict
+                                                                          .modelConfig
+                                                                          .allAdded
+                                                                    : dict
+                                                                          .modelConfig
+                                                                          .suggested}
                                                            </span>
                                                        </SelectTrigger>
                                                        <SelectContent className="max-h-72">
@@ -1124,7 +1213,10 @@ export function ModelConfigDialog({
                                                            <Sparkles className="h-5 w-5 text-muted-foreground" />
                                                        </div>
                                                        <p className="text-sm text-muted-foreground">
-                                                            No models configured
+                                                            {
+                                                                dict.modelConfig
+                                                                    .noModelsConfigured
+                                                            }
                                                        </p>
                                                    </div>
                                                ) : (
@@ -1291,7 +1383,9 @@ export function ModelConfigDialog({
                                                                                    !newModelId
                                                                                ) {
                                                                                    showError(
-                                                                                        "Model ID cannot be empty",
+                                                                                        dict
+                                                                                            .modelConfig
+                                                                                            .modelIdEmpty,
                                                                                    )
                                                                                    return
                                                                                }
@@ -1319,7 +1413,9 @@ export function ModelConfigDialog({
                                                                                    )
                                                                                ) {
                                                                                    showError(
-                                                                                        "This model ID already exists",
+                                                                                        dict
+                                                                                            .modelConfig
+                                                                                            .modelIdExists,
                                                                                    )
                                                                                    return
                                                                                }
@@ -1383,7 +1479,10 @@ export function ModelConfigDialog({
                                                className="text-muted-foreground hover:text-destructive hover:bg-destructive/10"
                                            >
                                                <Trash2 className="h-4 w-4 mr-2" />
-                                                Delete Provider
+                                                {
+                                                    dict.modelConfig
+                                                        .deleteProvider
+                                                }
                                            </Button>
                                        </div>
                                    </div>
@@ -1395,11 +1494,10 @@ export function ModelConfigDialog({
                                    <Server className="h-8 w-8 text-primary/60" />
                                </div>
                                <h3 className="font-semibold mb-1">
-                                    Configure AI Providers
+                                    {dict.modelConfig.configureProviders}
                                </h3>
                                <p className="text-sm text-muted-foreground max-w-xs">
-                                    Select a provider from the list or add a new
-                                    one to configure API keys and models
+                                    {dict.modelConfig.selectProviderHint}
                                </p>
                            </div>
                        )}
@@ -1410,7 +1508,7 @@ export function ModelConfigDialog({
                <div className="px-6 py-3 border-t bg-muted/20">
                    <p className="text-xs text-muted-foreground text-center flex items-center justify-center gap-1.5">
                        <Key className="h-3 w-3" />
-                        API keys are stored locally in your browser
+                        {dict.modelConfig.apiKeyStored}
                    </p>
                </div>
            </DialogContent>
@@ -1429,19 +1527,16 @@ export function ModelConfigDialog({
                            <AlertCircle className="h-6 w-6 text-destructive" />
                        </div>
                        <AlertDialogTitle className="text-center">
-                            Delete Provider
+                            {dict.modelConfig.deleteProvider}
                        </AlertDialogTitle>
                        <AlertDialogDescription className="text-center">
-                            Are you sure you want to delete{" "}
-                            <span className="font-medium text-foreground">
-                                {selectedProvider
+                            {formatMessage(dict.modelConfig.deleteConfirmDesc, {
+                                name: selectedProvider
                                    ? selectedProvider.name ||
                                      PROVIDER_INFO[selectedProvider.provider]
                                          .label
-                                    : "this provider"}
-                            </span>
-                            ? This will remove all configured models and cannot
-                            be undone.
+                                    : "this provider",
+                            })}
                        </AlertDialogDescription>
                    </AlertDialogHeader>
                    {selectedProvider &&
@@ -1451,11 +1546,16 @@ export function ModelConfigDialog({
                                    htmlFor="delete-confirm"
                                    className="text-sm text-muted-foreground"
                                >
-                                    Type &quot;
-                                    {selectedProvider.name ||
-                                        PROVIDER_INFO[selectedProvider.provider]
-                                            .label}
-                                    &quot; to confirm
+                                    {formatMessage(
+                                        dict.modelConfig.typeToConfirm,
+                                        {
+                                            name:
+                                                selectedProvider.name ||
+                                                PROVIDER_INFO[
+                                                    selectedProvider.provider
+                                                ].label,
+                                        },
+                                    )}
                                </Label>
                                <Input
                                    id="delete-confirm"
@@ -1463,13 +1563,17 @@ export function ModelConfigDialog({
                                    onChange={(e) =>
                                        setDeleteConfirmText(e.target.value)
                                    }
-                                    placeholder="Type provider name..."
+                                    placeholder={
+                                        dict.modelConfig.typeProviderName
+                                    }
                                    className="h-9"
                                />
                            </div>
                        )}
                    <AlertDialogFooter>
-                        <AlertDialogCancel>Cancel</AlertDialogCancel>
+                        <AlertDialogCancel>
+                            {dict.modelConfig.cancel}
+                        </AlertDialogCancel>
                        <AlertDialogAction
                            onClick={handleDeleteProvider}
                            disabled={
@@ -1482,7 +1586,7 @@ export function ModelConfigDialog({
                            }
                            className="bg-destructive text-destructive-foreground hover:bg-destructive/90 disabled:opacity-50"
                        >
-                            Delete
+                            {dict.modelConfig.delete}
                        </AlertDialogAction>
                    </AlertDialogFooter>
                </AlertDialogContent>
--- a/components/model-selector.tsx
+++ b/components/model-selector.tsx
@@ -16,6 +16,7 @@ import {
    ModelSelectorTrigger,
 } from "@/components/ai-elements/model-selector"
 import { ButtonWithTooltip } from "@/components/button-with-tooltip"
+import { useDictionary } from "@/hooks/use-dictionary"
 import type { FlattenedModel } from "@/lib/types/model-config"
 import { cn } from "@/lib/utils"

@@ -67,6 +68,7 @@ export function ModelSelector({
    onConfigure,
    disabled = false,
 }: ModelSelectorProps) {
+    const dict = useDictionary()
    const [open, setOpen] = useState(false)
    // Only show validated models in the selector
    const validatedModels = useMemo(
@@ -96,8 +98,8 @@ export function ModelSelector({
    }

    const tooltipContent = selectedModel
-        ? `${selectedModel.modelId} (click to change)`
-        : "Using server default model (click to change)"
+        ? `${selectedModel.modelId} ${dict.modelConfig.clickToChange}`
+        : `${dict.modelConfig.usingServerDefault} ${dict.modelConfig.clickToChange}`

    return (
        <ModelSelectorRoot open={open} onOpenChange={setOpen}>
@@ -111,22 +113,26 @@ export function ModelSelector({
                >
                    <Bot className="h-4 w-4 flex-shrink-0 text-muted-foreground" />
                    <span className="text-xs truncate">
-                        {selectedModel ? selectedModel.modelId : "Default"}
+                        {selectedModel
+                            ? selectedModel.modelId
+                            : dict.modelConfig.default}
                    </span>
                    <ChevronDown className="h-3 w-3 flex-shrink-0 text-muted-foreground" />
                </ButtonWithTooltip>
            </ModelSelectorTrigger>
-            <ModelSelectorContent title="Select Model">
-                <ModelSelectorInput placeholder="Search models..." />
+            <ModelSelectorContent title={dict.modelConfig.selectModel}>
+                <ModelSelectorInput
+                    placeholder={dict.modelConfig.searchModels}
+                />
                <ModelSelectorList>
                    <ModelSelectorEmpty>
                        {validatedModels.length === 0 && models.length > 0
-                            ? "No verified models. Test your models first."
-                            : "No models found."}
+                            ? dict.modelConfig.noVerifiedModels
+                            : dict.modelConfig.noModelsFound}
                    </ModelSelectorEmpty>

                    {/* Server Default Option */}
-                    <ModelSelectorGroup heading="Default">
+                    <ModelSelectorGroup heading={dict.modelConfig.default}>
                        <ModelSelectorItem
                            value="__server_default__"
                            onSelect={handleSelect}
@@ -145,7 +151,7 @@ export function ModelSelector({
                            />
                            <Server className="mr-2 h-4 w-4 text-muted-foreground" />
                            <ModelSelectorName>
-                                Server Default
+                                {dict.modelConfig.serverDefault}
                            </ModelSelectorName>
                        </ModelSelectorItem>
                    </ModelSelectorGroup>
@@ -201,13 +207,13 @@ export function ModelSelector({
                        >
                            <Settings2 className="mr-2 h-4 w-4" />
                            <ModelSelectorName>
-                                Configure Models...
+                                {dict.modelConfig.configureModels}
                            </ModelSelectorName>
                        </ModelSelectorItem>
                    </ModelSelectorGroup>
                    {/* Info text */}
                    <div className="px-3 py-2 text-xs text-muted-foreground border-t">
-                        Only verified models are shown
+                        {dict.modelConfig.onlyVerifiedShown}
                    </div>
                </ModelSelectorList>
            </ModelSelectorContent>
--- a/instrumentation.ts
+++ b/instrumentation.ts
@@ -19,10 +19,13 @@ export function register() {
            const spanName = otelSpan.name
            // Skip Next.js HTTP infrastructure spans
            if (
-                spanName.startsWith("POST /") ||
-                spanName.startsWith("GET /") ||
+                spanName.startsWith("POST") ||
+                spanName.startsWith("GET") ||
+                spanName.startsWith("RSC") ||
                spanName.includes("BaseServer") ||
-                spanName.includes("handleRequest")
+                spanName.includes("handleRequest") ||
+                spanName.includes("resolve page") ||
+                spanName.includes("start response")
            ) {
                return false
            }
@@ -36,4 +39,5 @@ export function register() {

    // Register globally so AI SDK's telemetry also uses this processor
    tracerProvider.register()
+    console.log("[Langfuse] Instrumentation initialized successfully")
 }
--- a/lib/ai-providers.ts
+++ b/lib/ai-providers.ts
@@ -95,8 +95,8 @@ function parseIntSafe(
 * Supports various AI SDK providers with their unique configuration options
 *
 * Environment variables:
- * - OPENAI_REASONING_EFFORT: OpenAI reasoning effort level (minimal/low/medium/high) - for o1/o3/gpt-5
- * - OPENAI_REASONING_SUMMARY: OpenAI reasoning summary (none/brief/detailed) - auto-enabled for o1/o3/gpt-5
+ * - OPENAI_REASONING_EFFORT: OpenAI reasoning effort level (minimal/low/medium/high) - for o1/o3/o4/gpt-5
+ * - OPENAI_REASONING_SUMMARY: OpenAI reasoning summary (auto/detailed) - auto-enabled for o1/o3/o4/gpt-5
 * - ANTHROPIC_THINKING_BUDGET_TOKENS: Anthropic thinking budget in tokens (1024-64000)
 * - ANTHROPIC_THINKING_TYPE: Anthropic thinking type (enabled)
 * - GOOGLE_THINKING_BUDGET: Google Gemini 2.5 thinking budget in tokens (1024-100000)
@@ -118,18 +118,19 @@ function buildProviderOptions(
            const reasoningEffort = process.env.OPENAI_REASONING_EFFORT
            const reasoningSummary = process.env.OPENAI_REASONING_SUMMARY

-            // OpenAI reasoning models (o1, o3, gpt-5) need reasoningSummary to return thoughts
+            // OpenAI reasoning models (o1, o3, o4, gpt-5) need reasoningSummary to return thoughts
            if (
                modelId &&
                (modelId.includes("o1") ||
                    modelId.includes("o3") ||
+                    modelId.includes("o4") ||
                    modelId.includes("gpt-5"))
            ) {
                options.openai = {
-                    // Auto-enable reasoning summary for reasoning models (default: detailed)
+                    // Auto-enable reasoning summary for reasoning models
+                    // Use 'auto' as default since not all models support 'detailed'
                    reasoningSummary:
-                        (reasoningSummary as "none" | "brief" | "detailed") ||
-                        "detailed",
+                        (reasoningSummary as "auto" | "detailed") || "auto",
                }

                // Optionally configure reasoning effort
@@ -152,8 +153,7 @@ function buildProviderOptions(
                }
                if (reasoningSummary) {
                    options.openai.reasoningSummary = reasoningSummary as
-                        | "none"
-                        | "brief"
+                        | "auto"
                        | "detailed"
                }
            }
@@ -593,7 +593,9 @@ export function getAIModel(overrides?: ClientOverrides): ModelConfig {
                    apiKey,
                    ...(baseURL && { baseURL }),
                })
-                model = customOpenAI.chat(modelId)
+                // Use Responses API (default) instead of .chat() to support reasoning
+                // for gpt-5, o1, o3, o4 models. Chat Completions API does not emit reasoning events.
+                model = customOpenAI(modelId)
            } else {
                model = openai(modelId)
            }
--- a/lib/dynamo-quota-manager.ts
+++ b/lib/dynamo-quota-manager.ts
@@ -0,0 +1,238 @@
+import {
+    ConditionalCheckFailedException,
+    DynamoDBClient,
+    GetItemCommand,
+    UpdateItemCommand,
+} from "@aws-sdk/client-dynamodb"
+
+// Quota tracking is OPT-IN: only enabled if DYNAMODB_QUOTA_TABLE is explicitly set
+// OSS users who don't need quota tracking can simply not set this env var
+const TABLE = process.env.DYNAMODB_QUOTA_TABLE
+const DYNAMODB_REGION = process.env.DYNAMODB_REGION || "ap-northeast-1"
+
+// Only create client if quota is enabled
+const client = TABLE ? new DynamoDBClient({ region: DYNAMODB_REGION }) : null
+
+/**
+ * Check if server-side quota tracking is enabled.
+ * Quota is opt-in: only enabled when DYNAMODB_QUOTA_TABLE env var is set.
+ */
+export function isQuotaEnabled(): boolean {
+    return !!TABLE
+}
+
+interface QuotaLimits {
+    requests: number // Daily request limit
+    tokens: number // Daily token limit
+    tpm: number // Tokens per minute
+}
+
+interface QuotaCheckResult {
+    allowed: boolean
+    error?: string
+    type?: "request" | "token" | "tpm"
+    used?: number
+    limit?: number
+}
+
+/**
+ * Check all quotas and increment request count atomically.
+ * Uses ConditionExpression to prevent race conditions.
+ * Returns which limit was exceeded if any.
+ */
+export async function checkAndIncrementRequest(
+    ip: string,
+    limits: QuotaLimits,
+): Promise<QuotaCheckResult> {
+    // Skip if quota tracking not enabled
+    if (!client || !TABLE) {
+        return { allowed: true }
+    }
+
+    const today = new Date().toISOString().split("T")[0]
+    const currentMinute = Math.floor(Date.now() / 60000).toString()
+    const ttl = Math.floor(Date.now() / 1000) + 7 * 24 * 60 * 60
+
+    try {
+        // Atomic check-and-increment with ConditionExpression
+        // This prevents race conditions by failing if limits are exceeded
+        await client.send(
+            new UpdateItemCommand({
+                TableName: TABLE,
+                Key: { PK: { S: `IP#${ip}` } },
+                // Reset counts if new day/minute, then increment request count
+                UpdateExpression: `
+                    SET lastResetDate = :today,
+                        dailyReqCount = if_not_exists(dailyReqCount, :zero) + :one,
+                        dailyTokenCount = if_not_exists(dailyTokenCount, :zero),
+                        lastMinute = :minute,
+                        tpmCount = if_not_exists(tpmCount, :zero),
+                        #ttl = :ttl
+                `,
+                // Atomic condition: only succeed if ALL limits pass
+                // Uses attribute_not_exists for new items, then checks limits for existing items
+                ConditionExpression: `
+                    (attribute_not_exists(lastResetDate) OR lastResetDate < :today OR
+                     ((attribute_not_exists(dailyReqCount) OR dailyReqCount < :reqLimit) AND
+                      (attribute_not_exists(dailyTokenCount) OR dailyTokenCount < :tokenLimit))) AND
+                    (attribute_not_exists(lastMinute) OR lastMinute <> :minute OR
+                     attribute_not_exists(tpmCount) OR tpmCount < :tpmLimit)
+                `,
+                ExpressionAttributeNames: { "#ttl": "ttl" },
+                ExpressionAttributeValues: {
+                    ":today": { S: today },
+                    ":zero": { N: "0" },
+                    ":one": { N: "1" },
+                    ":minute": { S: currentMinute },
+                    ":ttl": { N: String(ttl) },
+                    ":reqLimit": { N: String(limits.requests || 999999) },
+                    ":tokenLimit": { N: String(limits.tokens || 999999) },
+                    ":tpmLimit": { N: String(limits.tpm || 999999) },
+                },
+            }),
+        )
+
+        return { allowed: true }
+    } catch (e: any) {
+        // Condition failed - need to determine which limit was exceeded
+        if (e instanceof ConditionalCheckFailedException) {
+            // Get current counts to determine which limit was hit
+            try {
+                const getResult = await client.send(
+                    new GetItemCommand({
+                        TableName: TABLE,
+                        Key: { PK: { S: `IP#${ip}` } },
+                    }),
+                )
+
+                const item = getResult.Item
+                const storedDate = item?.lastResetDate?.S
+                const storedMinute = item?.lastMinute?.S
+                const isNewDay = !storedDate || storedDate < today
+
+                const dailyReqCount = isNewDay
+                    ? 0
+                    : Number(item?.dailyReqCount?.N || 0)
+                const dailyTokenCount = isNewDay
+                    ? 0
+                    : Number(item?.dailyTokenCount?.N || 0)
+                const tpmCount =
+                    storedMinute !== currentMinute
+                        ? 0
+                        : Number(item?.tpmCount?.N || 0)
+
+                // Determine which limit was exceeded
+                if (limits.requests > 0 && dailyReqCount >= limits.requests) {
+                    return {
+                        allowed: false,
+                        type: "request",
+                        error: "Daily request limit exceeded",
+                        used: dailyReqCount,
+                        limit: limits.requests,
+                    }
+                }
+                if (limits.tokens > 0 && dailyTokenCount >= limits.tokens) {
+                    return {
+                        allowed: false,
+                        type: "token",
+                        error: "Daily token limit exceeded",
+                        used: dailyTokenCount,
+                        limit: limits.tokens,
+                    }
+                }
+                if (limits.tpm > 0 && tpmCount >= limits.tpm) {
+                    return {
+                        allowed: false,
+                        type: "tpm",
+                        error: "Rate limit exceeded (tokens per minute)",
+                        used: tpmCount,
+                        limit: limits.tpm,
+                    }
+                }
+
+                // Condition failed but no limit clearly exceeded - race condition edge case
+                // Fail safe by allowing (could be a reset race)
+                console.warn(
+                    `[quota] Condition failed but no limit exceeded for IP prefix: ${ip.slice(0, 8)}...`,
+                )
+                return { allowed: true }
+            } catch (getError: any) {
+                console.error(
+                    `[quota] Failed to get quota details after condition failure, IP prefix: ${ip.slice(0, 8)}..., error: ${getError.message}`,
+                )
+                return { allowed: true } // Fail open
+            }
+        }
+
+        // Other DynamoDB errors - fail open
+        console.error(
+            `[quota] DynamoDB error (fail-open), IP prefix: ${ip.slice(0, 8)}..., error: ${e.message}`,
+        )
+        return { allowed: true }
+    }
+}
+
+/**
+ * Record token usage after response completes.
+ * Uses atomic operations to update both daily token count and TPM count.
+ * Handles minute boundaries atomically to prevent race conditions.
+ */
+export async function recordTokenUsage(
+    ip: string,
+    tokens: number,
+): Promise<void> {
+    // Skip if quota tracking not enabled
+    if (!client || !TABLE) return
+    if (!Number.isFinite(tokens) || tokens <= 0) return
+
+    const currentMinute = Math.floor(Date.now() / 60000).toString()
+    const ttl = Math.floor(Date.now() / 1000) + 7 * 24 * 60 * 60
+
+    try {
+        // Try to update assuming same minute (most common case)
+        // Uses condition to ensure we're in the same minute
+        await client.send(
+            new UpdateItemCommand({
+                TableName: TABLE,
+                Key: { PK: { S: `IP#${ip}` } },
+                UpdateExpression:
+                    "SET #ttl = :ttl ADD dailyTokenCount :tokens, tpmCount :tokens",
+                ConditionExpression: "lastMinute = :minute",
+                ExpressionAttributeNames: { "#ttl": "ttl" },
+                ExpressionAttributeValues: {
+                    ":minute": { S: currentMinute },
+                    ":tokens": { N: String(tokens) },
+                    ":ttl": { N: String(ttl) },
+                },
+            }),
+        )
+    } catch (e: any) {
+        if (e instanceof ConditionalCheckFailedException) {
+            // Different minute - reset TPM count and set new minute
+            try {
+                await client.send(
+                    new UpdateItemCommand({
+                        TableName: TABLE,
+                        Key: { PK: { S: `IP#${ip}` } },
+                        UpdateExpression:
+                            "SET lastMinute = :minute, tpmCount = :tokens, #ttl = :ttl ADD dailyTokenCount :tokens",
+                        ExpressionAttributeNames: { "#ttl": "ttl" },
+                        ExpressionAttributeValues: {
+                            ":minute": { S: currentMinute },
+                            ":tokens": { N: String(tokens) },
+                            ":ttl": { N: String(ttl) },
+                        },
+                    }),
+                )
+            } catch (retryError: any) {
+                console.error(
+                    `[quota] Failed to record tokens (retry), IP prefix: ${ip.slice(0, 8)}..., tokens: ${tokens}, error: ${retryError.message}`,
+                )
+            }
+        } else {
+            console.error(
+                `[quota] Failed to record tokens, IP prefix: ${ip.slice(0, 8)}..., tokens: ${tokens}, error: ${e.message}`,
+            )
+        }
+    }
+}
--- a/lib/i18n/dictionaries/en.json
+++ b/lib/i18n/dictionaries/en.json
@@ -202,6 +202,47 @@
        "apiKeyStored": "API keys are stored locally in your browser",
        "test": "Test",
        "validationError": "Validation failed",
-        "addModelFirst": "Add at least one model to validate"
+        "addModelFirst": "Add at least one model to validate",
+        "providers": "Providers",
+        "addProviderHint": "Add a provider to get started",
+        "verified": "Verified",
+        "configuration": "Configuration",
+        "displayName": "Display Name",
+        "awsAccessKeyId": "AWS Access Key ID",
+        "awsSecretAccessKey": "AWS Secret Access Key",
+        "awsRegion": "AWS Region",
+        "selectRegion": "Select region",
+        "apiKey": "API Key",
+        "enterApiKey": "Enter your API key",
+        "enterSecretKey": "Enter your secret access key",
+        "baseUrl": "Base URL",
+        "optional": "(optional)",
+        "customEndpoint": "Custom endpoint URL",
+        "models": "Models",
+        "customModelId": "Custom model ID...",
+        "allAdded": "All added",
+        "suggested": "Suggested",
+        "noModelsConfigured": "No models configured",
+        "modelIdEmpty": "Model ID cannot be empty",
+        "modelIdExists": "This model ID already exists",
+        "configureProviders": "Configure AI Providers",
+        "selectProviderHint": "Select a provider from the list or add a new one to configure API keys and models",
+        "deleteConfirmDesc": "Are you sure you want to delete {name}? This will remove all configured models and cannot be undone.",
+        "typeToConfirm": "Type \"{name}\" to confirm",
+        "typeProviderName": "Type provider name...",
+        "modelsConfiguredCount": "{count} model(s) configured",
+        "validationFailedCount": "{count} model(s) failed validation",
+        "cancel": "Cancel",
+        "delete": "Delete",
+        "clickToChange": "(click to change)",
+        "usingServerDefault": "Using server default model",
+        "selectModel": "Select Model",
+        "searchModels": "Search models...",
+        "noVerifiedModels": "No verified models. Test your models first.",
+        "noModelsFound": "No models found.",
+        "default": "Default",
+        "serverDefault": "Server Default",
+        "configureModels": "Configure Models...",
+        "onlyVerifiedShown": "Only verified models are shown"
    }
 }
--- a/lib/i18n/dictionaries/ja.json
+++ b/lib/i18n/dictionaries/ja.json
@@ -202,6 +202,47 @@
        "apiKeyStored": "APIキーはブラウザにローカル保存されます",
        "test": "テスト",
        "validationError": "検証に失敗しました",
-        "addModelFirst": "検証するには少なくとも1つのモデルを追加してください"
+        "addModelFirst": "検証するには少なくとも1つのモデルを追加してください",
+        "providers": "プロバイダー",
+        "addProviderHint": "プロバイダーを追加して開始",
+        "verified": "検証済み",
+        "configuration": "設定",
+        "displayName": "表示名",
+        "awsAccessKeyId": "AWS アクセスキー ID",
+        "awsSecretAccessKey": "AWS シークレットアクセスキー",
+        "awsRegion": "AWS リージョン",
+        "selectRegion": "リージョンを選択",
+        "apiKey": "API キー",
+        "enterApiKey": "API キーを入力",
+        "enterSecretKey": "シークレットアクセスキーを入力",
+        "baseUrl": "ベース URL",
+        "optional": "（オプション）",
+        "customEndpoint": "カスタムエンドポイント URL",
+        "models": "モデル",
+        "customModelId": "カスタムモデル ID...",
+        "allAdded": "すべて追加済み",
+        "suggested": "おすすめ",
+        "noModelsConfigured": "モデルが設定されていません",
+        "modelIdEmpty": "モデル ID は空にできません",
+        "modelIdExists": "このモデル ID は既に存在します",
+        "configureProviders": "AI プロバイダーを設定",
+        "selectProviderHint": "リストからプロバイダーを選択するか、新規追加して API キーとモデルを設定",
+        "deleteConfirmDesc": "{name} を削除してもよろしいですか？設定されたすべてのモデルが削除され、元に戻せません。",
+        "typeToConfirm": "確認のため「{name}」と入力",
+        "typeProviderName": "プロバイダー名を入力...",
+        "modelsConfiguredCount": "{count} 個のモデルを設定済み",
+        "validationFailedCount": "{count} 個のモデルの検証に失敗",
+        "cancel": "キャンセル",
+        "delete": "削除",
+        "clickToChange": "（クリックして変更）",
+        "usingServerDefault": "サーバーデフォルトモデルを使用中",
+        "selectModel": "モデルを選択",
+        "searchModels": "モデルを検索...",
+        "noVerifiedModels": "検証済みのモデルがありません。先にモデルをテストしてください。",
+        "noModelsFound": "モデルが見つかりません。",
+        "default": "デフォルト",
+        "serverDefault": "サーバーデフォルト",
+        "configureModels": "モデルを設定...",
+        "onlyVerifiedShown": "検証済みのモデルのみ表示"
    }
 }
--- a/lib/i18n/dictionaries/zh.json
+++ b/lib/i18n/dictionaries/zh.json
@@ -202,6 +202,47 @@
        "apiKeyStored": "API 密钥存储在您的浏览器本地",
        "test": "测试",
        "validationError": "验证失败",
-        "addModelFirst": "请先添加至少一个模型以进行验证"
+        "addModelFirst": "请先添加至少一个模型以进行验证",
+        "providers": "提供商",
+        "addProviderHint": "添加提供商即可开始使用",
+        "verified": "已验证",
+        "configuration": "配置",
+        "displayName": "显示名称",
+        "awsAccessKeyId": "AWS 访问密钥 ID",
+        "awsSecretAccessKey": "AWS Secret Access Key",
+        "awsRegion": "AWS 区域",
+        "selectRegion": "选择区域",
+        "apiKey": "API 密钥",
+        "enterApiKey": "输入您的 API 密钥",
+        "enterSecretKey": "输入您的 Secret Key",
+        "baseUrl": "基础 URL",
+        "optional": "（可选）",
+        "customEndpoint": "自定义端点 URL",
+        "models": "模型",
+        "customModelId": "自定义模型 ID...",
+        "allAdded": "已全部添加",
+        "suggested": "推荐",
+        "noModelsConfigured": "尚未配置模型",
+        "modelIdEmpty": "模型 ID 不能为空",
+        "modelIdExists": "此模型 ID 已存在",
+        "configureProviders": "配置 AI 提供商",
+        "selectProviderHint": "从列表中选择提供商或添加新的以配置 API 密钥和模型",
+        "deleteConfirmDesc": "确定要删除 {name} 吗？这将移除所有配置的模型且无法撤销。",
+        "typeToConfirm": "输入 \"{name}\" 以确认",
+        "typeProviderName": "输入提供商名称...",
+        "modelsConfiguredCount": "已配置 {count} 个模型",
+        "validationFailedCount": "{count} 个模型验证失败",
+        "cancel": "取消",
+        "delete": "删除",
+        "clickToChange": "（点击更改）",
+        "usingServerDefault": "使用服务器默认模型",
+        "selectModel": "选择模型",
+        "searchModels": "搜索模型...",
+        "noVerifiedModels": "没有已验证的模型。请先测试您的模型。",
+        "noModelsFound": "未找到模型。",
+        "default": "默认",
+        "serverDefault": "服务器默认",
+        "configureModels": "配置模型...",
+        "onlyVerifiedShown": "仅显示已验证的模型"
    }
 }
--- a/lib/langfuse.ts
+++ b/lib/langfuse.ts
@@ -21,9 +21,11 @@ export function getLangfuseClient(): LangfuseClient | null {
    return langfuseClient
 }

-// Check if Langfuse is configured
+// Check if Langfuse is configured (both keys required)
 export function isLangfuseEnabled(): boolean {
-    return !!process.env.LANGFUSE_PUBLIC_KEY
+    return !!(
+        process.env.LANGFUSE_PUBLIC_KEY && process.env.LANGFUSE_SECRET_KEY
+    )
 }

 // Update trace with input data at the start of request
@@ -43,34 +45,16 @@ export function setTraceInput(params: {
 }

 // Update trace with output and end the span
-export function setTraceOutput(
-    output: string,
-    usage?: { promptTokens?: number; completionTokens?: number },
-) {
+// Note: AI SDK 6 telemetry automatically reports token usage on its spans,
+// so we only need to set the output text and close our wrapper span
+export function setTraceOutput(output: string) {
    if (!isLangfuseEnabled()) return

    updateActiveTrace({ output })

+    // End the observe() wrapper span (AI SDK creates its own child spans with usage)
    const activeSpan = api.trace.getActiveSpan()
    if (activeSpan) {
-        // Manually set usage attributes since AI SDK Bedrock streaming doesn't provide them
-        if (usage?.promptTokens) {
-            activeSpan.setAttribute("ai.usage.promptTokens", usage.promptTokens)
-            activeSpan.setAttribute(
-                "gen_ai.usage.input_tokens",
-                usage.promptTokens,
-            )
-        }
-        if (usage?.completionTokens) {
-            activeSpan.setAttribute(
-                "ai.usage.completionTokens",
-                usage.completionTokens,
-            )
-            activeSpan.setAttribute(
-                "gen_ai.usage.output_tokens",
-                usage.completionTokens,
-            )
-        }
        activeSpan.end()
    }
 }
--- a/lib/use-quota-manager.tsx
+++ b/lib/use-quota-manager.tsx
@@ -1,11 +1,10 @@
 "use client"

-import { useCallback, useMemo } from "react"
+import { useCallback } from "react"
 import { toast } from "sonner"
 import { QuotaLimitToast } from "@/components/quota-limit-toast"
 import { useDictionary } from "@/hooks/use-dictionary"
 import { formatMessage } from "@/lib/i18n/utils"
-import { STORAGE_KEYS } from "@/lib/storage"

 export interface QuotaConfig {
    dailyRequestLimit: number
@@ -13,134 +12,19 @@ export interface QuotaConfig {
    tpmLimit: number
 }

-export interface QuotaCheckResult {
-    allowed: boolean
-    remaining: number
-    used: number
-}
-
 /**
- * Hook for managing request/token quotas and rate limiting.
- * Handles three types of limits:
- * - Daily request limit
- * - Daily token limit
- * - Tokens per minute (TPM) rate limit
- *
- * Users with their own API key bypass all limits.
+ * Hook for displaying quota limit toasts.
+ * Server-side handles actual quota enforcement via DynamoDB.
+ * This hook only provides UI feedback when limits are exceeded.
 */
 export function useQuotaManager(config: QuotaConfig): {
-    hasOwnApiKey: () => boolean
-    checkDailyLimit: () => QuotaCheckResult
-    checkTokenLimit: () => QuotaCheckResult
-    checkTPMLimit: () => QuotaCheckResult
-    incrementRequestCount: () => void
-    incrementTokenCount: (tokens: number) => void
-    incrementTPMCount: (tokens: number) => void
    showQuotaLimitToast: () => void
    showTokenLimitToast: (used: number) => void
    showTPMLimitToast: () => void
 } {
    const { dailyRequestLimit, dailyTokenLimit, tpmLimit } = config
-
    const dict = useDictionary()

-    // Check if user has their own API key configured (bypass limits)
-    const hasOwnApiKey = useCallback((): boolean => {
-        const provider = localStorage.getItem(STORAGE_KEYS.aiProvider)
-        const apiKey = localStorage.getItem(STORAGE_KEYS.aiApiKey)
-        return !!(provider && apiKey)
-    }, [])
-
-    // Generic helper: Parse count from localStorage with NaN guard
-    const parseStorageCount = (key: string): number => {
-        const count = parseInt(localStorage.getItem(key) || "0", 10)
-        return Number.isNaN(count) ? 0 : count
-    }
-
-    // Generic helper: Create quota checker factory
-    const createQuotaChecker = useCallback(
-        (
-            getTimeKey: () => string,
-            timeStorageKey: string,
-            countStorageKey: string,
-            limit: number,
-        ) => {
-            return (): QuotaCheckResult => {
-                if (hasOwnApiKey())
-                    return { allowed: true, remaining: -1, used: 0 }
-                if (limit <= 0) return { allowed: true, remaining: -1, used: 0 }
-
-                const currentTime = getTimeKey()
-                const storedTime = localStorage.getItem(timeStorageKey)
-                let count = parseStorageCount(countStorageKey)
-
-                if (storedTime !== currentTime) {
-                    count = 0
-                    localStorage.setItem(timeStorageKey, currentTime)
-                    localStorage.setItem(countStorageKey, "0")
-                }
-
-                return {
-                    allowed: count < limit,
-                    remaining: limit - count,
-                    used: count,
-                }
-            }
-        },
-        [hasOwnApiKey],
-    )
-
-    // Generic helper: Create quota incrementer factory
-    const createQuotaIncrementer = useCallback(
-        (
-            getTimeKey: () => string,
-            timeStorageKey: string,
-            countStorageKey: string,
-            validateInput: boolean = false,
-        ) => {
-            return (tokens: number = 1): void => {
-                if (validateInput && (!Number.isFinite(tokens) || tokens <= 0))
-                    return
-
-                const currentTime = getTimeKey()
-                const storedTime = localStorage.getItem(timeStorageKey)
-                let count = parseStorageCount(countStorageKey)
-
-                if (storedTime !== currentTime) {
-                    count = 0
-                    localStorage.setItem(timeStorageKey, currentTime)
-                }
-
-                localStorage.setItem(countStorageKey, String(count + tokens))
-            }
-        },
-        [],
-    )
-
-    // Check daily request limit
-    const checkDailyLimit = useMemo(
-        () =>
-            createQuotaChecker(
-                () => new Date().toDateString(),
-                STORAGE_KEYS.requestDate,
-                STORAGE_KEYS.requestCount,
-                dailyRequestLimit,
-            ),
-        [createQuotaChecker, dailyRequestLimit],
-    )
-
-    // Increment request count
-    const incrementRequestCount = useMemo(
-        () =>
-            createQuotaIncrementer(
-                () => new Date().toDateString(),
-                STORAGE_KEYS.requestDate,
-                STORAGE_KEYS.requestCount,
-                false,
-            ),
-        [createQuotaIncrementer],
-    )
-
    // Show quota limit toast (request-based)
    const showQuotaLimitToast = useCallback(() => {
        toast.custom(
@@ -155,30 +39,6 @@ export function useQuotaManager(config: QuotaConfig): {
        )
    }, [dailyRequestLimit])

-    // Check daily token limit
-    const checkTokenLimit = useMemo(
-        () =>
-            createQuotaChecker(
-                () => new Date().toDateString(),
-                STORAGE_KEYS.tokenDate,
-                STORAGE_KEYS.tokenCount,
-                dailyTokenLimit,
-            ),
-        [createQuotaChecker, dailyTokenLimit],
-    )
-
-    // Increment token count
-    const incrementTokenCount = useMemo(
-        () =>
-            createQuotaIncrementer(
-                () => new Date().toDateString(),
-                STORAGE_KEYS.tokenDate,
-                STORAGE_KEYS.tokenCount,
-                true, // Validate input tokens
-            ),
-        [createQuotaIncrementer],
-    )
-
    // Show token limit toast
    const showTokenLimitToast = useCallback(
        (used: number) => {
@@ -197,30 +57,6 @@ export function useQuotaManager(config: QuotaConfig): {
        [dailyTokenLimit],
    )

-    // Check TPM (tokens per minute) limit
-    const checkTPMLimit = useMemo(
-        () =>
-            createQuotaChecker(
-                () => Math.floor(Date.now() / 60000).toString(),
-                STORAGE_KEYS.tpmMinute,
-                STORAGE_KEYS.tpmCount,
-                tpmLimit,
-            ),
-        [createQuotaChecker, tpmLimit],
-    )
-
-    // Increment TPM count
-    const incrementTPMCount = useMemo(
-        () =>
-            createQuotaIncrementer(
-                () => Math.floor(Date.now() / 60000).toString(),
-                STORAGE_KEYS.tpmMinute,
-                STORAGE_KEYS.tpmCount,
-                true, // Validate input tokens
-            ),
-        [createQuotaIncrementer],
-    )
-
    // Show TPM limit toast
    const showTPMLimitToast = useCallback(() => {
        const limitDisplay =
@@ -233,18 +69,6 @@ export function useQuotaManager(config: QuotaConfig): {
    }, [tpmLimit, dict])

    return {
-        // Check functions
-        hasOwnApiKey,
-        checkDailyLimit,
-        checkTokenLimit,
-        checkTPMLimit,
-
-        // Increment functions
-        incrementRequestCount,
-        incrementTokenCount,
-        incrementTPMCount,
-
-        // Toast functions
        showQuotaLimitToast,
        showTokenLimitToast,
        showTPMLimitToast,
--- a/lib/utils.ts
+++ b/lib/utils.ts
@@ -61,6 +61,47 @@ export function isMxCellXmlComplete(xml: string | undefined | null): boolean {
    return trimmed.endsWith("/>") || trimmed.endsWith("</mxCell>")
 }

+/**
+ * Extract only complete mxCell elements from partial/streaming XML.
+ * This allows progressive rendering during streaming by ignoring incomplete trailing elements.
+ * @param xml - The partial XML string (may contain incomplete trailing mxCell)
+ * @returns XML string containing only complete mxCell elements
+ */
+export function extractCompleteMxCells(xml: string | undefined | null): string {
+    if (!xml) return ""
+
+    const completeCells: Array<{ index: number; text: string }> = []
+
+    // Match self-closing mxCell tags: <mxCell ... />
+    // Also match mxCell with nested mxGeometry: <mxCell ...>...<mxGeometry .../></mxCell>
+    const selfClosingPattern = /<mxCell\s+[^>]*\/>/g
+    const nestedPattern = /<mxCell\s+[^>]*>[\s\S]*?<\/mxCell>/g
+
+    // Find all self-closing mxCell elements
+    let match: RegExpExecArray | null
+    while ((match = selfClosingPattern.exec(xml)) !== null) {
+        completeCells.push({ index: match.index, text: match[0] })
+    }
+
+    // Find all mxCell elements with nested content (like mxGeometry)
+    while ((match = nestedPattern.exec(xml)) !== null) {
+        completeCells.push({ index: match.index, text: match[0] })
+    }
+
+    // Sort by position to maintain order
+    completeCells.sort((a, b) => a.index - b.index)
+
+    // Remove duplicates (a self-closing match might overlap with nested match)
+    const seen = new Set<number>()
+    const uniqueCells = completeCells.filter((cell) => {
+        if (seen.has(cell.index)) return false
+        seen.add(cell.index)
+        return true
+    })
+
+    return uniqueCells.map((c) => c.text).join("\n")
+}
+
 // ============================================================================
 // XML Parsing Helpers
 // ============================================================================
--- a/package-lock.json
+++ b/package-lock.json
--- a/package.json
+++ b/package.json
@@ -24,21 +24,22 @@
        "dist:all": "npm run electron:build && npm run electron:prepare && npx electron-builder --mac --win --linux"
    },
    "dependencies": {
-        "@ai-sdk/amazon-bedrock": "^3.0.70",
-        "@ai-sdk/anthropic": "^2.0.44",
-        "@ai-sdk/azure": "^2.0.69",
-        "@ai-sdk/deepseek": "^1.0.30",
-        "@ai-sdk/gateway": "^2.0.21",
-        "@ai-sdk/google": "^2.0.0",
-        "@ai-sdk/openai": "^2.0.19",
-        "@ai-sdk/react": "^2.0.107",
+        "@ai-sdk/amazon-bedrock": "^4.0.1",
+        "@ai-sdk/anthropic": "^3.0.0",
+        "@ai-sdk/azure": "^3.0.0",
+        "@ai-sdk/deepseek": "^2.0.0",
+        "@ai-sdk/gateway": "^3.0.0",
+        "@ai-sdk/google": "^3.0.0",
+        "@ai-sdk/openai": "^3.0.0",
+        "@ai-sdk/react": "^3.0.1",
+        "@aws-sdk/client-dynamodb": "^3.957.0",
        "@aws-sdk/credential-providers": "^3.943.0",
        "@formatjs/intl-localematcher": "^0.7.2",
        "@langfuse/client": "^4.4.9",
        "@langfuse/otel": "^4.4.4",
        "@langfuse/tracing": "^4.4.9",
        "@next/third-parties": "^16.0.6",
-        "@openrouter/ai-sdk-provider": "^1.2.3",
+        "@openrouter/ai-sdk-provider": "^1.5.4",
        "@opentelemetry/exporter-trace-otlp-http": "^0.208.0",
        "@opentelemetry/sdk-trace-node": "^2.2.0",
        "@radix-ui/react-alert-dialog": "^1.1.15",
@@ -53,7 +54,7 @@
        "@radix-ui/react-tooltip": "^1.1.8",
        "@radix-ui/react-use-controllable-state": "^1.2.2",
        "@xmldom/xmldom": "^0.9.8",
-        "ai": "^5.0.89",
+        "ai": "^6.0.1",
        "base-64": "^1.0.0",
        "class-variance-authority": "^0.7.1",
        "clsx": "^2.1.1",
@@ -111,5 +112,10 @@
        "tailwindcss": "^4",
        "typescript": "^5",
        "wait-on": "^9.0.3"
+    },
+    "overrides": {
+        "@openrouter/ai-sdk-provider": {
+            "ai": "^6.0.1"
+        }
    }
 }
Author	SHA1	Message	Date
dayuan.jiang	29121f5e78	fix: use totalUsage with all token types for accurate quota tracking The onFinish callback's 'usage' only contains the final step's tokens, which underreports usage for multi-step tool calls (like diagram generation). Changed to 'totalUsage' which provides cumulative counts across all steps. Include all 4 token types for accurate counting: 1. inputTokens - non-cached input tokens 2. outputTokens - generated output tokens 3. cachedInputTokens - tokens read from prompt cache 4. inputTokenDetails.cacheWriteTokens - tokens written to cache Tested locally: - Request 1 (cache write): 334 + 62 + 0 + 6671 = 7,067 tokens - Request 2 (cache read): 334 + 184 + 6551 + 120 = 7,189 tokens - DynamoDB total: 14,256 ✓	2025-12-23 20:16:24 +09:00
Dayuan Jiang	7de192e1fa	fix: enable progressive diagram rendering during streaming (#380 ) - Add extractCompleteMxCells() to extract only complete mxCell elements from partial XML - Remove useEffect cleanup that was killing debounce timeouts on every re-render - Wrap XML in <root> tags for proper DOMParser validation Previously, diagrams only rendered after ALL XML finished streaming because: 1. useEffect cleanup cleared the 150ms debounce timeout on every message change 2. DOMParser rejected partial XML like '<mxCell id="2" value="...' (incomplete) Now each complete mxCell renders progressively as it finishes streaming.	2025-12-23 18:54:03 +09:00
Dayuan Jiang	97ae9395cd	feat: add server-side quota tracking with DynamoDB (#379 ) - Add dynamo-quota-manager.ts for atomic quota checks using ConditionExpression - Enforce daily request limit, daily token limit, and TPM limit - Return 429 with quota details (type, used, limit) when exceeded - Quota is opt-in: only enabled when DYNAMODB_QUOTA_TABLE env var is set - Remove client-side quota enforcement (server is now source of truth) - Simplify use-quota-manager.tsx to only display toasts - Add @aws-sdk/client-dynamodb dependency	2025-12-23 18:36:27 +09:00
Dayuan Jiang	5ec05eb100	refactor: simplify Langfuse integration with AI SDK 6 (#375 ) - Remove manual token attribute setting (AI SDK 6 telemetry auto-reports) - Use totalTokens directly instead of inputTokens + outputTokens calculation - Fix sessionId bug in log-save/log-feedback (prevents wrong trace attachment) - Hash IP addresses for privacy instead of storing raw IPs - Fix isLangfuseEnabled() to check both keys for consistency	2025-12-23 16:26:45 +09:00
Dayuan Jiang	9aec7eda79	fix: add continuation retry limit for truncated diagrams (#372 ) Previously, continuation mode (for truncated XML) had unlimited client-side retries, relying only on server stepCountIs(5) limit. This could cause excessive API calls (495 observed) when XML truncation kept occurring. Added MAX_CONTINUATION_RETRY_COUNT=2 to limit continuation attempts: - After 2 failed continuation attempts, shows error toast and stops - Resets on successful completion or user-initiated message - Also resets when quota limits are hit	2025-12-23 14:17:06 +09:00
Dayuan Jiang	a0fbc0ad33	fix: use last user message for Langfuse trace input (#371 ) In multi-step tool flows, messages array contains assistant messages from previous steps. Using messages[messages.length - 1] would record the assistant's response as trace input instead of the user's question.	2025-12-23 13:43:28 +09:00
Dayuan Jiang	0385c45a10	fix: OpenAI reasoning/thinking blocks not showing (#370 ) - Use Responses API instead of Chat Completions API for OpenAI (.chat() -> default call) to support reasoning events - Add o4 to reasoning model detection - Change default reasoningSummary from 'detailed' to 'auto' (not all models support 'detailed') - Update types to match AI SDK: 'auto' \| 'detailed'	2025-12-23 13:38:50 +09:00
Dayuan Jiang	5262b7bfb2	chore: upgrade AI SDK to v6.0.1 (#369 ) - Upgrade ai package from ^5.0.89 to ^6.0.1 - Upgrade @ai-sdk/* provider packages to latest v3/v4 - Update convertToModelMessages call to async (new API) - Fix usage.cachedInputTokens to usage.inputTokenDetails?.cacheReadTokens	2025-12-23 13:31:42 +09:00
Dayuan Jiang	8cb7494d16	feat(i18n): add translations for model configuration UI (#368 ) - Add ~40 new translation keys for model-config-dialog and model-selector - Support English, Chinese, and Japanese translations - Replace all hardcoded strings with dictionary lookups	2025-12-23 11:42:27 +09:00
Dayuan Jiang	98625dd72a	docs: update about page model info to Haiku 4.5 (#367 )	2025-12-23 10:22:31 +09:00
Dayuan Jiang	b5734aa5e1	chore: hide notice icon from header (#366 )	2025-12-23 10:08:14 +09:00
Dayuan Jiang	87cdc53665	fix: improve Langfuse span filter to exclude all Next.js infrastructure traces (#365 ) * debug: add log to verify instrumentation initialization * fix: improve Langfuse span filter to exclude all Next.js infrastructure traces	2025-12-23 09:47:23 +09:00