Compare commits

...

56 Commits

Author SHA1 Message Date
dayuan.jiang
c42efdc702 chore: bump version to 0.4.0 and update README features 2025-12-11 14:59:09 +09:00
Dayuan Jiang
dd027f1856 chore: update chain-of-thought.txt example file (#215)
Added full content of the Chain-of-Thought Prompting paper by Wei et al. from Google Research. This example file demonstrates how chain-of-thought reasoning improves LLM performance on complex reasoning tasks.
2025-12-11 14:45:29 +09:00
Dayuan Jiang
869391a029 refactor: eliminate code duplication (DRY principle) (#211)
## Problem Solved
Previous refactoring added 105 lines (1476→1581) by extracting code into separate files without eliminating duplication. This refactor focuses on reducing code size through deduplication while maintaining file separation for maintainability.

## Summary
- Reduced total lines from 1581 to 1519 (-62 lines, 3.9% reduction)
- Eliminated duplicate patterns using generic helpers and factory functions
- Maintained file structure for maintainability
- Zero functional changes - same behavior

### Phase 1: DRY use-quota-manager.tsx
- Created parseStorageCount() helper (eliminates 6x localStorage read duplication)
- Created createQuotaChecker() factory (consolidates 3 check function bodies)
- Created createQuotaIncrementer() factory (consolidates 3 increment function bodies)
- Result: 242→247 lines (+5 lines, but fully DRY with eliminated duplication)

### Phase 2: DRY chat-panel.tsx (1176→1109 lines, -67 lines)

#### 2.1: Extract checkAllQuotaLimits helper
- Replaced 3 occurrences of 18-line quota check blocks
- Saved 36 lines

#### 2.2: Extract sendChatMessage helper
- Replaced 3 occurrences of 21-line sendMessage+headers blocks
- Saved 42 lines

#### 2.3: Extract processFilesAndAppendContent helper
- Replaced 2 occurrences of file processing loops
- Handles PDF, text, and image files uniformly
- Async helper with optional image parts parameter
2025-12-11 14:28:02 +09:00
Dayuan Jiang
8b9336466f feat: make PDF/text extraction char limit configurable via env (#214)
Add NEXT_PUBLIC_MAX_EXTRACTED_CHARS environment variable to allow
configuring the maximum characters extracted from PDF and text files.
Defaults to 150000 (150k chars) if not set.
2025-12-11 14:14:31 +09:00
Dayuan Jiang
ee514efa9e fix: implement AZURE_RESOURCE_NAME config for Azure OpenAI (#213)
Previously AZURE_RESOURCE_NAME was documented in env.example but not
actually used in the code. This caused Azure OpenAI configuration to fail
when users set AZURE_RESOURCE_NAME instead of AZURE_BASE_URL.

Changes:
- Read AZURE_RESOURCE_NAME from environment and pass to createAzure()
- resourceName constructs endpoint: https://{name}.openai.azure.com/openai/v1
- baseURL takes precedence over resourceName when both are set
- Updated env.example with clearer documentation

Fixes #208
2025-12-11 13:32:33 +09:00
Dayuan Jiang
e2757a34b7 Replace old asset link with new one
Updated asset link in README.md.
2025-12-11 12:25:00 +09:00
Dayuan Jiang
c0347dd55d fix: prevent diagram regeneration loop after successful display (#206)
- Changed sendAutomaticallyWhen to only auto-resubmit on tool errors
- Extracted logic to shouldAutoResubmit() function with JSDoc
- Added TypeScript interfaces for type safety (MessagePart, ChatMessage)
- Wrapped debug logs with DEBUG flag for production readiness
- Added TOOL_ERROR_STATE constant to avoid hardcoded strings

Problem: AI was regenerating diagrams 3+ times even after successful display
Root cause: lastAssistantMessageIsCompleteWithToolCalls auto-resubmits on both success AND error
Solution: Custom logic that only auto-resubmits on errors, stops on success
2025-12-11 09:47:30 +09:00
Biki Kalita
a047a6ff97 feat: Display AI reasoning/thinking blocks in chat interface (#152)
* feat: Add reasoning/thinking blocks display in chat interface

* feat: add multi-provider options support and replace custom reasoning UI with AI Elements

* resolve conflicting reasoning configs and correct provider-specific reasoning parameters

* try to solve conflict

* fix: simplify reasoning display and remove unnecessary dependencies

- Remove Streamdown dependency (~5MB) - reasoning is plain text only
- Fix Bedrock providerOptions merging for Claude reasoning configs
- Remove unsupported DeepSeek reasoning configuration
- Clean up unused environment variables (REASONING_BUDGET_TOKENS, REASONING_EFFORT, DEEPSEEK_REASONING_*)
- Remove dead commented code from route.ts

Reasoning blocks contain plain thinking text and don't need markdown/diagram/code rendering.

* feat: comprehensive reasoning support improvements

Major improvements:
- Auto-enable reasoning display for all supported models
- Fix provider-specific reasoning configurations
- Remove unnecessary Streamdown dependency (~5MB)
- Clean up debug logging

Provider changes:
- OpenAI: Auto-enable reasoningSummary for o1/o3/gpt-5 models
- Google: Auto-enable includeThoughts for Gemini 2.5/3 models
- Bedrock: Restrict reasoningConfig to only Claude/Nova (fixes MiniMax error)
- Ollama: Add thinking support for qwen3-like models

Other improvements:
- Remove ENABLE_REASONING toggle (always enabled)
- Fix Bedrock providerOptions merging for Claude
- Simplify reasoning component (plain text rendering)
- Clean up unused environment variables

* fix: critical bugs and documentation gaps in reasoning support

Critical fixes:
- Fix Bedrock shallow merge bug (deep merge preserves anthropicBeta + reasoningConfig)
- Add parseInt validation with parseIntSafe helper (prevents NaN errors)
- Validate all numeric env vars with min/max ranges

Documentation improvements:
- Add BEDROCK_REASONING_BUDGET_TOKENS and BEDROCK_REASONING_EFFORT to env.example
- Add OLLAMA_ENABLE_THINKING to env.example
- Update JSDoc with accurate env var list and ranges

Code cleanup:
- Remove debug console.log statements from route.ts
- Refactor duplicate providerOptions assignments

---------

Co-authored-by: Dayuan Jiang <34411969+DayuanJiang@users.noreply.github.com>
Co-authored-by: Dayuan Jiang <jdy.toh@gmail.com>
2025-12-11 00:24:43 +09:00
Dayuan Jiang
d2ba133eaf feat: add PDF and text file upload support (#205)
- Add client-side PDF text extraction using unpdf library
- Support text files (.txt, .md, .json, .csv, .py, .js, .ts, etc.)
- Add file preview with character count for PDF/text files
- Add 150k character limit for extracted content
- Highlight Paper to Diagram example with NEW badge
- Fix React hydration error by adding explicit IDs to ResizablePanelGroup
- Remove code duplication by centralizing file utilities in pdf-utils.ts
2025-12-10 21:32:35 +09:00
Dayuan Jiang
43e5993f47 fix: improve LLM diagram context awareness and image preview (#202)
- Add replaceHistoricalToolInputs to replace XML in tool calls with placeholders
- Send both previousXml and current xml so LLM can understand user's manual edits
- Update system message to mark current XML as authoritative source of truth
- Fix React StrictMode issue with blob URL cleanup in FilePreviewList
- Add unoptimized prop to Image components for blob URLs
2025-12-10 18:04:37 +09:00
Dayuan Jiang
9a954ccb44 fix: correct NEXT_PUBLIC_DRAWIO_BASE_URL in offline deployment docs (#198)
- Fix wrong URL (http://drawio:8080 -> http://localhost:8080)
- Browser cannot resolve Docker internal hostnames
- Add docker-compose.yml example
- Simplify documentation
2025-12-10 14:20:34 +09:00
Dayuan Jiang
9d4c89ec43 docs: improve offline deployment guide with docker compose (#191) 2025-12-10 09:40:48 +09:00
try2love
5da4ef67ec feat:light/dark mode switch (#138)
Summary

- Adds browser theme detection on first visit using
prefers-color-scheme media query
- Renames localStorage key from dark-mode to
next-ai-draw-io-dark-mode for consistency with other keys
- Uses STORAGE_DIAGRAM_XML_KEY constant instead of hardcoded
string in diagram-context.tsx

Changes

app/page.tsx:
- On first visit (no saved preference), detect browser's color
scheme preference
- Update localStorage key to follow project naming convention
(next-ai-draw-io-*)

contexts/diagram-context.tsx:
- Import STORAGE_DIAGRAM_XML_KEY from chat-panel.tsx
- Replace hardcoded "next-ai-draw-io-diagram-xml" with the
constant
2025-12-10 09:21:15 +09:00
Dayuan Jiang
67b0adf211 docs: align Chinese and Japanese README with main README (#190) 2025-12-10 08:42:53 +09:00
Dayuan Jiang
65af353852 Update model requirements and notes in README 2025-12-10 08:19:21 +09:00
Dayuan Jiang
b3deb65584 chore: remove duplicate self-hosted-drawio.md (#188)
Content has been consolidated into docs/offline-deployment.md
2025-12-09 22:27:34 +09:00
Dayuan Jiang
626f3a76b5 docs: add offline deployment guide (#187) 2025-12-09 22:19:28 +09:00
terrydash
97bb350dd6 feat: add support for self-hosted draw.io via DRAWIO_BASE_URL environment variable (#163)
## Summary
Add support for self-hosted draw.io instances via build-time configuration.

## Problem
In some corporate environments, `embed.diagrams.net` is blocked by network policies. 
Users cannot use the application without access to the default draw.io embed URL.

## Solution
- Add `NEXT_PUBLIC_DRAWIO_BASE_URL` environment variable support
- Pass the `baseUrl` prop to the `DrawIoEmbed` component
- Configure Dockerfile to accept build-time argument for the draw.io URL

## Usage
```yaml
# docker-compose.yaml
services:
  drawio:
    image: jgraph/drawio:latest
    ports:
      - "8080:8080"
  
  next-ai-draw-io:
    build:
      context: .
      args:
        - NEXT_PUBLIC_DRAWIO_BASE_URL=http://drawio:8080
```

Or build directly:
```bash
docker build --build-arg NEXT_PUBLIC_DRAWIO_BASE_URL=http://localhost:8080 -t next-ai-draw-io .
```

**Note:** This is a build-time configuration. To change the draw.io URL, you need to rebuild the Docker image.
2025-12-09 22:00:54 +09:00
Dayuan Jiang
97ab82e027 feat: add bring-your-own-API-key support (#186)
- Add AI provider settings to config panel (provider, model, API key, base URL)
- Support 7 providers: OpenAI, Anthropic, Google, Azure, OpenRouter, DeepSeek, SiliconFlow
- Client API keys stored in localStorage, never stored on server
- Client settings override server env vars when provided
- Skip server credential validation when client provides API key
- Bypass usage limits (request/token/TPM) when using own API key
- Add /api/config endpoint for fetching usage limits
- Add privacy notices to settings dialog, about pages, and quota toast
- Add clear settings button to reset saved API keys
- Update README files (EN/CN/JA) with BYOK documentation

Co-authored-by: dayuan.jiang <jiangdy@amazon.co.jp>
2025-12-09 17:50:07 +09:00
dayuan.jiang
77cb10393b fix: translate image content block error to user-friendly message 2025-12-09 16:00:19 +09:00
Dayuan Jiang
967d63c57e feat: support minimax model (#185)
* feat: support minimax model with XML wrapping fix

- Add wrapWithMxFile utility to properly wrap XML for draw.io
- Fix 'Not a diagram file' error when model generates raw <root> XML
- Add supportsPromptCaching check for conditional caching
- Only enable Bedrock prompt caching for Claude models

* docs: update model mention to minimax-m2 across About pages and READMEs

- Update tooltip in chat-panel.tsx to mention minimax-m2 model change
- Update English, Chinese, and Japanese About pages with model change info
- Update English, Chinese, and Japanese READMEs with demo site model note

---------

Co-authored-by: dayuan.jiang <jiangdy@amazon.co.jp>
2025-12-09 15:53:59 +09:00
Dayuan Jiang
914e914423 feat: replace hardcoded usage limits with dynamic env variables (#180)
- About pages now read DAILY_REQUEST_LIMIT, DAILY_TOKEN_LIMIT, TPM_LIMIT from env
- Removed unused /app/api/config/ route
- Numbers formatted as Xk (e.g., 30k, 10k)
2025-12-09 09:57:09 +09:00
Dayuan Jiang
f6cfcab45a fix: upgrade React to 19.1.2 to patch CVE-2025-55182 (#176)
Fixes critical RCE vulnerability (CVSS 10.0) in React Server Components
caused by deserialization of untrusted data.

Closes #175
2025-12-08 23:31:17 +09:00
singledog957
95c5a75ca3 feat: Show detailed error messages instead of generic 'Internal server error' (#144) (#154)
* feat: Show detailed error messages instead of generic 'Internal server error' (#144)

* refactor: simplify error handling logic per feedback

* refactor: imported AI SDK error handler

* fix: remove unused import and expand sensitive data filter

- Remove unused NoSuchModelError import
- Add 'secret', 'password', 'credential' to sensitive data filter

---------

Co-authored-by: dayuan.jiang <jdy.toh@gmail.com>
2025-12-08 20:52:18 +09:00
Dayuan Jiang
ac09f9f8f9 feat: redesign usage limits section on about pages (#173)
- Redesign usage limits card with gradient border and modern styling
- Remove emojis and combine title/subtitle on same line
- Make all 3 language pages (EN/CN/JP) consistent in design
- Update text content with exact localized wording
- Add warning triangle icon in chat panel linking to about page
- Add 'Learn more' link in quota limit toast
- Open about page links in new tab to preserve diagram state
2025-12-08 20:26:51 +09:00
Dayuan Jiang
622829b903 feat: add daily token limit with actual usage tracking (#171)
* feat: add daily token limit with actual usage tracking

- Add DAILY_TOKEN_LIMIT env var for configurable daily token limit
- Track actual tokens from Bedrock API response metadata (not estimates)
- Server sends inputTokens + cachedInputTokens + outputTokens via messageMetadata
- Client increments token count in onFinish callback with actual usage
- Add NaN guards to prevent corrupted localStorage values
- Add token limit toast notification with quota display
- Remove client-side token estimation (was blocking legitimate requests)
- Switch to js-tiktoken for client compatibility (pure JS, no WASM)

* feat: add TPM (tokens per minute) rate limiting

- Add 50k tokens/min client-side rate limit
- Track tokens per minute with automatic minute rollover
- Check TPM limit after daily limits pass
- Show toast when rate limit reached
- NaN guards for localStorage values

* feat: make TPM limit configurable via TPM_LIMIT env var

* chore: restore cache debug logs

* fix: prevent race condition in TPM tracking

checkTPMLimit was resetting TPM count to 0 when checking, which
overwrote the count saved by incrementTPMCount. Now checkTPMLimit
only reads and incrementTPMCount handles all writes.

* chore: improve TPM limit error message clarity
2025-12-08 18:56:34 +09:00
Dayuan Jiang
728dda5267 feat: add daily request limit with custom toast notification (#167)
- Add DAILY_REQUEST_LIMIT env var support in config API
- Track request count in localStorage (resets daily)
- Show friendly quota limit toast with self-host/sponsor links
- Apply limit to send, regenerate, and edit message actions
2025-12-08 14:26:01 +09:00
Dayuan Jiang
b68b1b0f33 chore: clean up root directory by moving docs (#166)
Co-authored-by: dayuan.jiang <jiangdy@amazon.co.jp>
2025-12-08 12:08:35 +09:00
Dayuan Jiang
bd23aed93b docs: update README with badges, TOC, and live demo button (#165)
* docs: update README with badges, TOC, and reorganized sections

* docs: add SVG button for live demo link
2025-12-08 11:48:14 +09:00
Dayuan Jiang
95aa4b8a56 chore: remove Amplify integration (#164)
Co-authored-by: dayuan.jiang <jiangdy@amazon.co.jp>
2025-12-08 11:39:32 +09:00
Dayuan Jiang
4070772733 perf: disable prefetch on About link to reduce requests (#162) 2025-12-08 11:03:29 +09:00
Dayuan Jiang
c4aaa7c915 feat: only show access code field when password is configured (#161)
Co-authored-by: dayuan.jiang <jiangdy@amazon.co.jp>
2025-12-08 11:00:14 +09:00
Dayuan Jiang
ecea8a6005 fix: use static maxDuration value for Next.js 16 compatibility (#160) 2025-12-08 10:56:37 +09:00
Dayuan Jiang
ebc622144b chore: add Cloudflare-related files to gitignore (#159) 2025-12-08 10:42:08 +09:00
Dayuan Jiang
ee9267d54c chore: make maxDuration configurable via env variable (#157)
Co-authored-by: dayuan.jiang <jiangdy@amazon.co.jp>
2025-12-08 10:20:52 +09:00
Dayuan Jiang
f6682fe3ac fix(diagram): wrap XML in mxfile structure when canvas is empty (#156)
When clicking an example diagram on a clean page (no existing diagram),
the code was passing raw <root>...</root> format XML directly to DrawIO.
However, DrawIO expects the full mxfile/diagram/mxGraphModel/root structure.

This fix provides a default mxfile template when chartXML is empty,
ensuring replaceNodes properly wraps the content before loading into DrawIO.

Co-authored-by: dayuan.jiang <jiangdy@amazon.co.jp>
2025-12-08 10:10:52 +09:00
Biki Kalita
03db4c8096 fix(diagram): save history snapshot after edit_diagram tool execution (#155) 2025-12-08 09:44:10 +09:00
dayuan.jiang
167f5ed36a feat: enable recordInputs in Langfuse telemetry
Enable full message history recording including XML tool calls for better observability.
2025-12-07 20:58:44 +09:00
Dayuan Jiang
cd8e0e2263 feat: add token counting utility for system prompts (#153)
Co-authored-by: dayuan.jiang <jiangdy@amazon.co.jp>
2025-12-07 20:33:43 +09:00
Dayuan Jiang
8c431ee6ed fix: preserve message parts order in chat display (#151)
- Fix bug where text after tool calls was merged with initial text
- Group consecutive text/file parts into bubbles while keeping tools in order
- Parts now display as: plan -> tool_result -> additional text
- Remove debug logs from fixToolCallInputs function

Co-authored-by: dayuan.jiang <jiangdy@amazon.co.jp>
2025-12-07 19:56:31 +09:00
Dayuan Jiang
86420a42c6 fix: implement client-side caching for example diagrams (#150)
- Add client-side cache check in onFormSubmit to bypass API calls for example prompts
- Use findCachedResponse to match input against cached examples
- Directly set messages with cached tool response when example matches
- Hide regenerate button for cached example responses (toolCallId starts with 'cached-')
- Prevents unnecessary API calls when using example buttons

Co-authored-by: dayuan.jiang <jiangdy@amazon.co.jp>
2025-12-07 19:36:09 +09:00
Dayuan Jiang
0baf21fadb fix: validate XML before displaying diagram to catch duplicate IDs (#147)
- Add validation to loadDiagram in diagram-context, returns error or null
- display_diagram and edit_diagram tools now check validation result
- Return error to AI agent with state: output-error so it can retry
- Skip validation for trusted sources (localStorage, history, internal templates)
- Add debug logging for tool call inputs to diagnose Bedrock API issues
2025-12-07 14:38:15 +09:00
Dayuan Jiang
a54068fec2 docs: add --env-file option for Docker and simplify install instructions (#142) 2025-12-07 11:49:01 +09:00
Dayuan Jiang
e25fd367d5 chore: add GitHub issue templates for bug reports and feature requests (#140) 2025-12-07 11:00:25 +09:00
Aurelius Huang
3264244fe9 fix: add deps: @opentelemetry/exporter-trace-otlp-http version 0.208.0 (#124) 2025-12-07 10:49:01 +09:00
QiyuanChen
d8cdd049d1 feat: add SiliconFlow as a supported AI provider (#137)
* feat: add SiliconFlow as a supported AI provider in documentation and configuration

* fix: update SiliconFlow configuration comment to English
2025-12-07 10:22:57 +09:00
Dayuan Jiang
b1bc1a6dc6 feat: auto-save and restore session state (#135)
- Save and restore chat messages, XML snapshots, session ID, and diagram XML to localStorage
- Restore diagram when DrawIO becomes ready (using new onLoad callback)
- Change close protection default to false since auto-save handles persistence
- Clear localStorage when clearing chat
- Handle edge cases: undefined edit fields, empty chartXML, missing access code header
2025-12-07 01:39:09 +09:00
Biki Kalita
8b578a456e fix: Remove hardcoded temperature parameter to support models that don't support it (#133)
* Fix: remove hardcoded temperature parameter to support reasoning models

* feat: make temperature configurable via AI_TEMPERATURE env var

- Instead of removing temperature entirely, make it optional via env var
- Set AI_TEMPERATURE=0 for deterministic output (recommended for diagrams)
- Leave unset for models that don't support temperature (e.g., GPT-5.1 reasoning)

* docs: add AI_TEMPERATURE env var documentation

- Update env.example with AI_TEMPERATURE option
- Update README.md configuration section
- Add Temperature Setting section in ai-providers.md

* docs: add TEMPERATURE env var documentation

- Update env.example with TEMPERATURE option
- Update README.md, README_CN.md, README_JA.md configuration sections
- Add Temperature Setting section in ai-providers.md
- Update route.ts to use TEMPERATURE env var

---------

Co-authored-by: dayuan.jiang <jiangdy@amazon.co.jp>
2025-12-07 01:34:59 +09:00
Dayuan Jiang
05d58025c4 fix: fix hydration mismatch for DrawIO theme loading (#131)
- Load DrawIO theme from localStorage after mount with useEffect
- Add loading spinner while theme loads
- Prevents SSR/client hydration mismatch (server has no localStorage)

Co-authored-by: dayuan.jiang <jiangdy@amazon.co.jp>
2025-12-07 00:45:19 +09:00
Dayuan Jiang
4be64317b3 feat: enhance system prompts with JSON escaping and edge routing rules (#132)
- Add JSON escaping warnings to help model generate valid tool calls
- Add comprehensive edge routing rules to prevent overlapping lines
- Add planning guidance for diagram creation
- Update token count estimates in comments

Co-authored-by: dayuan.jiang <jiangdy@amazon.co.jp>
2025-12-07 00:40:23 +09:00
Dayuan Jiang
2fac6323f0 fix: add orphaned mxPoint validation and cleanup (#130)
- Add validation for orphaned mxPoint elements in validateMxCellStructure()
- Add cleanup of orphaned mxPoint elements in convertToLegalXml()
- Orphaned mxPoints cause 'Could not add object mxPoint' errors in draw.io
- mxPoint elements must have 'as' attribute or be inside <Array as="points">

Co-authored-by: dayuan.jiang <jiangdy@amazon.co.jp>
2025-12-07 00:40:19 +09:00
Dayuan Jiang
a415c46b66 feat: improve XML search/replace matching strategies (#129)
- Add 6th strategy: match by value attribute (label text)
- Add 7th strategy: normalized whitespace match
- Remove lastProcessedIndex tracking - always search from beginning
- Pairs may not be in document order, so sequential tracking was unreliable

Co-authored-by: dayuan.jiang <jiangdy@amazon.co.jp>
2025-12-07 00:40:16 +09:00
Dayuan Jiang
3894abd9ed feat: add tool call JSON repair and Bedrock compatibility (#127)
- Add fixToolCallInputs() to fix Bedrock API requirement (JSON object, not string)
- Add experimental_repairToolCall for malformed JSON from model
- Add stepCountIs(5) limit to prevent infinite loops
- Update edit_diagram tool description with JSON escaping warning

Co-authored-by: dayuan.jiang <jiangdy@amazon.co.jp>
2025-12-07 00:40:13 +09:00
Dayuan Jiang
6965a54f48 feat: upgrade @ai-sdk/react to 2.0.107 and migrate to new API (#126)
- Upgrade @ai-sdk/react from 2.0.22 to 2.0.107
- Migrate from addToolResult to addToolOutput (new API)
- Add output-error state for proper error signaling to model
- Add sendAutomaticallyWhen for auto-retry on tool errors
- Add stop function ref for potential future use

Co-authored-by: dayuan.jiang <jiangdy@amazon.co.jp>
2025-12-07 00:40:11 +09:00
Dayuan Jiang
46567cb0b8 feat: verify access code with server before saving (#128) 2025-12-07 00:21:59 +09:00
Dayuan Jiang
9f77199272 feat: add configurable close protection setting (#123)
- Add Close Protection toggle to Settings dialog
- Save setting to localStorage (default: enabled)
- Make beforeunload confirmation conditional
- Settings button now always visible in header
- Add shadcn Switch and Label components
2025-12-06 21:42:28 +09:00
51 changed files with 5708 additions and 1100 deletions

35
.github/ISSUE_TEMPLATE/bug_report.md vendored Normal file
View File

@@ -0,0 +1,35 @@
---
name: Bug Report
about: Report a bug to help us improve
title: '[Bug] '
labels: bug
assignees: ''
---
> **Note**: This template is just a guide. Feel free to ignore the format entirely - any feedback is welcome! Don't let the template stop you from sharing your thoughts.
## Bug Description
A brief description of the issue.
## Steps to Reproduce
1. Go to '...'
2. Click on '...'
3. Scroll to '...'
4. See error
## Expected Behavior
What you expected to happen.
## Actual Behavior
What actually happened.
## Screenshots
If applicable, add screenshots to help explain the problem.
## Environment
- OS: [e.g. Windows 11, macOS 14]
- Browser: [e.g. Chrome 120, Safari 17]
- Version: [e.g. 1.0.0]
## Additional Context
Any other information about the problem.

5
.github/ISSUE_TEMPLATE/config.yml vendored Normal file
View File

@@ -0,0 +1,5 @@
blank_issues_enabled: true
contact_links:
- name: Discussions
url: https://github.com/DayuanJiang/next-ai-draw-io/discussions
about: Have questions or ideas? Feel free to start a discussion

View File

@@ -0,0 +1,25 @@
---
name: Feature Request
about: Suggest a new feature for this project
title: '[Feature] '
labels: enhancement
assignees: ''
---
> **Note**: This template is just a guide. Feel free to ignore the format entirely - any feedback is welcome! Don't let the template stop you from sharing your ideas.
## Feature Description
A brief description of the feature you'd like.
## Problem Context
Is this related to a problem? Please describe.
e.g. I'm always frustrated when [...]
## Proposed Solution
How you'd like this feature to work.
## Alternatives Considered
Any alternative solutions or features you've considered.
## Additional Context
Any other information or screenshots about the feature request.

8
.gitignore vendored
View File

@@ -40,5 +40,9 @@ yarn-error.log*
*.tsbuildinfo
next-env.d.ts
push-via-ec2.sh
.claude/settings.local.json
.playwright-mcp/
.claude/
.playwright-mcp/
# Cloudflare
.dev.vars
.open-next/
.wrangler/

View File

@@ -22,6 +22,10 @@ COPY . .
# Disable Next.js telemetry during build
ENV NEXT_TELEMETRY_DISABLED=1
# Build-time argument for self-hosted draw.io URL
ARG NEXT_PUBLIC_DRAWIO_BASE_URL=https://embed.diagrams.net
ENV NEXT_PUBLIC_DRAWIO_BASE_URL=${NEXT_PUBLIC_DRAWIO_BASE_URL}
# Build Next.js application (standalone mode)
RUN npm run build

142
README.md
View File

@@ -4,31 +4,44 @@
**AI-Powered Diagram Creation Tool - Chat, Draw, Visualize**
English | [中文](./README_CN.md) | [日本語](./README_JA.md)
English | [中文](./docs/README_CN.md) | [日本語](./docs/README_JA.md)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Next.js](https://img.shields.io/badge/Next.js-15.x-black)](https://nextjs.org/)
[![TypeScript](https://img.shields.io/badge/TypeScript-5.x-blue)](https://www.typescriptlang.org/)
[![TrendShift](https://trendshift.io/api/badge/repositories/15449)](https://next-ai-drawio.jiang.jp/)
[![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Next.js](https://img.shields.io/badge/Next.js-16.x-black)](https://nextjs.org/)
[![React](https://img.shields.io/badge/React-19.x-61dafb)](https://react.dev/)
[![Sponsor](https://img.shields.io/badge/Sponsor-❤-ea4aaa)](https://github.com/sponsors/DayuanJiang)
[🚀 Live Demo](https://next-ai-drawio.jiang.jp/)
[![Live Demo](./public/live-demo-button.svg)](https://next-ai-drawio.jiang.jp/)
</div>
A Next.js web application that integrates AI capabilities with draw.io diagrams. Create, modify, and enhance diagrams through natural language commands and AI-assisted visualization.
https://github.com/user-attachments/assets/b2eef5f3-b335-4e71-a755-dc2e80931979
## Features
- **LLM-Powered Diagram Creation**: Leverage Large Language Models to create and manipulate draw.io diagrams directly through natural language commands
- **Image-Based Diagram Replication**: Upload existing diagrams or images and have the AI replicate and enhance them automatically
- **Diagram History**: Comprehensive version control that tracks all changes, allowing you to view and restore previous versions of your diagrams before the AI editing.
- **Interactive Chat Interface**: Communicate with AI to refine your diagrams in real-time
- **AWS Architecture Diagram Support**: Specialized support for generating AWS architecture diagrams
- **Animated Connectors**: Create dynamic and animated connectors between diagram elements for better visualization
https://github.com/user-attachments/assets/9d60a3e8-4a1c-4b5e-acbb-26af2d3eabd1
## **Examples**
## Table of Contents
- [Next AI Draw.io ](#next-ai-drawio-)
- [Table of Contents](#table-of-contents)
- [Examples](#examples)
- [Features](#features)
- [Getting Started](#getting-started)
- [Try it Online](#try-it-online)
- [Run with Docker (Recommended)](#run-with-docker-recommended)
- [Installation](#installation)
- [Deployment](#deployment)
- [Multi-Provider Support](#multi-provider-support)
- [How It Works](#how-it-works)
- [Project Structure](#project-structure)
- [Support \& Contact](#support--contact)
- [Star History](#star-history)
## Examples
Here are some example prompts and their generated diagrams:
@@ -68,37 +81,29 @@ Here are some example prompts and their generated diagrams:
</table>
</div>
## How It Works
## Features
The application uses the following technologies:
- **Next.js**: For the frontend framework and routing
- **Vercel AI SDK** (`ai` + `@ai-sdk/*`): For streaming AI responses and multi-provider support
- **react-drawio**: For diagram representation and manipulation
Diagrams are represented as XML that can be rendered in draw.io. The AI processes your commands and generates or modifies this XML accordingly.
## Multi-Provider Support
- AWS Bedrock (default)
- OpenAI
- Anthropic
- Google AI
- Azure OpenAI
- Ollama
- OpenRouter
- DeepSeek
All providers except AWS Bedrock and OpenRouter support custom endpoints.
📖 **[Detailed Provider Configuration Guide](./docs/ai-providers.md)** - See setup instructions for each provider.
**Model Requirements**: This task requires strong model capabilities for generating long-form text with strict formatting constraints (draw.io XML). Recommended models include Claude Sonnet 4.5, GPT-4o, Gemini 2.0, and DeepSeek V3/R1.
Note that `claude-sonnet-4-5` has trained on draw.io diagrams with AWS logos, so if you want to create AWS architecture diagrams, this is the best choice.
- **LLM-Powered Diagram Creation**: Leverage Large Language Models to create and manipulate draw.io diagrams directly through natural language commands
- **Image-Based Diagram Replication**: Upload existing diagrams or images and have the AI replicate and enhance them automatically
- **PDF & Text File Upload**: Upload PDF documents and text files to extract content and generate diagrams from existing documents
- **AI Reasoning Display**: View the AI's thinking process for supported models (OpenAI o1/o3, Gemini, Claude, etc.)
- **Diagram History**: Comprehensive version control that tracks all changes, allowing you to view and restore previous versions of your diagrams before the AI editing.
- **Interactive Chat Interface**: Communicate with AI to refine your diagrams in real-time
- **Cloud Architecture Diagram Support**: Specialized support for generating cloud architecture diagrams (AWS, GCP, Azure)
- **Animated Connectors**: Create dynamic and animated connectors between diagram elements for better visualization
## Getting Started
### Try it Online
No installation needed! Try the app directly on our demo site:
[![Live Demo](./public/live-demo-button.svg)](https://next-ai-drawio.jiang.jp/)
> Note: Due to high traffic, the demo site currently uses minimax-m2. For best results, we recommend self-hosting with Claude Sonnet 4.5 or Claude Opus 4.5.
> **Bring Your Own API Key**: You can use your own API key to bypass usage limits on the demo site. Click the Settings icon in the chat panel to configure your provider and API key. Your key is stored locally in your browser and is never stored on the server.
### Run with Docker (Recommended)
If you just want to run it locally, the best way is to use Docker.
@@ -115,10 +120,20 @@ docker run -d -p 3000:3000 \
ghcr.io/dayuanjiang/next-ai-draw-io:latest
```
Or use an env file:
```bash
cp env.example .env
# Edit .env with your configuration
docker run -d -p 3000:3000 --env-file .env ghcr.io/dayuanjiang/next-ai-draw-io:latest
```
Open [http://localhost:3000](http://localhost:3000) in your browser.
Replace the environment variables with your preferred AI provider configuration. See [Multi-Provider Support](#multi-provider-support) for available options.
> **Offline Deployment:** If `embed.diagrams.net` is blocked, see [Offline Deployment](./docs/offline-deployment.md) for configuration options.
### Installation
1. Clone the repository:
@@ -132,8 +147,6 @@ cd next-ai-draw-io
```bash
npm install
# or
yarn install
```
3. Configure your AI provider:
@@ -146,9 +159,10 @@ cp env.example .env.local
Edit `.env.local` and configure your chosen provider:
- Set `AI_PROVIDER` to your chosen provider (bedrock, openai, anthropic, google, azure, ollama, openrouter, deepseek)
- Set `AI_PROVIDER` to your chosen provider (bedrock, openai, anthropic, google, azure, ollama, openrouter, deepseek, siliconflow)
- Set `AI_MODEL` to the specific model you want to use
- Add the required API keys for your provider
- `TEMPERATURE`: Optional temperature setting (e.g., `0` for deterministic output). Leave unset for models that don't support it (e.g., reasoning models).
- `ACCESS_CODE_LIST`: Optional access password(s), can be comma-separated for multiple passwords.
> Warning: If you do not set `ACCESS_CODE_LIST`, anyone can access your deployed site directly, which may lead to rapid depletion of your token. It is recommended to set this option.
@@ -174,6 +188,38 @@ Or you can deploy by this button.
Be sure to **set the environment variables** in the Vercel dashboard as you did in your local `.env.local` file.
## Multi-Provider Support
- AWS Bedrock (default)
- OpenAI
- Anthropic
- Google AI
- Azure OpenAI
- Ollama
- OpenRouter
- DeepSeek
- SiliconFlow
All providers except AWS Bedrock and OpenRouter support custom endpoints.
📖 **[Detailed Provider Configuration Guide](./docs/ai-providers.md)** - See setup instructions for each provider.
**Model Requirements**: This task requires strong model capabilities for generating long-form text with strict formatting constraints (draw.io XML). Recommended models include Claude Sonnet 4.5, GPT-5.1, Gemini 3 Pro, and DeepSeek V3.2/R1.
Note that `claude` series has trained on draw.io diagrams with cloud architecture logos like AWS, Azue, GCP. So if you want to create cloud architecture diagrams, this is the best choice.
## How It Works
The application uses the following technologies:
- **Next.js**: For the frontend framework and routing
- **Vercel AI SDK** (`ai` + `@ai-sdk/*`): For streaming AI responses and multi-provider support
- **react-drawio**: For diagram representation and manipulation
Diagrams are represented as XML that can be rendered in draw.io. The AI processes your commands and generates or modifies this XML accordingly.
## Project Structure
```
@@ -193,14 +239,6 @@ lib/ # Utility functions and helpers
public/ # Static assets including example images
```
## TODOs
- [x] Allow the LLM to modify the XML instead of generating it from scratch everytime.
- [x] Improve the smoothness of shape streaming updates.
- [x] Add multiple AI provider support (OpenAI, Anthropic, Google, Azure, Ollama)
- [x] Solve the bug that generation will fail for session that longer than 60s.
- [ ] Add API config on the UI.
## Support & Contact
If you find this project useful, please consider [sponsoring](https://github.com/sponsors/DayuanJiang) to help me host the live demo site!

View File

@@ -1,22 +0,0 @@
version: 1
frontend:
phases:
preBuild:
commands:
- npm ci --cache .npm --prefer-offline
build:
commands:
# Write env vars to .env.production for Next.js SSR runtime
- env | grep -e AI_MODEL >> .env.production
- env | grep -e AI_PROVIDER >> .env.production
- env | grep -e OPENAI_API_KEY >> .env.production
- env | grep -e NEXT_PUBLIC_ >> .env.production
- npm run build
artifacts:
baseDirectory: .next
files:
- '**/*'
cache:
paths:
- .next/cache/**/*
- .npm/**/*

View File

@@ -10,7 +10,18 @@ export const metadata: Metadata = {
keywords: ["AI图表", "draw.io", "AWS架构", "GCP图表", "Azure图表", "LLM"],
}
function formatNumber(num: number): string {
if (num >= 1000) {
return `${num / 1000}k`
}
return num.toString()
}
export default function AboutCN() {
const dailyRequestLimit = Number(process.env.DAILY_REQUEST_LIMIT) || 20
const dailyTokenLimit = Number(process.env.DAILY_TOKEN_LIMIT) || 500000
const tpmLimit = Number(process.env.TPM_LIMIT) || 50000
return (
<div className="min-h-screen bg-gray-50">
{/* Navigation */}
@@ -85,12 +96,124 @@ export default function AboutCN() {
</div>
</div>
<div className="bg-amber-50 border border-amber-200 rounded-lg p-4 mb-6">
<p className="text-amber-800">
Claude Opus 4.5
Claude Haiku 4.5
</p>
<div className="relative mb-8 rounded-2xl bg-gradient-to-br from-amber-50 via-orange-50 to-rose-50 p-[1px] shadow-lg">
<div className="absolute inset-0 rounded-2xl bg-gradient-to-br from-amber-400 via-orange-400 to-rose-400 opacity-20" />
<div className="relative rounded-2xl bg-white/80 backdrop-blur-sm p-6">
{/* Header */}
<div className="mb-4">
<h3 className="text-lg font-bold text-gray-900 tracking-tight">
{" "}
<span className="text-sm text-amber-600 font-medium italic font-normal">
()
</span>
</h3>
</div>
{/* Story */}
<div className="space-y-3 text-sm text-gray-700 leading-relaxed mb-5">
<p>
AI
(TPS/TPM)
</p>
<p>
使 Claude {" "}
<span className="font-semibold text-amber-700">
minimax-m2
</span>
</p>
<p>
<span className="font-semibold text-amber-700">
</span>
API
</p>
</div>
{/* Limits Cards */}
<div className="grid grid-cols-2 gap-3 mb-5">
<div className="rounded-xl bg-gradient-to-br from-amber-100 to-orange-100 p-4 text-center">
<div className="text-xs font-medium text-amber-700 uppercase tracking-wide mb-1">
Token
</div>
<div className="text-lg font-bold text-gray-900">
{formatNumber(tpmLimit)}
<span className="text-sm font-normal text-gray-600">
/
</span>
</div>
<div className="text-lg font-bold text-gray-900">
{formatNumber(dailyTokenLimit)}
<span className="text-sm font-normal text-gray-600">
/
</span>
</div>
</div>
<div className="rounded-xl bg-gradient-to-br from-amber-100 to-orange-100 p-4 text-center">
<div className="text-xs font-medium text-amber-700 uppercase tracking-wide mb-1">
</div>
<div className="text-2xl font-bold text-gray-900">
{dailyRequestLimit}
</div>
<div className="text-sm text-gray-600">
</div>
</div>
</div>
{/* Divider */}
<div className="flex items-center gap-3 my-5">
<div className="flex-1 h-px bg-gradient-to-r from-transparent via-amber-300 to-transparent" />
</div>
{/* Bring Your Own Key */}
<div className="text-center mb-5">
<h4 className="text-base font-bold text-gray-900 mb-2">
使 API Key
</h4>
<p className="text-sm text-gray-600 mb-2 max-w-md mx-auto">
使 API Key
Provider API Key
</p>
<p className="text-xs text-gray-500 max-w-md mx-auto">
Key
</p>
</div>
{/* Divider */}
<div className="flex items-center gap-3 mb-5">
<div className="flex-1 h-px bg-gradient-to-r from-transparent via-amber-300 to-transparent" />
</div>
{/* Sponsorship CTA */}
<div className="text-center">
<h4 className="text-base font-bold text-gray-900 mb-2">
()
</h4>
<p className="text-sm text-gray-600 mb-4 max-w-md mx-auto">
AI API
</p>
<p className="text-sm text-gray-600 mb-4 max-w-md mx-auto">
GitHub Live Demo
Logo
</p>
<a
href="mailto:me@jiang.jp"
className="inline-flex items-center gap-2 px-5 py-2.5 rounded-full bg-gradient-to-r from-amber-500 to-orange-500 text-white font-medium text-sm shadow-md hover:shadow-lg hover:scale-105 transition-all duration-200"
>
</a>
</div>
</div>
</div>
<p className="text-gray-700">

View File

@@ -17,7 +17,18 @@ export const metadata: Metadata = {
],
}
function formatNumber(num: number): string {
if (num >= 1000) {
return `${num / 1000}k`
}
return num.toString()
}
export default function AboutJA() {
const dailyRequestLimit = Number(process.env.DAILY_REQUEST_LIMIT) || 20
const dailyTokenLimit = Number(process.env.DAILY_TOKEN_LIMIT) || 500000
const tpmLimit = Number(process.env.TPM_LIMIT) || 50000
return (
<div className="min-h-screen bg-gray-50">
{/* Navigation */}
@@ -93,13 +104,121 @@ export default function AboutJA() {
</div>
</div>
<div className="bg-amber-50 border border-amber-200 rounded-lg p-4 mb-6">
<p className="text-amber-800">
Claude
Opus 4.5
Claude Haiku 4.5
</p>
<div className="relative mb-8 rounded-2xl bg-gradient-to-br from-amber-50 via-orange-50 to-rose-50 p-[1px] shadow-lg">
<div className="absolute inset-0 rounded-2xl bg-gradient-to-br from-amber-400 via-orange-400 to-rose-400 opacity-20" />
<div className="relative rounded-2xl bg-white/80 backdrop-blur-sm p-6">
{/* Header */}
<div className="mb-4">
<h3 className="text-lg font-bold text-gray-900 tracking-tight">
{" "}
<span className="text-sm text-amber-600 font-medium italic font-normal">
</span>
</h3>
</div>
{/* Story */}
<div className="space-y-3 text-sm text-gray-700 leading-relaxed mb-5">
<p>
AI API (TPS/TPM)
</p>
<p>
Claude {" "}
<span className="font-semibold text-amber-700">
minimax-m2
</span>{" "}
</p>
<p>
<span className="font-semibold text-amber-700">
</span>
API
</p>
</div>
{/* Limits Cards */}
<div className="grid grid-cols-2 gap-3 mb-5">
<div className="rounded-xl bg-gradient-to-br from-amber-100 to-orange-100 p-4 text-center">
<div className="text-xs font-medium text-amber-700 uppercase tracking-wide mb-1">
使
</div>
<div className="text-lg font-bold text-gray-900">
{formatNumber(tpmLimit)}
<span className="text-sm font-normal text-gray-600">
/
</span>
</div>
<div className="text-lg font-bold text-gray-900">
{formatNumber(dailyTokenLimit)}
<span className="text-sm font-normal text-gray-600">
/
</span>
</div>
</div>
<div className="rounded-xl bg-gradient-to-br from-amber-100 to-orange-100 p-4 text-center">
<div className="text-xs font-medium text-amber-700 uppercase tracking-wide mb-1">
1
</div>
<div className="text-2xl font-bold text-gray-900">
{dailyRequestLimit}
</div>
<div className="text-sm text-gray-600">
</div>
</div>
</div>
{/* Divider */}
<div className="flex items-center gap-3 my-5">
<div className="flex-1 h-px bg-gradient-to-r from-transparent via-amber-300 to-transparent" />
</div>
{/* Bring Your Own Key */}
<div className="text-center mb-5">
<h4 className="text-base font-bold text-gray-900 mb-2">
APIキーを使用
</h4>
<p className="text-sm text-gray-600 mb-2 max-w-md mx-auto">
APIキーを使用することでAPIキーを設定してください
</p>
<p className="text-xs text-gray-500 max-w-md mx-auto">
</p>
</div>
{/* Divider */}
<div className="flex items-center gap-3 mb-5">
<div className="flex-1 h-px bg-gradient-to-r from-transparent via-amber-300 to-transparent" />
</div>
{/* Sponsorship CTA */}
<div className="text-center">
<h4 className="text-base font-bold text-gray-900 mb-2">
</h4>
<p className="text-sm text-gray-600 mb-4 max-w-md mx-auto">
AI
API
</p>
<p className="text-sm text-gray-600 mb-4 max-w-md mx-auto">
GitHub
</p>
<a
href="mailto:me@jiang.jp"
className="inline-flex items-center gap-2 px-5 py-2.5 rounded-full bg-gradient-to-r from-amber-500 to-orange-500 text-white font-medium text-sm shadow-md hover:shadow-lg hover:scale-105 transition-all duration-200"
>
</a>
</div>
</div>
</div>
<p className="text-gray-700">

View File

@@ -17,7 +17,18 @@ export const metadata: Metadata = {
],
}
function formatNumber(num: number): string {
if (num >= 1000) {
return `${num / 1000}k`
}
return num.toString()
}
export default function About() {
const dailyRequestLimit = Number(process.env.DAILY_REQUEST_LIMIT) || 20
const dailyTokenLimit = Number(process.env.DAILY_TOKEN_LIMIT) || 500000
const tpmLimit = Number(process.env.TPM_LIMIT) || 50000
return (
<div className="min-h-screen bg-gray-50">
{/* Navigation */}
@@ -93,15 +104,134 @@ export default function About() {
</div>
</div>
<div className="bg-amber-50 border border-amber-200 rounded-lg p-4 mb-6">
<p className="text-amber-800">
This app is designed to run on Claude Opus 4.5 for
best performance. However, due to
higher-than-expected traffic, running the top-tier
model has become cost-prohibitive. To avoid service
interruptions and manage costs, I have switched the
backend to Claude Haiku 4.5.
</p>
<div className="relative mb-8 rounded-2xl bg-gradient-to-br from-amber-50 via-orange-50 to-rose-50 p-[1px] shadow-lg">
<div className="absolute inset-0 rounded-2xl bg-gradient-to-br from-amber-400 via-orange-400 to-rose-400 opacity-20" />
<div className="relative rounded-2xl bg-white/80 backdrop-blur-sm p-6">
{/* Header */}
<div className="mb-4">
<h3 className="text-lg font-bold text-gray-900 tracking-tight">
Model Change & Usage Limits{" "}
<span className="text-sm text-amber-600 font-medium italic font-normal">
(Or: Why My Wallet is Crying)
</span>
</h3>
</div>
{/* Story */}
<div className="space-y-3 text-sm text-gray-700 leading-relaxed mb-5">
<p>
The response to this project has been
incredibleyou all love making diagrams!
However, this enthusiasm means we are
frequently hitting the AI API rate limits
(TPS/TPM). When this happens, the system
pauses, leading to failed requests.
</p>
<p>
Due to the high usage, I have changed the
model from Claude to{" "}
<span className="font-semibold text-amber-700">
minimax-m2
</span>
, which is more cost-effective.
</p>
<p>
As an{" "}
<span className="font-semibold text-amber-700">
indie developer
</span>
, I am currently footing the entire API
bill. To keep the lights on and ensure the
service remains available to everyone
without sending me into debt, I have also
implemented the following temporary caps:
</p>
</div>
{/* Limits Cards */}
<div className="grid grid-cols-2 gap-3 mb-5">
<div className="rounded-xl bg-gradient-to-br from-amber-100 to-orange-100 p-4 text-center">
<div className="text-xs font-medium text-amber-700 uppercase tracking-wide mb-1">
Token Usage
</div>
<div className="text-lg font-bold text-gray-900">
{formatNumber(tpmLimit)}
<span className="text-sm font-normal text-gray-600">
/min
</span>
</div>
<div className="text-lg font-bold text-gray-900">
{formatNumber(dailyTokenLimit)}
<span className="text-sm font-normal text-gray-600">
/day
</span>
</div>
</div>
<div className="rounded-xl bg-gradient-to-br from-amber-100 to-orange-100 p-4 text-center">
<div className="text-xs font-medium text-amber-700 uppercase tracking-wide mb-1">
Daily Requests
</div>
<div className="text-2xl font-bold text-gray-900">
{dailyRequestLimit}
</div>
<div className="text-sm text-gray-600">
requests
</div>
</div>
</div>
{/* Divider */}
<div className="flex items-center gap-3 my-5">
<div className="flex-1 h-px bg-gradient-to-r from-transparent via-amber-300 to-transparent" />
</div>
{/* Bring Your Own Key */}
<div className="text-center mb-5">
<h4 className="text-base font-bold text-gray-900 mb-2">
Bring Your Own API Key
</h4>
<p className="text-sm text-gray-600 mb-2 max-w-md mx-auto">
You can use your own API key to bypass these
limits. Click the Settings icon in the chat
panel to configure your provider and API
key.
</p>
<p className="text-xs text-gray-500 max-w-md mx-auto">
Your key is stored locally in your browser
and is never stored on the server.
</p>
</div>
{/* Divider */}
<div className="flex items-center gap-3 mb-5">
<div className="flex-1 h-px bg-gradient-to-r from-transparent via-amber-300 to-transparent" />
</div>
{/* Sponsorship CTA */}
<div className="text-center">
<h4 className="text-base font-bold text-gray-900 mb-2">
Call for Sponsorship
</h4>
<p className="text-sm text-gray-600 mb-4 max-w-md mx-auto">
Scaling the backend is the only way to
remove these limits. I am actively seeking
sponsorship from AI API providers or Cloud
Platforms.
</p>
<p className="text-sm text-gray-600 mb-4 max-w-md mx-auto">
In return for support (credits or funding),
I will prominently feature your company as a
platform sponsor on both the GitHub
repository and the live demo site.
</p>
<a
href="mailto:me@jiang.jp"
className="inline-flex items-center gap-2 px-5 py-2.5 rounded-full bg-gradient-to-r from-amber-500 to-orange-500 text-white font-medium text-sm shadow-md hover:shadow-lg hover:scale-105 transition-all duration-200"
>
Contact Me
</a>
</div>
</div>
</div>
<p className="text-gray-700">

View File

@@ -1,11 +1,14 @@
import {
APICallError,
convertToModelMessages,
createUIMessageStream,
createUIMessageStreamResponse,
LoadAPIKeyError,
stepCountIs,
streamText,
} from "ai"
import { z } from "zod"
import { getAIModel } from "@/lib/ai-providers"
import { getAIModel, supportsPromptCaching } from "@/lib/ai-providers"
import { findCachedResponse } from "@/lib/cached-responses"
import {
getTelemetryConfig,
@@ -15,7 +18,7 @@ import {
} from "@/lib/langfuse"
import { getSystemPrompt } from "@/lib/system-prompts"
export const maxDuration = 60
export const maxDuration = 300
// File upload limits (must match client-side)
const MAX_FILE_SIZE = 2 * 1024 * 1024 // 2MB
@@ -63,6 +66,64 @@ function isMinimalDiagram(xml: string): boolean {
return !stripped.includes('id="2"')
}
// Helper function to replace historical tool call XML with placeholders
// This reduces token usage and forces LLM to rely on the current diagram XML (source of truth)
function replaceHistoricalToolInputs(messages: any[]): any[] {
return messages.map((msg) => {
if (msg.role !== "assistant" || !Array.isArray(msg.content)) {
return msg
}
const replacedContent = msg.content.map((part: any) => {
if (part.type === "tool-call") {
const toolName = part.toolName
if (
toolName === "display_diagram" ||
toolName === "edit_diagram"
) {
return {
...part,
input: {
placeholder:
"[XML content replaced - see current diagram XML in system context]",
},
}
}
}
return part
})
return { ...msg, content: replacedContent }
})
}
// Helper function to fix tool call inputs for Bedrock API
// Bedrock requires toolUse.input to be a JSON object, not a string
function fixToolCallInputs(messages: any[]): any[] {
return messages.map((msg) => {
if (msg.role !== "assistant" || !Array.isArray(msg.content)) {
return msg
}
const fixedContent = msg.content.map((part: any) => {
if (part.type === "tool-call") {
if (typeof part.input === "string") {
try {
const parsed = JSON.parse(part.input)
return { ...part, input: parsed }
} catch {
// If parsing fails, wrap the string in an object
return { ...part, input: { rawInput: part.input } }
}
}
// Input is already an object, but verify it's not null/undefined
if (part.input === null || part.input === undefined) {
return { ...part, input: {} }
}
}
return part
})
return { ...msg, content: fixedContent }
})
}
// Helper function to create cached stream response
function createCachedStreamResponse(xml: string): Response {
const toolCallId = `cached-${Date.now()}`
@@ -112,7 +173,7 @@ async function handleChatRequest(req: Request): Promise<Response> {
}
}
const { messages, xml, sessionId } = await req.json()
const { messages, xml, previousXml, sessionId } = await req.json()
// Get user IP for Langfuse tracking
const forwardedFor = req.headers.get("x-forwarded-for")
@@ -155,17 +216,28 @@ async function handleChatRequest(req: Request): Promise<Response> {
const cached = findCachedResponse(textPart?.text || "", !!filePart)
if (cached) {
console.log(
"[Cache] Returning cached response for:",
textPart?.text,
)
return createCachedStreamResponse(cached.xml)
}
}
// === CACHE CHECK END ===
// Get AI model from environment configuration
const { model, providerOptions, headers, modelId } = getAIModel()
// Read client AI provider overrides from headers
const clientOverrides = {
provider: req.headers.get("x-ai-provider"),
baseUrl: req.headers.get("x-ai-base-url"),
apiKey: req.headers.get("x-ai-api-key"),
modelId: req.headers.get("x-ai-model"),
}
// Get AI model with optional client overrides
const { model, providerOptions, headers, modelId } =
getAIModel(clientOverrides)
// Check if model supports prompt caching
const shouldCache = supportsPromptCaching(modelId)
console.log(
`[Prompt Caching] ${shouldCache ? "ENABLED" : "DISABLED"} for model: ${modelId}`,
)
// Get the appropriate system prompt based on model (extended for Opus/Haiku 4.5)
const systemMessage = getSystemPrompt(modelId)
@@ -189,9 +261,15 @@ ${lastMessageText}
// Convert UIMessages to ModelMessages and add system message
const modelMessages = convertToModelMessages(messages)
// Fix tool call inputs for Bedrock API (requires JSON objects, not strings)
const fixedMessages = fixToolCallInputs(modelMessages)
// Replace historical tool call XML with placeholders to reduce tokens and avoid confusion
const placeholderMessages = replaceHistoricalToolInputs(fixedMessages)
// Filter out messages with empty content arrays (Bedrock API rejects these)
// This is a safety measure - ideally convertToModelMessages should handle all cases
let enhancedMessages = modelMessages.filter(
let enhancedMessages = placeholderMessages.filter(
(msg: any) =>
msg.content && Array.isArray(msg.content) && msg.content.length > 0,
)
@@ -224,7 +302,7 @@ ${lastMessageText}
// Add cache point to the last assistant message in conversation history
// This caches the entire conversation prefix for subsequent requests
// Strategy: system (cached) + history with last assistant (cached) + new user message
if (enhancedMessages.length >= 2) {
if (shouldCache && enhancedMessages.length >= 2) {
// Find the last assistant message (should be second-to-last, before current user message)
for (let i = enhancedMessages.length - 2; i >= 0; i--) {
if (enhancedMessages[i].role === "assistant") {
@@ -249,17 +327,21 @@ ${lastMessageText}
{
role: "system" as const,
content: systemMessage,
providerOptions: {
bedrock: { cachePoint: { type: "default" } },
},
...(shouldCache && {
providerOptions: {
bedrock: { cachePoint: { type: "default" } },
},
}),
},
// Cache breakpoint 2: Current diagram XML context
// Cache breakpoint 2: Previous and Current diagram XML context
{
role: "system" as const,
content: `Current diagram XML:\n"""xml\n${xml || ""}\n"""\nWhen using edit_diagram, COPY search patterns exactly from this XML - attribute order matters!`,
providerOptions: {
bedrock: { cachePoint: { type: "default" } },
},
content: `${previousXml ? `Previous diagram XML (before user's last message):\n"""xml\n${previousXml}\n"""\n\n` : ""}Current diagram XML (AUTHORITATIVE - the source of truth):\n"""xml\n${xml || ""}\n"""\n\nIMPORTANT: The "Current diagram XML" is the SINGLE SOURCE OF TRUTH for what's on the canvas right now. The user can manually add, delete, or modify shapes directly in draw.io. Always count and describe elements based on the CURRENT XML, not on what you previously generated. If both previous and current XML are shown, compare them to understand what the user changed. When using edit_diagram, COPY search patterns exactly from the CURRENT XML - attribute order matters!`,
...(shouldCache && {
providerOptions: {
bedrock: { cachePoint: { type: "default" } },
},
}),
},
]
@@ -267,8 +349,9 @@ ${lastMessageText}
const result = streamText({
model,
stopWhen: stepCountIs(5),
messages: allMessages,
...(providerOptions && { providerOptions }),
...(providerOptions && { providerOptions }), // This now includes all reasoning configs
...(headers && { headers }),
// Langfuse telemetry config (returns undefined if not configured)
...(getTelemetryConfig({ sessionId: validSessionId, userId }) && {
@@ -277,14 +360,34 @@ ${lastMessageText}
userId,
}),
}),
onFinish: ({ text, usage, providerMetadata }) => {
console.log(
"[Cache] Full providerMetadata:",
JSON.stringify(providerMetadata, null, 2),
)
console.log("[Cache] Usage:", JSON.stringify(usage, null, 2))
// Repair malformed tool calls (model sometimes generates invalid JSON with unescaped quotes)
experimental_repairToolCall: async ({ toolCall }) => {
// The toolCall.input contains the raw JSON string that failed to parse
const rawJson =
typeof toolCall.input === "string" ? toolCall.input : null
if (rawJson) {
try {
// Fix unescaped quotes: x="520" should be x=\"520\"
const fixed = rawJson.replace(
/([a-zA-Z])="(\d+)"/g,
'$1=\\"$2\\"',
)
const parsed = JSON.parse(fixed)
return {
type: "tool-call" as const,
toolCallId: toolCall.toolCallId,
toolName: toolCall.toolName,
input: JSON.stringify(parsed),
}
} catch {
// Repair failed, return null
}
}
return null
},
onFinish: ({ text, usage }) => {
// Pass usage to Langfuse (Bedrock streaming doesn't auto-report tokens to telemetry)
// AI SDK uses inputTokens/outputTokens, Langfuse expects promptTokens/completionTokens
setTraceOutput(text, {
promptTokens: usage?.inputTokens,
completionTokens: usage?.outputTokens,
@@ -342,7 +445,9 @@ IMPORTANT: Keep edits concise:
- Only include the lines that are changing, plus 1-2 surrounding lines for context if needed
- Break large changes into multiple smaller edits
- Each search must contain complete lines (never truncate mid-line)
- First match only - be specific enough to target the right element`,
- First match only - be specific enough to target the right element
⚠️ JSON ESCAPING: Every " inside string values MUST be escaped as \\". Example: x=\\"100\\" y=\\"200\\" - BOTH quotes need backslashes!`,
inputSchema: z.object({
edits: z
.array(
@@ -363,10 +468,96 @@ IMPORTANT: Keep edits concise:
}),
},
},
temperature: 0,
...(process.env.TEMPERATURE !== undefined && {
temperature: parseFloat(process.env.TEMPERATURE),
}),
})
return result.toUIMessageStreamResponse()
return result.toUIMessageStreamResponse({
sendReasoning: true,
messageMetadata: ({ part }) => {
if (part.type === "finish") {
const usage = (part as any).totalUsage
if (!usage) {
console.warn(
"[messageMetadata] No usage data in finish part",
)
return undefined
}
// Total input = non-cached + cached (these are separate counts)
// Note: cacheWriteInputTokens is not available on finish part
const totalInputTokens =
(usage.inputTokens ?? 0) + (usage.cachedInputTokens ?? 0)
return {
inputTokens: totalInputTokens,
outputTokens: usage.outputTokens ?? 0,
}
}
return undefined
},
})
}
// Helper to categorize errors and return appropriate response
function handleError(error: unknown): Response {
console.error("Error in chat route:", error)
const isDev = process.env.NODE_ENV === "development"
// Check for specific AI SDK error types
if (APICallError.isInstance(error)) {
return Response.json(
{
error: error.message,
...(isDev && {
details: error.responseBody,
stack: error.stack,
}),
},
{ status: error.statusCode || 500 },
)
}
if (LoadAPIKeyError.isInstance(error)) {
return Response.json(
{
error: "Authentication failed. Please check your API key.",
...(isDev && {
stack: error.stack,
}),
},
{ status: 401 },
)
}
// Fallback for other errors with safety filter
const message =
error instanceof Error ? error.message : "An unexpected error occurred"
const status = (error as any)?.statusCode || (error as any)?.status || 500
// Prevent leaking API keys, tokens, or other sensitive data
const lowerMessage = message.toLowerCase()
const safeMessage =
lowerMessage.includes("key") ||
lowerMessage.includes("token") ||
lowerMessage.includes("sig") ||
lowerMessage.includes("signature") ||
lowerMessage.includes("secret") ||
lowerMessage.includes("password") ||
lowerMessage.includes("credential")
? "Authentication failed. Please check your credentials."
: message
return Response.json(
{
error: safeMessage,
...(isDev && {
details: message,
stack: error instanceof Error ? error.stack : undefined,
}),
},
{ status },
)
}
// Wrap handler with error handling
@@ -374,11 +565,7 @@ async function safeHandler(req: Request): Promise<Response> {
try {
return await handleChatRequest(req)
} catch (error) {
console.error("Error in chat route:", error)
return Response.json(
{ error: "Internal server error" },
{ status: 500 },
)
return handleError(error)
}
}

View File

@@ -1,12 +1,10 @@
import { NextResponse } from "next/server"
export async function GET() {
const accessCodes =
process.env.ACCESS_CODE_LIST?.split(",")
.map((code) => code.trim())
.filter(Boolean) || []
return NextResponse.json({
accessCodeRequired: accessCodes.length > 0,
accessCodeRequired: !!process.env.ACCESS_CODE_LIST,
dailyRequestLimit: Number(process.env.DAILY_REQUEST_LIMIT) || 0,
dailyTokenLimit: Number(process.env.DAILY_TOKEN_LIMIT) || 0,
tpmLimit: Number(process.env.TPM_LIMIT) || 0,
})
}

View File

@@ -0,0 +1,32 @@
export async function POST(req: Request) {
const accessCodes =
process.env.ACCESS_CODE_LIST?.split(",")
.map((code) => code.trim())
.filter(Boolean) || []
// If no access codes configured, verification always passes
if (accessCodes.length === 0) {
return Response.json({
valid: true,
message: "No access code required",
})
}
const accessCodeHeader = req.headers.get("x-access-code")
if (!accessCodeHeader) {
return Response.json(
{ valid: false, message: "Access code is required" },
{ status: 401 },
)
}
if (!accessCodes.includes(accessCodeHeader)) {
return Response.json(
{ valid: false, message: "Invalid access code" },
{ status: 401 },
)
}
return Response.json({ valid: true, message: "Access code is valid" })
}

View File

@@ -106,7 +106,7 @@ export default function RootLayout({
}
return (
<html lang="en">
<html lang="en" suppressHydrationWarning>
<head>
<script
type="application/ld+json"

View File

@@ -3,6 +3,7 @@ import { useEffect, useRef, useState } from "react"
import { DrawIoEmbed } from "react-drawio"
import type { ImperativePanelHandle } from "react-resizable-panels"
import ChatPanel from "@/components/chat-panel"
import { STORAGE_CLOSE_PROTECTION_KEY } from "@/components/settings-dialog"
import {
ResizableHandle,
ResizablePanel,
@@ -10,19 +11,63 @@ import {
} from "@/components/ui/resizable"
import { useDiagram } from "@/contexts/diagram-context"
const drawioBaseUrl =
process.env.NEXT_PUBLIC_DRAWIO_BASE_URL || "https://embed.diagrams.net"
export default function Home() {
const { drawioRef, handleDiagramExport } = useDiagram()
const { drawioRef, handleDiagramExport, onDrawioLoad, resetDrawioReady } =
useDiagram()
const [isMobile, setIsMobile] = useState(false)
const [isChatVisible, setIsChatVisible] = useState(true)
const [drawioUi, setDrawioUi] = useState<"min" | "sketch">(() => {
if (typeof window !== "undefined") {
const saved = localStorage.getItem("drawio-theme")
if (saved === "min" || saved === "sketch") return saved
}
return "min"
})
const [drawioUi, setDrawioUi] = useState<"min" | "sketch">("min")
const [darkMode, setDarkMode] = useState(false)
const [isLoaded, setIsLoaded] = useState(false)
const [closeProtection, setCloseProtection] = useState(false)
const chatPanelRef = useRef<ImperativePanelHandle>(null)
// Load preferences from localStorage after mount
useEffect(() => {
const savedUi = localStorage.getItem("drawio-theme")
if (savedUi === "min" || savedUi === "sketch") {
setDrawioUi(savedUi)
}
const savedDarkMode = localStorage.getItem("next-ai-draw-io-dark-mode")
if (savedDarkMode !== null) {
// Use saved preference
const isDark = savedDarkMode === "true"
setDarkMode(isDark)
document.documentElement.classList.toggle("dark", isDark)
} else {
// First visit: match browser preference
const prefersDark = window.matchMedia(
"(prefers-color-scheme: dark)",
).matches
setDarkMode(prefersDark)
document.documentElement.classList.toggle("dark", prefersDark)
}
const savedCloseProtection = localStorage.getItem(
STORAGE_CLOSE_PROTECTION_KEY,
)
if (savedCloseProtection === "true") {
setCloseProtection(true)
}
setIsLoaded(true)
}, [])
const toggleDarkMode = () => {
const newValue = !darkMode
setDarkMode(newValue)
localStorage.setItem("next-ai-draw-io-dark-mode", String(newValue))
document.documentElement.classList.toggle("dark", newValue)
// Reset so onDrawioLoad fires again after remount
resetDrawioReady()
}
// Check mobile
useEffect(() => {
const checkMobile = () => {
setIsMobile(window.innerWidth < 768)
@@ -46,6 +91,7 @@ export default function Home() {
}
}
// Keyboard shortcut for toggling chat panel
useEffect(() => {
const handleKeyDown = (event: KeyboardEvent) => {
if ((event.ctrlKey || event.metaKey) && event.key === "b") {
@@ -59,8 +105,9 @@ export default function Home() {
}, [])
// Show confirmation dialog when user tries to leave the page
// This helps prevent accidental navigation from browser back gestures
useEffect(() => {
if (!closeProtection) return
const handleBeforeUnload = (event: BeforeUnloadEvent) => {
event.preventDefault()
return ""
@@ -69,35 +116,49 @@ export default function Home() {
window.addEventListener("beforeunload", handleBeforeUnload)
return () =>
window.removeEventListener("beforeunload", handleBeforeUnload)
}, [])
}, [closeProtection])
return (
<div className="h-screen bg-background relative overflow-hidden">
<ResizablePanelGroup
id="main-panel-group"
key={isMobile ? "mobile" : "desktop"}
direction={isMobile ? "vertical" : "horizontal"}
className="h-full"
>
{/* Draw.io Canvas */}
<ResizablePanel defaultSize={isMobile ? 50 : 67} minSize={20}>
<ResizablePanel
id="drawio-panel"
defaultSize={isMobile ? 50 : 67}
minSize={20}
>
<div
className={`h-full relative ${
isMobile ? "p-1" : "p-2"
}`}
>
<div className="h-full rounded-xl overflow-hidden shadow-soft-lg border border-border/30 bg-white">
<DrawIoEmbed
key={drawioUi}
ref={drawioRef}
onExport={handleDiagramExport}
urlParameters={{
ui: drawioUi,
spin: true,
libraries: false,
saveAndExit: false,
noExitBtn: true,
}}
/>
<div className="h-full rounded-xl overflow-hidden shadow-soft-lg border border-border/30">
{isLoaded ? (
<DrawIoEmbed
key={`${drawioUi}-${darkMode}`}
ref={drawioRef}
onExport={handleDiagramExport}
onLoad={onDrawioLoad}
baseUrl={drawioBaseUrl}
urlParameters={{
ui: drawioUi,
spin: true,
libraries: false,
saveAndExit: false,
noExitBtn: true,
dark: darkMode,
}}
/>
) : (
<div className="h-full w-full flex items-center justify-center bg-background">
<div className="animate-spin h-8 w-8 border-4 border-primary border-t-transparent rounded-full" />
</div>
)}
</div>
</div>
</ResizablePanel>
@@ -106,6 +167,7 @@ export default function Home() {
{/* Chat Panel */}
<ResizablePanel
id="chat-panel"
ref={chatPanelRef}
defaultSize={isMobile ? 50 : 33}
minSize={isMobile ? 20 : 15}
@@ -121,12 +183,16 @@ export default function Home() {
onToggleVisibility={toggleChatPanel}
drawioUi={drawioUi}
onToggleDrawioUi={() => {
const newTheme =
const newUi =
drawioUi === "min" ? "sketch" : "min"
localStorage.setItem("drawio-theme", newTheme)
setDrawioUi(newTheme)
localStorage.setItem("drawio-theme", newUi)
setDrawioUi(newUi)
resetDrawioReady()
}}
darkMode={darkMode}
onToggleDarkMode={toggleDarkMode}
isMobile={isMobile}
onCloseProtectionChange={setCloseProtection}
/>
</div>
</ResizablePanel>

View File

@@ -0,0 +1,186 @@
"use client"
import { useControllableState } from "@radix-ui/react-use-controllable-state"
import { BrainIcon, ChevronDownIcon } from "lucide-react"
import type { ComponentProps, ReactNode } from "react"
import { createContext, memo, useContext, useEffect, useState } from "react"
import {
Collapsible,
CollapsibleContent,
CollapsibleTrigger,
} from "@/components/ui/collapsible"
import { cn } from "@/lib/utils"
import { Shimmer } from "./shimmer"
type ReasoningContextValue = {
isStreaming: boolean
isOpen: boolean
setIsOpen: (open: boolean) => void
duration: number | undefined
}
const ReasoningContext = createContext<ReasoningContextValue | null>(null)
export const useReasoning = () => {
const context = useContext(ReasoningContext)
if (!context) {
throw new Error("Reasoning components must be used within Reasoning")
}
return context
}
export type ReasoningProps = ComponentProps<typeof Collapsible> & {
isStreaming?: boolean
open?: boolean
defaultOpen?: boolean
onOpenChange?: (open: boolean) => void
duration?: number
}
const AUTO_CLOSE_DELAY = 1000
const MS_IN_S = 1000
export const Reasoning = memo(
({
className,
isStreaming = false,
open,
defaultOpen = true,
onOpenChange,
duration: durationProp,
children,
...props
}: ReasoningProps) => {
const [isOpen, setIsOpen] = useControllableState({
prop: open,
defaultProp: defaultOpen,
onChange: onOpenChange,
})
const [duration, setDuration] = useControllableState({
prop: durationProp,
defaultProp: undefined,
})
const [hasAutoClosed, setHasAutoClosed] = useState(false)
const [startTime, setStartTime] = useState<number | null>(null)
// Track duration when streaming starts and ends
useEffect(() => {
if (isStreaming) {
if (startTime === null) {
setStartTime(Date.now())
}
} else if (startTime !== null) {
setDuration(Math.ceil((Date.now() - startTime) / MS_IN_S))
setStartTime(null)
}
}, [isStreaming, startTime, setDuration])
// Auto-open when streaming starts, auto-close when streaming ends (once only)
useEffect(() => {
if (defaultOpen && !isStreaming && isOpen && !hasAutoClosed) {
// Add a small delay before closing to allow user to see the content
const timer = setTimeout(() => {
setIsOpen(false)
setHasAutoClosed(true)
}, AUTO_CLOSE_DELAY)
return () => clearTimeout(timer)
}
}, [isStreaming, isOpen, defaultOpen, setIsOpen, hasAutoClosed])
const handleOpenChange = (newOpen: boolean) => {
setIsOpen(newOpen)
}
return (
<ReasoningContext.Provider
value={{ isStreaming, isOpen, setIsOpen, duration }}
>
<Collapsible
className={cn("not-prose mb-4", className)}
onOpenChange={handleOpenChange}
open={isOpen}
{...props}
>
{children}
</Collapsible>
</ReasoningContext.Provider>
)
},
)
export type ReasoningTriggerProps = ComponentProps<
typeof CollapsibleTrigger
> & {
getThinkingMessage?: (isStreaming: boolean, duration?: number) => ReactNode
}
const defaultGetThinkingMessage = (isStreaming: boolean, duration?: number) => {
if (isStreaming || duration === 0) {
return <Shimmer duration={1}>Thinking...</Shimmer>
}
if (duration === undefined) {
return <p>Thought for a few seconds</p>
}
return <p>Thought for {duration} seconds</p>
}
export const ReasoningTrigger = memo(
({
className,
children,
getThinkingMessage = defaultGetThinkingMessage,
...props
}: ReasoningTriggerProps) => {
const { isStreaming, isOpen, duration } = useReasoning()
return (
<CollapsibleTrigger
className={cn(
"flex w-full items-center gap-2 text-muted-foreground text-sm transition-colors hover:text-foreground",
className,
)}
{...props}
>
{children ?? (
<>
<BrainIcon className="size-4" />
{getThinkingMessage(isStreaming, duration)}
<ChevronDownIcon
className={cn(
"size-4 transition-transform",
isOpen ? "rotate-180" : "rotate-0",
)}
/>
</>
)}
</CollapsibleTrigger>
)
},
)
export type ReasoningContentProps = ComponentProps<
typeof CollapsibleContent
> & {
children: string
}
export const ReasoningContent = memo(
({ className, children, ...props }: ReasoningContentProps) => (
<CollapsibleContent
className={cn(
"mt-4 text-sm",
"data-[state=closed]:fade-out-0 data-[state=closed]:slide-out-to-top-2 data-[state=open]:slide-in-from-top-2 text-muted-foreground outline-none data-[state=closed]:animate-out data-[state=open]:animate-in",
className,
)}
{...props}
>
<div className="whitespace-pre-wrap">{children}</div>
</CollapsibleContent>
),
)
Reasoning.displayName = "Reasoning"
ReasoningTrigger.displayName = "ReasoningTrigger"
ReasoningContent.displayName = "ReasoningContent"

View File

@@ -0,0 +1,64 @@
"use client"
import { motion } from "motion/react"
import {
type CSSProperties,
type ElementType,
type JSX,
memo,
useMemo,
} from "react"
import { cn } from "@/lib/utils"
export type TextShimmerProps = {
children: string
as?: ElementType
className?: string
duration?: number
spread?: number
}
const ShimmerComponent = ({
children,
as: Component = "p",
className,
duration = 2,
spread = 2,
}: TextShimmerProps) => {
const MotionComponent = motion.create(
Component as keyof JSX.IntrinsicElements,
)
const dynamicSpread = useMemo(
() => (children?.length ?? 0) * spread,
[children, spread],
)
return (
<MotionComponent
animate={{ backgroundPosition: "0% center" }}
className={cn(
"relative inline-block bg-[length:250%_100%,auto] bg-clip-text text-transparent",
"[--bg:linear-gradient(90deg,#0000_calc(50%-var(--spread)),var(--color-background),#0000_calc(50%+var(--spread)))] [background-repeat:no-repeat,padding-box]",
className,
)}
initial={{ backgroundPosition: "100% center" }}
style={
{
"--spread": `${dynamicSpread}px`,
backgroundImage:
"var(--bg), linear-gradient(var(--color-muted-foreground), var(--color-muted-foreground))",
} as CSSProperties
}
transition={{
repeat: Number.POSITIVE_INFINITY,
duration,
ease: "linear",
}}
>
{children}
</MotionComponent>
)
}
export const Shimmer = memo(ShimmerComponent)

View File

@@ -1,28 +1,52 @@
"use client"
import { Cloud, GitBranch, Palette, Zap } from "lucide-react"
import { Cloud, FileText, GitBranch, Palette, Zap } from "lucide-react"
interface ExampleCardProps {
icon: React.ReactNode
title: string
description: string
onClick: () => void
isNew?: boolean
}
function ExampleCard({ icon, title, description, onClick }: ExampleCardProps) {
function ExampleCard({
icon,
title,
description,
onClick,
isNew,
}: ExampleCardProps) {
return (
<button
onClick={onClick}
className="group w-full text-left p-4 rounded-xl border border-border/60 bg-card hover:bg-accent/50 hover:border-primary/30 transition-all duration-200 hover:shadow-sm"
className={`group w-full text-left p-4 rounded-xl border bg-card hover:bg-accent/50 hover:border-primary/30 transition-all duration-200 hover:shadow-sm ${
isNew
? "border-primary/40 ring-1 ring-primary/20"
: "border-border/60"
}`}
>
<div className="flex items-start gap-3">
<div className="w-9 h-9 rounded-lg bg-primary/10 flex items-center justify-center shrink-0 group-hover:bg-primary/15 transition-colors">
<div
className={`w-9 h-9 rounded-lg flex items-center justify-center shrink-0 transition-colors ${
isNew
? "bg-primary/20 group-hover:bg-primary/25"
: "bg-primary/10 group-hover:bg-primary/15"
}`}
>
{icon}
</div>
<div className="min-w-0">
<h3 className="text-sm font-medium text-foreground group-hover:text-primary transition-colors">
{title}
</h3>
<div className="flex items-center gap-2">
<h3 className="text-sm font-medium text-foreground group-hover:text-primary transition-colors">
{title}
</h3>
{isNew && (
<span className="px-1.5 py-0.5 text-[10px] font-semibold bg-primary text-primary-foreground rounded">
NEW
</span>
)}
</div>
<p className="text-xs text-muted-foreground mt-0.5 line-clamp-2">
{description}
</p>
@@ -67,6 +91,21 @@ export default function ExamplePanel({
}
}
const handlePdfExample = async () => {
setInput("Summarize this paper as a diagram")
try {
const response = await fetch("/chain-of-thought.txt")
const blob = await response.blob()
const file = new File([blob], "chain-of-thought.txt", {
type: "text/plain",
})
setFiles([file])
} catch (error) {
console.error("Error loading text file:", error)
}
}
return (
<div className="py-6 px-2 animate-fade-in">
{/* Welcome section */}
@@ -87,6 +126,14 @@ export default function ExamplePanel({
</p>
<div className="grid gap-2">
<ExampleCard
icon={<FileText className="w-4 h-4 text-primary" />}
title="Paper to Diagram"
description="Upload .pdf, .txt, .md, .json, .csv, .py, .js, .ts and more"
onClick={handlePdfExample}
isNew
/>
<ExampleCard
icon={<Zap className="w-4 h-4 text-primary" />}
title="Animated Diagram"

View File

@@ -4,9 +4,7 @@ import {
Download,
History,
Image as ImageIcon,
LayoutGrid,
Loader2,
PenTool,
Send,
Trash2,
} from "lucide-react"
@@ -19,21 +17,18 @@ import { HistoryDialog } from "@/components/history-dialog"
import { ResetWarningModal } from "@/components/reset-warning-modal"
import { SaveDialog } from "@/components/save-dialog"
import { Button } from "@/components/ui/button"
import {
Dialog,
DialogContent,
DialogDescription,
DialogFooter,
DialogHeader,
DialogTitle,
} from "@/components/ui/dialog"
import { Textarea } from "@/components/ui/textarea"
import { useDiagram } from "@/contexts/diagram-context"
import { isPdfFile, isTextFile } from "@/lib/pdf-utils"
import { FilePreviewList } from "./file-preview-list"
const MAX_FILE_SIZE = 2 * 1024 * 1024 // 2MB
const MAX_IMAGE_SIZE = 2 * 1024 * 1024 // 2MB
const MAX_FILES = 5
function isValidFileType(file: File): boolean {
return file.type.startsWith("image/") || isPdfFile(file) || isTextFile(file)
}
function formatFileSize(bytes: number): string {
const mb = bytes / 1024 / 1024
if (mb < 0.01) return `${(bytes / 1024).toFixed(0)}KB`
@@ -73,9 +68,16 @@ function validateFiles(
errors.push(`Only ${availableSlots} more file(s) allowed`)
break
}
if (file.size > MAX_FILE_SIZE) {
if (!isValidFileType(file)) {
errors.push(`"${file.name}" is not a supported file type`)
continue
}
// Only check size for images (PDFs/text files are extracted client-side, so file size doesn't matter)
const isExtractedFile = isPdfFile(file) || isTextFile(file)
if (!isExtractedFile && file.size > MAX_IMAGE_SIZE) {
const maxSizeMB = MAX_IMAGE_SIZE / 1024 / 1024
errors.push(
`"${file.name}" is ${formatFileSize(file.size)} (exceeds 2MB)`,
`"${file.name}" is ${formatFileSize(file.size)} (exceeds ${maxSizeMB}MB)`,
)
} else {
validFiles.push(file)
@@ -119,12 +121,14 @@ interface ChatInputProps {
onClearChat: () => void
files?: File[]
onFileChange?: (files: File[]) => void
pdfData?: Map<
File,
{ text: string; charCount: number; isExtracting: boolean }
>
showHistory?: boolean
onToggleHistory?: (show: boolean) => void
sessionId?: string
error?: Error | null
drawioUi?: "min" | "sketch"
onToggleDrawioUi?: () => void
}
export function ChatInput({
@@ -135,12 +139,11 @@ export function ChatInput({
onClearChat,
files = [],
onFileChange = () => {},
pdfData = new Map(),
showHistory = false,
onToggleHistory = () => {},
sessionId,
error = null,
drawioUi = "min",
onToggleDrawioUi = () => {},
}: ChatInputProps) {
const { diagramHistory, saveDiagramToFile } = useDiagram()
const textareaRef = useRef<HTMLTextAreaElement>(null)
@@ -148,7 +151,6 @@ export function ChatInput({
const [isDragging, setIsDragging] = useState(false)
const [showClearDialog, setShowClearDialog] = useState(false)
const [showSaveDialog, setShowSaveDialog] = useState(false)
const [showThemeWarning, setShowThemeWarning] = useState(false)
// Allow retry when there's an error (even if status is still "streaming" or "submitted")
const isDisabled =
@@ -260,11 +262,14 @@ export function ChatInput({
if (isDisabled) return
const droppedFiles = e.dataTransfer.files
const imageFiles = Array.from(droppedFiles).filter((file) =>
file.type.startsWith("image/"),
const supportedFiles = Array.from(droppedFiles).filter((file) =>
isValidFileType(file),
)
const { validFiles, errors } = validateFiles(imageFiles, files.length)
const { validFiles, errors } = validateFiles(
supportedFiles,
files.length,
)
showValidationErrors(errors)
if (validFiles.length > 0) {
onFileChange([...files, ...validFiles])
@@ -294,6 +299,7 @@ export function ChatInput({
<FilePreviewList
files={files}
onRemoveFile={handleRemoveFile}
pdfData={pdfData}
/>
</div>
)}
@@ -306,7 +312,7 @@ export function ChatInput({
onChange={handleChange}
onKeyDown={handleKeyDown}
onPaste={handlePaste}
placeholder="Describe your diagram or paste an image..."
placeholder="Describe your diagram or upload a file..."
disabled={isDisabled}
aria-label="Chat input"
className="min-h-[60px] max-h-[200px] resize-none border-0 bg-transparent px-4 py-3 text-sm focus-visible:ring-0 focus-visible:ring-offset-0 placeholder:text-muted-foreground/60"
@@ -337,60 +343,6 @@ export function ChatInput({
showHistory={showHistory}
onToggleHistory={onToggleHistory}
/>
<ButtonWithTooltip
type="button"
variant="ghost"
size="sm"
onClick={() => setShowThemeWarning(true)}
tooltipContent={
drawioUi === "min"
? "Switch to Sketch theme"
: "Switch to Minimal theme"
}
className="h-8 w-8 p-0 text-muted-foreground hover:text-foreground"
>
{drawioUi === "min" ? (
<PenTool className="h-4 w-4" />
) : (
<LayoutGrid className="h-4 w-4" />
)}
</ButtonWithTooltip>
<Dialog
open={showThemeWarning}
onOpenChange={setShowThemeWarning}
>
<DialogContent>
<DialogHeader>
<DialogTitle>Switch Theme?</DialogTitle>
<DialogDescription>
Switching themes will reload the diagram
editor and clear any unsaved changes.
</DialogDescription>
</DialogHeader>
<DialogFooter>
<Button
variant="outline"
onClick={() =>
setShowThemeWarning(false)
}
>
Cancel
</Button>
<Button
variant="destructive"
onClick={() => {
onClearChat()
onToggleDrawioUi()
setShowThemeWarning(false)
}}
>
Switch Theme
</Button>
</DialogFooter>
</DialogContent>
</Dialog>
</div>
{/* Right actions */}
@@ -436,7 +388,7 @@ export function ChatInput({
size="sm"
onClick={triggerFileInput}
disabled={isDisabled}
tooltipContent="Upload image"
tooltipContent="Upload file (image, PDF, text)"
className="h-8 w-8 p-0 text-muted-foreground hover:text-foreground"
>
<ImageIcon className="h-4 w-4" />
@@ -447,7 +399,7 @@ export function ChatInput({
ref={fileInputRef}
className="hidden"
onChange={handleFileChange}
accept="image/*"
accept="image/*,.pdf,application/pdf,text/*,.md,.markdown,.json,.csv,.xml,.yaml,.yml,.toml"
multiple
disabled={isDisabled}
/>

View File

@@ -8,6 +8,8 @@ import {
ChevronUp,
Copy,
Cpu,
FileCode,
FileText,
Minus,
Pencil,
Plus,
@@ -19,6 +21,11 @@ import {
import Image from "next/image"
import { useCallback, useEffect, useRef, useState } from "react"
import ReactMarkdown from "react-markdown"
import {
Reasoning,
ReasoningContent,
ReasoningTrigger,
} from "@/components/ai-elements/reasoning"
import { ScrollArea } from "@/components/ui/scroll-area"
import {
convertToLegalXml,
@@ -47,7 +54,7 @@ function EditDiffDisplay({ edits }: { edits: EditPair[] }) {
<div className="space-y-3">
{edits.map((edit, index) => (
<div
key={`${edit.search.slice(0, 50)}-${edit.replace.slice(0, 50)}-${index}`}
key={`${(edit.search || "").slice(0, 50)}-${(edit.replace || "").slice(0, 50)}-${index}`}
className="rounded-lg border border-border/50 overflow-hidden bg-background/50"
>
<div className="px-3 py-1.5 bg-muted/40 border-b border-border/30 flex items-center gap-2">
@@ -89,6 +96,59 @@ function EditDiffDisplay({ edits }: { edits: EditPair[] }) {
import { useDiagram } from "@/contexts/diagram-context"
// Helper to split text content into regular text and file sections (PDF or text files)
interface TextSection {
type: "text" | "file"
content: string
filename?: string
charCount?: number
fileType?: "pdf" | "text"
}
function splitTextIntoFileSections(text: string): TextSection[] {
const sections: TextSection[] = []
// Match [PDF: filename] or [File: filename] patterns
const filePattern =
/\[(PDF|File):\s*([^\]]+)\]\n([\s\S]*?)(?=\n\n\[(PDF|File):|$)/g
let lastIndex = 0
let match
while ((match = filePattern.exec(text)) !== null) {
// Add text before this file section
const beforeText = text.slice(lastIndex, match.index).trim()
if (beforeText) {
sections.push({ type: "text", content: beforeText })
}
// Add file section
const fileType = match[1].toLowerCase() === "pdf" ? "pdf" : "text"
const filename = match[2].trim()
const fileContent = match[3].trim()
sections.push({
type: "file",
content: fileContent,
filename,
charCount: fileContent.length,
fileType,
})
lastIndex = match.index + match[0].length
}
// Add remaining text after last file section
const remainingText = text.slice(lastIndex).trim()
if (remainingText) {
sections.push({ type: "text", content: remainingText })
}
// If no file sections found, return original text
if (sections.length === 0) {
sections.push({ type: "text", content: text })
}
return sections
}
const getMessageTextContent = (message: UIMessage): string => {
if (!message.parts) return ""
return message.parts
@@ -97,6 +157,14 @@ const getMessageTextContent = (message: UIMessage): string => {
.join("\n")
}
// Get only the user's original text, excluding appended file content
const getUserOriginalText = (message: UIMessage): string => {
const fullText = getMessageTextContent(message)
// Strip out [PDF: ...] and [File: ...] sections that were appended
const filePattern = /\n\n\[(PDF|File):\s*[^\]]+\]\n[\s\S]*$/
return fullText.replace(filePattern, "").trim()
}
interface ChatMessageDisplayProps {
messages: UIMessage[]
setInput: (input: string) => void
@@ -104,6 +172,7 @@ interface ChatMessageDisplayProps {
sessionId?: string
onRegenerate?: (messageIndex: number) => void
onEditMessage?: (messageIndex: number, newText: string) => void
status?: "streaming" | "submitted" | "idle" | "error" | "ready"
}
export function ChatMessageDisplay({
@@ -113,6 +182,7 @@ export function ChatMessageDisplay({
sessionId,
onRegenerate,
onEditMessage,
status = "idle",
}: ChatMessageDisplayProps) {
const { chartXML, loadDiagram: onDisplayChart } = useDiagram()
const messagesEndRef = useRef<HTMLDivElement>(null)
@@ -131,10 +201,15 @@ export function ChatMessageDisplay({
)
const editTextareaRef = useRef<HTMLTextAreaElement>(null)
const [editText, setEditText] = useState<string>("")
// Track which PDF sections are expanded (key: messageId-sectionIndex)
const [expandedPdfSections, setExpandedPdfSections] = useState<
Record<string, boolean>
>({})
const copyMessageToClipboard = async (messageId: string, text: string) => {
try {
await navigator.clipboard.writeText(text)
setCopiedMessageId(messageId)
setTimeout(() => setCopiedMessageId(null), 2000)
} catch (err) {
@@ -177,12 +252,18 @@ export function ChatMessageDisplay({
const currentXml = xml || ""
const convertedXml = convertToLegalXml(currentXml)
if (convertedXml !== previousXML.current) {
const replacedXML = replaceNodes(chartXML, convertedXml)
// If chartXML is empty, create a default mxfile structure to use with replaceNodes
// This ensures the XML is properly wrapped in mxfile/diagram/mxGraphModel format
const baseXML =
chartXML ||
`<mxfile><diagram name="Page-1" id="page-1"><mxGraphModel><root><mxCell id="0"/><mxCell id="1" parent="0"/></root></mxGraphModel></diagram></mxfile>`
const replacedXML = replaceNodes(baseXML, convertedXml)
const validationError = validateMxCellStructure(replacedXML)
if (!validationError) {
previousXML.current = convertedXml
onDisplayChart(replacedXML)
// Skip validation in loadDiagram since we already validated above
onDisplayChart(replacedXML, true)
} else {
console.log(
"[ChatMessageDisplay] XML validation failed:",
@@ -384,7 +465,9 @@ export function ChatMessageDisplay({
message.id,
)
setEditText(
userMessageText,
getUserOriginalText(
message,
),
)
}}
className="p-1.5 rounded-lg text-muted-foreground/60 hover:text-muted-foreground hover:bg-muted transition-colors"
@@ -425,6 +508,52 @@ export function ChatMessageDisplay({
</div>
)}
<div className="max-w-[85%] min-w-0">
{/* Reasoning blocks - displayed first for assistant messages */}
{message.role === "assistant" &&
message.parts?.map(
(part, partIndex) => {
if (part.type === "reasoning") {
const reasoningPart =
part as {
type: "reasoning"
text: string
}
const isLastPart =
partIndex ===
(message.parts
?.length ?? 0) -
1
const isLastMessage =
message.id ===
messages[
messages.length - 1
]?.id
const isStreamingReasoning =
status ===
"streaming" &&
isLastPart &&
isLastMessage
return (
<Reasoning
key={`${message.id}-reasoning-${partIndex}`}
className="w-full"
isStreaming={
isStreamingReasoning
}
>
<ReasoningTrigger />
<ReasoningContent>
{
reasoningPart.text
}
</ReasoningContent>
</Reasoning>
)
}
return null
},
)}
{/* Edit mode for user messages */}
{isEditing && message.role === "user" ? (
<div className="flex flex-col gap-2">
@@ -505,139 +634,313 @@ export function ChatMessageDisplay({
</div>
</div>
) : (
/* Text content in bubble */
message.parts?.some(
(part) =>
part.type === "text" ||
part.type === "file",
) && (
<div
className={`px-4 py-3 text-sm leading-relaxed ${
message.role === "user"
? "bg-primary text-primary-foreground rounded-2xl rounded-br-md shadow-sm"
: message.role ===
"system"
? "bg-destructive/10 text-destructive border border-destructive/20 rounded-2xl rounded-bl-md"
: "bg-muted/60 text-foreground rounded-2xl rounded-bl-md"
} ${message.role === "user" && isLastUserMessage && onEditMessage ? "cursor-pointer hover:opacity-90 transition-opacity" : ""}`}
role={
message.role === "user" &&
isLastUserMessage &&
onEditMessage
? "button"
: undefined
}
tabIndex={
message.role === "user" &&
isLastUserMessage &&
onEditMessage
? 0
: undefined
}
onClick={() => {
/* Render parts in order, grouping consecutive text/file parts into bubbles */
(() => {
const parts = message.parts || []
const groups: {
type: "content" | "tool"
parts: typeof parts
startIndex: number
}[] = []
parts.forEach((part, index) => {
const isToolPart =
part.type?.startsWith(
"tool-",
)
const isContentPart =
part.type === "text" ||
part.type === "file"
if (isToolPart) {
groups.push({
type: "tool",
parts: [part],
startIndex: index,
})
} else if (isContentPart) {
const lastGroup =
groups[
groups.length - 1
]
if (
message.role ===
"user" &&
isLastUserMessage &&
onEditMessage
lastGroup?.type ===
"content"
) {
setEditingMessageId(
message.id,
lastGroup.parts.push(
part,
)
setEditText(
userMessageText,
} else {
groups.push({
type: "content",
parts: [part],
startIndex: index,
})
}
}
})
return groups.map(
(group, groupIndex) => {
if (group.type === "tool") {
return renderToolPart(
group
.parts[0] as ToolPartLike,
)
}
}}
onKeyDown={(e) => {
if (
(e.key === "Enter" ||
e.key === " ") &&
message.role ===
"user" &&
isLastUserMessage &&
onEditMessage
) {
e.preventDefault()
setEditingMessageId(
message.id,
)
setEditText(
userMessageText,
)
}
}}
title={
message.role === "user" &&
isLastUserMessage &&
onEditMessage
? "Click to edit"
: undefined
}
>
{message.parts?.map(
(part, index) => {
switch (part.type) {
case "text":
return (
<div
key={`${message.id}-text-${index}`}
className={`prose prose-sm max-w-none break-words [&>*:first-child]:mt-0 [&>*:last-child]:mb-0 ${
message.role ===
"user"
? "[&_*]:!text-primary-foreground prose-code:bg-white/20"
: "dark:prose-invert"
}`}
>
<ReactMarkdown>
{
part.text
}
</ReactMarkdown>
</div>
)
case "file":
return (
<div
key={`${message.id}-file-${part.url}`}
className="mt-2"
>
<Image
src={
part.url
}
width={
200
}
height={
200
}
alt={`Uploaded diagram or image for AI analysis`}
className="rounded-lg border border-white/20"
style={{
objectFit:
"contain",
}}
/>
</div>
)
default:
return null
}
},
)}
</div>
)
)}
{/* Tool calls outside bubble */}
{message.parts?.map((part) => {
if (part.type?.startsWith("tool-")) {
return renderToolPart(
part as ToolPartLike,
// Content bubble
return (
<div
key={`${message.id}-content-${group.startIndex}`}
className={`px-4 py-3 text-sm leading-relaxed ${
message.role ===
"user"
? "bg-primary text-primary-foreground rounded-2xl rounded-br-md shadow-sm"
: message.role ===
"system"
? "bg-destructive/10 text-destructive border border-destructive/20 rounded-2xl rounded-bl-md"
: "bg-muted/60 text-foreground rounded-2xl rounded-bl-md"
} ${message.role === "user" && isLastUserMessage && onEditMessage ? "cursor-pointer hover:opacity-90 transition-opacity" : ""} ${groupIndex > 0 ? "mt-3" : ""}`}
role={
message.role ===
"user" &&
isLastUserMessage &&
onEditMessage
? "button"
: undefined
}
tabIndex={
message.role ===
"user" &&
isLastUserMessage &&
onEditMessage
? 0
: undefined
}
onClick={() => {
if (
message.role ===
"user" &&
isLastUserMessage &&
onEditMessage
) {
setEditingMessageId(
message.id,
)
setEditText(
getUserOriginalText(
message,
),
)
}
}}
onKeyDown={(e) => {
if (
(e.key ===
"Enter" ||
e.key ===
" ") &&
message.role ===
"user" &&
isLastUserMessage &&
onEditMessage
) {
e.preventDefault()
setEditingMessageId(
message.id,
)
setEditText(
getUserOriginalText(
message,
),
)
}
}}
title={
message.role ===
"user" &&
isLastUserMessage &&
onEditMessage
? "Click to edit"
: undefined
}
>
{group.parts.map(
(
part,
partIndex,
) => {
if (
part.type ===
"text"
) {
const textContent =
(
part as {
text: string
}
)
.text
const sections =
splitTextIntoFileSections(
textContent,
)
return (
<div
key={`${message.id}-text-${group.startIndex}-${partIndex}`}
className="space-y-2"
>
{sections.map(
(
section,
sectionIndex,
) => {
if (
section.type ===
"file"
) {
const pdfKey = `${message.id}-file-${partIndex}-${sectionIndex}`
const isExpanded =
expandedPdfSections[
pdfKey
] ??
false
const charDisplay =
section.charCount &&
section.charCount >=
1000
? `${(section.charCount / 1000).toFixed(1)}k`
: section.charCount
return (
<div
key={
pdfKey
}
className="rounded-lg border border-border/60 bg-muted/30 overflow-hidden"
>
<button
type="button"
onClick={(
e,
) => {
e.stopPropagation()
setExpandedPdfSections(
(
prev,
) => ({
...prev,
[pdfKey]:
!isExpanded,
}),
)
}}
className="w-full flex items-center justify-between px-3 py-2 hover:bg-muted/50 transition-colors"
>
<div className="flex items-center gap-2">
{section.fileType ===
"pdf" ? (
<FileText className="h-4 w-4 text-red-500" />
) : (
<FileCode className="h-4 w-4 text-blue-500" />
)}
<span className="text-xs font-medium">
{
section.filename
}
</span>
<span className="text-[10px] text-muted-foreground">
(
{
charDisplay
}{" "}
chars)
</span>
</div>
{isExpanded ? (
<ChevronUp className="h-4 w-4 text-muted-foreground" />
) : (
<ChevronDown className="h-4 w-4 text-muted-foreground" />
)}
</button>
{isExpanded && (
<div className="px-3 py-2 border-t border-border/40 max-h-48 overflow-y-auto bg-muted/30">
<pre className="text-xs whitespace-pre-wrap text-foreground/80">
{
section.content
}
</pre>
</div>
)}
</div>
)
}
// Regular text section
return (
<div
key={`${message.id}-textsection-${partIndex}-${sectionIndex}`}
className={`prose prose-sm max-w-none break-words [&>*:first-child]:mt-0 [&>*:last-child]:mb-0 ${
message.role ===
"user"
? "[&_*]:!text-primary-foreground prose-code:bg-white/20"
: "dark:prose-invert"
}`}
>
<ReactMarkdown>
{
section.content
}
</ReactMarkdown>
</div>
)
},
)}
</div>
)
}
if (
part.type ===
"file"
) {
return (
<div
key={`${message.id}-file-${group.startIndex}-${partIndex}`}
className="mt-2"
>
<Image
src={
(
part as {
url: string
}
)
.url
}
width={
200
}
height={
200
}
alt={`Uploaded diagram or image for AI analysis`}
className="rounded-lg border border-white/20"
style={{
objectFit:
"contain",
}}
/>
</div>
)
}
return null
},
)}
</div>
)
},
)
}
return null
})}
})()
)}
{/* Action buttons for assistant messages */}
{message.role === "assistant" && (
<div className="flex items-center gap-1 mt-2">
@@ -672,9 +975,14 @@ export function ChatMessageDisplay({
<Copy className="h-3.5 w-3.5" />
)}
</button>
{/* Regenerate button - only on last assistant message */}
{/* Regenerate button - only on last assistant message, not for cached examples */}
{onRegenerate &&
isLastAssistantMessage && (
isLastAssistantMessage &&
!message.parts?.some((p: any) =>
p.toolCallId?.startsWith(
"cached-",
),
) && (
<button
type="button"
onClick={() =>

File diff suppressed because it is too large Load Diff

View File

@@ -1,15 +1,31 @@
"use client"
import { X } from "lucide-react"
import { FileCode, FileText, Loader2, X } from "lucide-react"
import Image from "next/image"
import { useEffect, useRef, useState } from "react"
import { isPdfFile, isTextFile } from "@/lib/pdf-utils"
function formatCharCount(count: number): string {
if (count >= 1000) {
return `${(count / 1000).toFixed(1)}k`
}
return String(count)
}
interface FilePreviewListProps {
files: File[]
onRemoveFile: (fileToRemove: File) => void
pdfData?: Map<
File,
{ text: string; charCount: number; isExtracting: boolean }
>
}
export function FilePreviewList({ files, onRemoveFile }: FilePreviewListProps) {
export function FilePreviewList({
files,
onRemoveFile,
pdfData = new Map(),
}: FilePreviewListProps) {
const [selectedImage, setSelectedImage] = useState<string | null>(null)
const [imageUrls, setImageUrls] = useState<Map<File, string>>(new Map())
const imageUrlsRef = useRef<Map<File, string>>(new Map())
@@ -48,6 +64,8 @@ export function FilePreviewList({ files, onRemoveFile }: FilePreviewListProps) {
imageUrlsRef.current.forEach((url) => {
URL.revokeObjectURL(url)
})
// Clear the ref so StrictMode remount creates fresh URLs
imageUrlsRef.current = new Map()
}
}, [])
@@ -68,12 +86,19 @@ export function FilePreviewList({ files, onRemoveFile }: FilePreviewListProps) {
<div className="flex flex-wrap gap-2 mt-2 p-2 bg-muted/50 rounded-md">
{files.map((file, index) => {
const imageUrl = imageUrls.get(file) || null
const pdfInfo = pdfData.get(file)
return (
<div key={file.name + index} className="relative group">
<div
className="w-20 h-20 border rounded-md overflow-hidden bg-muted cursor-pointer"
className={`w-20 h-20 border rounded-md overflow-hidden bg-muted ${
file.type.startsWith("image/") && imageUrl
? "cursor-pointer"
: ""
}`}
onClick={() =>
imageUrl && setSelectedImage(imageUrl)
file.type.startsWith("image/") &&
imageUrl &&
setSelectedImage(imageUrl)
}
>
{file.type.startsWith("image/") && imageUrl ? (
@@ -83,7 +108,35 @@ export function FilePreviewList({ files, onRemoveFile }: FilePreviewListProps) {
width={80}
height={80}
className="object-cover w-full h-full"
unoptimized
/>
) : isPdfFile(file) || isTextFile(file) ? (
<div className="flex flex-col items-center justify-center h-full p-1">
{pdfInfo?.isExtracting ? (
<Loader2 className="h-6 w-6 text-blue-500 mb-1 animate-spin" />
) : isPdfFile(file) ? (
<FileText className="h-6 w-6 text-red-500 mb-1" />
) : (
<FileCode className="h-6 w-6 text-blue-500 mb-1" />
)}
<span className="text-xs text-center truncate w-full px-1">
{file.name.length > 10
? `${file.name.slice(0, 7)}...`
: file.name}
</span>
{pdfInfo?.isExtracting ? (
<span className="text-[10px] text-muted-foreground">
Reading...
</span>
) : pdfInfo?.charCount ? (
<span className="text-[10px] text-green-600 font-medium">
{formatCharCount(
pdfInfo.charCount,
)}{" "}
chars
</span>
) : null}
</div>
) : (
<div className="flex items-center justify-center h-full text-xs text-center p-1">
{file.name}
@@ -124,6 +177,7 @@ export function FilePreviewList({ files, onRemoveFile }: FilePreviewListProps) {
height={900}
className="object-contain max-w-full max-h-[90vh] w-auto h-auto"
onClick={(e) => e.stopPropagation()}
unoptimized
/>
</div>
</div>

View File

@@ -32,7 +32,8 @@ export function HistoryDialog({
const handleConfirmRestore = () => {
if (selectedIndex !== null) {
onDisplayChart(diagramHistory[selectedIndex].xml)
// Skip validation for trusted history snapshots
onDisplayChart(diagramHistory[selectedIndex].xml, true)
handleClose()
}
}

View File

@@ -0,0 +1,115 @@
"use client"
import { Coffee, X } from "lucide-react"
import Link from "next/link"
import type React from "react"
import { FaGithub } from "react-icons/fa"
interface QuotaLimitToastProps {
type?: "request" | "token"
used: number
limit: number
onDismiss: () => void
}
export function QuotaLimitToast({
type = "request",
used,
limit,
onDismiss,
}: QuotaLimitToastProps) {
const isTokenLimit = type === "token"
const formatNumber = (n: number) =>
n >= 1000 ? `${(n / 1000).toFixed(1)}k` : n.toString()
const handleKeyDown = (e: React.KeyboardEvent) => {
if (e.key === "Escape") {
e.preventDefault()
onDismiss()
}
}
return (
<div
role="alert"
aria-live="polite"
tabIndex={0}
onKeyDown={handleKeyDown}
className="relative w-[400px] overflow-hidden rounded-xl border border-border/50 bg-card p-5 shadow-soft animate-message-in"
>
{/* Close button */}
<button
onClick={onDismiss}
className="absolute right-3 top-3 p-1.5 rounded-full text-muted-foreground/60 hover:text-foreground hover:bg-muted transition-colors"
aria-label="Dismiss"
>
<X className="w-4 h-4" />
</button>
{/* Title row with icon */}
<div className="flex items-center gap-2.5 mb-3 pr-6">
<div className="flex-shrink-0 w-8 h-8 rounded-lg bg-accent flex items-center justify-center">
<Coffee
className="w-4 h-4 text-accent-foreground"
strokeWidth={2}
/>
</div>
<h3 className="font-semibold text-foreground text-sm">
{isTokenLimit
? "Daily Token Limit Reached"
: "Daily Quota Reached"}
</h3>
<span className="px-2 py-0.5 text-xs font-medium rounded-md bg-muted text-muted-foreground">
{isTokenLimit
? `${formatNumber(used)}/${formatNumber(limit)} tokens`
: `${used}/${limit}`}
</span>
</div>
{/* Message */}
<div className="text-sm text-muted-foreground leading-relaxed mb-4 space-y-2">
<p>
Oops you've reached the daily{" "}
{isTokenLimit ? "token" : "API"} limit for this demo! As an
indie developer covering all the API costs myself, I have to
set these limits to keep things sustainable.{" "}
<Link
href="/about"
target="_blank"
rel="noopener noreferrer"
className="inline-flex items-center gap-1 text-amber-600 font-medium hover:text-amber-700 hover:underline"
>
Learn more
</Link>
</p>
<p>
<strong>Tip:</strong> You can use your own API key (click
the Settings icon) or self-host the project to bypass these
limits.
</p>
<p>Your limit resets tomorrow. Thanks for understanding!</p>
</div>
{/* Action buttons */}
<div className="flex items-center gap-2">
<a
href="https://github.com/DayuanJiang/next-ai-draw-io"
target="_blank"
rel="noopener noreferrer"
className="inline-flex items-center gap-1.5 px-3 py-1.5 text-xs font-medium rounded-lg bg-primary text-primary-foreground hover:bg-primary/90 transition-colors"
>
<FaGithub className="w-3.5 h-3.5" />
Self-host
</a>
<a
href="https://github.com/sponsors/DayuanJiang"
target="_blank"
rel="noopener noreferrer"
className="inline-flex items-center gap-1.5 px-3 py-1.5 text-xs font-medium rounded-lg border border-border text-foreground hover:bg-muted transition-colors"
>
<Coffee className="w-3.5 h-3.5" />
Sponsor
</a>
</div>
</div>
)
}

View File

@@ -1,38 +1,145 @@
"use client"
import { Moon, Sun } from "lucide-react"
import { useEffect, useState } from "react"
import { Button } from "@/components/ui/button"
import {
Dialog,
DialogContent,
DialogDescription,
DialogFooter,
DialogHeader,
DialogTitle,
} from "@/components/ui/dialog"
import { Input } from "@/components/ui/input"
import { Label } from "@/components/ui/label"
import {
Select,
SelectContent,
SelectItem,
SelectTrigger,
SelectValue,
} from "@/components/ui/select"
import { Switch } from "@/components/ui/switch"
interface SettingsDialogProps {
open: boolean
onOpenChange: (open: boolean) => void
onCloseProtectionChange?: (enabled: boolean) => void
drawioUi: "min" | "sketch"
onToggleDrawioUi: () => void
darkMode: boolean
onToggleDarkMode: () => void
}
export const STORAGE_ACCESS_CODE_KEY = "next-ai-draw-io-access-code"
export const STORAGE_CLOSE_PROTECTION_KEY = "next-ai-draw-io-close-protection"
const STORAGE_ACCESS_CODE_REQUIRED_KEY = "next-ai-draw-io-access-code-required"
export const STORAGE_AI_PROVIDER_KEY = "next-ai-draw-io-ai-provider"
export const STORAGE_AI_BASE_URL_KEY = "next-ai-draw-io-ai-base-url"
export const STORAGE_AI_API_KEY_KEY = "next-ai-draw-io-ai-api-key"
export const STORAGE_AI_MODEL_KEY = "next-ai-draw-io-ai-model"
export function SettingsDialog({ open, onOpenChange }: SettingsDialogProps) {
function getStoredAccessCodeRequired(): boolean | null {
if (typeof window === "undefined") return null
const stored = localStorage.getItem(STORAGE_ACCESS_CODE_REQUIRED_KEY)
if (stored === null) return null
return stored === "true"
}
export function SettingsDialog({
open,
onOpenChange,
onCloseProtectionChange,
drawioUi,
onToggleDrawioUi,
darkMode,
onToggleDarkMode,
}: SettingsDialogProps) {
const [accessCode, setAccessCode] = useState("")
const [closeProtection, setCloseProtection] = useState(true)
const [isVerifying, setIsVerifying] = useState(false)
const [error, setError] = useState("")
const [accessCodeRequired, setAccessCodeRequired] = useState(
() => getStoredAccessCodeRequired() ?? false,
)
const [provider, setProvider] = useState("")
const [baseUrl, setBaseUrl] = useState("")
const [apiKey, setApiKey] = useState("")
const [modelId, setModelId] = useState("")
useEffect(() => {
// Only fetch if not cached in localStorage
if (getStoredAccessCodeRequired() !== null) return
fetch("/api/config")
.then((res) => {
if (!res.ok) throw new Error(`HTTP ${res.status}`)
return res.json()
})
.then((data) => {
const required = data?.accessCodeRequired === true
localStorage.setItem(
STORAGE_ACCESS_CODE_REQUIRED_KEY,
String(required),
)
setAccessCodeRequired(required)
})
.catch(() => {
// Don't cache on error - allow retry on next mount
setAccessCodeRequired(false)
})
}, [])
useEffect(() => {
if (open) {
const storedCode =
localStorage.getItem(STORAGE_ACCESS_CODE_KEY) || ""
setAccessCode(storedCode)
const storedCloseProtection = localStorage.getItem(
STORAGE_CLOSE_PROTECTION_KEY,
)
// Default to true if not set
setCloseProtection(storedCloseProtection !== "false")
// Load AI provider settings
setProvider(localStorage.getItem(STORAGE_AI_PROVIDER_KEY) || "")
setBaseUrl(localStorage.getItem(STORAGE_AI_BASE_URL_KEY) || "")
setApiKey(localStorage.getItem(STORAGE_AI_API_KEY_KEY) || "")
setModelId(localStorage.getItem(STORAGE_AI_MODEL_KEY) || "")
setError("")
}
}, [open])
const handleSave = () => {
localStorage.setItem(STORAGE_ACCESS_CODE_KEY, accessCode.trim())
onOpenChange(false)
const handleSave = async () => {
if (!accessCodeRequired) return
setError("")
setIsVerifying(true)
try {
const response = await fetch("/api/verify-access-code", {
method: "POST",
headers: {
"x-access-code": accessCode.trim(),
},
})
const data = await response.json()
if (!data.valid) {
setError(data.message || "Invalid access code")
return
}
localStorage.setItem(STORAGE_ACCESS_CODE_KEY, accessCode.trim())
onOpenChange(false)
} catch {
setError("Failed to verify access code")
} finally {
setIsVerifying(false)
}
}
const handleKeyDown = (e: React.KeyboardEvent) => {
@@ -48,36 +155,281 @@ export function SettingsDialog({ open, onOpenChange }: SettingsDialogProps) {
<DialogHeader>
<DialogTitle>Settings</DialogTitle>
<DialogDescription>
Configure your access settings.
Configure your application settings.
</DialogDescription>
</DialogHeader>
<div className="space-y-4 py-2">
{accessCodeRequired && (
<div className="space-y-2">
<Label htmlFor="access-code">Access Code</Label>
<div className="flex gap-2">
<Input
id="access-code"
type="password"
value={accessCode}
onChange={(e) =>
setAccessCode(e.target.value)
}
onKeyDown={handleKeyDown}
placeholder="Enter access code"
autoComplete="off"
/>
<Button
onClick={handleSave}
disabled={isVerifying || !accessCode.trim()}
>
{isVerifying ? "..." : "Save"}
</Button>
</div>
<p className="text-[0.8rem] text-muted-foreground">
Required to use this application.
</p>
{error && (
<p className="text-[0.8rem] text-destructive">
{error}
</p>
)}
</div>
)}
<div className="space-y-2">
<label className="text-sm font-medium leading-none peer-disabled:cursor-not-allowed peer-disabled:opacity-70">
Access Code
</label>
<Input
type="password"
value={accessCode}
onChange={(e) => setAccessCode(e.target.value)}
onKeyDown={handleKeyDown}
placeholder="Enter access code"
autoComplete="off"
/>
<Label>AI Provider Settings</Label>
<p className="text-[0.8rem] text-muted-foreground">
Required if the server has enabled access control.
Use your own API key to bypass usage limits. Your
key is stored locally in your browser and is never
stored on the server.
</p>
<div className="space-y-3 pt-2">
<div className="space-y-2">
<Label htmlFor="ai-provider">Provider</Label>
<Select
value={provider || "default"}
onValueChange={(value) => {
const actualValue =
value === "default" ? "" : value
setProvider(actualValue)
localStorage.setItem(
STORAGE_AI_PROVIDER_KEY,
actualValue,
)
}}
>
<SelectTrigger id="ai-provider">
<SelectValue placeholder="Use Server Default" />
</SelectTrigger>
<SelectContent>
<SelectItem value="default">
Use Server Default
</SelectItem>
<SelectItem value="openai">
OpenAI
</SelectItem>
<SelectItem value="anthropic">
Anthropic
</SelectItem>
<SelectItem value="google">
Google
</SelectItem>
<SelectItem value="azure">
Azure OpenAI
</SelectItem>
<SelectItem value="openrouter">
OpenRouter
</SelectItem>
<SelectItem value="deepseek">
DeepSeek
</SelectItem>
<SelectItem value="siliconflow">
SiliconFlow
</SelectItem>
</SelectContent>
</Select>
</div>
{provider && provider !== "default" && (
<>
<div className="space-y-2">
<Label htmlFor="ai-model">
Model ID
</Label>
<Input
id="ai-model"
value={modelId}
onChange={(e) => {
setModelId(e.target.value)
localStorage.setItem(
STORAGE_AI_MODEL_KEY,
e.target.value,
)
}}
placeholder={
provider === "openai"
? "e.g., gpt-4o"
: provider === "anthropic"
? "e.g., claude-sonnet-4-5"
: provider === "google"
? "e.g., gemini-2.0-flash-exp"
: provider ===
"deepseek"
? "e.g., deepseek-chat"
: "Model ID"
}
/>
</div>
<div className="space-y-2">
<Label htmlFor="ai-api-key">
API Key
</Label>
<Input
id="ai-api-key"
type="password"
value={apiKey}
onChange={(e) => {
setApiKey(e.target.value)
localStorage.setItem(
STORAGE_AI_API_KEY_KEY,
e.target.value,
)
}}
placeholder="Your API key"
autoComplete="off"
/>
<p className="text-[0.8rem] text-muted-foreground">
Overrides{" "}
{provider === "openai"
? "OPENAI_API_KEY"
: provider === "anthropic"
? "ANTHROPIC_API_KEY"
: provider === "google"
? "GOOGLE_GENERATIVE_AI_API_KEY"
: provider === "azure"
? "AZURE_API_KEY"
: provider ===
"openrouter"
? "OPENROUTER_API_KEY"
: provider ===
"deepseek"
? "DEEPSEEK_API_KEY"
: provider ===
"siliconflow"
? "SILICONFLOW_API_KEY"
: "server API key"}
</p>
</div>
<div className="space-y-2">
<Label htmlFor="ai-base-url">
Base URL (optional)
</Label>
<Input
id="ai-base-url"
value={baseUrl}
onChange={(e) => {
setBaseUrl(e.target.value)
localStorage.setItem(
STORAGE_AI_BASE_URL_KEY,
e.target.value,
)
}}
placeholder={
provider === "anthropic"
? "https://api.anthropic.com/v1"
: provider === "siliconflow"
? "https://api.siliconflow.com/v1"
: "Custom endpoint URL"
}
/>
</div>
<Button
variant="outline"
size="sm"
className="w-full"
onClick={() => {
localStorage.removeItem(
STORAGE_AI_PROVIDER_KEY,
)
localStorage.removeItem(
STORAGE_AI_BASE_URL_KEY,
)
localStorage.removeItem(
STORAGE_AI_API_KEY_KEY,
)
localStorage.removeItem(
STORAGE_AI_MODEL_KEY,
)
setProvider("")
setBaseUrl("")
setApiKey("")
setModelId("")
}}
>
Clear Settings
</Button>
</>
)}
</div>
</div>
<div className="flex items-center justify-between">
<div className="space-y-0.5">
<Label htmlFor="theme-toggle">Theme</Label>
<p className="text-[0.8rem] text-muted-foreground">
Dark/Light mode for interface and DrawIO canvas.
</p>
</div>
<Button
id="theme-toggle"
variant="outline"
size="icon"
onClick={onToggleDarkMode}
>
{darkMode ? (
<Sun className="h-4 w-4" />
) : (
<Moon className="h-4 w-4" />
)}
</Button>
</div>
<div className="flex items-center justify-between">
<div className="space-y-0.5">
<Label htmlFor="drawio-ui">DrawIO Style</Label>
<p className="text-[0.8rem] text-muted-foreground">
Canvas style:{" "}
{drawioUi === "min" ? "Minimal" : "Sketch"}
</p>
</div>
<Button
id="drawio-ui"
variant="outline"
size="sm"
onClick={onToggleDrawioUi}
>
Switch to{" "}
{drawioUi === "min" ? "Sketch" : "Minimal"}
</Button>
</div>
<div className="flex items-center justify-between">
<div className="space-y-0.5">
<Label htmlFor="close-protection">
Close Protection
</Label>
<p className="text-[0.8rem] text-muted-foreground">
Show confirmation when leaving the page.
</p>
</div>
<Switch
id="close-protection"
checked={closeProtection}
onCheckedChange={(checked) => {
setCloseProtection(checked)
localStorage.setItem(
STORAGE_CLOSE_PROTECTION_KEY,
checked.toString(),
)
onCloseProtectionChange?.(checked)
}}
/>
</div>
</div>
<DialogFooter>
<Button
variant="outline"
onClick={() => onOpenChange(false)}
>
Cancel
</Button>
<Button onClick={handleSave}>Save</Button>
</DialogFooter>
</DialogContent>
</Dialog>
)

View File

@@ -0,0 +1,33 @@
"use client"
import * as CollapsiblePrimitive from "@radix-ui/react-collapsible"
function Collapsible({
...props
}: React.ComponentProps<typeof CollapsiblePrimitive.Root>) {
return <CollapsiblePrimitive.Root data-slot="collapsible" {...props} />
}
function CollapsibleTrigger({
...props
}: React.ComponentProps<typeof CollapsiblePrimitive.CollapsibleTrigger>) {
return (
<CollapsiblePrimitive.CollapsibleTrigger
data-slot="collapsible-trigger"
{...props}
/>
)
}
function CollapsibleContent({
...props
}: React.ComponentProps<typeof CollapsiblePrimitive.CollapsibleContent>) {
return (
<CollapsiblePrimitive.CollapsibleContent
data-slot="collapsible-content"
{...props}
/>
)
}
export { Collapsible, CollapsibleTrigger, CollapsibleContent }

24
components/ui/label.tsx Normal file
View File

@@ -0,0 +1,24 @@
"use client"
import * as React from "react"
import * as LabelPrimitive from "@radix-ui/react-label"
import { cn } from "@/lib/utils"
function Label({
className,
...props
}: React.ComponentProps<typeof LabelPrimitive.Root>) {
return (
<LabelPrimitive.Root
data-slot="label"
className={cn(
"flex items-center gap-2 text-sm leading-none font-medium select-none group-data-[disabled=true]:pointer-events-none group-data-[disabled=true]:opacity-50 peer-disabled:cursor-not-allowed peer-disabled:opacity-50",
className
)}
{...props}
/>
)
}
export { Label }

31
components/ui/switch.tsx Normal file
View File

@@ -0,0 +1,31 @@
"use client"
import * as React from "react"
import * as SwitchPrimitive from "@radix-ui/react-switch"
import { cn } from "@/lib/utils"
function Switch({
className,
...props
}: React.ComponentProps<typeof SwitchPrimitive.Root>) {
return (
<SwitchPrimitive.Root
data-slot="switch"
className={cn(
"peer data-[state=checked]:bg-primary data-[state=unchecked]:bg-input focus-visible:border-ring focus-visible:ring-ring/50 dark:data-[state=unchecked]:bg-input/80 inline-flex h-[1.15rem] w-8 shrink-0 items-center rounded-full border border-transparent shadow-xs transition-all outline-none focus-visible:ring-[3px] disabled:cursor-not-allowed disabled:opacity-50",
className
)}
{...props}
>
<SwitchPrimitive.Thumb
data-slot="switch-thumb"
className={cn(
"bg-background dark:data-[state=unchecked]:bg-foreground dark:data-[state=checked]:bg-primary-foreground pointer-events-none block size-4 rounded-full ring-0 transition-transform data-[state=checked]:translate-x-[calc(100%-2px)] data-[state=unchecked]:translate-x-0"
)}
/>
</SwitchPrimitive.Root>
)
}
export { Switch }

View File

@@ -3,14 +3,15 @@
import type React from "react"
import { createContext, useContext, useRef, useState } from "react"
import type { DrawIoEmbedRef } from "react-drawio"
import { STORAGE_DIAGRAM_XML_KEY } from "@/components/chat-panel"
import type { ExportFormat } from "@/components/save-dialog"
import { extractDiagramXML } from "../lib/utils"
import { extractDiagramXML, validateMxCellStructure } from "../lib/utils"
interface DiagramContextType {
chartXML: string
latestSvg: string
diagramHistory: { svg: string; xml: string }[]
loadDiagram: (chart: string) => void
loadDiagram: (chart: string, skipValidation?: boolean) => string | null
handleExport: () => void
handleExportWithoutHistory: () => void
resolverRef: React.Ref<((value: string) => void) | null>
@@ -22,6 +23,9 @@ interface DiagramContextType {
format: ExportFormat,
sessionId?: string,
) => void
isDrawioReady: boolean
onDrawioLoad: () => void
resetDrawioReady: () => void
}
const DiagramContext = createContext<DiagramContextType | undefined>(undefined)
@@ -32,10 +36,27 @@ export function DiagramProvider({ children }: { children: React.ReactNode }) {
const [diagramHistory, setDiagramHistory] = useState<
{ svg: string; xml: string }[]
>([])
const [isDrawioReady, setIsDrawioReady] = useState(false)
const hasCalledOnLoadRef = useRef(false)
const drawioRef = useRef<DrawIoEmbedRef | null>(null)
const resolverRef = useRef<((value: string) => void) | null>(null)
// Track if we're expecting an export for history (user-initiated)
const expectHistoryExportRef = useRef<boolean>(false)
const onDrawioLoad = () => {
// Only set ready state once to prevent infinite loops
if (hasCalledOnLoadRef.current) return
hasCalledOnLoadRef.current = true
// console.log("[DiagramContext] DrawIO loaded, setting ready state")
setIsDrawioReady(true)
}
const resetDrawioReady = () => {
// console.log("[DiagramContext] Resetting DrawIO ready state")
hasCalledOnLoadRef.current = false
setIsDrawioReady(false)
}
// Track if we're expecting an export for file save (stores raw export data)
const saveResolverRef = useRef<{
resolver: ((data: string) => void) | null
@@ -61,12 +82,29 @@ export function DiagramProvider({ children }: { children: React.ReactNode }) {
}
}
const loadDiagram = (chart: string) => {
const loadDiagram = (
chart: string,
skipValidation?: boolean,
): string | null => {
// Validate XML structure before loading (unless skipped for internal use)
if (!skipValidation) {
const validationError = validateMxCellStructure(chart)
if (validationError) {
console.warn("[loadDiagram] Validation error:", validationError)
return validationError
}
}
// Keep chartXML in sync even when diagrams are injected (e.g., display_diagram tool)
setChartXML(chart)
if (drawioRef.current) {
drawioRef.current.load({
xml: chart,
})
}
return null
}
const handleDiagramExport = (data: any) => {
@@ -106,8 +144,8 @@ export function DiagramProvider({ children }: { children: React.ReactNode }) {
const clearDiagram = () => {
const emptyDiagram = `<mxfile><diagram name="Page-1" id="page-1"><mxGraphModel><root><mxCell id="0"/><mxCell id="1" parent="0"/></root></mxGraphModel></diagram></mxfile>`
loadDiagram(emptyDiagram)
setChartXML(emptyDiagram)
// Skip validation for trusted internal template (loadDiagram also sets chartXML)
loadDiagram(emptyDiagram, true)
setLatestSvg("")
setDiagramHistory([])
}
@@ -142,6 +180,9 @@ export function DiagramProvider({ children }: { children: React.ReactNode }) {
fileContent = xmlContent
mimeType = "application/xml"
extension = ".drawio"
// Save to localStorage when user manually saves
localStorage.setItem(STORAGE_DIAGRAM_XML_KEY, xmlContent)
} else if (format === "png") {
// PNG data comes as base64 data URL
fileContent = exportData
@@ -220,6 +261,9 @@ export function DiagramProvider({ children }: { children: React.ReactNode }) {
handleDiagramExport,
clearDiagram,
saveDiagramToFile,
isDrawioReady,
onDrawioLoad,
resetDrawioReady,
}}
>
{children}

12
docker-compose.yml Normal file
View File

@@ -0,0 +1,12 @@
services:
drawio:
image: jgraph/drawio:latest
ports: ["8080:8080"]
next-ai-draw-io:
build:
context: .
args:
- NEXT_PUBLIC_DRAWIO_BASE_URL=http://localhost:8080
ports: ["3000:3000"]
env_file: .env
depends_on: [drawio]

View File

@@ -4,14 +4,16 @@
**AI驱动的图表创建工具 - 对话、绘制、可视化**
[English](./README.md) | 中文 | [日本語](./README_JA.md)
[English](../README.md) | 中文 | [日本語](./README_JA.md)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Next.js](https://img.shields.io/badge/Next.js-15.x-black)](https://nextjs.org/)
[![TypeScript](https://img.shields.io/badge/TypeScript-5.x-blue)](https://www.typescriptlang.org/)
[![TrendShift](https://trendshift.io/api/badge/repositories/15449)](https://next-ai-drawio.jiang.jp/)
[![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Next.js](https://img.shields.io/badge/Next.js-16.x-black)](https://nextjs.org/)
[![React](https://img.shields.io/badge/React-19.x-61dafb)](https://react.dev/)
[![Sponsor](https://img.shields.io/badge/Sponsor-❤-ea4aaa)](https://github.com/sponsors/DayuanJiang)
[🚀 在线演示](https://next-ai-drawio.jiang.jp/)
[![Live Demo](../public/live-demo-button.svg)](https://next-ai-drawio.jiang.jp/)
</div>
@@ -19,16 +21,23 @@
https://github.com/user-attachments/assets/b2eef5f3-b335-4e71-a755-dc2e80931979
## 功能特性
## 目录
- [Next AI Draw.io](#next-ai-drawio)
- [目录](#目录)
- [示例](#示例)
- [功能特性](#功能特性)
- [快速开始](#快速开始)
- [在线试用](#在线试用)
- [使用Docker运行推荐](#使用docker运行推荐)
- [安装](#安装)
- [部署](#部署)
- [多提供商支持](#多提供商支持)
- [工作原理](#工作原理)
- [项目结构](#项目结构)
- [支持与联系](#支持与联系)
- [Star历史](#star历史)
- **LLM驱动的图表创建**利用大语言模型通过自然语言命令直接创建和操作draw.io图表
- **基于图像的图表复制**上传现有图表或图像让AI自动复制和增强
- **图表历史记录**全面的版本控制跟踪所有更改允许您查看和恢复AI编辑前的图表版本
- **交互式聊天界面**与AI实时对话来完善您的图表
- **AWS架构图支持**专门支持生成AWS架构图
- **动画连接器**:在图表元素之间创建动态动画连接器,实现更好的可视化效果
## **示例**
## 示例
以下是一些示例提示词及其生成的图表:
@@ -38,67 +47,59 @@ https://github.com/user-attachments/assets/b2eef5f3-b335-4e71-a755-dc2e80931979
<td colspan="2" valign="top" align="center">
<strong>动画Transformer连接器</strong><br />
<p><strong>提示词:</strong> 给我一个带有**动画连接器**的Transformer架构图。</p>
<img src="./public/animated_connectors.svg" alt="带动画连接器的Transformer架构" width="480" />
<img src="../public/animated_connectors.svg" alt="带动画连接器的Transformer架构" width="480" />
</td>
</tr>
<tr>
<td width="50%" valign="top">
<strong>GCP架构图</strong><br />
<p><strong>提示词:</strong> 使用**GCP图标**生成一个GCP架构图。在这个图中用户连接到托管在实例上的前端。</p>
<img src="./public/gcp_demo.svg" alt="GCP架构图" width="480" />
<img src="../public/gcp_demo.svg" alt="GCP架构图" width="480" />
</td>
<td width="50%" valign="top">
<strong>AWS架构图</strong><br />
<p><strong>提示词:</strong> 使用**AWS图标**生成一个AWS架构图。在这个图中用户连接到托管在实例上的前端。</p>
<img src="./public/aws_demo.svg" alt="AWS架构图" width="480" />
<img src="../public/aws_demo.svg" alt="AWS架构图" width="480" />
</td>
</tr>
<tr>
<td width="50%" valign="top">
<strong>Azure架构图</strong><br />
<p><strong>提示词:</strong> 使用**Azure图标**生成一个Azure架构图。在这个图中用户连接到托管在实例上的前端。</p>
<img src="./public/azure_demo.svg" alt="Azure架构图" width="480" />
<img src="../public/azure_demo.svg" alt="Azure架构图" width="480" />
</td>
<td width="50%" valign="top">
<strong>猫咪素描</strong><br />
<p><strong>提示词:</strong> 给我画一只可爱的猫。</p>
<img src="./public/cat_demo.svg" alt="猫咪绘图" width="240" />
<img src="../public/cat_demo.svg" alt="猫咪绘图" width="240" />
</td>
</tr>
</table>
</div>
## 工作原理
## 功能特性
本应用使用以下技术:
- **Next.js**:用于前端框架和路由
- **Vercel AI SDK**`ai` + `@ai-sdk/*`用于流式AI响应和多提供商支持
- **react-drawio**:用于图表表示和操作
图表以XML格式表示可在draw.io中渲染。AI处理您的命令并相应地生成或修改此XML。
## 多提供商支持
- AWS Bedrock默认
- OpenAI
- Anthropic
- Google AI
- Azure OpenAI
- Ollama
- OpenRouter
- DeepSeek
除AWS Bedrock和OpenRouter外所有提供商都支持自定义端点。
📖 **[详细的提供商配置指南](./docs/ai-providers.md)** - 查看各提供商的设置说明。
**模型要求**此任务需要强大的模型能力因为它涉及生成具有严格格式约束的长文本draw.io XML。推荐使用Claude Sonnet 4.5、GPT-4o、Gemini 2.0和DeepSeek V3/R1。
注意:`claude-sonnet-4-5` 已在带有AWS标志的draw.io图表上进行训练因此如果您想创建AWS架构图这是最佳选择。
- **LLM驱动的图表创建**利用大语言模型通过自然语言命令直接创建和操作draw.io图表
- **基于图像的图表复制**上传现有图表或图像让AI自动复制和增强
- **PDF和文本文件上传**上传PDF文档和文本文件提取内容并从现有文档生成图表
- **AI推理过程显示**查看支持模型的AI思考过程OpenAI o1/o3、Gemini、Claude等
- **图表历史记录**全面的版本控制跟踪所有更改允许您查看和恢复AI编辑前的图表版本
- **交互式聊天界面**与AI实时对话来完善您的图表
- **云架构图支持**专门支持生成云架构图AWS、GCP、Azure
- **动画连接器**:在图表元素之间创建动态动画连接器,实现更好的可视化效果
## 快速开始
### 在线试用
无需安装!直接在我们的演示站点试用:
[![Live Demo](../public/live-demo-button.svg)](https://next-ai-drawio.jiang.jp/)
> 注意:由于访问量较大,演示站点目前使用 minimax-m2 模型。如需获得最佳效果,建议使用 Claude Sonnet 4.5 或 Claude Opus 4.5 自行部署。
> **使用自己的 API Key**:您可以使用自己的 API Key 来绕过演示站点的用量限制。点击聊天面板中的设置图标即可配置您的 Provider 和 API Key。您的 Key 仅保存在浏览器本地,不会被存储在服务器上。
### 使用Docker运行推荐
如果您只想在本地运行最好的方式是使用Docker。
@@ -115,10 +116,20 @@ docker run -d -p 3000:3000 \
ghcr.io/dayuanjiang/next-ai-draw-io:latest
```
或者使用 env 文件:
```bash
cp env.example .env
# 编辑 .env 填写您的配置
docker run -d -p 3000:3000 --env-file .env ghcr.io/dayuanjiang/next-ai-draw-io:latest
```
在浏览器中打开 [http://localhost:3000](http://localhost:3000)。
请根据您首选的AI提供商配置替换环境变量。可用选项请参阅[多提供商支持](#多提供商支持)。
> **离线部署:** 如果 `embed.diagrams.net` 被屏蔽,请参阅 [离线部署指南](./offline-deployment.md) 了解配置选项。
### 安装
1. 克隆仓库:
@@ -132,8 +143,6 @@ cd next-ai-draw-io
```bash
npm install
# 或
yarn install
```
3. 配置您的AI提供商
@@ -146,14 +155,15 @@ cp env.example .env.local
编辑 `.env.local` 并配置您选择的提供商:
- 将 `AI_PROVIDER` 设置为您选择的提供商bedrock, openai, anthropic, google, azure, ollama, openrouter, deepseek
- 将 `AI_PROVIDER` 设置为您选择的提供商bedrock, openai, anthropic, google, azure, ollama, openrouter, deepseek, siliconflow
- 将 `AI_MODEL` 设置为您要使用的特定模型
- 添加您的提供商所需的API密钥
- `TEMPERATURE`:可选的温度设置(例如 `0` 表示确定性输出)。对于不支持此参数的模型(如推理模型),请不要设置。
- `ACCESS_CODE_LIST` 访问密码,可选,可以使用逗号隔开多个密码。
> 警告:如果不填写 `ACCESS_CODE_LIST`,则任何人都可以直接使用你部署后的网站,可能会导致你的 token 被急速消耗完毕,建议填写此选项。
详细设置说明请参阅[提供商配置指南](./docs/ai-providers.md)。
详细设置说明请参阅[提供商配置指南](./ai-providers.md)。
4. 运行开发服务器:
@@ -174,6 +184,38 @@ npm run dev
请确保在Vercel控制台中**设置环境变量**,就像您在本地 `.env.local` 文件中所做的那样。
## 多提供商支持
- AWS Bedrock默认
- OpenAI
- Anthropic
- Google AI
- Azure OpenAI
- Ollama
- OpenRouter
- DeepSeek
- SiliconFlow
除AWS Bedrock和OpenRouter外所有提供商都支持自定义端点。
📖 **[详细的提供商配置指南](./ai-providers.md)** - 查看各提供商的设置说明。
**模型要求**此任务需要强大的模型能力因为它涉及生成具有严格格式约束的长文本draw.io XML。推荐使用Claude Sonnet 4.5、GPT-4o、Gemini 2.0和DeepSeek V3/R1。
注意:`claude-sonnet-4-5` 已在带有AWS标志的draw.io图表上进行训练因此如果您想创建AWS架构图这是最佳选择。
## 工作原理
本应用使用以下技术:
- **Next.js**:用于前端框架和路由
- **Vercel AI SDK**`ai` + `@ai-sdk/*`用于流式AI响应和多提供商支持
- **react-drawio**:用于图表表示和操作
图表以XML格式表示可在draw.io中渲染。AI处理您的命令并相应地生成或修改此XML。
## 项目结构
```
@@ -193,14 +235,6 @@ lib/ # 工具函数和辅助程序
public/ # 静态资源包括示例图片
```
## 待办事项
- [x] 允许LLM修改XML而不是每次从头生成
- [x] 提高形状流式更新的流畅度
- [x] 添加多AI提供商支持OpenAI, Anthropic, Google, Azure, Ollama
- [x] 解决超过60秒的会话生成失败的bug
- [ ] 在UI上添加API配置
## 支持与联系
如果您觉得这个项目有用,请考虑[赞助](https://github.com/sponsors/DayuanJiang)来帮助我托管在线演示站点!

View File

@@ -4,14 +4,16 @@
**AI搭載のダイアグラム作成ツール - チャット、描画、可視化**
[English](./README.md) | [中文](./README_CN.md) | 日本語
[English](../README.md) | [中文](./README_CN.md) | 日本語
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Next.js](https://img.shields.io/badge/Next.js-15.x-black)](https://nextjs.org/)
[![TypeScript](https://img.shields.io/badge/TypeScript-5.x-blue)](https://www.typescriptlang.org/)
[![TrendShift](https://trendshift.io/api/badge/repositories/15449)](https://next-ai-drawio.jiang.jp/)
[![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Next.js](https://img.shields.io/badge/Next.js-16.x-black)](https://nextjs.org/)
[![React](https://img.shields.io/badge/React-19.x-61dafb)](https://react.dev/)
[![Sponsor](https://img.shields.io/badge/Sponsor-❤-ea4aaa)](https://github.com/sponsors/DayuanJiang)
[🚀 ライブデモ](https://next-ai-drawio.jiang.jp/)
[![Live Demo](../public/live-demo-button.svg)](https://next-ai-drawio.jiang.jp/)
</div>
@@ -19,16 +21,23 @@ AI機能とdraw.ioダイアグラムを統合したNext.jsウェブアプリケ
https://github.com/user-attachments/assets/b2eef5f3-b335-4e71-a755-dc2e80931979
## 機能
## 目次
- [Next AI Draw.io](#next-ai-drawio)
- [目次](#目次)
- [](#例)
- [機能](#機能)
- [はじめに](#はじめに)
- [オンラインで試す](#オンラインで試す)
- [Dockerで実行推奨](#dockerで実行推奨)
- [インストール](#インストール)
- [デプロイ](#デプロイ)
- [マルチプロバイダーサポート](#マルチプロバイダーサポート)
- [仕組み](#仕組み)
- [プロジェクト構造](#プロジェクト構造)
- [サポート&お問い合わせ](#サポートお問い合わせ)
- [スター履歴](#スター履歴)
- **LLM搭載のダイアグラム作成**大規模言語モデルを活用して、自然言語コマンドで直接draw.ioダイアグラムを作成・操作
- **画像ベースのダイアグラム複製**既存のダイアグラムや画像をアップロードし、AIが自動的に複製・強化
- **ダイアグラム履歴**すべての変更を追跡する包括的なバージョン管理。AI編集前のダイアグラムの以前のバージョンを表示・復元可能
- **インタラクティブなチャットインターフェース**AIとリアルタイムでコミュニケーションしてダイアグラムを改善
- **AWSアーキテクチャダイアグラムサポート**AWSアーキテクチャダイアグラムの生成を専門的にサポート
- **アニメーションコネクタ**:より良い可視化のためにダイアグラム要素間に動的でアニメーション化されたコネクタを作成
## **例**
## 例
以下はいくつかのプロンプト例と生成されたダイアグラムです:
@@ -38,67 +47,59 @@ https://github.com/user-attachments/assets/b2eef5f3-b335-4e71-a755-dc2e80931979
<td colspan="2" valign="top" align="center">
<strong>アニメーションTransformerコネクタ</strong><br />
<p><strong>プロンプト:</strong> **アニメーションコネクタ**付きのTransformerアーキテクチャ図を作成してください。</p>
<img src="./public/animated_connectors.svg" alt="アニメーションコネクタ付きTransformerアーキテクチャ" width="480" />
<img src="../public/animated_connectors.svg" alt="アニメーションコネクタ付きTransformerアーキテクチャ" width="480" />
</td>
</tr>
<tr>
<td width="50%" valign="top">
<strong>GCPアーキテクチャ図</strong><br />
<p><strong>プロンプト:</strong> **GCPアイコン**を使用してGCPアーキテクチャ図を生成してください。この図では、ユーザーがインスタンス上でホストされているフロントエンドに接続します。</p>
<img src="./public/gcp_demo.svg" alt="GCPアーキテクチャ図" width="480" />
<img src="../public/gcp_demo.svg" alt="GCPアーキテクチャ図" width="480" />
</td>
<td width="50%" valign="top">
<strong>AWSアーキテクチャ図</strong><br />
<p><strong>プロンプト:</strong> **AWSアイコン**を使用してAWSアーキテクチャ図を生成してください。この図では、ユーザーがインスタンス上でホストされているフロントエンドに接続します。</p>
<img src="./public/aws_demo.svg" alt="AWSアーキテクチャ図" width="480" />
<img src="../public/aws_demo.svg" alt="AWSアーキテクチャ図" width="480" />
</td>
</tr>
<tr>
<td width="50%" valign="top">
<strong>Azureアーキテクチャ図</strong><br />
<p><strong>プロンプト:</strong> **Azureアイコン**を使用してAzureアーキテクチャ図を生成してください。この図では、ユーザーがインスタンス上でホストされているフロントエンドに接続します。</p>
<img src="./public/azure_demo.svg" alt="Azureアーキテクチャ図" width="480" />
<img src="../public/azure_demo.svg" alt="Azureアーキテクチャ図" width="480" />
</td>
<td width="50%" valign="top">
<strong>猫のスケッチ</strong><br />
<p><strong>プロンプト:</strong> かわいい猫を描いてください。</p>
<img src="./public/cat_demo.svg" alt="猫の絵" width="240" />
<img src="../public/cat_demo.svg" alt="猫の絵" width="240" />
</td>
</tr>
</table>
</div>
## 仕組み
## 機能
本アプリケーションは以下の技術を使用しています:
- **Next.js**:フロントエンドフレームワークとルーティング
- **Vercel AI SDK**`ai` + `@ai-sdk/*`ストリーミングAIレスポンスとマルチプロバイダーサポート
- **react-drawio**:ダイアグラムの表現と操作
ダイアグラムはdraw.ioでレンダリングできるXMLとして表現されます。AIがコマンドを処理し、それに応じてこのXMLを生成または変更します。
## マルチプロバイダーサポート
- AWS Bedrockデフォルト
- OpenAI
- Anthropic
- Google AI
- Azure OpenAI
- Ollama
- OpenRouter
- DeepSeek
AWS BedrockとOpenRouter以外のすべてのプロバイダーはカスタムエンドポイントをサポートしています。
📖 **[詳細なプロバイダー設定ガイド](./docs/ai-providers.md)** - 各プロバイダーの設定手順をご覧ください。
**モデル要件**このタスクは厳密なフォーマット制約draw.io XMLを持つ長文テキスト生成を伴うため、強力なモデル機能が必要です。Claude Sonnet 4.5、GPT-4o、Gemini 2.0、DeepSeek V3/R1を推奨します。
注:`claude-sonnet-4-5`はAWSロゴ付きのdraw.ioダイアグラムで学習されているため、AWSアーキテクチャダイアグラムを作成したい場合は最適な選択です。
- **LLM搭載のダイアグラム作成**大規模言語モデルを活用して、自然言語コマンドで直接draw.ioダイアグラムを作成・操作
- **画像ベースのダイアグラム複製**既存のダイアグラムや画像をアップロードし、AIが自動的に複製・強化
- **PDFとテキストファイルのアップロード**PDFドキュメントやテキストファイルをアップロードして、既存のドキュメントからコンテンツを抽出し、ダイアグラムを生成
- **AI推論プロセス表示**サポートされているモデルOpenAI o1/o3、Gemini、ClaudeなどのAIの思考プロセスを表示
- **ダイアグラム履歴**すべての変更を追跡する包括的なバージョン管理。AI編集前のダイアグラムの以前のバージョンを表示・復元可能
- **インタラクティブなチャットインターフェース**AIとリアルタイムでコミュニケーションしてダイアグラムを改善
- **クラウドアーキテクチャダイアグラムサポート**クラウドアーキテクチャダイアグラムの生成を専門的にサポートAWS、GCP、Azure
- **アニメーションコネクタ**:より良い可視化のためにダイアグラム要素間に動的でアニメーション化されたコネクタを作成
## はじめに
### オンラインで試す
インストール不要!デモサイトで直接お試しください:
[![Live Demo](../public/live-demo-button.svg)](https://next-ai-drawio.jiang.jp/)
> 注意:アクセス数が多いため、デモサイトでは現在 minimax-m2 モデルを使用しています。最高の結果を得るには、Claude Sonnet 4.5 または Claude Opus 4.5 でのセルフホスティングをお勧めします。
> **自分のAPIキーを使用**自分のAPIキーを使用することで、デモサイトの利用制限を回避できます。チャットパネルの設定アイコンをクリックして、プロバイダーとAPIキーを設定してください。キーはブラウザのローカルに保存され、サーバーには保存されません。
### Dockerで実行推奨
ローカルで実行したいだけなら、Dockerを使用するのが最も簡単です。
@@ -115,10 +116,20 @@ docker run -d -p 3000:3000 \
ghcr.io/dayuanjiang/next-ai-draw-io:latest
```
または env ファイルを使用:
```bash
cp env.example .env
# .env を編集して設定を入力
docker run -d -p 3000:3000 --env-file .env ghcr.io/dayuanjiang/next-ai-draw-io:latest
```
ブラウザで [http://localhost:3000](http://localhost:3000) を開いてください。
環境変数はお好みのAIプロバイダー設定に置き換えてください。利用可能なオプションについては[マルチプロバイダーサポート](#マルチプロバイダーサポート)を参照してください。
> **オフラインデプロイ:** `embed.diagrams.net` がブロックされている場合は、[オフラインデプロイガイド](./offline-deployment.md) で設定オプションをご確認ください。
### インストール
1. リポジトリをクローン:
@@ -132,8 +143,6 @@ cd next-ai-draw-io
```bash
npm install
# または
yarn install
```
3. AIプロバイダーを設定
@@ -146,14 +155,15 @@ cp env.example .env.local
`.env.local`を編集して選択したプロバイダーを設定:
- `AI_PROVIDER`を選択したプロバイダーに設定bedrock, openai, anthropic, google, azure, ollama, openrouter, deepseek
- `AI_PROVIDER`を選択したプロバイダーに設定bedrock, openai, anthropic, google, azure, ollama, openrouter, deepseek, siliconflow
- `AI_MODEL`を使用する特定のモデルに設定
- プロバイダーに必要なAPIキーを追加
- `TEMPERATURE`:オプションの温度設定(例:`0`で決定論的な出力)。温度をサポートしないモデル(推論モデルなど)では設定しないでください。
- `ACCESS_CODE_LIST` アクセスパスワード(オプション)。カンマ区切りで複数のパスワードを指定できます。
> 警告:`ACCESS_CODE_LIST`を設定しない場合、誰でもデプロイされたサイトに直接アクセスできるため、トークンが急速に消費される可能性があります。このオプションを設定することをお勧めします。
詳細な設定手順については[プロバイダー設定ガイド](./docs/ai-providers.md)を参照してください。
詳細な設定手順については[プロバイダー設定ガイド](./ai-providers.md)を参照してください。
4. 開発サーバーを起動:
@@ -174,6 +184,38 @@ Next.jsアプリをデプロイする最も簡単な方法は、Next.jsの作成
ローカルの`.env.local`ファイルと同様に、Vercelダッシュボードで**環境変数を設定**してください。
## マルチプロバイダーサポート
- AWS Bedrockデフォルト
- OpenAI
- Anthropic
- Google AI
- Azure OpenAI
- Ollama
- OpenRouter
- DeepSeek
- SiliconFlow
AWS BedrockとOpenRouter以外のすべてのプロバイダーはカスタムエンドポイントをサポートしています。
📖 **[詳細なプロバイダー設定ガイド](./ai-providers.md)** - 各プロバイダーの設定手順をご覧ください。
**モデル要件**このタスクは厳密なフォーマット制約draw.io XMLを持つ長文テキスト生成を伴うため、強力なモデル機能が必要です。Claude Sonnet 4.5、GPT-4o、Gemini 2.0、DeepSeek V3/R1を推奨します。
注:`claude-sonnet-4-5`はAWSロゴ付きのdraw.ioダイアグラムで学習されているため、AWSアーキテクチャダイアグラムを作成したい場合は最適な選択です。
## 仕組み
本アプリケーションは以下の技術を使用しています:
- **Next.js**:フロントエンドフレームワークとルーティング
- **Vercel AI SDK**`ai` + `@ai-sdk/*`ストリーミングAIレスポンスとマルチプロバイダーサポート
- **react-drawio**:ダイアグラムの表現と操作
ダイアグラムはdraw.ioでレンダリングできるXMLとして表現されます。AIがコマンドを処理し、それに応じてこのXMLを生成または変更します。
## プロジェクト構造
```
@@ -193,14 +235,6 @@ lib/ # ユーティリティ関数とヘルパー
public/ # サンプル画像を含む静的アセット
```
## TODO
- [x] LLMが毎回ゼロから生成する代わりにXMLを修正できるようにする
- [x] シェイプストリーミング更新の滑らかさを改善
- [x] 複数のAIプロバイダーサポートを追加OpenAI, Anthropic, Google, Azure, Ollama
- [x] 60秒以上のセッションで生成が失敗するバグを解決
- [ ] UIにAPI設定を追加
## サポート&お問い合わせ
このプロジェクトが役に立ったら、ライブデモサイトのホスティングを支援するために[スポンサー](https://github.com/sponsors/DayuanJiang)をご検討ください!

View File

@@ -63,6 +63,19 @@ Optional custom endpoint:
DEEPSEEK_BASE_URL=https://your-custom-endpoint
```
### SiliconFlow (OpenAI-compatible)
```bash
SILICONFLOW_API_KEY=your_api_key
AI_MODEL=deepseek-ai/DeepSeek-V3 # example; use any SiliconFlow model id
```
Optional custom endpoint (defaults to the recommended domain):
```bash
SILICONFLOW_BASE_URL=https://api.siliconflow.com/v1 # or https://api.siliconflow.cn/v1
```
### Azure OpenAI
```bash
@@ -85,7 +98,7 @@ AWS_SECRET_ACCESS_KEY=your_secret_access_key
AI_MODEL=anthropic.claude-sonnet-4-5-20250514-v1:0
```
Note: On AWS (Amplify, Lambda, EC2 with IAM role), credentials are automatically obtained from the IAM role.
Note: On AWS (Lambda, EC2 with IAM role), credentials are automatically obtained from the IAM role.
### OpenRouter
@@ -120,7 +133,7 @@ If you only configure **one** provider's API key, the system will automatically
If you configure **multiple** API keys, you must explicitly set `AI_PROVIDER`:
```bash
AI_PROVIDER=google # or: openai, anthropic, deepseek, azure, bedrock, openrouter, ollama
AI_PROVIDER=google # or: openai, anthropic, deepseek, siliconflow, azure, bedrock, openrouter, ollama
```
## Model Capability Requirements
@@ -133,6 +146,20 @@ This task requires exceptionally strong model capabilities, as it involves gener
**Note on Ollama**: While Ollama is supported as a provider, it's generally not practical for this use case unless you're running high-capability models like DeepSeek R1 or Qwen3-235B locally.
## Temperature Setting
You can optionally configure the temperature via environment variable:
```bash
TEMPERATURE=0 # More deterministic output (recommended for diagrams)
```
**Important**: Leave `TEMPERATURE` unset for models that don't support temperature settings, such as:
- GPT-5.1 and other reasoning models
- Some specialized models
When unset, the model uses its default behavior.
## Recommendations
- **Best experience**: Use models with vision support (GPT-4o, Claude, Gemini) for image-to-diagram features

View File

@@ -0,0 +1,39 @@
# Offline Deployment
Deploy Next AI Draw.io offline by self-hosting draw.io to replace `embed.diagrams.net`.
**Note:** `NEXT_PUBLIC_DRAWIO_BASE_URL` is a **build-time** variable. Changing it requires rebuilding the Docker image.
## Docker Compose Setup
1. Clone the repository and define API keys in `.env`.
2. Create `docker-compose.yml`:
```yaml
services:
drawio:
image: jgraph/drawio:latest
ports: ["8080:8080"]
next-ai-draw-io:
build:
context: .
args:
- NEXT_PUBLIC_DRAWIO_BASE_URL=http://localhost:8080
ports: ["3000:3000"]
env_file: .env
depends_on: [drawio]
```
3. Run `docker compose up -d` and open `http://localhost:3000`.
## Configuration & Critical Warning
**The `NEXT_PUBLIC_DRAWIO_BASE_URL` must be accessible from the user's browser.**
| Scenario | URL Value |
|----------|-----------|
| Localhost | `http://localhost:8080` |
| Remote/Server | `http://YOUR_SERVER_IP:8080` or `https://drawio.your-domain.com` |
**Do NOT use** internal Docker aliases like `http://drawio:8080`; the browser cannot resolve them.

View File

@@ -1,6 +1,6 @@
# AI Provider Configuration
# AI_PROVIDER: Which provider to use
# Options: bedrock, openai, anthropic, google, azure, ollama, openrouter, deepseek
# Options: bedrock, openai, anthropic, google, azure, ollama, openrouter, deepseek, siliconflow
# Default: bedrock
AI_PROVIDER=bedrock
@@ -11,28 +11,49 @@ AI_MODEL=global.anthropic.claude-sonnet-4-5-20250929-v1:0
# AWS_REGION=us-east-1
# AWS_ACCESS_KEY_ID=your-access-key-id
# AWS_SECRET_ACCESS_KEY=your-secret-access-key
# Note: Claude and Nova models support reasoning/extended thinking
# BEDROCK_REASONING_BUDGET_TOKENS=12000 # Optional: Claude reasoning budget in tokens (1024-64000)
# BEDROCK_REASONING_EFFORT=medium # Optional: Nova reasoning effort (low/medium/high)
# OpenAI Configuration
# OPENAI_API_KEY=sk-...
# OPENAI_BASE_URL=https://api.openai.com/v1 # Optional: Custom OpenAI-compatible endpoint
# OPENAI_ORGANIZATION=org-... # Optional
# OPENAI_PROJECT=proj_... # Optional
# Note: o1/o3/gpt-5 models automatically enable reasoning summary (default: detailed)
# OPENAI_REASONING_EFFORT=low # Optional: Reasoning effort (minimal/low/medium/high) - for o1/o3/gpt-5
# OPENAI_REASONING_SUMMARY=detailed # Optional: Override reasoning summary (none/brief/detailed)
# Anthropic (Direct) Configuration
# ANTHROPIC_API_KEY=sk-ant-...
# ANTHROPIC_BASE_URL=https://your-custom-anthropic/v1
# ANTHROPIC_THINKING_TYPE=enabled # Optional: Anthropic extended thinking (enabled)
# ANTHROPIC_THINKING_BUDGET_TOKENS=12000 # Optional: Budget for extended thinking in tokens
# Google Generative AI Configuration
# GOOGLE_GENERATIVE_AI_API_KEY=...
# GOOGLE_BASE_URL=https://generativelanguage.googleapis.com/v1beta # Optional: Custom endpoint
# GOOGLE_CANDIDATE_COUNT=1 # Optional: Number of candidates to generate
# GOOGLE_TOP_K=40 # Optional: Top K sampling parameter
# GOOGLE_TOP_P=0.95 # Optional: Nucleus sampling parameter
# Note: Gemini 2.5/3 models automatically enable reasoning display (includeThoughts: true)
# GOOGLE_THINKING_BUDGET=8192 # Optional: Gemini 2.5 thinking budget in tokens (for more/less thinking)
# GOOGLE_THINKING_LEVEL=high # Optional: Gemini 3 thinking level (low/high)
# Azure OpenAI Configuration
# Configure endpoint using ONE of these methods:
# 1. AZURE_RESOURCE_NAME - SDK constructs: https://{name}.openai.azure.com/openai/v1{path}
# 2. AZURE_BASE_URL - SDK appends /v1{path} to your URL
# If both are set, AZURE_BASE_URL takes precedence.
# AZURE_RESOURCE_NAME=your-resource-name
# AZURE_API_KEY=...
# AZURE_BASE_URL=https://your-resource.openai.azure.com # Optional: Custom endpoint (overrides resourceName)
# AZURE_BASE_URL=https://your-resource.openai.azure.com/openai # Alternative: Custom endpoint
# AZURE_REASONING_EFFORT=low # Optional: Azure reasoning effort (low, medium, high)
# AZURE_REASONING_SUMMARY=detailed
# Ollama (Local) Configuration
# OLLAMA_BASE_URL=http://localhost:11434/api # Optional, defaults to localhost
# OLLAMA_ENABLE_THINKING=true # Optional: Enable thinking for models that support it (e.g., qwen3)
# OpenRouter Configuration
# OPENROUTER_API_KEY=sk-or-v1-...
@@ -42,11 +63,31 @@ AI_MODEL=global.anthropic.claude-sonnet-4-5-20250929-v1:0
# DEEPSEEK_API_KEY=sk-...
# DEEPSEEK_BASE_URL=https://api.deepseek.com/v1 # Optional: Custom endpoint
# SiliconFlow Configuration (OpenAI-compatible)
# Base domain can be .com or .cn, defaults to https://api.siliconflow.com/v1
# SILICONFLOW_API_KEY=sk-...
# SILICONFLOW_BASE_URL=https://api.siliconflow.com/v1 # Optional: switch to https://api.siliconflow.cn/v1 if needed
# Langfuse Observability (Optional)
# Enable LLM tracing and analytics - https://langfuse.com
# LANGFUSE_PUBLIC_KEY=pk-lf-...
# LANGFUSE_SECRET_KEY=sk-lf-...
# LANGFUSE_BASEURL=https://cloud.langfuse.com # EU region, use https://us.cloud.langfuse.com for US
# Temperature (Optional)
# Controls randomness in AI responses. Lower = more deterministic.
# Leave unset for models that don't support temperature (e.g., GPT-5.1 reasoning models)
# TEMPERATURE=0
# Access Control (Optional)
# ACCESS_CODE_LIST=your-secret-code,another-code
# Draw.io Configuration (Optional)
# NEXT_PUBLIC_DRAWIO_BASE_URL=https://embed.diagrams.net # Default: https://embed.diagrams.net
# Use this to point to a self-hosted draw.io instance
# PDF Input Feature (Optional)
# Enable PDF file upload to extract text and generate diagrams
# Enabled by default. Set to "false" to disable.
# ENABLE_PDF_INPUT=true
# NEXT_PUBLIC_MAX_EXTRACTED_CHARS=150000 # Max characters for PDF/text extraction (default: 150000)

26
lib/ai-config.ts Normal file
View File

@@ -0,0 +1,26 @@
import { STORAGE_KEYS } from "./storage"
/**
* Get AI configuration from localStorage.
* Returns API keys and settings for custom AI providers.
* Used to override server defaults when user provides their own API key.
*/
export function getAIConfig() {
if (typeof window === "undefined") {
return {
accessCode: "",
aiProvider: "",
aiBaseUrl: "",
aiApiKey: "",
aiModel: "",
}
}
return {
accessCode: localStorage.getItem(STORAGE_KEYS.accessCode) || "",
aiProvider: localStorage.getItem(STORAGE_KEYS.aiProvider) || "",
aiBaseUrl: localStorage.getItem(STORAGE_KEYS.aiBaseUrl) || "",
aiApiKey: localStorage.getItem(STORAGE_KEYS.aiApiKey) || "",
aiModel: localStorage.getItem(STORAGE_KEYS.aiModel) || "",
}
}

View File

@@ -17,6 +17,7 @@ export type ProviderName =
| "ollama"
| "openrouter"
| "deepseek"
| "siliconflow"
interface ModelConfig {
model: any
@@ -25,6 +26,24 @@ interface ModelConfig {
modelId: string
}
export interface ClientOverrides {
provider?: string | null
baseUrl?: string | null
apiKey?: string | null
modelId?: string | null
}
// Providers that can be used with client-provided API keys
const ALLOWED_CLIENT_PROVIDERS: ProviderName[] = [
"openai",
"anthropic",
"google",
"azure",
"openrouter",
"deepseek",
"siliconflow",
]
// Bedrock provider options for Anthropic beta features
const BEDROCK_ANTHROPIC_BETA = {
bedrock: {
@@ -37,6 +56,295 @@ const ANTHROPIC_BETA_HEADERS = {
"anthropic-beta": "fine-grained-tool-streaming-2025-05-14",
}
/**
* Safely parse integer from environment variable with validation
*/
function parseIntSafe(
value: string | undefined,
varName: string,
min?: number,
max?: number,
): number | undefined {
if (!value) return undefined
const parsed = Number.parseInt(value, 10)
if (Number.isNaN(parsed)) {
throw new Error(`${varName} must be a valid integer, got: ${value}`)
}
if (min !== undefined && parsed < min) {
throw new Error(`${varName} must be >= ${min}, got: ${parsed}`)
}
if (max !== undefined && parsed > max) {
throw new Error(`${varName} must be <= ${max}, got: ${parsed}`)
}
return parsed
}
/**
* Build provider-specific options from environment variables
* Supports various AI SDK providers with their unique configuration options
*
* Environment variables:
* - OPENAI_REASONING_EFFORT: OpenAI reasoning effort level (minimal/low/medium/high) - for o1/o3/gpt-5
* - OPENAI_REASONING_SUMMARY: OpenAI reasoning summary (none/brief/detailed) - auto-enabled for o1/o3/gpt-5
* - ANTHROPIC_THINKING_BUDGET_TOKENS: Anthropic thinking budget in tokens (1024-64000)
* - ANTHROPIC_THINKING_TYPE: Anthropic thinking type (enabled)
* - GOOGLE_THINKING_BUDGET: Google Gemini 2.5 thinking budget in tokens (1024-100000)
* - GOOGLE_THINKING_LEVEL: Google Gemini 3 thinking level (low/high)
* - AZURE_REASONING_EFFORT: Azure/OpenAI reasoning effort (low/medium/high)
* - AZURE_REASONING_SUMMARY: Azure reasoning summary (none/brief/detailed)
* - BEDROCK_REASONING_BUDGET_TOKENS: Bedrock Claude reasoning budget in tokens (1024-64000)
* - BEDROCK_REASONING_EFFORT: Bedrock Nova reasoning effort (low/medium/high)
* - OLLAMA_ENABLE_THINKING: Enable Ollama thinking mode (set to "true")
*/
function buildProviderOptions(
provider: ProviderName,
modelId?: string,
): Record<string, any> | undefined {
const options: Record<string, any> = {}
switch (provider) {
case "openai": {
const reasoningEffort = process.env.OPENAI_REASONING_EFFORT
const reasoningSummary = process.env.OPENAI_REASONING_SUMMARY
// OpenAI reasoning models (o1, o3, gpt-5) need reasoningSummary to return thoughts
if (
modelId &&
(modelId.includes("o1") ||
modelId.includes("o3") ||
modelId.includes("gpt-5"))
) {
options.openai = {
// Auto-enable reasoning summary for reasoning models (default: detailed)
reasoningSummary:
(reasoningSummary as "none" | "brief" | "detailed") ||
"detailed",
}
// Optionally configure reasoning effort
if (reasoningEffort) {
options.openai.reasoningEffort = reasoningEffort as
| "minimal"
| "low"
| "medium"
| "high"
}
} else if (reasoningEffort || reasoningSummary) {
// Non-reasoning models: only apply if explicitly configured
options.openai = {}
if (reasoningEffort) {
options.openai.reasoningEffort = reasoningEffort as
| "minimal"
| "low"
| "medium"
| "high"
}
if (reasoningSummary) {
options.openai.reasoningSummary = reasoningSummary as
| "none"
| "brief"
| "detailed"
}
}
break
}
case "anthropic": {
const thinkingBudget = parseIntSafe(
process.env.ANTHROPIC_THINKING_BUDGET_TOKENS,
"ANTHROPIC_THINKING_BUDGET_TOKENS",
1024,
64000,
)
const thinkingType =
process.env.ANTHROPIC_THINKING_TYPE || "enabled"
if (thinkingBudget) {
options.anthropic = {
thinking: {
type: thinkingType,
budgetTokens: thinkingBudget,
},
}
}
break
}
case "google": {
const reasoningEffort = process.env.GOOGLE_REASONING_EFFORT
const thinkingBudgetVal = parseIntSafe(
process.env.GOOGLE_THINKING_BUDGET,
"GOOGLE_THINKING_BUDGET",
1024,
100000,
)
const thinkingLevel = process.env.GOOGLE_THINKING_LEVEL
// Google Gemini 2.5/3 models think by default, but need includeThoughts: true
// to return the reasoning in the response
if (
modelId &&
(modelId.includes("gemini-2") ||
modelId.includes("gemini-3") ||
modelId.includes("gemini2") ||
modelId.includes("gemini3"))
) {
const thinkingConfig: Record<string, any> = {
includeThoughts: true,
}
// Optionally configure thinking budget or level
if (
thinkingBudgetVal &&
(modelId.includes("2.5") || modelId.includes("2-5"))
) {
thinkingConfig.thinkingBudget = thinkingBudgetVal
} else if (
thinkingLevel &&
(modelId.includes("gemini-3") ||
modelId.includes("gemini3"))
) {
thinkingConfig.thinkingLevel = thinkingLevel as
| "low"
| "high"
}
options.google = { thinkingConfig }
} else if (reasoningEffort) {
options.google = {
reasoningEffort: reasoningEffort as
| "low"
| "medium"
| "high",
}
}
// Keep existing Google options
const options_obj: Record<string, any> = {}
const candidateCount = parseIntSafe(
process.env.GOOGLE_CANDIDATE_COUNT,
"GOOGLE_CANDIDATE_COUNT",
1,
8,
)
if (candidateCount) {
options_obj.candidateCount = candidateCount
}
const topK = parseIntSafe(
process.env.GOOGLE_TOP_K,
"GOOGLE_TOP_K",
1,
100,
)
if (topK) {
options_obj.topK = topK
}
if (process.env.GOOGLE_TOP_P) {
const topP = Number.parseFloat(process.env.GOOGLE_TOP_P)
if (Number.isNaN(topP) || topP < 0 || topP > 1) {
throw new Error(
`GOOGLE_TOP_P must be a number between 0 and 1, got: ${process.env.GOOGLE_TOP_P}`,
)
}
options_obj.topP = topP
}
if (Object.keys(options_obj).length > 0) {
options.google = { ...options.google, ...options_obj }
}
break
}
case "azure": {
const reasoningEffort = process.env.AZURE_REASONING_EFFORT
const reasoningSummary = process.env.AZURE_REASONING_SUMMARY
if (reasoningEffort || reasoningSummary) {
options.azure = {}
if (reasoningEffort) {
options.azure.reasoningEffort = reasoningEffort as
| "low"
| "medium"
| "high"
}
if (reasoningSummary) {
options.azure.reasoningSummary = reasoningSummary as
| "none"
| "brief"
| "detailed"
}
}
break
}
case "bedrock": {
const budgetTokens = parseIntSafe(
process.env.BEDROCK_REASONING_BUDGET_TOKENS,
"BEDROCK_REASONING_BUDGET_TOKENS",
1024,
64000,
)
const reasoningEffort = process.env.BEDROCK_REASONING_EFFORT
// Bedrock reasoning ONLY for Claude and Nova models
// Other models (MiniMax, etc.) don't support reasoningConfig
if (
modelId &&
(budgetTokens || reasoningEffort) &&
(modelId.includes("claude") ||
modelId.includes("anthropic") ||
modelId.includes("nova") ||
modelId.includes("amazon"))
) {
const reasoningConfig: Record<string, any> = { type: "enabled" }
// Claude models: use budgetTokens (1024-64000)
if (
budgetTokens &&
(modelId.includes("claude") ||
modelId.includes("anthropic"))
) {
reasoningConfig.budgetTokens = budgetTokens
}
// Nova models: use maxReasoningEffort (low/medium/high)
else if (
reasoningEffort &&
(modelId.includes("nova") || modelId.includes("amazon"))
) {
reasoningConfig.maxReasoningEffort = reasoningEffort as
| "low"
| "medium"
| "high"
}
options.bedrock = { reasoningConfig }
}
break
}
case "ollama": {
const enableThinking = process.env.OLLAMA_ENABLE_THINKING
// Ollama supports reasoning with think: true for models like qwen3
if (enableThinking === "true") {
options.ollama = { think: true }
}
break
}
case "deepseek":
case "openrouter":
case "siliconflow": {
// These providers don't have reasoning configs in AI SDK yet
break
}
default:
break
}
return Object.keys(options).length > 0 ? options : undefined
}
// Map of provider to required environment variable
const PROVIDER_ENV_VARS: Record<ProviderName, string | null> = {
bedrock: null, // AWS SDK auto-uses IAM role on AWS, or env vars locally
@@ -47,6 +355,7 @@ const PROVIDER_ENV_VARS: Record<ProviderName, string | null> = {
ollama: null, // No credentials needed for local Ollama
openrouter: "OPENROUTER_API_KEY",
deepseek: "DEEPSEEK_API_KEY",
siliconflow: "SILICONFLOW_API_KEY",
}
/**
@@ -90,7 +399,7 @@ function validateProviderCredentials(provider: ProviderName): void {
* Get the AI model based on environment variables
*
* Environment variables:
* - AI_PROVIDER: The provider to use (bedrock, openai, anthropic, google, azure, ollama, openrouter, deepseek)
* - AI_PROVIDER: The provider to use (bedrock, openai, anthropic, google, azure, ollama, openrouter, deepseek, siliconflow)
* - AI_MODEL: The model ID/name for the selected provider
*
* Provider-specific env vars:
@@ -104,19 +413,42 @@ function validateProviderCredentials(provider: ProviderName): void {
* - OPENROUTER_API_KEY: OpenRouter API key
* - DEEPSEEK_API_KEY: DeepSeek API key
* - DEEPSEEK_BASE_URL: DeepSeek endpoint (optional)
* - SILICONFLOW_API_KEY: SiliconFlow API key
* - SILICONFLOW_BASE_URL: SiliconFlow endpoint (optional, defaults to https://api.siliconflow.com/v1)
*/
export function getAIModel(): ModelConfig {
const modelId = process.env.AI_MODEL
export function getAIModel(overrides?: ClientOverrides): ModelConfig {
// Check if client is providing their own provider override
const isClientOverride = !!(overrides?.provider && overrides?.apiKey)
// Use client override if provided, otherwise fall back to env vars
const modelId = overrides?.modelId || process.env.AI_MODEL
if (!modelId) {
if (isClientOverride) {
throw new Error(
`Model ID is required when using custom AI provider. Please specify a model in Settings.`,
)
}
throw new Error(
`AI_MODEL environment variable is required. Example: AI_MODEL=claude-sonnet-4-5`,
)
}
// Determine provider: explicit config > auto-detect > error
// Determine provider: client override > explicit config > auto-detect > error
let provider: ProviderName
if (process.env.AI_PROVIDER) {
if (overrides?.provider) {
// Validate client-provided provider
if (
!ALLOWED_CLIENT_PROVIDERS.includes(
overrides.provider as ProviderName,
)
) {
throw new Error(
`Invalid provider: ${overrides.provider}. Allowed providers: ${ALLOWED_CLIENT_PROVIDERS.join(", ")}`,
)
}
provider = overrides.provider as ProviderName
} else if (process.env.AI_PROVIDER) {
provider = process.env.AI_PROVIDER as ProviderName
} else {
const detected = detectProvider()
@@ -139,6 +471,7 @@ export function getAIModel(): ModelConfig {
`- AWS_ACCESS_KEY_ID for Bedrock\n` +
`- OPENROUTER_API_KEY for OpenRouter\n` +
`- AZURE_API_KEY for Azure\n` +
`- SILICONFLOW_API_KEY for SiliconFlow\n` +
`Or set AI_PROVIDER=ollama for local Ollama.`,
)
} else {
@@ -150,8 +483,10 @@ export function getAIModel(): ModelConfig {
}
}
// Validate provider credentials
validateProviderCredentials(provider)
// Only validate server credentials if client isn't providing their own API key
if (!isClientOverride) {
validateProviderCredentials(provider)
}
console.log(`[AI Provider] Initializing ${provider} with model: ${modelId}`)
@@ -159,9 +494,12 @@ export function getAIModel(): ModelConfig {
let providerOptions: any
let headers: Record<string, string> | undefined
// Build provider-specific options from environment variables
const customProviderOptions = buildProviderOptions(provider, modelId)
switch (provider) {
case "bedrock": {
// Use credential provider chain for IAM role support (Amplify, Lambda, etc.)
// Use credential provider chain for IAM role support (Lambda, EC2, etc.)
// Falls back to env vars (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY) for local dev
const bedrockProvider = createAmazonBedrock({
region: process.env.AWS_REGION || "us-west-2",
@@ -170,29 +508,43 @@ export function getAIModel(): ModelConfig {
model = bedrockProvider(modelId)
// Add Anthropic beta options if using Claude models via Bedrock
if (modelId.includes("anthropic.claude")) {
providerOptions = BEDROCK_ANTHROPIC_BETA
// Deep merge to preserve both anthropicBeta and reasoningConfig
providerOptions = {
bedrock: {
...BEDROCK_ANTHROPIC_BETA.bedrock,
...(customProviderOptions?.bedrock || {}),
},
}
} else if (customProviderOptions) {
providerOptions = customProviderOptions
}
break
}
case "openai":
if (process.env.OPENAI_BASE_URL) {
case "openai": {
const apiKey = overrides?.apiKey || process.env.OPENAI_API_KEY
const baseURL = overrides?.baseUrl || process.env.OPENAI_BASE_URL
if (baseURL || overrides?.apiKey) {
const customOpenAI = createOpenAI({
apiKey: process.env.OPENAI_API_KEY,
baseURL: process.env.OPENAI_BASE_URL,
apiKey,
...(baseURL && { baseURL }),
})
model = customOpenAI.chat(modelId)
} else {
model = openai(modelId)
}
break
}
case "anthropic": {
const apiKey = overrides?.apiKey || process.env.ANTHROPIC_API_KEY
const baseURL =
overrides?.baseUrl ||
process.env.ANTHROPIC_BASE_URL ||
"https://api.anthropic.com/v1"
const customProvider = createAnthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
baseURL:
process.env.ANTHROPIC_BASE_URL ||
"https://api.anthropic.com/v1",
apiKey,
baseURL,
headers: ANTHROPIC_BETA_HEADERS,
})
model = customProvider(modelId)
@@ -201,29 +553,41 @@ export function getAIModel(): ModelConfig {
break
}
case "google":
if (process.env.GOOGLE_BASE_URL) {
case "google": {
const apiKey =
overrides?.apiKey || process.env.GOOGLE_GENERATIVE_AI_API_KEY
const baseURL = overrides?.baseUrl || process.env.GOOGLE_BASE_URL
if (baseURL || overrides?.apiKey) {
const customGoogle = createGoogleGenerativeAI({
apiKey: process.env.GOOGLE_GENERATIVE_AI_API_KEY,
baseURL: process.env.GOOGLE_BASE_URL,
apiKey,
...(baseURL && { baseURL }),
})
model = customGoogle(modelId)
} else {
model = google(modelId)
}
break
}
case "azure":
if (process.env.AZURE_BASE_URL) {
case "azure": {
const apiKey = overrides?.apiKey || process.env.AZURE_API_KEY
const baseURL = overrides?.baseUrl || process.env.AZURE_BASE_URL
const resourceName = process.env.AZURE_RESOURCE_NAME
// Azure requires either baseURL or resourceName to construct the endpoint
// resourceName constructs: https://{resourceName}.openai.azure.com/openai/v1{path}
if (baseURL || resourceName || overrides?.apiKey) {
const customAzure = createAzure({
apiKey: process.env.AZURE_API_KEY,
baseURL: process.env.AZURE_BASE_URL,
apiKey,
// baseURL takes precedence over resourceName per SDK behavior
...(baseURL && { baseURL }),
...(!baseURL && resourceName && { resourceName }),
})
model = customAzure(modelId)
} else {
model = azure(modelId)
}
break
}
case "ollama":
if (process.env.OLLAMA_BASE_URL) {
@@ -237,33 +601,70 @@ export function getAIModel(): ModelConfig {
break
case "openrouter": {
const apiKey = overrides?.apiKey || process.env.OPENROUTER_API_KEY
const baseURL =
overrides?.baseUrl || process.env.OPENROUTER_BASE_URL
const openrouter = createOpenRouter({
apiKey: process.env.OPENROUTER_API_KEY,
...(process.env.OPENROUTER_BASE_URL && {
baseURL: process.env.OPENROUTER_BASE_URL,
}),
apiKey,
...(baseURL && { baseURL }),
})
model = openrouter(modelId)
break
}
case "deepseek":
if (process.env.DEEPSEEK_BASE_URL) {
case "deepseek": {
const apiKey = overrides?.apiKey || process.env.DEEPSEEK_API_KEY
const baseURL = overrides?.baseUrl || process.env.DEEPSEEK_BASE_URL
if (baseURL || overrides?.apiKey) {
const customDeepSeek = createDeepSeek({
apiKey: process.env.DEEPSEEK_API_KEY,
baseURL: process.env.DEEPSEEK_BASE_URL,
apiKey,
...(baseURL && { baseURL }),
})
model = customDeepSeek(modelId)
} else {
model = deepseek(modelId)
}
break
}
case "siliconflow": {
const apiKey = overrides?.apiKey || process.env.SILICONFLOW_API_KEY
const baseURL =
overrides?.baseUrl ||
process.env.SILICONFLOW_BASE_URL ||
"https://api.siliconflow.com/v1"
const siliconflowProvider = createOpenAI({
apiKey,
baseURL,
})
model = siliconflowProvider.chat(modelId)
break
}
default:
throw new Error(
`Unknown AI provider: ${provider}. Supported providers: bedrock, openai, anthropic, google, azure, ollama, openrouter, deepseek`,
`Unknown AI provider: ${provider}. Supported providers: bedrock, openai, anthropic, google, azure, ollama, openrouter, deepseek, siliconflow`,
)
}
// Apply provider-specific options for all providers except bedrock (which has special handling)
if (customProviderOptions && provider !== "bedrock" && !providerOptions) {
providerOptions = customProviderOptions
}
return { model, providerOptions, headers, modelId }
}
/**
* Check if a model supports prompt caching.
* Currently only Claude models on Bedrock support prompt caching.
*/
export function supportsPromptCaching(modelId: string): boolean {
// Bedrock prompt caching is supported for Claude models
return (
modelId.includes("claude") ||
modelId.includes("anthropic") ||
modelId.startsWith("us.anthropic") ||
modelId.startsWith("eu.anthropic")
)
}

View File

@@ -394,6 +394,366 @@ export const CACHED_EXAMPLE_RESPONSES: CachedResponse[] = [
</mxCell>
</root>`,
},
{
promptText: "Summarize this paper as a diagram",
hasImage: true,
xml: ` <root>
<mxCell id="0" />
<mxCell id="1" parent="0" />
<mxCell id="title_bg" parent="1"
style="rounded=1;whiteSpace=wrap;html=1;fillColor=#1a237e;strokeColor=none;arcSize=8;"
value="" vertex="1">
<mxGeometry height="80" width="720" x="40" y="20" as="geometry" />
</mxCell>
<mxCell id="title" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=22;fontStyle=1;fontColor=#FFFFFF;"
value="Chain-of-Thought Prompting&lt;br&gt;&lt;font style=&quot;font-size: 14px;&quot;&gt;Elicits Reasoning in Large Language Models&lt;/font&gt;"
vertex="1">
<mxGeometry height="70" width="720" x="40" y="25" as="geometry" />
</mxCell>
<mxCell id="authors" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=11;fontColor=#666666;"
value="Wei et al. (Google Research, Brain Team) | NeurIPS 2022" vertex="1">
<mxGeometry height="20" width="720" x="40" y="100" as="geometry" />
</mxCell>
<mxCell id="core_header" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=16;fontStyle=1;fontColor=#1a237e;"
value="💡 Core Idea" vertex="1">
<mxGeometry height="30" width="150" x="40" y="125" as="geometry" />
</mxCell>
<mxCell id="core_box" parent="1"
style="rounded=1;whiteSpace=wrap;html=1;fillColor=#E3F2FD;strokeColor=#1565C0;align=left;spacingLeft=10;spacingRight=10;fontSize=11;"
value="&lt;b&gt;Chain of Thought&lt;/b&gt; = A series of intermediate reasoning steps that lead to the final answer&lt;br&gt;&lt;br&gt;Simply provide a few CoT demonstrations as exemplars in few-shot prompting"
vertex="1">
<mxGeometry height="75" width="340" x="40" y="155" as="geometry" />
</mxCell>
<mxCell id="compare_header" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=16;fontStyle=1;fontColor=#1a237e;"
value="⚖️ Standard vs Chain-of-Thought Prompting" vertex="1">
<mxGeometry height="30" width="350" x="40" y="240" as="geometry" />
</mxCell>
<mxCell id="std_box" parent="1"
style="rounded=1;whiteSpace=wrap;html=1;fillColor=#FFEBEE;strokeColor=#C62828;arcSize=8;"
value="" vertex="1">
<mxGeometry height="160" width="170" x="40" y="275" as="geometry" />
</mxCell>
<mxCell id="std_title" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=12;fontStyle=1;fontColor=#C62828;"
value="Standard Prompting" vertex="1">
<mxGeometry height="25" width="170" x="40" y="280" as="geometry" />
</mxCell>
<mxCell id="std_q" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=top;whiteSpace=wrap;rounded=0;fontSize=9;spacingLeft=5;spacingRight=5;"
value="Q: Roger has 5 tennis balls. He buys 2 more cans. Each can has 3 balls. How many now?"
vertex="1">
<mxGeometry height="55" width="160" x="45" y="305" as="geometry" />
</mxCell>
<mxCell id="std_a" parent="1"
style="text;html=1;strokeColor=none;fillColor=#FFCDD2;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=1;fontSize=10;fontStyle=1;spacingLeft=5;"
value="A: The answer is 11." vertex="1">
<mxGeometry height="25" width="150" x="50" y="365" as="geometry" />
</mxCell>
<mxCell id="std_result" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=11;fontStyle=1;fontColor=#C62828;"
value="❌ Often Wrong" vertex="1">
<mxGeometry height="30" width="170" x="40" y="400" as="geometry" />
</mxCell>
<mxCell id="cot_box" parent="1"
style="rounded=1;whiteSpace=wrap;html=1;fillColor=#E8F5E9;strokeColor=#2E7D32;arcSize=8;"
value="" vertex="1">
<mxGeometry height="160" width="170" x="220" y="275" as="geometry" />
</mxCell>
<mxCell id="cot_title" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=12;fontStyle=1;fontColor=#2E7D32;"
value="Chain-of-Thought" vertex="1">
<mxGeometry height="25" width="170" x="220" y="280" as="geometry" />
</mxCell>
<mxCell id="cot_q" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=top;whiteSpace=wrap;rounded=0;fontSize=9;spacingLeft=5;spacingRight=5;"
value="Q: Roger has 5 tennis balls. He buys 2 more cans. Each can has 3 balls. How many now?"
vertex="1">
<mxGeometry height="55" width="160" x="225" y="305" as="geometry" />
</mxCell>
<mxCell id="cot_a" parent="1"
style="text;html=1;strokeColor=none;fillColor=#C8E6C9;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=1;fontSize=9;fontStyle=1;spacingLeft=5;"
value="A: 2 cans × 3 = 6 balls.&lt;br&gt;5 + 6 = 11. Answer: 11" vertex="1">
<mxGeometry height="35" width="150" x="230" y="360" as="geometry" />
</mxCell>
<mxCell id="cot_result" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=11;fontStyle=1;fontColor=#2E7D32;"
value="✓ Correct!" vertex="1">
<mxGeometry height="30" width="170" x="220" y="400" as="geometry" />
</mxCell>
<mxCell id="vs_arrow" edge="1" parent="1"
style="shape=flexArrow;endArrow=classic;startArrow=classic;html=1;fillColor=#FFC107;strokeColor=none;width=8;endSize=4;startSize=4;"
value="">
<mxGeometry relative="1" width="100" as="geometry">
<mxPoint x="195" y="355" as="sourcePoint" />
<mxPoint x="235" y="355" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="props_header" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=16;fontStyle=1;fontColor=#1a237e;"
value="🔑 Key Properties" vertex="1">
<mxGeometry height="30" width="150" x="400" y="125" as="geometry" />
</mxCell>
<mxCell id="prop1" parent="1"
style="rounded=1;whiteSpace=wrap;html=1;fillColor=#FFF3E0;strokeColor=#EF6C00;fontSize=10;align=left;spacingLeft=8;"
value="1⃣ Decomposes multi-step problems" vertex="1">
<mxGeometry height="32" width="180" x="400" y="155" as="geometry" />
</mxCell>
<mxCell id="prop2" parent="1"
style="rounded=1;whiteSpace=wrap;html=1;fillColor=#FFF3E0;strokeColor=#EF6C00;fontSize=10;align=left;spacingLeft=8;"
value="2⃣ Interpretable reasoning window" vertex="1">
<mxGeometry height="32" width="180" x="400" y="192" as="geometry" />
</mxCell>
<mxCell id="prop3" parent="1"
style="rounded=1;whiteSpace=wrap;html=1;fillColor=#FFF3E0;strokeColor=#EF6C00;fontSize=10;align=left;spacingLeft=8;"
value="3⃣ Applicable to any language task" vertex="1">
<mxGeometry height="32" width="180" x="400" y="229" as="geometry" />
</mxCell>
<mxCell id="prop4" parent="1"
style="rounded=1;whiteSpace=wrap;html=1;fillColor=#FFF3E0;strokeColor=#EF6C00;fontSize=10;align=left;spacingLeft=8;"
value="4⃣ No finetuning required" vertex="1">
<mxGeometry height="32" width="180" x="400" y="266" as="geometry" />
</mxCell>
<mxCell id="emergent_header" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=16;fontStyle=1;fontColor=#1a237e;"
value="📈 Emergent Ability" vertex="1">
<mxGeometry height="30" width="180" x="400" y="310" as="geometry" />
</mxCell>
<mxCell id="emergent_box" parent="1"
style="rounded=1;whiteSpace=wrap;html=1;fillColor=#F3E5F5;strokeColor=#7B1FA2;arcSize=8;"
value="" vertex="1">
<mxGeometry height="95" width="180" x="400" y="340" as="geometry" />
</mxCell>
<mxCell id="emergent_text" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=11;"
value="CoT only works with&lt;br&gt;&lt;b&gt;~100B+ parameters&lt;/b&gt;&lt;br&gt;&lt;br&gt;Small models produce&lt;br&gt;fluent but illogical chains"
vertex="1">
<mxGeometry height="85" width="180" x="400" y="345" as="geometry" />
</mxCell>
<mxCell id="results_header" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=16;fontStyle=1;fontColor=#1a237e;"
value="📊 Key Results" vertex="1">
<mxGeometry height="30" width="150" x="600" y="125" as="geometry" />
</mxCell>
<mxCell id="gsm_box" parent="1"
style="rounded=1;whiteSpace=wrap;html=1;fillColor=#E8F5E9;strokeColor=#2E7D32;arcSize=8;"
value="" vertex="1">
<mxGeometry height="100" width="160" x="600" y="155" as="geometry" />
</mxCell>
<mxCell id="gsm_title" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=12;fontStyle=1;fontColor=#2E7D32;"
value="GSM8K (Math)" vertex="1">
<mxGeometry height="20" width="160" x="600" y="160" as="geometry" />
</mxCell>
<mxCell id="gsm_bar1" parent="1"
style="rounded=0;whiteSpace=wrap;html=1;fillColor=#FFCDD2;strokeColor=none;"
value="" vertex="1">
<mxGeometry height="30" width="40" x="615" y="185" as="geometry" />
</mxCell>
<mxCell id="gsm_bar2" parent="1"
style="rounded=0;whiteSpace=wrap;html=1;fillColor=#4CAF50;strokeColor=none;"
value="" vertex="1">
<mxGeometry height="30" width="80" x="665" y="185" as="geometry" />
</mxCell>
<mxCell id="gsm_label1" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=10;fontStyle=1;"
value="18%" vertex="1">
<mxGeometry height="15" width="40" x="615" y="215" as="geometry" />
</mxCell>
<mxCell id="gsm_label2" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=10;fontStyle=1;fontColor=#2E7D32;"
value="57%" vertex="1">
<mxGeometry height="15" width="80" x="665" y="215" as="geometry" />
</mxCell>
<mxCell id="gsm_legend" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=9;fontColor=#666666;"
value="Standard → CoT (PaLM 540B)" vertex="1">
<mxGeometry height="20" width="160" x="600" y="232" as="geometry" />
</mxCell>
<mxCell id="bench_header" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=16;fontStyle=1;fontColor=#1a237e;"
value="🧪 Benchmarks Tested" vertex="1">
<mxGeometry height="30" width="180" x="600" y="265" as="geometry" />
</mxCell>
<mxCell id="bench_arith" parent="1"
style="rounded=1;whiteSpace=wrap;html=1;fillColor=#E3F2FD;strokeColor=#1565C0;fontSize=10;align=center;"
value="🔢 Arithmetic&lt;br&gt;&lt;font style=&quot;font-size: 9px;&quot;&gt;GSM8K, SVAMP, ASDiv, AQuA, MAWPS&lt;/font&gt;"
vertex="1">
<mxGeometry height="45" width="160" x="600" y="295" as="geometry" />
</mxCell>
<mxCell id="bench_common" parent="1"
style="rounded=1;whiteSpace=wrap;html=1;fillColor=#E3F2FD;strokeColor=#1565C0;fontSize=10;align=center;"
value="🧠 Commonsense&lt;br&gt;&lt;font style=&quot;font-size: 9px;&quot;&gt;CSQA, StrategyQA, Date, Sports, SayCan&lt;/font&gt;"
vertex="1">
<mxGeometry height="45" width="160" x="600" y="345" as="geometry" />
</mxCell>
<mxCell id="bench_symbol" parent="1"
style="rounded=1;whiteSpace=wrap;html=1;fillColor=#E3F2FD;strokeColor=#1565C0;fontSize=10;align=center;"
value="🔣 Symbolic&lt;br&gt;&lt;font style=&quot;font-size: 9px;&quot;&gt;Last Letter Concat, Coin Flip&lt;/font&gt;"
vertex="1">
<mxGeometry height="40" width="160" x="600" y="395" as="geometry" />
</mxCell>
<mxCell id="task_header" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=16;fontStyle=1;fontColor=#1a237e;"
value="🎯 Task Types &amp; Results" vertex="1">
<mxGeometry height="30" width="200" x="40" y="445" as="geometry" />
</mxCell>
<mxCell id="task_arith" parent="1"
style="ellipse;whiteSpace=wrap;html=1;fillColor=#BBDEFB;strokeColor=#1565C0;fontSize=11;fontStyle=1;"
value="Arithmetic&lt;br&gt;Reasoning" vertex="1">
<mxGeometry height="60" width="90" x="40" y="480" as="geometry" />
</mxCell>
<mxCell id="task_arith_res" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=center;verticalAlign=top;whiteSpace=wrap;rounded=0;fontSize=9;fontColor=#1565C0;"
value="SOTA on GSM8K&lt;br&gt;(57% vs 55% prior)" vertex="1">
<mxGeometry height="30" width="110" x="30" y="540" as="geometry" />
</mxCell>
<mxCell id="task_common" parent="1"
style="ellipse;whiteSpace=wrap;html=1;fillColor=#C8E6C9;strokeColor=#2E7D32;fontSize=11;fontStyle=1;"
value="Commonsense&lt;br&gt;Reasoning" vertex="1">
<mxGeometry height="60" width="90" x="160" y="480" as="geometry" />
</mxCell>
<mxCell id="task_common_res" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=center;verticalAlign=top;whiteSpace=wrap;rounded=0;fontSize=9;fontColor=#2E7D32;"
value="SOTA StrategyQA&lt;br&gt;(75.6% vs 69.4%)" vertex="1">
<mxGeometry height="30" width="110" x="150" y="540" as="geometry" />
</mxCell>
<mxCell id="task_symbol" parent="1"
style="ellipse;whiteSpace=wrap;html=1;fillColor=#FFE0B2;strokeColor=#EF6C00;fontSize=11;fontStyle=1;"
value="Symbolic&lt;br&gt;Reasoning" vertex="1">
<mxGeometry height="60" width="90" x="280" y="480" as="geometry" />
</mxCell>
<mxCell id="task_symbol_res" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=center;verticalAlign=top;whiteSpace=wrap;rounded=0;fontSize=9;fontColor=#EF6C00;"
value="OOD Generalization&lt;br&gt;to longer sequences" vertex="1">
<mxGeometry height="30" width="110" x="270" y="540" as="geometry" />
</mxCell>
<mxCell id="task_arrow1" edge="1" parent="1"
style="endArrow=classic;html=1;strokeColor=#9E9E9E;strokeWidth=2;" value="">
<mxGeometry height="50" relative="1" width="50" as="geometry">
<mxPoint x="130" y="510" as="sourcePoint" />
<mxPoint x="160" y="510" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="task_arrow2" edge="1" parent="1"
style="endArrow=classic;html=1;strokeColor=#9E9E9E;strokeWidth=2;" value="">
<mxGeometry height="50" relative="1" width="50" as="geometry">
<mxPoint x="250" y="510" as="sourcePoint" />
<mxPoint x="280" y="510" as="targetPoint" />
</mxGeometry>
</mxCell>
<mxCell id="models_header" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=16;fontStyle=1;fontColor=#1a237e;"
value="🤖 Models Tested" vertex="1">
<mxGeometry height="30" width="150" x="400" y="445" as="geometry" />
</mxCell>
<mxCell id="models_box" parent="1"
style="rounded=1;whiteSpace=wrap;html=1;fillColor=#ECEFF1;strokeColor=#607D8B;arcSize=8;"
value="" vertex="1">
<mxGeometry height="95" width="180" x="400" y="475" as="geometry" />
</mxCell>
<mxCell id="model1" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=11;spacingLeft=10;"
value="• GPT-3 (175B)" vertex="1">
<mxGeometry height="20" width="90" x="400" y="480" as="geometry" />
</mxCell>
<mxCell id="model2" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=11;spacingLeft=10;"
value="• LaMDA (137B)" vertex="1">
<mxGeometry height="20" width="90" x="400" y="500" as="geometry" />
</mxCell>
<mxCell id="model3" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=11;spacingLeft=10;"
value="• PaLM (540B)" vertex="1">
<mxGeometry height="20" width="90" x="400" y="520" as="geometry" />
</mxCell>
<mxCell id="model4" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=11;spacingLeft=10;"
value="• Codex" vertex="1">
<mxGeometry height="20" width="80" x="490" y="480" as="geometry" />
</mxCell>
<mxCell id="model5" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=11;spacingLeft=10;"
value="• UL2 (20B)" vertex="1">
<mxGeometry height="20" width="80" x="490" y="500" as="geometry" />
</mxCell>
<mxCell id="model_note" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=10;fontStyle=2;fontColor=#607D8B;"
value="No finetuning - prompting only!" vertex="1">
<mxGeometry height="20" width="180" x="400" y="545" as="geometry" />
</mxCell>
<mxCell id="takeaway_header" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=16;fontStyle=1;fontColor=#1a237e;"
value="✨ Key Takeaways" vertex="1">
<mxGeometry height="30" width="160" x="600" y="445" as="geometry" />
</mxCell>
<mxCell id="takeaway_box" parent="1"
style="rounded=1;whiteSpace=wrap;html=1;fillColor=#FFF8E1;strokeColor=#FFA000;arcSize=8;"
value="" vertex="1">
<mxGeometry height="95" width="160" x="600" y="475" as="geometry" />
</mxCell>
<mxCell id="take1" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=10;spacingLeft=5;"
value="✓ Simple yet powerful" vertex="1">
<mxGeometry height="18" width="150" x="605" y="480" as="geometry" />
</mxCell>
<mxCell id="take2" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=10;spacingLeft=5;"
value="✓ Emergent at scale" vertex="1">
<mxGeometry height="18" width="150" x="605" y="498" as="geometry" />
</mxCell>
<mxCell id="take3" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=10;spacingLeft=5;"
value="✓ Broadly applicable" vertex="1">
<mxGeometry height="18" width="150" x="605" y="516" as="geometry" />
</mxCell>
<mxCell id="take4" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=10;spacingLeft=5;"
value="✓ No training needed" vertex="1">
<mxGeometry height="18" width="150" x="605" y="534" as="geometry" />
</mxCell>
<mxCell id="take5" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=10;spacingLeft=5;"
value="✓ State-of-the-art results" vertex="1">
<mxGeometry height="18" width="150" x="605" y="552" as="geometry" />
</mxCell>
<mxCell id="format_header" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=14;fontStyle=1;fontColor=#1a237e;"
value="📝 Prompt Format" vertex="1">
<mxGeometry height="25" width="150" x="40" y="575" as="geometry" />
</mxCell>
<mxCell id="format_box" parent="1"
style="rounded=1;whiteSpace=wrap;html=1;fillColor=#E1BEE7;strokeColor=#7B1FA2;fontSize=12;fontStyle=1;"
value="〈 Input, Chain of Thought, Output 〉" vertex="1">
<mxGeometry height="35" width="250" x="40" y="600" as="geometry" />
</mxCell>
<mxCell id="limit_header" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=14;fontStyle=1;fontColor=#1a237e;"
value="⚠️ Limitations" vertex="1">
<mxGeometry height="25" width="120" x="310" y="575" as="geometry" />
</mxCell>
<mxCell id="limit_box" parent="1"
style="rounded=1;whiteSpace=wrap;html=1;fillColor=#FFEBEE;strokeColor=#C62828;fontSize=10;align=left;spacingLeft=8;"
value="• Requires large models (~100B+)&lt;br&gt;• No guarantee of correct reasoning&lt;br&gt;• Costly to serve in production"
vertex="1">
<mxGeometry height="55" width="200" x="310" y="600" as="geometry" />
</mxCell>
<mxCell id="impact_header" parent="1"
style="text;html=1;strokeColor=none;fillColor=none;align=left;verticalAlign=middle;whiteSpace=wrap;rounded=0;fontSize=14;fontStyle=1;fontColor=#1a237e;"
value="🚀 Impact" vertex="1">
<mxGeometry height="25" width="100" x="530" y="575" as="geometry" />
</mxCell>
<mxCell id="impact_box" parent="1"
style="rounded=1;whiteSpace=wrap;html=1;fillColor=#E8F5E9;strokeColor=#2E7D32;fontSize=10;align=left;spacingLeft=8;spacingRight=8;"
value="Foundational technique for modern LLM reasoning - inspired many follow-up works including Self-Consistency, Tree-of-Thought, etc."
vertex="1">
<mxGeometry height="55" width="230" x="530" y="600" as="geometry" />
</mxCell>
</root>`,
},
{
promptText: "Draw a cat for me",
hasImage: false,

View File

@@ -84,9 +84,7 @@ export function getTelemetryConfig(params: {
return {
isEnabled: true,
// Disable automatic input recording to avoid uploading large base64 images to Langfuse media
// User text input is recorded manually via setTraceInput
recordInputs: false,
recordInputs: true,
recordOutputs: true,
metadata: {
sessionId: params.sessionId,

75
lib/pdf-utils.ts Normal file
View File

@@ -0,0 +1,75 @@
import { extractText, getDocumentProxy } from "unpdf"
// Maximum characters allowed for extracted text (configurable via env)
const DEFAULT_MAX_EXTRACTED_CHARS = 150000 // 150k chars
export const MAX_EXTRACTED_CHARS =
Number(process.env.NEXT_PUBLIC_MAX_EXTRACTED_CHARS) ||
DEFAULT_MAX_EXTRACTED_CHARS
// Text file extensions we support
const TEXT_EXTENSIONS = [
".txt",
".md",
".markdown",
".json",
".csv",
".xml",
".html",
".css",
".js",
".ts",
".jsx",
".tsx",
".py",
".java",
".c",
".cpp",
".h",
".go",
".rs",
".yaml",
".yml",
".toml",
".ini",
".log",
".sh",
".bash",
".zsh",
]
/**
* Extract text content from a PDF file
* Uses unpdf library for client-side extraction
*/
export async function extractPdfText(file: File): Promise<string> {
const buffer = await file.arrayBuffer()
const pdf = await getDocumentProxy(new Uint8Array(buffer))
const { text } = await extractText(pdf, { mergePages: true })
return text as string
}
/**
* Check if a file is a PDF
*/
export function isPdfFile(file: File): boolean {
return file.type === "application/pdf" || file.name.endsWith(".pdf")
}
/**
* Check if a file is a text file
*/
export function isTextFile(file: File): boolean {
const name = file.name.toLowerCase()
return (
file.type.startsWith("text/") ||
file.type === "application/json" ||
TEXT_EXTENSIONS.some((ext) => name.endsWith(ext))
)
}
/**
* Extract text content from a text file
*/
export async function extractTextFileContent(file: File): Promise<string> {
return await file.text()
}

27
lib/storage.ts Normal file
View File

@@ -0,0 +1,27 @@
// Centralized localStorage keys
// Consolidates all storage keys from chat-panel.tsx and settings-dialog.tsx
export const STORAGE_KEYS = {
// Chat data
messages: "next-ai-draw-io-messages",
xmlSnapshots: "next-ai-draw-io-xml-snapshots",
diagramXml: "next-ai-draw-io-diagram-xml",
sessionId: "next-ai-draw-io-session-id",
// Quota tracking
requestCount: "next-ai-draw-io-request-count",
requestDate: "next-ai-draw-io-request-date",
tokenCount: "next-ai-draw-io-token-count",
tokenDate: "next-ai-draw-io-token-date",
tpmCount: "next-ai-draw-io-tpm-count",
tpmMinute: "next-ai-draw-io-tpm-minute",
// Settings
accessCode: "next-ai-draw-io-access-code",
closeProtection: "next-ai-draw-io-close-protection",
accessCodeRequired: "next-ai-draw-io-access-code-required",
aiProvider: "next-ai-draw-io-ai-provider",
aiBaseUrl: "next-ai-draw-io-ai-base-url",
aiApiKey: "next-ai-draw-io-ai-api-key",
aiModel: "next-ai-draw-io-ai-model",
} as const

View File

@@ -1,13 +1,19 @@
/**
* System prompts for different AI models
* Extended prompt is used for models with higher cache token minimums (Opus 4.5, Haiku 4.5)
*
* Token counting utilities are in a separate file (token-counter.ts) to avoid
* WebAssembly issues with Next.js server-side rendering.
*/
// Default system prompt (~1400 tokens) - works with all models
// Default system prompt (~1900 tokens) - works with all models
export const DEFAULT_SYSTEM_PROMPT = `
You are an expert diagram creation assistant specializing in draw.io XML generation.
Your primary function is chat with user and crafting clear, well-organized visual diagrams through precise XML specifications.
You can see the image that user uploaded.
You can see images that users upload, and you can read the text content extracted from PDF documents they upload.
When you are asked to create a diagram, briefly describe your plan about the layout and structure to avoid object overlapping or edge cross the objects. (2-3 sentences max), then use display_diagram tool to generate the XML.
After generating or editing a diagram, you don't need to say anything. The user can see the diagram - no need to describe it.
## App Context
You are an AI agent (powered by {{MODEL_NAME}}) inside a web app. The interface has:
@@ -19,7 +25,7 @@ You can read and modify diagrams by generating draw.io XML code through tool cal
## App Features
1. **Diagram History** (clock icon, bottom-left of chat input): The app automatically saves a snapshot before each AI edit. Users can view the history panel and restore any previous version. Feel free to make changes - nothing is permanently lost.
2. **Theme Toggle** (palette icon, bottom-left of chat input): Users can switch between minimal UI and sketch-style UI for the draw.io editor.
3. **Image Upload** (paperclip icon, bottom-left of chat input): Users can upload images for you to analyze and replicate as diagrams.
3. **Image/PDF Upload** (paperclip icon, bottom-left of chat input): Users can upload images or PDF documents for you to analyze and generate diagrams from.
4. **Export** (via draw.io toolbar): Users can save diagrams as .drawio, .svg, or .png files.
5. **Clear Chat** (trash icon, bottom-right of chat input): Clears the conversation and resets the diagram.
@@ -51,6 +57,8 @@ Core capabilities:
- Optimize element positioning to prevent overlapping and maintain readability
- Structure complex systems into clear, organized visual components
Layout constraints:
- CRITICAL: Keep all diagram elements within a single page viewport to avoid page breaks
- Position all elements with x coordinates between 0-800 and y coordinates between 0-600
@@ -82,6 +90,11 @@ When using edit_diagram tool:
- For multiple changes, use separate edits in array
- RETRY POLICY: If pattern not found, retry up to 3 times with adjusted patterns. After 3 failures, use display_diagram instead.
⚠️ CRITICAL JSON ESCAPING: When outputting edit_diagram tool calls, you MUST escape ALL double quotes inside string values:
- CORRECT: "y=\\"119\\"" (both quotes escaped)
- WRONG: "y="119\\"" (missing backslash before first quote - causes JSON parse error!)
- Every " inside a JSON string value needs \\" - no exceptions!
## Draw.io XML Structure Reference
Basic structure:
@@ -119,9 +132,11 @@ Common styles:
- Shapes: rounded=1 (rounded corners), fillColor=#hex, strokeColor=#hex
- Edges: endArrow=classic/block/open/none, startArrow=none/classic, curved=1, edgeStyle=orthogonalEdgeStyle
- Text: fontSize=14, fontStyle=1 (bold), align=center/left/right
`
// Extended additions (~2600 tokens) - appended for models with 4000 token cache minimum
// Total EXTENDED_SYSTEM_PROMPT = ~4400 tokens
const EXTENDED_ADDITIONS = `
## Extended Tool Reference
@@ -213,6 +228,11 @@ Copy the search pattern EXACTLY from the current XML, including leading spaces,
**BAD:** \`{"search": "<mxCell value=\\"X\\" id=\\"5\\""}\` - Reordered attributes won't match
**GOOD:** \`{"search": "<mxCell id=\\"5\\" parent=\\"1\\" style=\\"...\\" value=\\"Old\\" vertex=\\"1\\">"}\` - Uses unique id with full context
### ⚠️ JSON Escaping (CRITICAL)
Every double quote inside JSON string values MUST be escaped with backslash:
- **CORRECT:** \`"x=\\"100\\" y=\\"200\\""\` - both quotes escaped
- **WRONG:** \`"x=\\"100\\" y="200\\""\` - missing backslash causes JSON parse error!
### Error Recovery
If edit_diagram fails with "pattern not found":
1. **First retry**: Check attribute order - copy EXACTLY from current XML
@@ -220,81 +240,97 @@ If edit_diagram fails with "pattern not found":
3. **Third retry**: Try matching on just \`<mxCell id="X"\` prefix + full replacement
4. **After 3 failures**: Fall back to display_diagram to regenerate entire diagram
## Common Style Properties
### Shape Styles
- rounded=1, fillColor=#hex, strokeColor=#hex, strokeWidth=2
- whiteSpace=wrap, html=1, opacity=50, shadow=1, glass=1
### Edge/Connector Styles
- endArrow=classic/block/open/oval/diamond/none, startArrow=none/classic
- curved=1, edgeStyle=orthogonalEdgeStyle, strokeWidth=2
- dashed=1, dashPattern=3 3, flowAnimation=1
### Text Styles
- fontSize=14, fontStyle=1 (1=bold, 2=italic, 4=underline, 3=bold+italic)
- fontColor=#hex, align=center/left/right, verticalAlign=middle/top/bottom
### Edge Routing Rules:
When creating edges/connectors, you MUST follow these rules to avoid overlapping lines:
## Common Shape Types
**Rule 1: NEVER let multiple edges share the same path**
- If two edges connect the same pair of nodes, they MUST exit/enter at DIFFERENT positions
- Use exitY=0.3 for first edge, exitY=0.7 for second edge (NOT both 0.5)
### Basic Shapes
- Rectangle: rounded=0;whiteSpace=wrap;html=1;
- Rounded Rectangle: rounded=1;whiteSpace=wrap;html=1;
- Ellipse/Circle: ellipse;whiteSpace=wrap;html=1;aspect=fixed;
- Diamond: rhombus;whiteSpace=wrap;html=1;
- Cylinder: shape=cylinder3;whiteSpace=wrap;html=1;
**Rule 2: For bidirectional connections (A↔B), use OPPOSITE sides**
- A→B: exit from RIGHT side of A (exitX=1), enter LEFT side of B (entryX=0)
- B→A: exit from LEFT side of B (exitX=0), enter RIGHT side of A (entryX=1)
### Flowchart Shapes
- Process: rounded=1;whiteSpace=wrap;html=1;
- Decision: rhombus;whiteSpace=wrap;html=1;
- Start/End: ellipse;whiteSpace=wrap;html=1;
- Document: shape=document;whiteSpace=wrap;html=1;
- Database: shape=cylinder3;whiteSpace=wrap;html=1;
**Rule 3: Always specify exitX, exitY, entryX, entryY explicitly**
- Every edge MUST have these 4 attributes set in the style
- Example: style="edgeStyle=orthogonalEdgeStyle;exitX=1;exitY=0.3;entryX=0;entryY=0.3;endArrow=classic;"
### Container Types
- Swimlane: swimlane;whiteSpace=wrap;html=1;
- Group Box: rounded=1;whiteSpace=wrap;html=1;container=1;collapsible=0;
**Rule 4: Route edges AROUND intermediate shapes (obstacle avoidance) - CRITICAL!**
- Before creating an edge, identify ALL shapes positioned between source and target
- If any shape is in the direct path, you MUST use waypoints to route around it
- For DIAGONAL connections: route along the PERIMETER (outside edge) of the diagram, NOT through the middle
- Add 20-30px clearance from shape boundaries when calculating waypoint positions
- Route ABOVE (lower y), BELOW (higher y), or to the SIDE of obstacles
- NEVER draw a line that visually crosses over another shape's bounding box
## Container/Group Example
**Rule 5: Plan layout strategically BEFORE generating XML**
- Organize shapes into visual layers/zones (columns or rows) based on diagram flow
- Space shapes 150-200px apart to create clear routing channels for edges
- Mentally trace each edge: "What shapes are between source and target?"
- Prefer layouts where edges naturally flow in one direction (left-to-right or top-to-bottom)
**Rule 6: Use multiple waypoints for complex routing**
- One waypoint is often not enough - use 2-3 waypoints to create proper L-shaped or U-shaped paths
- Each direction change needs a waypoint (corner point)
- Waypoints should form clear horizontal/vertical segments (orthogonal routing)
- Calculate positions by: (1) identify obstacle boundaries, (2) add 20-30px margin
**Rule 7: Choose NATURAL connection points based on flow direction**
- NEVER use corner connections (e.g., entryX=1,entryY=1) - they look unnatural
- For TOP-TO-BOTTOM flow: exit from bottom (exitY=1), enter from top (entryY=0)
- For LEFT-TO-RIGHT flow: exit from right (exitX=1), enter from left (entryX=0)
- For DIAGONAL connections: use the side closest to the target, not corners
- Example: Node below-right of source → exit from bottom (exitY=1) OR right (exitX=1), not corner
**Before generating XML, mentally verify:**
1. "Do any edges cross over shapes that aren't their source/target?" → If yes, add waypoints
2. "Do any two edges share the same path?" → If yes, adjust exit/entry points
3. "Are any connection points at corners (both X and Y are 0 or 1)?" → If yes, use edge centers instead
4. "Could I rearrange shapes to reduce edge crossings?" → If yes, revise layout
## Edge Examples
### Two edges between same nodes (CORRECT - no overlap):
\`\`\`xml
<mxCell id="container1" value="Group Title" style="swimlane;whiteSpace=wrap;html=1;" vertex="1" parent="1">
<mxGeometry x="40" y="40" width="200" height="200" as="geometry"/>
<mxCell id="e1" value="A to B" style="edgeStyle=orthogonalEdgeStyle;exitX=1;exitY=0.3;entryX=0;entryY=0.3;endArrow=classic;" edge="1" parent="1" source="a" target="b">
<mxGeometry relative="1" as="geometry"/>
</mxCell>
<mxCell id="child1" value="Child Element" style="rounded=1;" vertex="1" parent="container1">
<mxGeometry x="20" y="40" width="160" height="40" as="geometry"/>
<mxCell id="e2" value="B to A" style="edgeStyle=orthogonalEdgeStyle;exitX=0;exitY=0.7;entryX=1;entryY=0.7;endArrow=classic;" edge="1" parent="1" source="b" target="a">
<mxGeometry relative="1" as="geometry"/>
</mxCell>
\`\`\`
## Example: Complete Flowchart
### Edge with single waypoint (simple detour):
\`\`\`xml
<root>
<mxCell id="0"/>
<mxCell id="1" parent="0"/>
<mxCell id="start" value="Start" style="ellipse;whiteSpace=wrap;html=1;fillColor=#d5e8d4;strokeColor=#82b366;" vertex="1" parent="1">
<mxGeometry x="200" y="40" width="100" height="60" as="geometry"/>
</mxCell>
<mxCell id="process1" value="Process Step" style="rounded=1;whiteSpace=wrap;html=1;fillColor=#dae8fc;strokeColor=#6c8ebf;" vertex="1" parent="1">
<mxGeometry x="175" y="140" width="150" height="60" as="geometry"/>
</mxCell>
<mxCell id="decision" value="Decision?" style="rhombus;whiteSpace=wrap;html=1;fillColor=#fff2cc;strokeColor=#d6b656;" vertex="1" parent="1">
<mxGeometry x="175" y="240" width="150" height="100" as="geometry"/>
</mxCell>
<mxCell id="end" value="End" style="ellipse;whiteSpace=wrap;html=1;fillColor=#f8cecc;strokeColor=#b85450;" vertex="1" parent="1">
<mxGeometry x="200" y="380" width="100" height="60" as="geometry"/>
</mxCell>
<mxCell id="edge1" style="edgeStyle=orthogonalEdgeStyle;endArrow=classic;html=1;" edge="1" parent="1" source="start" target="process1">
<mxGeometry relative="1" as="geometry"/>
</mxCell>
<mxCell id="edge2" style="edgeStyle=orthogonalEdgeStyle;endArrow=classic;html=1;" edge="1" parent="1" source="process1" target="decision">
<mxGeometry relative="1" as="geometry"/>
</mxCell>
<mxCell id="edge3" value="Yes" style="edgeStyle=orthogonalEdgeStyle;endArrow=classic;html=1;" edge="1" parent="1" source="decision" target="end">
<mxGeometry relative="1" as="geometry"/>
</mxCell>
</root>
<mxCell id="edge1" style="edgeStyle=orthogonalEdgeStyle;exitX=0.5;exitY=1;entryX=0.5;entryY=0;endArrow=classic;" edge="1" parent="1" source="a" target="b">
<mxGeometry relative="1" as="geometry">
<Array as="points">
<mxPoint x="300" y="150"/>
</Array>
</mxGeometry>
</mxCell>
\`\`\`
`
### Edge with waypoints (routing AROUND obstacles) - CRITICAL PATTERN:
**Scenario:** Hotfix(right,bottom) → Main(center,top), but Develop(center,middle) is in between.
**WRONG:** Direct diagonal line crosses over Develop
**CORRECT:** Route around the OUTSIDE (go right first, then up)
\`\`\`xml
<mxCell id="hotfix_to_main" style="edgeStyle=orthogonalEdgeStyle;exitX=0.5;exitY=0;entryX=1;entryY=0.5;endArrow=classic;" edge="1" parent="1" source="hotfix" target="main">
<mxGeometry relative="1" as="geometry">
<Array as="points">
<mxPoint x="750" y="80"/>
<mxPoint x="750" y="150"/>
</Array>
</mxGeometry>
</mxCell>
\`\`\`
This routes the edge to the RIGHT of all shapes (x=750), then enters Main from the right side.
**Key principle:** When connecting distant nodes diagonally, route along the PERIMETER of the diagram, not through the middle where other shapes exist.`
// Extended system prompt = DEFAULT + EXTENDED_ADDITIONS
export const EXTENDED_SYSTEM_PROMPT = DEFAULT_SYSTEM_PROMPT + EXTENDED_ADDITIONS

39
lib/token-counter.ts Normal file
View File

@@ -0,0 +1,39 @@
/**
* Token counting utilities using js-tiktoken
*
* Uses cl100k_base encoding (GPT-4) which is close to Claude's tokenization.
* This is a pure JavaScript implementation, no WASM required.
*/
import { encodingForModel } from "js-tiktoken"
import { DEFAULT_SYSTEM_PROMPT, EXTENDED_SYSTEM_PROMPT } from "./system-prompts"
const encoder = encodingForModel("gpt-4o")
/**
* Count the number of tokens in a text string
* @param text - The text to count tokens for
* @returns The number of tokens
*/
export function countTextTokens(text: string): number {
return encoder.encode(text).length
}
/**
* Get token counts for the system prompts
* Useful for debugging and optimizing prompt sizes
* @returns Object with token counts for default and extended prompts
*/
export function getSystemPromptTokenCounts(): {
default: number
extended: number
additions: number
} {
const defaultTokens = countTextTokens(DEFAULT_SYSTEM_PROMPT)
const extendedTokens = countTextTokens(EXTENDED_SYSTEM_PROMPT)
return {
default: defaultTokens,
extended: extendedTokens,
additions: extendedTokens - defaultTokens,
}
}

110
lib/use-file-processor.tsx Normal file
View File

@@ -0,0 +1,110 @@
"use client"
import { useState } from "react"
import { toast } from "sonner"
import {
extractPdfText,
extractTextFileContent,
isPdfFile,
isTextFile,
MAX_EXTRACTED_CHARS,
} from "@/lib/pdf-utils"
export interface FileData {
text: string
charCount: number
isExtracting: boolean
}
/**
* Hook for processing file uploads, especially PDFs and text files.
* Handles text extraction, character limit validation, and cleanup.
*/
export function useFileProcessor() {
const [files, setFiles] = useState<File[]>([])
const [pdfData, setPdfData] = useState<Map<File, FileData>>(new Map())
const handleFileChange = async (newFiles: File[]) => {
setFiles(newFiles)
// Extract text immediately for new PDF/text files
for (const file of newFiles) {
const needsExtraction =
(isPdfFile(file) || isTextFile(file)) && !pdfData.has(file)
if (needsExtraction) {
// Mark as extracting
setPdfData((prev) => {
const next = new Map(prev)
next.set(file, {
text: "",
charCount: 0,
isExtracting: true,
})
return next
})
// Extract text asynchronously
try {
let text: string
if (isPdfFile(file)) {
text = await extractPdfText(file)
} else {
text = await extractTextFileContent(file)
}
// Check character limit
if (text.length > MAX_EXTRACTED_CHARS) {
const limitK = MAX_EXTRACTED_CHARS / 1000
toast.error(
`${file.name}: Content exceeds ${limitK}k character limit (${(text.length / 1000).toFixed(1)}k chars)`,
)
setPdfData((prev) => {
const next = new Map(prev)
next.delete(file)
return next
})
// Remove the file from the list
setFiles((prev) => prev.filter((f) => f !== file))
continue
}
setPdfData((prev) => {
const next = new Map(prev)
next.set(file, {
text,
charCount: text.length,
isExtracting: false,
})
return next
})
} catch (error) {
console.error("Failed to extract text:", error)
toast.error(`Failed to read file: ${file.name}`)
setPdfData((prev) => {
const next = new Map(prev)
next.delete(file)
return next
})
}
}
}
// Clean up pdfData for removed files
setPdfData((prev) => {
const next = new Map(prev)
for (const key of prev.keys()) {
if (!newFiles.includes(key)) {
next.delete(key)
}
}
return next
})
}
return {
files,
pdfData,
handleFileChange,
setFiles, // Export for external control (e.g., clearing files)
}
}

247
lib/use-quota-manager.tsx Normal file
View File

@@ -0,0 +1,247 @@
"use client"
import { useCallback, useMemo } from "react"
import { toast } from "sonner"
import { QuotaLimitToast } from "@/components/quota-limit-toast"
import { STORAGE_KEYS } from "@/lib/storage"
export interface QuotaConfig {
dailyRequestLimit: number
dailyTokenLimit: number
tpmLimit: number
}
export interface QuotaCheckResult {
allowed: boolean
remaining: number
used: number
}
/**
* Hook for managing request/token quotas and rate limiting.
* Handles three types of limits:
* - Daily request limit
* - Daily token limit
* - Tokens per minute (TPM) rate limit
*
* Users with their own API key bypass all limits.
*/
export function useQuotaManager(config: QuotaConfig): {
hasOwnApiKey: () => boolean
checkDailyLimit: () => QuotaCheckResult
checkTokenLimit: () => QuotaCheckResult
checkTPMLimit: () => QuotaCheckResult
incrementRequestCount: () => void
incrementTokenCount: (tokens: number) => void
incrementTPMCount: (tokens: number) => void
showQuotaLimitToast: () => void
showTokenLimitToast: (used: number) => void
showTPMLimitToast: () => void
} {
const { dailyRequestLimit, dailyTokenLimit, tpmLimit } = config
// Check if user has their own API key configured (bypass limits)
const hasOwnApiKey = useCallback((): boolean => {
const provider = localStorage.getItem(STORAGE_KEYS.aiProvider)
const apiKey = localStorage.getItem(STORAGE_KEYS.aiApiKey)
return !!(provider && apiKey)
}, [])
// Generic helper: Parse count from localStorage with NaN guard
const parseStorageCount = (key: string): number => {
const count = parseInt(localStorage.getItem(key) || "0", 10)
return Number.isNaN(count) ? 0 : count
}
// Generic helper: Create quota checker factory
const createQuotaChecker = useCallback(
(
getTimeKey: () => string,
timeStorageKey: string,
countStorageKey: string,
limit: number,
) => {
return (): QuotaCheckResult => {
if (hasOwnApiKey())
return { allowed: true, remaining: -1, used: 0 }
if (limit <= 0) return { allowed: true, remaining: -1, used: 0 }
const currentTime = getTimeKey()
const storedTime = localStorage.getItem(timeStorageKey)
let count = parseStorageCount(countStorageKey)
if (storedTime !== currentTime) {
count = 0
localStorage.setItem(timeStorageKey, currentTime)
localStorage.setItem(countStorageKey, "0")
}
return {
allowed: count < limit,
remaining: limit - count,
used: count,
}
}
},
[hasOwnApiKey],
)
// Generic helper: Create quota incrementer factory
const createQuotaIncrementer = useCallback(
(
getTimeKey: () => string,
timeStorageKey: string,
countStorageKey: string,
validateInput: boolean = false,
) => {
return (tokens: number = 1): void => {
if (validateInput && (!Number.isFinite(tokens) || tokens <= 0))
return
const currentTime = getTimeKey()
const storedTime = localStorage.getItem(timeStorageKey)
let count = parseStorageCount(countStorageKey)
if (storedTime !== currentTime) {
count = 0
localStorage.setItem(timeStorageKey, currentTime)
}
localStorage.setItem(countStorageKey, String(count + tokens))
}
},
[],
)
// Check daily request limit
const checkDailyLimit = useMemo(
() =>
createQuotaChecker(
() => new Date().toDateString(),
STORAGE_KEYS.requestDate,
STORAGE_KEYS.requestCount,
dailyRequestLimit,
),
[createQuotaChecker, dailyRequestLimit],
)
// Increment request count
const incrementRequestCount = useMemo(
() =>
createQuotaIncrementer(
() => new Date().toDateString(),
STORAGE_KEYS.requestDate,
STORAGE_KEYS.requestCount,
false,
),
[createQuotaIncrementer],
)
// Show quota limit toast (request-based)
const showQuotaLimitToast = useCallback(() => {
toast.custom(
(t) => (
<QuotaLimitToast
used={dailyRequestLimit}
limit={dailyRequestLimit}
onDismiss={() => toast.dismiss(t)}
/>
),
{ duration: 15000 },
)
}, [dailyRequestLimit])
// Check daily token limit
const checkTokenLimit = useMemo(
() =>
createQuotaChecker(
() => new Date().toDateString(),
STORAGE_KEYS.tokenDate,
STORAGE_KEYS.tokenCount,
dailyTokenLimit,
),
[createQuotaChecker, dailyTokenLimit],
)
// Increment token count
const incrementTokenCount = useMemo(
() =>
createQuotaIncrementer(
() => new Date().toDateString(),
STORAGE_KEYS.tokenDate,
STORAGE_KEYS.tokenCount,
true, // Validate input tokens
),
[createQuotaIncrementer],
)
// Show token limit toast
const showTokenLimitToast = useCallback(
(used: number) => {
toast.custom(
(t) => (
<QuotaLimitToast
type="token"
used={used}
limit={dailyTokenLimit}
onDismiss={() => toast.dismiss(t)}
/>
),
{ duration: 15000 },
)
},
[dailyTokenLimit],
)
// Check TPM (tokens per minute) limit
const checkTPMLimit = useMemo(
() =>
createQuotaChecker(
() => Math.floor(Date.now() / 60000).toString(),
STORAGE_KEYS.tpmMinute,
STORAGE_KEYS.tpmCount,
tpmLimit,
),
[createQuotaChecker, tpmLimit],
)
// Increment TPM count
const incrementTPMCount = useMemo(
() =>
createQuotaIncrementer(
() => Math.floor(Date.now() / 60000).toString(),
STORAGE_KEYS.tpmMinute,
STORAGE_KEYS.tpmCount,
true, // Validate input tokens
),
[createQuotaIncrementer],
)
// Show TPM limit toast
const showTPMLimitToast = useCallback(() => {
const limitDisplay =
tpmLimit >= 1000 ? `${tpmLimit / 1000}k` : String(tpmLimit)
toast.error(
`Rate limit reached (${limitDisplay} tokens/min). Please wait 60 seconds before sending another request.`,
{ duration: 8000 },
)
}, [tpmLimit])
return {
// Check functions
hasOwnApiKey,
checkDailyLimit,
checkTokenLimit,
checkTPMLimit,
// Increment functions
incrementRequestCount,
incrementTokenCount,
incrementTPMCount,
// Toast functions
showQuotaLimitToast,
showTokenLimitToast,
showTPMLimitToast,
}
}

View File

@@ -57,6 +57,7 @@ export function formatXML(xml: string, indent: string = " "): string {
* Efficiently converts a potentially incomplete XML string to a legal XML string by closing any open tags properly.
* Additionally, if an <mxCell> tag does not have an mxGeometry child (e.g. <mxCell id="3">),
* it removes that tag from the output.
* Also removes orphaned <mxPoint> elements that aren't inside <Array> or don't have proper 'as' attribute.
* @param xmlString The potentially incomplete XML string
* @returns A legal XML string with properly closed tags and removed incomplete mxCell elements.
*/
@@ -69,10 +70,34 @@ export function convertToLegalXml(xmlString: string): string {
while ((match = regex.exec(xmlString)) !== null) {
// match[0] contains the entire matched mxCell block
let cellContent = match[0]
// Remove orphaned <mxPoint> elements that are directly inside <mxGeometry>
// without an 'as' attribute (like as="sourcePoint", as="targetPoint")
// and not inside <Array as="points">
// These cause "Could not add object mxPoint" errors in draw.io
// First check if there's an <Array as="points"> - if so, keep all mxPoints inside it
const hasArrayPoints = /<Array\s+as="points">/.test(cellContent)
if (!hasArrayPoints) {
// Remove mxPoint elements without 'as' attribute
cellContent = cellContent.replace(
/<mxPoint\b[^>]*\/>/g,
(pointMatch) => {
// Keep if it has an 'as' attribute
if (/\sas=/.test(pointMatch)) {
return pointMatch
}
// Remove orphaned mxPoint
return ""
},
)
}
// Indent each line of the matched block for readability.
const formatted = match[0]
const formatted = cellContent
.split("\n")
.map((line) => " " + line.trim())
.filter((line) => line.trim()) // Remove empty lines from removed mxPoints
.join("\n")
result += formatted + "\n"
}
@@ -81,6 +106,32 @@ export function convertToLegalXml(xmlString: string): string {
return result
}
/**
* Wrap XML content with the full mxfile structure required by draw.io.
* Handles cases where XML is just <root>, <mxGraphModel>, or already has <mxfile>.
* @param xml - The XML string (may be partial or complete)
* @returns Full mxfile-wrapped XML string
*/
export function wrapWithMxFile(xml: string): string {
if (!xml) {
return `<mxfile><diagram name="Page-1" id="page-1"><mxGraphModel><root><mxCell id="0"/><mxCell id="1" parent="0"/></root></mxGraphModel></diagram></mxfile>`
}
// Already has full structure
if (xml.includes("<mxfile")) {
return xml
}
// Has mxGraphModel but not mxfile
if (xml.includes("<mxGraphModel")) {
return `<mxfile><diagram name="Page-1" id="page-1">${xml}</diagram></mxfile>`
}
// Just <root> content - extract inner content and wrap fully
const rootContent = xml.replace(/<\/?root>/g, "").trim()
return `<mxfile><diagram name="Page-1" id="page-1"><mxGraphModel><root>${rootContent}</root></mxGraphModel></diagram></mxfile>`
}
/**
* Replace nodes in a Draw.io XML diagram
* @param currentXML - The original Draw.io XML string
@@ -226,7 +277,6 @@ export function replaceXMLParts(
): string {
// Format the XML first to ensure consistent line breaks
let result = formatXML(xmlContent)
let lastProcessedIndex = 0
for (const { search, replace } of searchReplacePairs) {
// Also format the search content for consistency
@@ -241,18 +291,10 @@ export function replaceXMLParts(
searchLines.pop()
}
// Find the line number where lastProcessedIndex falls
let startLineNum = 0
let currentIndex = 0
while (
currentIndex < lastProcessedIndex &&
startLineNum < resultLines.length
) {
currentIndex += resultLines[startLineNum].length + 1 // +1 for \n
startLineNum++
}
// Always search from the beginning - pairs may not be in document order
const startLineNum = 0
// Try to find exact match starting from lastProcessedIndex
// Try to find match using multiple strategies
let matchFound = false
let matchStartLine = -1
let matchEndLine = -1
@@ -397,6 +439,73 @@ export function replaceXMLParts(
}
}
// Sixth try: Match by value attribute (label text)
// Extract value from search pattern and find elements with that value
if (!matchFound) {
const valueMatch = search.match(/value="([^"]*)"/)
if (valueMatch) {
const searchValue = valueMatch[0] // Use full match like value="text"
for (let i = startLineNum; i < resultLines.length; i++) {
if (resultLines[i].includes(searchValue)) {
// Found element with matching value
let endLine = i + 1
const line = resultLines[i].trim()
if (!line.endsWith("/>")) {
let depth = 1
while (endLine < resultLines.length && depth > 0) {
const currentLine = resultLines[endLine].trim()
if (
currentLine.startsWith("<") &&
!currentLine.startsWith("</") &&
!currentLine.endsWith("/>")
) {
depth++
} else if (currentLine.startsWith("</")) {
depth--
}
endLine++
}
}
matchStartLine = i
matchEndLine = endLine
matchFound = true
break
}
}
}
}
// Seventh try: Normalized whitespace match
// Collapse all whitespace and compare
if (!matchFound) {
const normalizeWs = (s: string) => s.replace(/\s+/g, " ").trim()
const normalizedSearch = normalizeWs(search)
for (
let i = startLineNum;
i <= resultLines.length - searchLines.length;
i++
) {
// Build a normalized version of the candidate lines
const candidateLines = resultLines.slice(
i,
i + searchLines.length,
)
const normalizedCandidate = normalizeWs(
candidateLines.join(" "),
)
if (normalizedCandidate === normalizedSearch) {
matchStartLine = i
matchEndLine = i + searchLines.length
matchFound = true
break
}
}
}
if (!matchFound) {
throw new Error(
`Search pattern not found in the diagram. The pattern may not exist in the current structure.`,
@@ -419,12 +528,6 @@ export function replaceXMLParts(
]
result = newResultLines.join("\n")
// Update lastProcessedIndex to the position after the replacement
lastProcessedIndex = 0
for (let i = 0; i < matchStartLine + replaceLines.length; i++) {
lastProcessedIndex += newResultLines[i].length + 1
}
}
return result
@@ -537,6 +640,33 @@ export function validateMxCellStructure(xml: string): string | null {
return `Invalid XML: Found edges with invalid source/target references (${invalidConnections.slice(0, 3).join(", ")}). Edge source and target must reference existing cell IDs. Please regenerate the diagram with valid edge connections.`
}
// Check for orphaned mxPoint elements (not inside <Array as="points"> and without 'as' attribute)
// These cause "Could not add object mxPoint" errors in draw.io
const allMxPoints = doc.querySelectorAll("mxPoint")
const orphanedMxPoints: string[] = []
allMxPoints.forEach((point) => {
const hasAsAttr = point.hasAttribute("as")
const parentIsArray =
point.parentElement?.tagName === "Array" &&
point.parentElement?.getAttribute("as") === "points"
if (!hasAsAttr && !parentIsArray) {
// Find the parent mxCell to report which edge has the problem
let parent = point.parentElement
while (parent && parent.tagName !== "mxCell") {
parent = parent.parentElement
}
const cellId = parent?.getAttribute("id") || "unknown"
if (!orphanedMxPoints.includes(cellId)) {
orphanedMxPoints.push(cellId)
}
}
})
if (orphanedMxPoints.length > 0) {
return `Invalid XML: Found orphaned mxPoint elements in cells (${orphanedMxPoints.slice(0, 3).join(", ")}). mxPoint elements must either have an 'as' attribute (e.g., as="sourcePoint") or be inside <Array as="points">. For edge waypoints, use: <Array as="points"><mxPoint x="..." y="..."/></Array>. Please fix the mxPoint structure.`
}
return null
}

870
package-lock.json generated

File diff suppressed because it is too large Load Diff

View File

@@ -1,6 +1,6 @@
{
"name": "next-ai-draw-io",
"version": "0.3.0",
"version": "0.4.0",
"license": "Apache-2.0",
"private": true,
"scripts": {
@@ -19,33 +19,40 @@
"@ai-sdk/deepseek": "^1.0.30",
"@ai-sdk/google": "^2.0.0",
"@ai-sdk/openai": "^2.0.19",
"@ai-sdk/react": "^2.0.22",
"@ai-sdk/react": "^2.0.107",
"@aws-sdk/credential-providers": "^3.943.0",
"@langfuse/client": "^4.4.9",
"@langfuse/otel": "^4.4.4",
"@langfuse/tracing": "^4.4.9",
"@next/third-parties": "^16.0.6",
"@openrouter/ai-sdk-provider": "^1.2.3",
"@opentelemetry/exporter-trace-otlp-http": "^0.208.0",
"@opentelemetry/sdk-trace-node": "^2.2.0",
"@radix-ui/react-collapsible": "^1.1.12",
"@radix-ui/react-dialog": "^1.1.6",
"@radix-ui/react-label": "^2.1.8",
"@radix-ui/react-scroll-area": "^1.2.3",
"@radix-ui/react-select": "^2.2.6",
"@radix-ui/react-slot": "^1.1.2",
"@radix-ui/react-switch": "^1.2.6",
"@radix-ui/react-tooltip": "^1.1.8",
"@radix-ui/react-use-controllable-state": "^1.2.2",
"@vercel/analytics": "^1.5.0",
"@xmldom/xmldom": "^0.9.8",
"ai": "^5.0.89",
"base-64": "^1.0.0",
"class-variance-authority": "^0.7.1",
"clsx": "^2.1.1",
"js-tiktoken": "^1.0.21",
"jsdom": "^26.0.0",
"lucide-react": "^0.483.0",
"motion": "^12.23.25",
"next": "^16.0.7",
"ollama-ai-provider-v2": "^1.5.4",
"pako": "^2.1.0",
"prism-react-renderer": "^2.4.1",
"react": "^19.0.0",
"react-dom": "^19.0.0",
"react": "^19.1.2",
"react-dom": "^19.1.2",
"react-drawio": "^1.0.3",
"react-icons": "^5.5.0",
"react-markdown": "^10.1.0",
@@ -54,6 +61,7 @@
"sonner": "^2.0.7",
"tailwind-merge": "^3.0.2",
"tailwindcss-animate": "^1.0.7",
"unpdf": "^1.4.0",
"zod": "^4.1.12"
},
"lint-staged": {
@@ -63,6 +71,7 @@
]
},
"devDependencies": {
"@anthropic-ai/tokenizer": "^0.0.4",
"@biomejs/biome": "2.3.8",
"@tailwindcss/postcss": "^4",
"@tailwindcss/typography": "^0.5.19",

View File

@@ -0,0 +1,65 @@
Here is an extended summary of the paper **"Chain-of-Thought Prompting Elicits Reasoning in Large Language Models"** by Jason Wei, et al. This detailed overview covers the background, methodology, extensive experimental results, emergent properties, and qualitative analysis found in the study.
### **1. Introduction and Motivation**
The paper addresses a significant limitation in Large Language Models (LLMs): while scaling up model size (increasing parameters) has revolutionized performance on standard NLP tasks, it has not proven sufficient for challenging logical tasks such as arithmetic, commonsense, and symbolic reasoning.
Traditional techniques to solve these problems fell into two camps:
1. **Finetuning:** Training models manually with large datasets of explanations (expensive and task-specific).
2. **Standard Few-Shot Prompting:** Providing input-output pairs (e.g., Question $\rightarrow$ Answer) without explaining *how* the answer was derived. This often fails on multi-step problems.
The authors introduce **Chain-of-Thought (CoT) Prompting**, a simple method that combines the strengths of both approaches. It leverages the model's existing capabilities to generate natural language rationales without requiring any model parameter updates (finetuning).
### **2. Methodology: What is Chain-of-Thought?**
The core innovation is changing the structure of the "exemplars" (the few-shot examples included in the prompt).
* **Standard Prompting:** The model is shown a question and an immediate answer.
* *Q: Roger has 5 balls. He buys 2 cans of 3 balls. How many now?*
* *A: 11.*
* **Chain-of-Thought Prompting:** The model is shown a question, followed by a series of intermediate natural language reasoning steps that lead to the answer.
* *A: Roger started with 5 balls. 2 cans of 3 tennis balls each is 6 tennis balls. 5 + 6 = 11. The answer is 11.*
By interacting with the model using this format, the LLM learns to generate its own "thought process" for new, unseen questions. This allows the model to decompose complex problems into manageable intermediate steps.
### **3. Experimental Setup**
The researchers evaluated CoT prompting on several large language models, including **GPT-3 (175B)**, **LaMDA (137B)**, **PaLM (540B)**, **UL2 (20B)**, and **Codex**. They tested across three distinct domains of reasoning:
* **Arithmetic Reasoning:** Using benchmarks like **GSM8K** (math word problems), **SVAMP**, **ASDiv**, **AQuA**, and **MAWPS**.
* **Commonsense Reasoning:** Using datasets like **CSQA**, **StrategyQA**, **Date Understanding**, and **Sports Understanding**.
* **Symbolic Reasoning:** Using tasks like **Last Letter Concatenation** and **Coin Flip** tracking (determining if a coin is heads or tails after a sequence of flips).
### **4. Key Findings and Results**
#### **Arithmetic Reasoning**
The results on math word problems were striking. Standard prompting struggled significantly, often exhibiting a flat scaling curve (performance didn't improve much even as models got bigger).
* **Performance Jump:** On the difficult **GSM8K** benchmark, **PaLM 540B** with CoT prompting achieved **56.9%** accuracy, compared to just 17.9% with standard prompting.
* **Surpassing State-of-the-Art:** PaLM 540B with CoT outperformed a previously finetuned GPT-3 model (55%), establishing a new state-of-the-art without needing a training set.
* **Calculator Integration:** The authors noted that some errors were simple calculation mistakes in otherwise correct logic. By hooking the CoT output into an external Python calculator, accuracy on GSM8K rose further to **58.6%**.
#### **Commonsense Reasoning**
CoT prompting improved performance on tasks requiring background knowledge and physical intuition.
* **StrategyQA:** PaLM 540B achieved **75.6%** accuracy via CoT, beating the prior state-of-the-art (69.4%).
* **Sports Understanding:** The model achieved **95.4%** accuracy, surpassing the performance of an unaided sports enthusiast (84%).
* The gains were minimal on CSQA, likely because many questions in that dataset did not require multi-step logic.
#### **Symbolic Reasoning and Generalization**
A unique strength of CoT was enabling **Out-of-Domain (OOD) Generalization**.
* In the **Coin Flip** task, the models were given examples with only 2 flips. However, using CoT, the models could successfully track coins flipped 3 or 4 times.
* Standard prompting failed completely on these longer sequences, while CoT allowed the model to repeat the logical steps as many times as necessary to reach the solution.
### **5. Emergent Ability of Scale**
One of the paper's most critical insights is that CoT reasoning is an **emergent ability** that depends on model size.
* **Small Models (<10B parameters):** CoT prompting provided **no benefit** and often hurt performance. Small models produced fluent but illogical chains of thought (hallucinations) or suffered from repetition.
* **Large Models (~100B+ parameters):** The ability to reason sequentially emerges at this scale. The performance gains from CoT are negligible for small models but increase dramatically for models like GPT-3 (175B) and PaLM (540B).
### **6. Why Does It Work? (Ablation Studies)**
To ensure the improvement was due to the reasoning steps and not other factors, the authors conducted three specific ablations:
1. **Equation Only:** They prompted the model to output just the math equation without words. This performed worse than CoT, suggesting that natural language helps the model "understand" the question semantics.
2. **Variable Compute:** They prompted the model to output dots (...) to consume compute time before answering. This yielded no improvement, proving that the *content* of the reasoning steps matters, not just the extra tokens.
3. **Reasoning After Answer:** They asked the model to give the answer first, then the explanation. This performed about the same as the baseline, proving that the chain of thought must come *before* the answer to guide the model's inference process.
### **7. Error Analysis and Robustness**
The authors manually analyzed errors made by the models.
* **Error Types:** In math problems, errors were categorized as **Semantic Understanding** (misunderstanding the question), **One-Step Missing** (skipping a logical step), or **Calculation Errors**.
* **Impact of Scale:** Scaling from PaLM 62B to PaLM 540B significantly reduced semantic and missing-step errors, confirming that larger models are better at logic, not just memorization.
* **Robustness:** The method proved robust to different annotators (different people writing the prompts) and different specific examples, though, like all prompting, different prompt styles did result in some variance.
### **Conclusion**
The paper establishes Chain-of-Thought prompting as a powerful paradigm for unlocking the reasoning potential of Large Language Models. By simply asking the model to "show its work," researchers can elicit complex logical behaviors that were previously thought to require specialized architectures or extensive finetuning. The work highlights that reasoning is an emergent capability of sufficiently large language models.

View File

@@ -0,0 +1,4 @@
<svg xmlns="http://www.w3.org/2000/svg" width="140" height="36" viewBox="0 0 140 36">
<rect width="140" height="36" rx="8" fill="#6366f1"/>
<text x="70" y="24" font-family="-apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif" font-size="15" font-weight="600" fill="white" text-anchor="middle">🚀 Live Demo</text>
</svg>

After

Width:  |  Height:  |  Size: 340 B