Commit Graph

143 Commits

Author SHA1 Message Date
fawney19
7e26af5476 fix: 修复仪表盘缓存命中率和配额显示逻辑
- 本月缓存命中率改为使用本月数据计算(monthly_input_tokens + cache_read_tokens),而非全时数据
- 修复配额显示:有配额时显示实际金额,无配额时显示为 /bin/zsh 并标记为高警告状态
2025-12-29 18:12:33 +08:00
fawney19
c8dfb784bc fix: 修复仪表盘月份统计逻辑,改为自然月而非过去30天 2025-12-29 17:55:42 +08:00
fawney19
599b3d4c95 feat: 添加 Provider API Key 查看和复制功能
- 后端添加 GET /api/admin/endpoints/keys/{key_id}/reveal 接口
- 前端密钥列表添加眼睛按钮(显示/隐藏完整密钥)和复制按钮
- 关闭抽屉时自动清除已显示的密钥(安全考虑)

Fixes #53
2025-12-29 17:14:26 +08:00
fawney19
41719a00e7 refactor: 改进分布式任务锁的清理策略
实现两种锁清理模式:
- 单实例模式(默认):启动时使用 Lua 脚本原子性清理旧锁,解决 worker 重启时���锁残留问题
- 多实例模式:使用 NX 选项竞争锁,依赖 TTL 处理异常退出

可通过 SINGLE_INSTANCE_MODE 环境变量控制模式选择。
2025-12-28 21:34:43 +08:00
fawney19
7d6d262ed3 feat: 增加用户密码修改时的确认验证
在编辑用户时,如果填写了新密码,需要进行密码确认,确保两次输入一致。
同时更新后端请求模型以支持密码字段。
2025-12-28 20:00:25 +08:00
fawney19
d0ce798881 fix: TTL=0时启用Key随机轮换模式
- 当所有Key的cache_ttl_minutes都为0时,使用随机排序代替确定性哈希
- 将hashlib和random的import移到文件顶部
- 简化单Key场景的处理逻辑

Closes #57
2025-12-28 19:07:25 +08:00
fawney19
71bc2e6aab fix: 增加参数校验防止除零错误 2025-12-25 22:44:17 +08:00
fawney19
afb329934a fix: 修复端点健康统计时间分段计算的除零错误 2025-12-25 19:54:16 +08:00
fawney19
dddb327885 refactor: 重构模型测试错误解析逻辑并修复用量统计变量引用
- 将 ModelsTab 和 ModelAliasesTab 中重复的错误解析逻辑提取到 errorParser.ts
- 添加 parseTestModelError 函数统一处理测试响应错误
- 为 testModel API 添加 TypeScript 类型定义 (TestModelRequest/TestModelResponse)
- 修复 endpoint_checker.py 中 usage_data 变量引用错误
2025-12-25 19:36:29 +08:00
hoping
26b4a37323 feat: 引入统一的端点检查器以重构适配器并改进错误处理和用量统计。 2025-12-25 00:02:56 +08:00
fawney19
9dad194130 fix: 修复 API Key 访问限制字段无法清除的问题
- 统一前端创建和更新 API Key 时的空数组处理逻辑
- 后端创建和更新接口都支持空数组转 NULL(表示不限制)
- 开启自动刷新时立即刷新一次数据
2025-12-24 22:35:30 +08:00
fawney19
03ad16ea8a fix: 修复迁移脚本在全新安装时报错及改进统计回填逻辑
迁移脚本修复:
- 移除 AUTOCOMMIT 模式,改为在同一事务中创建索引
- 分别检查每个索引是否存在,只创建缺失的索引
- 修复全新安装时 AUTOCOMMIT 连接看不到未提交表的问题 (#46)

统计回填改进:
- 分别检查 StatsDaily 和 StatsDailyModel 的缺失日期
- 只回填实际缺失的数据而非连续区间
- 添加失败统计计数和 rollback 错误日志
2025-12-24 21:50:05 +08:00
Hwwwww-dev
15a9b88fc8 feat: enhance extract_cache_creation_tokens function to support three formats[#41] (#42)
- Updated the function to prioritize nested format, followed by flat new format, and finally old format for cache creation tokens.
- Added fallback logic for cases where the preferred formats return zero.
- Expanded unit tests to cover new format scenarios and ensure proper functionality across all formats.

Co-authored-by: heweimin <heweimin@retaileye.ai>
2025-12-24 01:31:45 +08:00
fawney19
03eb7203ec fix(api): 同步 chat_handler_base 使用 aiter_bytes 支持自动解压 2025-12-24 01:13:35 +08:00
hank9999
e38cd6819b fix(api): 优化字节流迭代器以支持自动解压 gzip (#39) 2025-12-24 01:11:35 +08:00
fawney19
d44cfaddf6 fix: rename variable to avoid shadowing in model mapping cache stats
循环内部变量 provider_model_mappings 与外部列表同名,导致外部列表被覆盖为 None 引发 AttributeError
2025-12-23 00:38:37 +08:00
fawney19
65225710a8 refactor: use ConcurrencyDefaults for CACHE_RESERVATION_RATIO constant 2025-12-23 00:34:18 +08:00
fawney19
d7384e69d9 fix: improve code quality and add type safety for Key updates
- Replace f-string logging with lazy formatting in keys.py (lines 256, 265)
- Add EndpointAPIKeyUpdate type interface for frontend type safety
- Use typed EndpointAPIKeyUpdate instead of any in KeyFormDialog.vue
2025-12-23 00:11:10 +08:00
fawney19
1d5c378343 feat: add TTFB timeout detection and improve stream handling
- Add stream first byte timeout (TTFB) detection to trigger failover
  when provider responds too slowly (configurable via STREAM_FIRST_BYTE_TIMEOUT)
- Add rate limit fail-open/fail-close strategy configuration
- Improve exception handling in stream prefetch with proper error classification
- Refactor UsageService with shared _prepare_usage_record method
- Add batch deletion for old usage records to avoid long transaction locks
- Update CLI adapters to use proper User-Agent headers for each CLI client
- Add composite indexes migration for usage table query optimization
- Fix streaming status display in frontend to show TTFB during streaming
- Remove sensitive JWT secret logging in auth service
2025-12-22 23:44:42 +08:00
fawney19
4e1aed9976 feat: add daily model statistics aggregation with stats_daily_model table 2025-12-20 02:39:10 +08:00
fawney19
e2e7996a54 feat: implement upstream model import and batch model assignment with UI components 2025-12-20 02:01:17 +08:00
fawney19
df9f9a9f4f feat: add internal model list query interface with configurable User-Agent headers 2025-12-19 23:40:42 +08:00
hoping
8c12174521 个性化处理
1. 为所有抽屉和对话框添加 ESC 键关闭功能;
2. 为`使用记录`表格添加自动刷新开关;
3. 为后端 API 请求增加 User-Agent 头部;
4. 修改启动命令支持从.env中读取数据库和Redis配置。
2025-12-19 17:31:15 +08:00
fawney19
af476ff21e feat: enhance error logging and upstream response tracking for provider failures 2025-12-19 15:29:48 +08:00
fawney19
3bbc1c6b66 feat: add provider compatibility error detection for intelligent failover
- Introduce ProviderCompatibilityException for unsupported parameter/feature errors
- Add COMPATIBILITY_ERROR_PATTERNS to detect provider-specific limitations
- Implement _is_compatibility_error() method in ErrorClassifier
- Prioritize compatibility error checking before client error validation
- Remove 'max_tokens' from CLIENT_ERROR_PATTERNS as it can indicate compatibility issues
- Enable automatic failover when provider doesn't support requested features
- Improve error classification accuracy with pattern matching for common compatibility issues
2025-12-19 13:28:26 +08:00
fawney19
c69a0a8506 refactor: remove stream smoothing config from system settings and improve base image caching
- Remove stream_smoothing configuration from SystemConfigService (moved to handler default)
- Remove stream smoothing UI controls from admin settings page
- Add AdminClearSingleAffinityAdapter for targeted cache invalidation
- Add clearSingleAffinity API endpoint to clear specific affinity cache entries
- Include global_model_id in affinity list response for UI deletion support
- Improve CI/CD workflow with hash-based base image change detection
- Add hash label to base image for reliable cache invalidation detection
- Use remote image inspection to determine if base image rebuild is needed
- Include Dockerfile.base in hash calculation for proper dependency tracking
2025-12-19 13:09:56 +08:00
fawney19
1fae202bde Merge pull request #30 from AAEE86/master
chore: Modify the order of API format enumeration
2025-12-19 12:34:22 +08:00
AAEE86
e42bd35d48 chore: Modify the order of API format enumeration - Move CLAUDE_CLI before OPENAI 2025-12-19 11:44:10 +08:00
fawney19
97425ac68f refactor: make stream smoothing parameters configurable and add models cache invalidation
- Move stream smoothing parameters (chunk_size, delay_ms) to database config
- Remove hardcoded stream smoothing constants from StreamProcessor
- Simplify dynamic delay calculation by using config values directly
- Add invalidate_models_list_cache() function to clear /v1/models endpoint cache
- Call cache invalidation on model create, update, delete, and bulk operations
- Update admin UI to allow runtime configuration of smoothing parameters
- Improve model listing freshness when models are modified
2025-12-19 11:03:46 +08:00
fawney19
912f6643e2 tune: adjust stream smoothing parameters for better user experience
- Increase chunk size from 5 to 20 characters for fewer delays
- Reduce min delay from 15ms to 8ms for faster playback
- Reduce max delay from 24ms to 15ms for better responsiveness
- Adjust text thresholds to better differentiate content types
- Apply parameter tuning to both StreamProcessor and _LightweightSmoother
2025-12-19 09:51:09 +08:00
fawney19
6c0373fda6 refactor: simplify text splitting logic in stream processor
- Remove complex conditional logic for short/medium/long text differentiation
- Unify text splitting to always use consistent CHUNK_SIZE-based splitting
- Rely on dynamic delay calculation for output speed adjustment
- Reduce code complexity in both main smoother and lightweight smoother
2025-12-19 09:48:11 +08:00
fawney19
070121717d refactor: consolidate stream smoothing into StreamProcessor with intelligent timing
- Move StreamSmoother functionality directly into StreamProcessor for better integration
- Create ContentExtractor strategy pattern for format-agnostic content extraction
- Implement intelligent dynamic delay calculation based on text length
- Support three text length tiers: short (char-by-char), medium (chunked), long (chunked)
- Remove manual chunk_size and delay_ms configuration - now auto-calculated
- Simplify admin UI to single toggle switch with auto timing adjustment
- Extract format detection logic to reusable content_extractors module
- Improve code maintainability with cleaner architecture
2025-12-19 09:46:22 +08:00
fawney19
85fafeacb8 feat: add stream smoothing feature for improved user experience
- Implement StreamSmoother class to split large content chunks into smaller pieces with delay
- Support OpenAI, Claude, and Gemini API response formats for smooth playback
- Add stream smoothing configuration to system settings (enable, chunk size, delay)
- Create streamlined API for stream smoothing with StreamSmoothingConfig dataclass
- Add admin UI controls for configuring stream smoothing parameters
- Use batch configuration loading to minimize database queries
- Enable typing effect simulation for better user experience in streaming responses
2025-12-19 03:15:19 +08:00
fawney19
7e792dabfc refactor: use background task for client disconnection monitoring
- Replace time-based throttling with background task for disconnect checks
- Remove time.monotonic() and related throttling logic
- Prevent blocking of stream transmission during connection checks
- Properly clean up background task with try/finally block
- Improve throughput and responsiveness of stream processing
2025-12-19 01:59:56 +08:00
fawney19
cd06169b2f fix: detect OpenAI format stream completion via finish_reason
- Add detection of finish_reason in OpenAI API responses to mark stream completion
- Ensures OpenAI API streams are properly marked as complete even without explicit completion events
- Complements existing completion event detection for other API formats
2025-12-19 01:44:35 +08:00
fawney19
50ffd47546 fix: handle client disconnection after stream completion gracefully
- Check has_completion flag before marking client disconnection as failure
- Allow graceful termination if response already completed when client disconnects
- Change logging level to info for post-completion disconnections
- Prevent false error reporting when client closes connection after receiving full response
2025-12-19 01:36:20 +08:00
fawney19
5f0c1fb347 refactor: remove unused response normalizer module
- Delete unused ResponseNormalizer class and its initialization logic
- Remove response_normalizer and enable_response_normalization parameters from handlers
- Simplify chat adapter base initialization by removing normalizer setup
- Clean up unused imports in handler modules
2025-12-19 01:20:30 +08:00
fawney19
7b932d7afb refactor: optimize middleware with pure ASGI implementation and enhance security measures
- Replace BaseHTTPMiddleware with pure ASGI implementation in plugin middleware for better streaming response handling
- Add trusted proxy count configuration for client IP extraction in reverse proxy environments
- Implement audit log cleanup scheduler with configurable retention period
- Replace plaintext token logging with SHA256 hash fingerprints for security
- Fix database session lifecycle management in middleware
- Improve request tracing and error logging throughout the system
- Add comprehensive tests for pipeline architecture
2025-12-18 19:07:20 +08:00
fawney19
293bb592dc fix: enhance proxy configuration with password preservation and UI improvements
- Add 'enabled' field to ProxyConfig for preserving config when disabled
- Mask proxy password in API responses (return '***' instead of actual password)
- Preserve existing password on update when new password not provided
- Add URL encoding for proxy credentials (handle special chars like @, :, /)
- Enhanced URL validation: block SOCKS4, require valid host, forbid embedded auth
- UI improvements: use Switch component, dynamic password placeholder
- Add confirmation dialog for orphaned credentials (URL empty but has username/password)
- Prevent browser password autofill with randomized IDs and CSS text-security
- Unify ProxyConfig type definition in types.ts
2025-12-18 16:14:37 +08:00
fawney19
3e50c157be feat: add HTTP/SOCKS5 proxy support for API endpoints
- Add proxy field to ProviderEndpoint database model with migration
- Add ProxyConfig Pydantic model for proxy URL validation
- Extend HTTP client pool with create_client_with_proxy method
- Integrate proxy configuration in chat_handler_base.py and cli_handler_base.py
- Update admin API endpoints to support proxy configuration CRUD
- Add proxy configuration UI in frontend EndpointFormDialog

Fixes #28
2025-12-18 14:46:47 +08:00
fawney19
21587449c8 fix: improve error classification and logging system
- Enhance error classifier to properly handle API key failures with fallback support
- Add error reason/code parsing for better AWS and multi-provider compatibility
- Improve error message structure detection for non-standard formats
- Refactor file logging with size-based rotation (100MB) instead of daily
- Optimize production logging by disabling backtrace and diagnose
- Clean up model validation and remove redundant configurations
2025-12-18 10:57:31 +08:00
fawney19
3d0ab353d3 refactor: migrate Pydantic Config to v2 ConfigDict 2025-12-18 02:20:53 +08:00
fawney19
b2a857c164 refactor: consolidate transaction management and remove legacy modules
- Remove unused context.py module (replaced by request.state)
- Remove provider_cache.py (no longer needed)
- Unify environment loading in config/settings.py instead of __init__.py
- Add deprecation warning for get_async_db() (consolidating on sync Session)
- Enhance database.py documentation with comprehensive transaction strategy
- Simplify audit logging to reuse request-level Session (no separate connections)
- Extract UsageService._build_usage_params() helper to reduce code duplication
- Update model and user cache implementations with refined transaction handling
- Remove unnecessary sessionmaker from pipeline
- Clean up audit service exception handling
2025-12-18 01:59:40 +08:00
fawney19
4d1d863916 refactor: improve authentication and user data handling
- Replace user cache queries with direct database queries to ensure data consistency
- Fix token_type parameter in verify_token calls (access token verification)
- Fix role-based permission check using dictionary ranking instead of string comparison
- Fix logout operation to use correct JWT claim name (user_id instead of sub)
- Simplify user authentication flow by removing unnecessary cache layer
- Optimize session initialization in main.py using create_session helper
- Remove unused imports and exception variables
2025-12-18 01:09:22 +08:00
fawney19
b579420690 refactor: optimize database session lifecycle and middleware architecture
- Improve database pool capacity logging with detailed configuration parameters
- Optimize database session dependency injection with middleware-managed lifecycle
- Simplify plugin middleware by delegating session creation to FastAPI dependencies
- Fix import path in auth routes (relative to absolute)
- Add safety checks for database session management across middleware exception handlers
- Ensure session cleanup only when not managed by middleware (avoid premature cleanup)
2025-12-18 00:35:46 +08:00
fawney19
9d5c84f9d3 refactor: add scheduling mode support and optimize system settings UI
- Add fixed_order and cache_affinity scheduling modes to CacheAwareScheduler
- Only apply cache affinity in cache_affinity mode; use fixed order otherwise
- Simplify Dialog components with title/description props
- Remove unnecessary button shadows in SystemSettings
- Optimize import dialog UI structure
- Update ModelAliasesTab shadow styling
- Fix fallback orchestrator type hints
- Add scheduling_mode configuration in system config
2025-12-17 19:15:08 +08:00
fawney19
bd11ebdbd5 fix: 修复个人设置页面深色模式切换后刷新失效的问题
- 前端使用 useDarkMode composable 统一主题切换逻辑
- 后端支持 system 主题值(之前只支持 auto)
- 主题以本地 localStorage 为准,避免刷新时被服务端旧值覆盖

Fixes #22
2025-12-17 18:02:19 +08:00
fawney19
1dac4cb156 refactor: optimize provider query and stats aggregation logic 2025-12-17 16:41:10 +08:00
fawney19
d24c3885ab feat(admin): add config and user data import/export functionality
Add comprehensive import/export endpoints for:
- Provider and model configuration (with key decryption for export)
- User data and API keys (preserving encrypted data)

Includes merge modes (skip/overwrite/error) for conflict handling,
10MB size limit for imports, and automatic cache invalidation.

Also fix optional field in GlobalModelResponse tiered_pricing.
2025-12-16 18:33:14 +08:00
fawney19
46ff5a1a50 refactor(models): enhance model management with official provider marking and extended metadata
- Add OFFICIAL_PROVIDERS set to mark first-party vendors in models.dev
- Implement official provider marking function with cache compatibility
- Extend model metadata with family, context_limit, output_limit fields
- Improve frontend model selection UI with wider panel and better search
- Add dark mode support for provider logos
- Optimize scrollbar styling for model lists
- Update deployment documentation with clearer migration steps
2025-12-16 17:28:40 +08:00