Commit Graph

23 Commits

Author SHA1 Message Date
fawney19
718f56ba75 refactor(cache): optimize cache service architecture and provider transport 2025-12-15 23:12:34 +08:00
fawney19
f16fb28405 feat(model): improve cache invalidation for model updates and deletion
- Handle both old and new aliases when invalidating cache during model updates
- Preserve cache info before deletion to properly invalidate after deletion
- Clear both Redis and in-memory caches on model changes
2025-12-15 20:39:39 +08:00
fawney19
a0ffc2c406 refactor(metrics): rename model_alias_* to model_mapping_* for clarity 2025-12-15 20:39:32 +08:00
fawney19
a7bfab1475 debug: add logging for model support checking and refactor cache resolution priority
- 在 aware_scheduler.py 中添加调试日志,用于跟踪模型支持检查过程
- 重构 model_cache.py 的别名解析逻辑:调整优先级为 alias > provider_model_name > direct_match
- 优化缓存命中路径,将直接匹配逻辑移到别名匹配失败后执行
2025-12-15 18:52:34 +08:00
fawney19
84d4db0f8d feat(model): include alias info in cache invalidation
- Pass provider_model_name to invalidate_model_cache() when creating models
- Pass provider_model_aliases to invalidate_model_cache() when updating models
- Ensures alias-based resolve cache keys are properly cleared on model changes
2025-12-15 18:27:49 +08:00
fawney19
903b182fdf fix(scheduler): correct whitelist validation logic
- Use 'is not None' instead of truthiness check for allowed_api_formats
- Use 'is not None' instead of truthiness check for allowed_models
- Use 'is not None' instead of truthiness check for allowed_providers
- Use 'is not None' check for allowed_endpoints to distinguish empty list from None
- Fixes issue where empty whitelist (empty list) was incorrectly treated as no restriction
2025-12-15 18:27:41 +08:00
fawney19
d9bd0790fe feat(cache): improve model cache invalidation for alias resolution
- Add provider_model_name and provider_model_aliases to invalidate_model_cache()
- Clear resolve cache keys for both model name and aliases when invalidating
- Also clear resolve cache in invalidate_global_model_cache() for GlobalModel names
- Handle SQLite gracefully by catching OperationalError and ProgrammingError
- Optimize fallback query to pre-filter by provider_model_name when JSONB fails
2025-12-15 18:27:31 +08:00
fawney19
11774c69b6 refactor(mapper): use model alias resolution service
- Replace direct GlobalModel.name lookup with ModelCacheService.resolve_global_model_by_name_or_alias()
- Support model aliases in source_model parameter
- Leverage model resolution caching for better performance
2025-12-15 18:13:41 +08:00
fawney19
8f0a0cbdb1 refactor(scheduler): integrate model alias resolution
- Use ModelCacheService.resolve_global_model_by_name_or_alias() for model lookups
- Support both requested model name and resolved GlobalModel name in validation
- Track resolved_model_name for proper allow_models checking
- Improve model availability checks to handle alias resolution
- Fix transient/detached object handling in global_model merge
- Add more descriptive debug logs for alias resolution mismatches
- Clean up code formatting (line length, imports organization)
2025-12-15 18:13:35 +08:00
fawney19
51b85915d2 feat(cache): implement model alias resolution with caching
- Add resolve_global_model_by_name_or_alias() supporting direct match and alias lookup
- Support both provider_model_name and provider_model_aliases matching
- Implement caching for resolved models with TTL
- Add conflict detection when alias maps to multiple GlobalModels
- Record resolution metrics: method, cache hits, duration, conflicts
- Fallback to Python-level filtering for non-PostgreSQL databases
- Add cache invalidation methods for GlobalModel
2025-12-15 18:13:28 +08:00
fawney19
7068aa9130 refactor(backend): optimize cache system and model/provider services 2025-12-15 14:30:21 +08:00
fawney19
728f9bb126 refactor(backend): remove model mappings module 2025-12-15 14:30:00 +08:00
fawney19
2f9d943647 fix(system): fix timezone handling in dashboard and stats services
- Use app timezone instead of UTC for date calculations in dashboard routes
- Ensure consistency between stats_daily.date and timezone-aware comparisons
- Fix date calculations in cleanup scheduler to handle DST correctly
- Update log message in stats aggregator to use business date
2025-12-14 00:16:03 +08:00
fawney19
7d0003e61e refactor(backend): optimize usage service and database helpers 2025-12-14 00:16:03 +08:00
fawney19
53bf74429e refactor: 重构流式处理模块,提取 StreamContext/Processor/Telemetry
- 将 chat_handler_base.py 中的流式处理逻辑拆分为三个独立模块:
  - StreamContext: 类型安全的流式上下文数据类,替代原有的 ctx dict
  - StreamProcessor: SSE 解析、预读、嵌套错误检测
  - StreamTelemetryRecorder: 统计记录(Usage/Audit/Candidate)
- 将硬编码配置外置到 settings.py,支持环境变量覆盖:
  - HTTP 超时配置(connect/write/pool)
  - 流式处理配置(预读行数、统计延迟)
  - 并发控制配置(槽位 TTL、缓存预留比例)
2025-12-12 15:42:45 +08:00
fawney19
39defce71c fix: 修复统计聚合的时区问题,启动时自动回填缺失数据
- 统计聚合使用业务时区(APP_TIMEZONE)计算日期,而非UTC
- 新增 _get_business_day_range() 将业务日期转换为UTC时间范围
- 启动时检查并自动回填因容器重启等原因缺失的统计数据
- 修复 aggregate_daily_stats/update_summary/get_today_realtime_stats 等方法的时区计算
2025-12-12 10:06:07 +08:00
fawney19
5722b1422e fix: 启动时自动回填缺失的统计数据 2025-12-12 09:48:17 +08:00
fawney19
0e8bf0a23b feat: 请求间隔散点图按模型区分颜色
- 后端 get_interval_timeline 接口返回数据添加 model 字段
- 前端散点图按模型分组显示不同颜色的数据点
- 横线统计信息支持按模型分别显示统计数据
- 管理员视图保持按用户分组,用户视图按模型分组
- 更新 mock 数据支持模型字段
2025-12-11 21:33:39 +08:00
fawney19
6e8107e340 fix: 修复管理员散点图只显示部分用户的问题
- 改为按比例采样,保持各用户数据量比例不变
- 散点图默认时间从7天改为当天(24小时)
- limit 从 2000 提高到 10000
2025-12-11 19:34:56 +08:00
fawney19
abc41c7d3c feat: 添加缓存监控和使用量统计 API 端点 2025-12-11 17:47:59 +08:00
fawney19
323a514f77 refactor: 优化活跃请求状态查询逻辑
- 重命名 get_active_requests 为 get_active_requests_status
- 支持从端点配置读取超时时间
- 新增 content_length_limit 错误类型
2025-12-11 10:45:06 +08:00
fawney19
913a87d7f3 refactor: 重构活跃请求查询逻辑到 UsageService
- 在 UsageService 新增 get_active_requests 方法,统一处理活跃请求查询
- 支持自动清理超时的 pending 请求(默认 5 分钟)
- admin 和 user 接口均复用该方法,减少重复代码
- 支持按 ID 列表查询或查询所有活跃请求
2025-12-11 10:04:15 +08:00
fawney19
f784106826 Initial commit 2025-12-10 20:52:44 +08:00