Aether

i/Aether

mirror of https://github.com/fawney19/Aether.git synced 2026-01-03 00:02:28 +08:00

Author	SHA1	Message	Date
hoping	26b4a37323	feat: 引入统一的端点检查器以重构适配器并改进错误处理和用量统计。	2025-12-25 00:02:56 +08:00
Hwwwww-dev	15a9b88fc8	feat: enhance extract_cache_creation_tokens function to support three formats[#41 ] (#42 ) - Updated the function to prioritize nested format, followed by flat new format, and finally old format for cache creation tokens. - Added fallback logic for cases where the preferred formats return zero. - Expanded unit tests to cover new format scenarios and ensure proper functionality across all formats. Co-authored-by: heweimin <heweimin@retaileye.ai>	2025-12-24 01:31:45 +08:00
fawney19	03eb7203ec	fix(api): 同步 chat_handler_base 使用 aiter_bytes 支持自动解压	2025-12-24 01:13:35 +08:00
hank9999	e38cd6819b	fix(api): 优化字节流迭代器以支持自动解压 gzip (#39 )	2025-12-24 01:11:35 +08:00
fawney19	1d5c378343	feat: add TTFB timeout detection and improve stream handling - Add stream first byte timeout (TTFB) detection to trigger failover when provider responds too slowly (configurable via STREAM_FIRST_BYTE_TIMEOUT) - Add rate limit fail-open/fail-close strategy configuration - Improve exception handling in stream prefetch with proper error classification - Refactor UsageService with shared _prepare_usage_record method - Add batch deletion for old usage records to avoid long transaction locks - Update CLI adapters to use proper User-Agent headers for each CLI client - Add composite indexes migration for usage table query optimization - Fix streaming status display in frontend to show TTFB during streaming - Remove sensitive JWT secret logging in auth service	2025-12-22 23:44:42 +08:00
fawney19	4e1aed9976	feat: add daily model statistics aggregation with stats_daily_model table	2025-12-20 02:39:10 +08:00
fawney19	df9f9a9f4f	feat: add internal model list query interface with configurable User-Agent headers	2025-12-19 23:40:42 +08:00
fawney19	af476ff21e	feat: enhance error logging and upstream response tracking for provider failures	2025-12-19 15:29:48 +08:00
fawney19	97425ac68f	refactor: make stream smoothing parameters configurable and add models cache invalidation - Move stream smoothing parameters (chunk_size, delay_ms) to database config - Remove hardcoded stream smoothing constants from StreamProcessor - Simplify dynamic delay calculation by using config values directly - Add invalidate_models_list_cache() function to clear /v1/models endpoint cache - Call cache invalidation on model create, update, delete, and bulk operations - Update admin UI to allow runtime configuration of smoothing parameters - Improve model listing freshness when models are modified	2025-12-19 11:03:46 +08:00
fawney19	912f6643e2	tune: adjust stream smoothing parameters for better user experience - Increase chunk size from 5 to 20 characters for fewer delays - Reduce min delay from 15ms to 8ms for faster playback - Reduce max delay from 24ms to 15ms for better responsiveness - Adjust text thresholds to better differentiate content types - Apply parameter tuning to both StreamProcessor and _LightweightSmoother	2025-12-19 09:51:09 +08:00
fawney19	6c0373fda6	refactor: simplify text splitting logic in stream processor - Remove complex conditional logic for short/medium/long text differentiation - Unify text splitting to always use consistent CHUNK_SIZE-based splitting - Rely on dynamic delay calculation for output speed adjustment - Reduce code complexity in both main smoother and lightweight smoother	2025-12-19 09:48:11 +08:00
fawney19	070121717d	refactor: consolidate stream smoothing into StreamProcessor with intelligent timing - Move StreamSmoother functionality directly into StreamProcessor for better integration - Create ContentExtractor strategy pattern for format-agnostic content extraction - Implement intelligent dynamic delay calculation based on text length - Support three text length tiers: short (char-by-char), medium (chunked), long (chunked) - Remove manual chunk_size and delay_ms configuration - now auto-calculated - Simplify admin UI to single toggle switch with auto timing adjustment - Extract format detection logic to reusable content_extractors module - Improve code maintainability with cleaner architecture	2025-12-19 09:46:22 +08:00
fawney19	85fafeacb8	feat: add stream smoothing feature for improved user experience - Implement StreamSmoother class to split large content chunks into smaller pieces with delay - Support OpenAI, Claude, and Gemini API response formats for smooth playback - Add stream smoothing configuration to system settings (enable, chunk size, delay) - Create streamlined API for stream smoothing with StreamSmoothingConfig dataclass - Add admin UI controls for configuring stream smoothing parameters - Use batch configuration loading to minimize database queries - Enable typing effect simulation for better user experience in streaming responses	2025-12-19 03:15:19 +08:00
fawney19	7e792dabfc	refactor: use background task for client disconnection monitoring - Replace time-based throttling with background task for disconnect checks - Remove time.monotonic() and related throttling logic - Prevent blocking of stream transmission during connection checks - Properly clean up background task with try/finally block - Improve throughput and responsiveness of stream processing	2025-12-19 01:59:56 +08:00
fawney19	cd06169b2f	fix: detect OpenAI format stream completion via finish_reason - Add detection of finish_reason in OpenAI API responses to mark stream completion - Ensures OpenAI API streams are properly marked as complete even without explicit completion events - Complements existing completion event detection for other API formats	2025-12-19 01:44:35 +08:00
fawney19	50ffd47546	fix: handle client disconnection after stream completion gracefully - Check has_completion flag before marking client disconnection as failure - Allow graceful termination if response already completed when client disconnects - Change logging level to info for post-completion disconnections - Prevent false error reporting when client closes connection after receiving full response	2025-12-19 01:36:20 +08:00
fawney19	5f0c1fb347	refactor: remove unused response normalizer module - Delete unused ResponseNormalizer class and its initialization logic - Remove response_normalizer and enable_response_normalization parameters from handlers - Simplify chat adapter base initialization by removing normalizer setup - Clean up unused imports in handler modules	2025-12-19 01:20:30 +08:00
fawney19	7b932d7afb	refactor: optimize middleware with pure ASGI implementation and enhance security measures - Replace BaseHTTPMiddleware with pure ASGI implementation in plugin middleware for better streaming response handling - Add trusted proxy count configuration for client IP extraction in reverse proxy environments - Implement audit log cleanup scheduler with configurable retention period - Replace plaintext token logging with SHA256 hash fingerprints for security - Fix database session lifecycle management in middleware - Improve request tracing and error logging throughout the system - Add comprehensive tests for pipeline architecture	2025-12-18 19:07:20 +08:00
fawney19	3e50c157be	feat: add HTTP/SOCKS5 proxy support for API endpoints - Add proxy field to ProviderEndpoint database model with migration - Add ProxyConfig Pydantic model for proxy URL validation - Extend HTTP client pool with create_client_with_proxy method - Integrate proxy configuration in chat_handler_base.py and cli_handler_base.py - Update admin API endpoints to support proxy configuration CRUD - Add proxy configuration UI in frontend EndpointFormDialog Fixes #28	2025-12-18 14:46:47 +08:00
fawney19	21587449c8	fix: improve error classification and logging system - Enhance error classifier to properly handle API key failures with fallback support - Add error reason/code parsing for better AWS and multi-provider compatibility - Improve error message structure detection for non-standard formats - Refactor file logging with size-based rotation (100MB) instead of daily - Optimize production logging by disabling backtrace and diagnose - Clean up model validation and remove redundant configurations	2025-12-18 10:57:31 +08:00
fawney19	1dac4cb156	refactor: optimize provider query and stats aggregation logic	2025-12-17 16:41:10 +08:00
fawney19	a3df41d63d	refactor(cli-handler): improve stream handling and response processing - Refactor CLI handler base for better stream context management - Optimize request/response handling for Claude, OpenAI, and Gemini CLI adapters - Enhance telemetry tracking across CLI handlers	2025-12-16 02:39:20 +08:00
fawney19	ad1c8c394c	refactor(handler): optimize stream processing and telemetry pipeline - Enhance stream context for better token and latency tracking - Refactor stream processor for improved performance metrics - Improve telemetry integration with first_byte_time_ms support - Add comprehensive stream context unit tests	2025-12-16 02:39:03 +08:00
fawney19	f3a69a6160	refactor(handler): implement defensive token update strategy and extract cache creation token utility - Add extract_cache_creation_tokens utility to handle new/old cache creation token formats - Implement defensive update strategy in StreamContext to prevent zero values overwriting valid data - Simplify cache creation token parsing in Claude handler using new utility - Add comprehensive test suite for cache creation token extraction - Improve type hints in handler classes	2025-12-16 00:02:49 +08:00
fawney19	cf67160821	feat(cache): enhance cache monitoring endpoints and handler integrations	2025-12-15 23:12:48 +08:00
fawney19	88e37594cf	refactor(backend): update handlers, utilities and core modules after models restructure	2025-12-15 14:30:53 +08:00
fawney19	53bf74429e	refactor: 重构流式处理模块，提取 StreamContext/Processor/Telemetry - 将 chat_handler_base.py 中的流式处理逻辑拆分为三个独立模块： - StreamContext: 类型安全的流式上下文数据类，替代原有的 ctx dict - StreamProcessor: SSE 解析、预读、嵌套错误检测 - StreamTelemetryRecorder: 统计记录（Usage/Audit/Candidate） - 将硬编码配置外置到 settings.py，支持环境变量覆盖： - HTTP 超时配置（connect/write/pool） - 流式处理配置（预读行数、统计延迟） - 并发控制配置（槽位 TTL、缓存预留比例）	2025-12-12 15:42:45 +08:00
fawney19	22ea0e245d	refactor: 统一响应解析中的嵌套错误检测逻辑 - 提取 _check_nested_error 函数处理多种错误格式 - 支持检测顶层 error、type=error 以及 chunks 内嵌套的错误 - 简化 OpenAIResponseParser 和 ClaudeResponseParser 中的错误处理代码 - 提高代码复用性和可维护性	2025-12-11 11:33:07 +08:00
fawney19	8f914d89bb	fix: 增加写入超时时间支持大请求体 - 将 chat_handler_base 的写入超时从 10 秒增加到 60 秒 - 将 cli_handler_base 的写入超时从 10 秒增加到 60 秒 - 将 http_client 的写入超时从 10 秒增加到 60 秒 - 支持包含大量数据（如图片）的长对话请求	2025-12-11 11:21:46 +08:00
fawney19	0474f63403	refactor: 完善 handler 基类类型注解和流式状态更新 - 为 BaseMessageHandler 和 MessageTelemetry 添加完整类型注解 - 新增 _update_usage_to_streaming 方法，异步更新 Usage 状态为 streaming - 优化 chat/cli handler 的类型提示，提升代码可维护性 - 修复类型检查警告，确保 mypy 通过	2025-12-11 10:05:06 +08:00
fawney19	f784106826	Initial commit	2025-12-10 20:52:44 +08:00

31 Commits