feat: add daily token limit with actual usage tracking (#171)

* feat: add daily token limit with actual usage tracking

- Add DAILY_TOKEN_LIMIT env var for configurable daily token limit
- Track actual tokens from Bedrock API response metadata (not estimates)
- Server sends inputTokens + cachedInputTokens + outputTokens via messageMetadata
- Client increments token count in onFinish callback with actual usage
- Add NaN guards to prevent corrupted localStorage values
- Add token limit toast notification with quota display
- Remove client-side token estimation (was blocking legitimate requests)
- Switch to js-tiktoken for client compatibility (pure JS, no WASM)

* feat: add TPM (tokens per minute) rate limiting

- Add 50k tokens/min client-side rate limit
- Track tokens per minute with automatic minute rollover
- Check TPM limit after daily limits pass
- Show toast when rate limit reached
- NaN guards for localStorage values

* feat: make TPM limit configurable via TPM_LIMIT env var

* chore: restore cache debug logs

* fix: prevent race condition in TPM tracking

checkTPMLimit was resetting TPM count to 0 when checking, which
overwrote the count saved by incrementTPMCount. Now checkTPMLimit
only reads and incrementTPMCount handles all writes.

* chore: improve TPM limit error message clarity
This commit is contained in:
Dayuan Jiang
2025-12-08 18:56:34 +09:00
committed by GitHub
parent 728dda5267
commit 622829b903
7 changed files with 285 additions and 66 deletions

View File

@@ -9,5 +9,7 @@ export async function GET() {
return NextResponse.json({
accessCodeRequired: accessCodes.length > 0,
dailyRequestLimit: parseInt(process.env.DAILY_REQUEST_LIMIT || "0", 10),
dailyTokenLimit: parseInt(process.env.DAILY_TOKEN_LIMIT || "0", 10),
tpmLimit: parseInt(process.env.TPM_LIMIT || "0", 10),
})
}