feat: add daily token limit with actual usage tracking (#171)

* feat: add daily token limit with actual usage tracking

- Add DAILY_TOKEN_LIMIT env var for configurable daily token limit
- Track actual tokens from Bedrock API response metadata (not estimates)
- Server sends inputTokens + cachedInputTokens + outputTokens via messageMetadata
- Client increments token count in onFinish callback with actual usage
- Add NaN guards to prevent corrupted localStorage values
- Add token limit toast notification with quota display
- Remove client-side token estimation (was blocking legitimate requests)
- Switch to js-tiktoken for client compatibility (pure JS, no WASM)

* feat: add TPM (tokens per minute) rate limiting

- Add 50k tokens/min client-side rate limit
- Track tokens per minute with automatic minute rollover
- Check TPM limit after daily limits pass
- Show toast when rate limit reached
- NaN guards for localStorage values

* feat: make TPM limit configurable via TPM_LIMIT env var

* chore: restore cache debug logs

* fix: prevent race condition in TPM tracking

checkTPMLimit was resetting TPM count to 0 when checking, which
overwrote the count saved by incrementTPMCount. Now checkTPMLimit
only reads and incrementTPMCount handles all writes.

* chore: improve TPM limit error message clarity
This commit is contained in:
Dayuan Jiang
2025-12-08 18:56:34 +09:00
committed by GitHub
parent 728dda5267
commit 622829b903
7 changed files with 285 additions and 66 deletions

View File

@@ -5,16 +5,21 @@ import type React from "react"
import { FaGithub } from "react-icons/fa"
interface QuotaLimitToastProps {
type?: "request" | "token"
used: number
limit: number
onDismiss: () => void
}
export function QuotaLimitToast({
type = "request",
used,
limit,
onDismiss,
}: QuotaLimitToastProps) {
const isTokenLimit = type === "token"
const formatNumber = (n: number) =>
n >= 1000 ? `${(n / 1000).toFixed(1)}k` : n.toString()
const handleKeyDown = (e: React.KeyboardEvent) => {
if (e.key === "Escape") {
e.preventDefault()
@@ -48,19 +53,24 @@ export function QuotaLimitToast({
/>
</div>
<h3 className="font-semibold text-foreground text-sm">
Daily Quota Reached
{isTokenLimit
? "Daily Token Limit Reached"
: "Daily Quota Reached"}
</h3>
<span className="px-2 py-0.5 text-xs font-medium rounded-md bg-muted text-muted-foreground">
{used}/{limit}
{isTokenLimit
? `${formatNumber(used)}/${formatNumber(limit)} tokens`
: `${used}/${limit}`}
</span>
</div>
{/* Message */}
<div className="text-sm text-muted-foreground leading-relaxed mb-4 space-y-2">
<p>
Oops you've reached the daily API limit for this demo! As
an indie developer covering all the API costs myself, I have
to set these limits to keep things sustainable.
Oops you've reached the daily{" "}
{isTokenLimit ? "token" : "API"} limit for this demo! As an
indie developer covering all the API costs myself, I have to
set these limits to keep things sustainable.
</p>
<p>
The good news is that you can self-host the project in