Rate Limiting

Per-IP, per-route rate limiting using a Count-Min Sketch sliding-window estimator. The entire state fits in ~32 KB of memory, is fully thread-safe with no locks, and is enabled with a single directive in your Dwaarfile.

Quick Start

api.example.com {
    reverse_proxy localhost:8080
    rate_limit 100/s
}

Every IP address is limited to 100 requests per second on api.example.com. Requests over the limit receive a 429 Too Many Requests response immediately — no queuing, no delay.

How It Works

Sliding-window interpolation

The estimator uses a 1-second measurement window and the PROPORTIONAL_RATE_ESTIMATE_CALC_FN from pingora-limits. On every request, the plugin:

Records a count of 1 for the composite key.
Computes a weighted estimate that interpolates between the completed previous interval and the still-accumulating current interval, proportional to how far through the current second you are.

This avoids the step-function artifact of pure fixed windows: a burst of 200 requests at 00:00.999 and 200 more at 00:01.001 would not be rate-limited by a naive 1-second counter reset, but the sliding estimate sees ~400 rps and correctly rejects the excess.

Composite key isolation

The key {client_ip}:{route_domain} gives each (IP, site) pair its own counter. A single IP accessing two different virtual hosts on the same Dwaar instance is tracked independently — reaching the limit on api.example.com does not affect the same IP’s budget on www.example.com.

Keys up to 24 bytes are stored inline in a CompactString with no heap allocation. IPv4 + short domain names (e.g., 203.0.113.1:api.example.com = 25 bytes) fall just over the inline threshold and allocate once per request; this is a single small allocation and does not affect throughput in any measurable way.

Configuration

rate_limit <requests_per_second>

Token	Type	Required	Description
`requests_per_second`	positive integer	yes	Maximum requests per second per IP for this route. The `/s` suffix is required.

Place rate_limit inside any site block. It applies to all paths on that site. To apply different limits to different paths, use handle blocks:

api.example.com {
    handle /public/* {
        reverse_proxy localhost:8080
        rate_limit 1000/s
    }

    handle /admin/* {
        reverse_proxy localhost:8080
        rate_limit 10/s
    }
}

Choosing a limit

Start with a limit that is comfortably above your 99th-percentile legitimate traffic rate for a single IP. Common starting points:

Use case	Suggested limit
Public API (authenticated)	`100/s` – `500/s`
Public API (unauthenticated)	`20/s` – `100/s`
Login / auth endpoints	`5/s` – `20/s`
Webhook receivers	`50/s` – `200/s`
Static asset CDN fallback	`500/s` – `2000/s`

These are starting points. Tune against your actual traffic profile and check your analytics logs for false-positive 429s before tightening further.

Under Attack Mode

Dwaar includes an Under Attack Mode plugin (UnderAttackPlugin, priority 15) for L7 DDoS mitigation. When enabled for a route, it intercepts every request from an unverified client and serves a JavaScript proof-of-work challenge page instead of forwarding to the upstream.

How the challenge works:

A client without a valid clearance cookie receives a 200 response containing an HTML page with embedded JavaScript.
The JavaScript computes SHA-256 hashes of challenge || nonce in a loop, incrementing the nonce until it finds a hash with 20 leading zero bits (~1 million iterations, ~200 ms on a modern browser).
Once solved, the browser redirects itself to the original URL with _dwaar_solved=1, the challenge value, and the winning nonce as query parameters.
Dwaar verifies the proof-of-work server-side and issues a signed _dwaar_clearance cookie (HMAC-SHA256 over timestamp and client IP, valid for 1 hour).
Subsequent requests that present a valid, unexpired clearance cookie pass through to the upstream without re-solving.

Non-browser clients (curl, scrapers, bots) cannot execute JavaScript and are permanently stuck on the challenge page. They never reach your upstream.

Cookie security properties:

The HMAC binds the cookie to the client’s IP address. A cookie stolen from one IP is rejected on any other IP.
The timestamp prevents replay attacks after the TTL expires.
Cookie verification uses constant-time comparison to eliminate timing side-channels.
Cookies are issued with HttpOnly; SameSite=Lax and Secure when the connection is TLS.

Note: Under Attack Mode is currently an internal/programmatic feature. There is no Dwaarfile directive to enable it — it must be activated via the Dwaar admin API or control plane integration. A under_attack Dwaarfile directive is planned for a future release.

Response

When a request is rate-limited, the client receives:

HTTP/1.1 429 Too Many Requests
Retry-After: 1
Content-Length: 0

The body is empty. The Retry-After: 1 header tells standards-compliant clients to wait at least one second before retrying — which corresponds to the estimator’s 1-second window.

What the client should do:

HTTP clients that respect Retry-After will wait 1 second automatically.
Clients that hammer through 429s will continue to be rejected: the rate estimate includes the rejected requests.
A 429 is not a ban. A client that drops its rate below the limit will immediately start receiving 200s again on the next estimation window.

Plugin Priority

RateLimitPlugin runs at priority 20.

Priority	Plugin	What it does
10	`BotDetectionPlugin`	Sets `ctx.is_bot` flag
15	`UnderAttackPlugin`	JS proof-of-work challenge
20	`RateLimitPlugin`	Per-IP sliding-window limit
30+	Other plugins	Auth, headers, etc.

Running after bot detection (priority 10) means the rate limiter can see ctx.is_bot if you build custom logic on top of the plugin chain. It runs before forward auth and other higher-priority plugins so that rate-limited requests are rejected before any upstream credential checks are made.

Memory Usage

The estimator allocates a fixed 4 hashes × 1024 slots = 4096 counters table at startup. Each counter is an atomic value (8 bytes), giving:

4 × 1024 × 8 bytes = 32,768 bytes ≈ 32 KB

This is the total memory cost regardless of how many unique IPs are tracked. There is no per-IP allocation.

Collision behaviour: Count-Min Sketch is a probabilistic structure. With 1024 slots per hash row, collision rates remain low up to roughly 1000 distinct active keys. Above that threshold, estimates become slightly inflated — some IPs may be rate-limited slightly earlier than their true rate would warrant. No IP is ever permitted above the limit due to collisions; the sketch only over-counts, never under-counts.

For most deployments, 1000 simultaneously active IP:domain pairs per Dwaar worker is a generous ceiling. If your traffic profile regularly exceeds this (e.g., a very large public API), consider deploying multiple worker processes or increasing the sketch dimensions by modifying RateLimiter::new() to pass a custom Rate config.

Complete Example

# Global TLS — Dwaar handles certificate provisioning
{
    email admin@example.com
}

# Public website — relaxed limit
www.example.com {
    reverse_proxy localhost:3000
    rate_limit 500/s

    header {
        Strict-Transport-Security "max-age=63072000; includeSubDomains; preload"
        X-Content-Type-Options nosniff
    }
}

# API — strict limit with per-path overrides
api.example.com {
    handle /v1/* {
        reverse_proxy localhost:8080
        rate_limit 100/s
    }

    handle /admin/* {
        ip_filter {
            allow 10.0.0.0/8
            allow 192.168.0.0/16
            default deny
        }
        reverse_proxy localhost:8080
        rate_limit 10/s
    }
}

# Login endpoint — aggressive rate limiting to slow credential stuffing
auth.example.com {
    handle /login {
        rate_limit 5/s
        reverse_proxy localhost:9000
    }

    handle {
        reverse_proxy localhost:9000
    }
}

Bot Detection — classify traffic as bot or human before rate-limiting (priority 10)
IP Filtering — block or allow specific IP ranges outright
Security Headers — add Strict-Transport-Security, CSP, and other response headers