Bài 78: Rate Limiting — governor + Per-IP, Per-User

Mục lục

Mục Tiêu Bài Học
4 Thuật Toán Rate Limiting
Cài governor Crate
AppError::TooManyRequests Variant 21
Per-IP Rate Limit Middleware
Per-User Rate Limit Middleware
In-Memory Pitfall + Redis Preview G15
Verify End-To-End
Tổng Kết
Bài Tập Củng Cố
Bài Tiếp Theo

Mục Tiêu Bài Học

Sau bài học, bạn sẽ:

Hiểu 4 thuật toán rate limiting: fixed window, sliding window, token bucket, leaky bucket — biết khi nào chọn cái nào.
Cài và dùng governor crate v0.6 — Rust idiomatic token bucket implementation.
Implement per-IP rate limit cho endpoint nhạy cảm (/auth/login, /users/register) chống brute force.
Implement per-user rate limit cho endpoint mutation (/orders, /cart/checkout) theo authenticated identity.
Response chuẩn 429 Too Many Requests kèm header Retry-After (RFC 6585) cho client biết khi nào retry.
Hiểu pitfall in-memory rate limit với multi-instance app + plan migration Redis-backed G15.
Thêm pattern AppError::TooManyRequests variant mới — bump 20 → 21 variant.

4 Thuật Toán Rate Limiting

Rate limiting có 4 thuật toán cổ điển; mỗi cái trade-off độ chính xác, memory, behavior burst riêng. Bảng so sánh nhanh:

Algorithm        | Pros                        | Cons
-----------------|-----------------------------|--------------------------------
Fixed window     | đơn giản, ít memory         | spike boundary (20 req/2s ở giữa
                 |                             | 2 window 10 req)
Sliding window   | smooth không spike boundary | memory grow O(N) timestamp lưu
Token bucket     | burst-friendly + refill     | phức tạp impl hơn fixed
                 | steady (industry standard)  |
Leaky bucket     | smooth output rate          | latency tăng (queue chờ slot)

Fixed window đếm số request trong cửa sổ N giây, reset đầu mỗi window. Cài đặt đơn giản (1 counter + 1 timestamp) nhưng có pitfall boundary spike: nếu limit 10 req/phút, attacker bắn 10 req lúc 59 giây cuối window 1 + 10 req lúc 1 giây đầu window 2 → effective 20 req/2 giây vượt rate spec.

Sliding window track timestamp từng request, đếm số timestamp nằm trong rolling N giây vừa qua. Khắc phục boundary spike nhưng memory grow O(N) — mỗi user/IP lưu list timestamp, traffic cao = RAM nặng.

Token bucket dùng bucket chứa N token, refill rate K token/giây, mỗi request consume 1 token. Bucket đầy = cho phép short burst (vd 60 token cho 60 request liên tiếp), refill steady đảm bảo long-term rate; bucket cạn = reject. Đây là industry standard dùng bởi AWS API Gateway, Stripe API, Cloudflare.

Leaky bucket queue request, process steady rate (output mượt). Server dễ control downstream load nhưng client phải chờ → latency tăng, không phù hợp REST API cần response nhanh.

Lock decision Shop API: token bucket qua governor crate. Lý do: burst-friendly cho legitimate user (refresh page nhanh, batch cart add 5 item), refill steady đảm bảo long-term quota, industry standard nên client dev đã quen pattern Retry-After.

Cài governor Crate

Crate governor v0.6 là Rust port của Stripe rate limiter Java, dùng GCRA (Generic Cell Rate Algorithm — token bucket variant tối ưu). API đơn giản, không phụ thuộc tokio runtime (sync + async đều dùng được).

Thêm vào workspace root shop/Cargo.toml:

# File: shop/Cargo.toml — workspace root
[workspace.dependencies]
governor = "0.6"

Thêm dep vào crate shop-api:

# File: crates/shop-api/Cargo.toml
[dependencies]
governor = { workspace = true }

Tạo file mới crates/shop-api/src/middleware/rate_limit.rs chứa 2 type alias + 2 builder function:

// File: crates/shop-api/src/middleware/rate_limit.rs
use std::net::IpAddr;
use std::num::NonZeroU32;
use std::sync::Arc;

use governor::{
    clock::DefaultClock,
    state::keyed::DefaultKeyedStateStore,
    Quota, RateLimiter,
};

/// Per-IP rate limiter — key IpAddr, in-memory state.
pub type IpRateLimiter = Arc<RateLimiter<IpAddr, DefaultKeyedStateStore<IpAddr>, DefaultClock>>;

/// Per-user rate limiter — key user_id (i64 = BIGSERIAL PG), in-memory state.
pub type UserRateLimiter = Arc<RateLimiter<i64, DefaultKeyedStateStore<i64>, DefaultClock>>;

/// Build limiter per-IP với quota requests/phút.
pub fn build_ip_limiter(requests_per_minute: u32) -> IpRateLimiter {
    let quota = Quota::per_minute(
        NonZeroU32::new(requests_per_minute)
            .expect("requests_per_minute phải > 0"),
    );
    Arc::new(RateLimiter::keyed(quota))
}

/// Build limiter per-user với quota requests/phút.
pub fn build_user_limiter(requests_per_minute: u32) -> UserRateLimiter {
    let quota = Quota::per_minute(
        NonZeroU32::new(requests_per_minute)
            .expect("requests_per_minute phải > 0"),
    );
    Arc::new(RateLimiter::keyed(quota))
}

Quota types governor support đa dạng đơn vị:

Quota::per_second(N) — N request/giây.
Quota::per_minute(N) — N request/phút (lock Shop API).
Quota::per_hour(N) — N request/giờ.
Quota::with_period(Duration::from_millis(N)) — custom interval.

Shop API lock Quota::per_minute làm đơn vị mặc định — đủ granular cho web API thông thường, số dễ đọc trong config (60 thay vì 1.0).

RateLimiter::keyed trả về limiter sharing 1 quota cho mỗi key riêng biệt — IP A có 60 quota, IP B có 60 quota khác hoàn toàn, không chia sẻ. Backing state store mặc định là DashMap concurrent in-memory.

AppError::TooManyRequests Variant 21

Sau B73 (variant Forbidden(String)), AppError có 20 variant. B78 thêm 1 variant mới TooManyRequests { retry_after_seconds: u64 } bump 20 → 21 variant. Field retry_after_seconds chứa thời gian client phải chờ trước khi retry — middleware tính từ limiter response.

// File: crates/shop-common/src/error.rs (snippet — chỉ phần thêm B78)
#[derive(Debug, thiserror::Error)]
pub enum AppError {
    // ... 20 variant cũ B10/B16/B48/B55/B73 ...

    #[error("too many requests, retry after {retry_after_seconds}s")]
    TooManyRequests { retry_after_seconds: u64 },
}

Cập nhật impl IntoResponse for AppError thêm match arm với Retry-After header chuẩn RFC 6585 — client biết chính xác bao nhiêu giây nữa quota refill xong, có thể implement exponential backoff hoặc UI countdown timer:

// File: crates/shop-common/src/error.rs — impl IntoResponse (snippet)
use axum::{
    http::{HeaderValue, StatusCode},
    response::{IntoResponse, Response},
    Json,
};
use serde_json::json;

impl IntoResponse for AppError {
    fn into_response(self) -> Response {
        match self {
            // ... 20 arm cũ ...

            AppError::TooManyRequests { retry_after_seconds } => {
                let body = json!({
                    "error": format!("rate limit exceeded, retry after {retry_after_seconds}s"),
                    "code": "TOO_MANY_REQUESTS",
                    "request_id": null,
                    "detail": {
                        "retry_after_seconds": retry_after_seconds,
                    }
                });
                let mut response = (StatusCode::TOO_MANY_REQUESTS, Json(body)).into_response();
                response.headers_mut().insert(
                    "Retry-After",
                    HeaderValue::from_str(&retry_after_seconds.to_string())
                        .unwrap_or(HeaderValue::from_static("60")),
                );
                response
            }
        }
    }
}

Cũng cần update 2 helper exhaustive match status_code() và code() (mỗi cái thêm 1 arm) — compiler bắt thiếu arm sẽ báo lỗi build, không có chuyện quên.

Pattern envelope thống nhất với 20 variant cũ:

error: human message kèm số giây retry — hiển thị frontend được luôn.
code: SCREAMING_SNAKE_CASE constant TOO_MANY_REQUESTS — client check programmatic.
request_id: null placeholder (middleware B39 sẽ enrich từ Extension<RequestId>).
detail.retry_after_seconds: structured field — client parse JSON lấy số trực tiếp không phải regex header.

Note: B55 từng preview pattern RateLimited map sang ServiceUnavailable 503 — B78 official lock thành 429 với variant riêng (semantic chuẩn HTTP hơn 503 vốn để dành cho maintenance/outage).

Per-IP Rate Limit Middleware

Use case: endpoint nhạy cảm như register và login cần limit theo IP — chống brute force credential stuffing (attacker dùng 1 tài khoản test 10 nghìn password lần lượt từ cùng IP).

Extend file middleware/rate_limit.rs thêm middleware function:

// File: crates/shop-api/src/middleware/rate_limit.rs (extend)
use std::net::SocketAddr;

use axum::{
    extract::{ConnectInfo, Request, State},
    middleware::Next,
    response::Response,
};
use shop_common::error::AppError;

pub async fn ip_rate_limit_middleware(
    State(limiter): State<IpRateLimiter>,
    ConnectInfo(addr): ConnectInfo<SocketAddr>,
    req: Request,
    next: Next,
) -> Result<Response, AppError> {
    let ip = addr.ip();

    match limiter.check_key(&ip) {
        Ok(_) => Ok(next.run(req).await),
        Err(negative) => {
            let wait = negative.wait_time_from(governor::clock::Clock::now(&DefaultClock::default()));
            Err(AppError::TooManyRequests {
                retry_after_seconds: wait.as_secs().max(1),
            })
        }
    }
}

3 điểm pattern lock vĩnh viễn:

ConnectInfo<SocketAddr> — axum extract IP client từ TCP connection. Cần bind app với into_make_service_with_connect_info::<SocketAddr>() thay into_make_service() default, nếu không extract sẽ panic runtime.
State<IpRateLimiter> — inject limiter qua axum state (Arc bên trong nên clone rẻ). Tách state riêng với AppState chính vì wire qua from_fn_with_state per-route.
negative.wait_time_from(...) — governor trả Negative outcome khi bucket cạn, tính chính xác thời gian chờ token refill. .max(1) đảm bảo retry-after ≥ 1 giây (tránh client retry ngay trong cùng millisecond).

Wire trong router với specific routes:

// File: crates/shop-api/src/router.rs (snippet — B78 thêm rate_limit per-route)
use axum::{middleware, routing::post, Router};

use crate::middleware::rate_limit::{
    build_ip_limiter, build_user_limiter, ip_rate_limit_middleware,
};

pub fn build_router(state: AppState) -> Router {
    // B78 — limiter per-IP cho auth endpoint (60 req/min đủ legitimate user).
    let ip_limiter = build_ip_limiter(60);

    let auth_routes = Router::new()
        .route("/users/register", post(routes::users::register))
        .route("/auth/login", post(routes::auth::login)) // B112
        .layer(middleware::from_fn_with_state(
            ip_limiter.clone(),
            ip_rate_limit_middleware,
        ));

    // ... wire auth_routes vào /api/v1 nest ...
}

Lock pattern: 60 request/phút mỗi IP cho auth endpoint — đủ legitimate user thao tác (login retry typo password 5-10 lần là cao), block brute force (10K req/phút từ 1 IP).

Bind main app với ConnectInfo để extract chạy được:

// File: crates/shop-api/src/main.rs (snippet — B78 update bind)
let listener = tokio::net::TcpListener::bind(&config.server_bind).await?;
axum::serve(
    listener,
    app.into_make_service_with_connect_info::<SocketAddr>(),
)
.with_graceful_shutdown(shutdown_signal())
.await?;

Per-User Rate Limit Middleware

Use case: endpoint mutation cần auth — POST /orders, POST /cart/checkout — limit theo user_id sau khi B112 wire JWT auth. Per-user fair hơn per-IP (1 IP có thể là office NAT của 100 user legitimate).

Middleware preview B112 (chưa có Extension<CurrentUser> nên placeholder user_id = 1):

// File: crates/shop-api/src/middleware/rate_limit.rs (extend)
pub async fn user_rate_limit_middleware(
    State(limiter): State<UserRateLimiter>,
    // TODO B112: Extension<CurrentUser> sẽ inject user_id từ JWT claims sau auth middleware.
    req: Request,
    next: Next,
) -> Result<Response, AppError> {
    // Placeholder B78 — sau B112 đổi thành: req.extensions().get::<CurrentUser>().map(|u| u.id).unwrap_or(0)
    let user_id = 1i64;

    match limiter.check_key(&user_id) {
        Ok(_) => Ok(next.run(req).await),
        Err(negative) => {
            let wait = negative.wait_time_from(governor::clock::Clock::now(&DefaultClock::default()));
            Err(AppError::TooManyRequests {
                retry_after_seconds: wait.as_secs().max(1),
            })
        }
    }
}

Wire cho /orders và /cart/checkout:

// File: crates/shop-api/src/router.rs (snippet — tiếp B78)
// B78 — limiter per-user cho mutation endpoint (300 req/min = 5 req/s đủ cao).
let user_limiter = build_user_limiter(300);

let order_routes = Router::new()
    .route("/orders", post(routes::orders::create_order))
    .route("/cart/checkout", post(routes::cart::checkout))
    .layer(middleware::from_fn_with_state(
        user_limiter.clone(),
        user_rate_limit_middleware,
    ));

Lock pattern: 300 request/phút mỗi authenticated user cho mutation — đủ cao cho legitimate flow (cart add 10 item liên tục, retry checkout 3 lần), block bot scraping data theo identity user.

Sau B112, replace user_id = 1i64 placeholder:

// Sau B112 (preview):
let user_id = req
    .extensions()
    .get::<CurrentUser>()
    .map(|u| u.id)
    .ok_or(AppError::Unauthenticated)?;

Nếu request đi qua per-user limiter mà chưa authenticate → reject 401 luôn (chain với auth middleware order: auth INNER hơn rate_limit per-user).

In-Memory Pitfall + Redis Preview G15

governor::RateLimiter dùng DashMap in-memory backing store — nhanh (latency 0ms) nhưng có 3 pitfall production-grade:

Per-instance state: mỗi instance app có limiter riêng. Deploy 5 replica = quota nhân 5 (300 × 5 = 1500 req/min thực tế cho 1 user). Attacker target multi-instance bypass limit dễ.
Reset on restart: app restart (deploy mới, crash, OOM kill) reset toàn bộ state limiter. Attacker biết thời điểm deploy → exploit window quota mới.
Single point of failure: instance fail/network partition = state mất, không recover được.

Solution G15 Redis-backed:

Redis SETEX + atomic INCR cho counter — cross-instance share state.
Persistent across restart (AOF/RDB snapshot).
Industry pattern: crate redis-rate-limit hoặc tự build qua fred client (lock G18 Redis adapter).

Lock decision Shop API:

Local dev / staging single-instance: governor in-memory OK (B78 wire).
Production multi-instance: Redis-backed migrate G15 deploy phase.
Trade-off: governor 0ms latency (in-memory) vs Redis 1-5ms (network round-trip). Cho rate limit, 5ms hoàn toàn chấp nhận được — không phải hot path.

Pattern lock cho migration: design abstraction RateLimiter trait ngay B78 → swap impl runtime ở G15 không phá API:

// Preview G15 — RateLimiter trait abstraction.
#[async_trait::async_trait]
pub trait RateLimiter<K>: Send + Sync {
    async fn check(&self, key: &K) -> Result<(), Duration>;
}

// B78 impl in-memory (governor wrapper):
pub struct InMemoryLimiter<K> { /* Arc<governor::RateLimiter<...>> */ }

// G15 impl Redis-backed:
pub struct RedisLimiter<K> { pool: RedisPool, key_prefix: String }

B78 chưa cần trait này thực tế (impl direct với governor) — lock note để G15 refactor không phá callsite middleware.

Verify End-To-End

Khởi động Shop API local:

cargo run -p shop-api
# > shop-api listening addr=0.0.0.0:3000 environment=Local
# > rate_limit ip_limiter=60req/min user_limiter=300req/min

Test 1 — Per-IP rate limit register (60 req/min, gửi 70):

for i in {1..70}; do
  curl -s -o /dev/null -w "%{http_code}\n" \
    -X POST http://localhost:3000/api/v1/users/register \
    -H 'Content-Type: application/json' \
    -d '{"email":"test'$i'@x.com","password":"Pass1234","display_name":"X"}'
done | sort | uniq -c

# Expected output:
#   60 201
#   10 429

Test 2 — Verify 429 response body + Retry-After header:

curl -i -X POST http://localhost:3000/api/v1/users/register \
  -H 'Content-Type: application/json' \
  -d '{"email":"[email protected]","password":"Pass1234","display_name":"X"}'

# Expected sau khi hết quota:
# HTTP/1.1 429 Too Many Requests
# content-type: application/json
# retry-after: 30
#
# {
#   "error": "rate limit exceeded, retry after 30s",
#   "code": "TOO_MANY_REQUESTS",
#   "request_id": null,
#   "detail": { "retry_after_seconds": 30 }
# }

Test 3 — Per-user rate limit (300 req/min, placeholder user_id = 1 chờ B112):

for i in {1..350}; do
  curl -s -o /dev/null -w "%{http_code}\n" \
    -X POST http://localhost:3000/api/v1/orders \
    -H 'Content-Type: application/json' \
    -H 'Idempotency-Key: '"$(uuidgen)" \
    -d '{"items":[{"product_id":1,"quantity":1}],"payment_method":{"type":"cod","phone":"+84912345678"}}'
done | sort | uniq -c

# Expected output:
#  300 201
#   50 429

Test 4 — IP khác reset quota:

# Spoof X-Forwarded-For (chỉ test local — production phải validate proxy whitelist):
curl -X POST http://localhost:3000/api/v1/users/register \
  -H 'X-Forwarded-For: 192.168.1.100' \
  -H 'Content-Type: application/json' \
  -d '{"email":"[email protected]","password":"Pass1234","display_name":"X"}'

# Note: ConnectInfo extract IP TCP layer, KHÔNG đọc X-Forwarded-For
# → spoof header KHÔNG bypass được limiter local test.
# Production behind reverse proxy (nginx, fly.io edge) cần middleware
# trust proxy whitelist + đọc X-Forwarded-For đúng layer (G18 deploy).

Note pitfall X-Forwarded-For production: nếu app đứng sau load balancer/CDN, ConnectInfo trả IP proxy (cùng IP cho mọi user) → rate limit toàn cục thay vì per-user IP. Production phải:

Validate trusted proxy whitelist — chỉ trust IP proxy đã biết.
Đọc X-Forwarded-For trong middleware riêng, lấy left-most IP từ proxy đã trust.
Reject request có header X-Forwarded-For nếu request đến từ IP không trong whitelist (chống spoofing).

Chi tiết wire trusted proxy pattern ở G18 (Deploy & Operations) khi setup load balancer config.

Tổng Kết

4 thuật toán rate limit: fixed window, sliding window, token bucket (chọn), leaky bucket.
governor crate v0.6 lock — Rust idiomatic token bucket (GCRA), industry standard.
2 strategy Shop API: per-IP 60 req/min cho auth + per-user 300 req/min cho mutation.
IpRateLimiter + UserRateLimiter type alias với keyed state store.
Quota::per_minute(N) lock đơn vị phút cho mọi rate spec Shop API.
AppError::TooManyRequests { retry_after_seconds } variant 21 (bump 20 → 21).
Response 429 + Retry-After header chuẩn RFC 6585 — client biết chính xác lúc nào retry.
ConnectInfo<SocketAddr> axum extract IP — cần into_make_service_with_connect_info.
State<IpRateLimiter> inject limiter qua state (Arc clone rẻ).
In-memory pitfall: per-instance state, reset on restart → multi-instance production cần Redis (G15).
Pattern abstraction RateLimiter trait preview G15 — swap in-memory ↔ Redis runtime không phá callsite.
X-Forwarded-For pitfall: production phải validate trusted proxy whitelist (G18 deploy).
File path lock: NEW middleware/rate_limit.rs; AppError +1 variant; router wire 2 specific endpoint group.
Stack giờ 7 layer (6 cũ B77 + rate_limit per-route, KHÔNG global).

Bài Tập Củng Cố

Tự trả lời, đáp án ở cuối:

4 thuật toán rate limit — pros/cons mỗi cách? Tại sao Shop API chọn token bucket?
Per-IP vs per-user — khi nào dùng cái nào? Cho ví dụ scenario register vs order.
Retry-After header — pros so với chỉ status 429? Client side handling pattern.
In-memory governor pitfall multi-instance — scenario attack exploit? Solution Redis G15.
X-Forwarded-For spoofing pitfall — production cần validate gì? Trusted proxy whitelist là gì?

Đáp án

Fixed window đếm request trong cửa sổ N giây cố định, reset đầu mỗi window — cài đặt đơn giản (1 counter + 1 timestamp), tiết kiệm memory; pitfall boundary spike: limit 10 req/phút, attacker bắn 10 req lúc 59s cuối window 1 + 10 req lúc 1s đầu window 2 → effective 20 req/2 giây vượt rate spec gấp đôi. Phù hợp endpoint không quan trọng accuracy (analytics counter). Sliding window track timestamp từng request, đếm số timestamp nằm trong rolling N giây vừa qua; pros khắc phục boundary spike, smooth toàn bộ window; cons memory grow O(N timestamp) — mỗi user/IP lưu list timestamp, traffic cao = RAM nặng và GC pressure cao. Phù hợp khi accuracy quan trọng + traffic vừa phải. Token bucket bucket chứa N token, refill rate K token/giây, mỗi request consume 1 token; pros burst-friendly cho phép short burst nếu bucket đầy (legitimate user refresh nhanh hoặc batch action), refill steady đảm bảo long-term rate; cons phức tạp impl hơn fixed; industry standard dùng bởi AWS API Gateway, Stripe, Cloudflare. Leaky bucket queue request, process steady rate; pros output mượt server dễ control downstream load; cons latency tăng (client phải chờ slot trong queue), không phù hợp REST API cần response nhanh. Shop API chọn token bucket qua governor crate vì: (a) burst-friendly không reject legitimate user (vd user refresh trang nhanh 5 lần liên tiếp, batch add 5 item cart không bị 429 oan); (b) refill steady đảm bảo long-term quota không vượt; (c) industry standard nên client SDK đã quen pattern Retry-After + exponential backoff; (d) governor dùng GCRA (Generic Cell Rate Algorithm — token bucket variant tối ưu của Stripe) đã production-tested ~10 năm Java; (e) memory O(1) per key (counter + timestamp) thay O(N) sliding window.
Per-IP dùng cho endpoint chưa authenticate hoặc nhạy cảm security: register, login, forgot password, OAuth callback. Lý do: không biết user_id (chưa login), chỉ phân biệt được theo IP TCP layer; chống brute force credential stuffing (attacker bắn 10K password thử login từ 1 IP). Scenario register: attacker dùng script tạo 10K tài khoản fake từ 1 IP để spam → per-IP 60 req/min reject sau 60 lần đầu, 9940 lần còn lại nhận 429. Per-user dùng cho endpoint đã authenticate, fair theo identity thực: orders, cart/checkout, profile update, review post. Lý do: 1 IP có thể là office NAT của 100 user legitimate (vd công ty dùng chung internet) — per-IP 60/min không đủ; per-user công bằng mỗi authenticated identity có quota riêng. Scenario order: user A và user B cùng office cùng IP → per-IP sẽ ăn nhau quota; per-user 300/min mỗi user có quota độc lập. Bot scraping data theo identity user → reject theo user_id chính xác hơn theo IP. Pattern lock Shop API: auth endpoint (B70 register + B112 login + G11 password reset) per-IP, mutation endpoint (B66 orders + G7 cart + G14 admin update) per-user. Public read endpoint (GET /products list/detail) không rate limit ở B78 — sẽ wire CDN cache layer riêng G15 (catalog public read scale ngang qua cache, không cần limit chống abuse vì không có write side effect). Action sensitive (forgot password, reset password) per-IP + per-email combined chống email spam G11.
Retry-After header RFC 6585 cho client biết chính xác bao nhiêu giây nữa quota refill xong, có thể retry. Pros so với chỉ status 429: (a) actionable info — client biết exact wait time không phải đoán; (b) UI countdown timer hiển thị user còn bao lâu được retry (vd "Bạn đã gửi quá nhiều request, thử lại sau 30 giây"); (c) exponential backoff giảm tải — client SDK auto retry sau wait time, không spam server liên tục với retry blind; (d) chuẩn HTTP universal — mọi HTTP client lib đã support (axios, fetch, requests, reqwest); (e) cũng hữu ích status khác 503 Service Unavailable cho maintenance window. Client-side handling pattern recommend: (a) intercept 429 response global ở HTTP client (axios interceptor, fetch wrapper); (b) parse Retry-After header (số giây hoặc HTTP date format); (c) hiển thị toast/banner UI với countdown timer; (d) queue retry tự động sau wait time + jitter random 0-500ms (tránh thundering herd nhiều client cùng retry exact 30s); (e) max retry count limit (vd 3 lần) tránh infinite loop nếu server overload kéo dài; (f) log metric retry rate gửi monitoring (Datadog/Sentry) — spike retry rate cao = alert ops team check. Anti-pattern: client retry ngay lập tức không respect Retry-After → server bị spam càng nặng, rate limit không có tác dụng. Format Retry-After: 2 form chuẩn — delta-seconds integer (vd Retry-After: 30) hoặc HTTP-date format RFC 7231 (vd Retry-After: Wed, 21 Oct 2026 07:28:00 GMT). Shop API lock delta-seconds đơn giản hơn parse + không lệ thuộc clock sync client-server.
In-memory governor pitfall multi-instance: (a) state per-instance riêng: deploy 5 replica app behind load balancer, mỗi instance có DashMap limiter độc lập → user/IP cùng key có quota 5× (60 × 5 = 300 req/min thực tế thay vì 60); attacker bắn round-robin qua 5 instance bypass limit dễ; (b) reset on restart: app restart (rolling deploy, crash, OOM kill, scale down/up) reset toàn bộ state limiter trong RAM → attacker biết thời điểm deploy (CI/CD log public, GitHub Actions notify) hoặc bắn liên tục đến khi crash → exploit window quota mới ngay sau restart; (c) single point of failure: instance fail/network partition = state mất, không recover; new instance khởi tạo fresh state attacker exploit. Scenario attack exploit cụ thể: attacker target Shop API multi-instance (5 replica fly.io), script bắn 60 request/instance qua sticky session hash → effective 300 req/min không bị reject (vượt spec 5×). Hoặc: attacker time attack đúng lúc rolling deploy mỗi 4h (CI release auto), reset window quota mới mỗi 4h → effective rate cao hơn intended. Solution Redis-backed G15: (a) state lưu Redis cluster — shared cross-instance tất cả 5 replica đọc/ghi cùng counter; (b) persistent across restart qua Redis AOF (append-only file) hoặc RDB snapshot — state recover sau crash; (c) atomic operation qua Redis INCR + EXPIRE trong Lua script hoặc MULTI transaction — không race condition; (d) cross-instance consistency mọi replica thấy cùng quota state; (e) trade-off: thêm 1-5ms network latency Redis call mỗi request — chấp nhận được cho rate limit middleware (không phải hot path response time critical). Industry pattern: crate redis-rate-limit hoặc tự build qua fred Redis client (lock G18 Redis adapter). Migration path Shop API: B78 wire governor in-memory cho dev/staging single-instance; G15 design RateLimiter trait abstraction → swap impl runtime (in-memory cho test, Redis cho production) không phá callsite middleware. Lock B78: chuẩn bị trait skeleton ngay để G15 refactor 1 commit clean.
X-Forwarded-For spoofing pitfall: production app đứng sau reverse proxy/CDN (nginx, fly.io edge, Cloudflare, AWS ALB) → ConnectInfo<SocketAddr> trả IP của proxy chứ không phải IP user thật (cùng IP cho mọi user qua proxy đó). Nếu trust blind header X-Forwarded-For do client gửi → attacker spoof X-Forwarded-For: 1.2.3.4 trong request → bypass per-IP rate limit dễ dàng (mỗi request giả 1 IP khác → quota reset liên tục). Production cần validate: (a) trusted proxy whitelist — danh sách IP/CIDR của reverse proxy đã biết (vd 10.0.0.0/8 internal LB, fly.io edge IP range public); (b) chỉ đọc X-Forwarded-For khi request đến từ trusted proxy — request đến từ IP không trong whitelist → ignore header X-Forwarded-For dùng ConnectInfo trực tiếp; (c) parse header đúng layer — X-Forwarded-For có thể chứa chain multi-hop client_ip, proxy1_ip, proxy2_ip; lấy left-most IP là client gốc, nhưng phải verify chain hợp lệ không có IP forged ở giữa; (d) reject request có header X-Forwarded-For nếu request không đến từ trusted proxy — defensive log alert security team. Trusted proxy whitelist: cấu hình env TRUSTED_PROXIES=10.0.0.0/8,fly_edge_ip_range CSV → middleware parse → check ConnectInfo IP có nằm trong whitelist không → quyết định trust hay không. Alternative chuẩn 2024: Forwarded header RFC 7239 thay X-Forwarded-For legacy — format structured Forwarded: for=192.0.2.43;proto=https;by=203.0.113.43 dễ parse hơn, nhưng chưa universal adopt; production vẫn dùng X-Forwarded-For + cẩn thận validate. Crate hỗ trợ: axum-client-ip handle chain extract đầy đủ (config trusted proxy + fallback strategy); Shop API G18 sẽ wire crate này. Tóm tắt rule lock Shop API G18: (1) production behind LB → MUST có middleware extract IP riêng (KHÔNG dùng ConnectInfo raw); (2) trusted proxy whitelist env config MANDATORY; (3) reject spoof attempt + log alert; (4) local dev OK dùng ConnectInfo trực tiếp (không có proxy).

Bài Tiếp Theo

Bài 79: Body Limit Per-Route — Refactor B47 — refactor body limit pattern (B47 4-layer defense), DefaultBodyLimit per-route lock, decompression layer body limit pitfall (B50), middleware composition cho /products/import.ndjson 10MB + /products/upload 100MB.

Danh sách bài viết