Bài 108: Hot Keys — Detection & Mitigation

Mục lục

Mục Tiêu Bài Học
Recap: Hot Key Là Gì & Khi Nào Cần Mitigation
Detection Trong Production
Mitigation 1: Replica Read Scaling
Mitigation 2: Local TTL Cache (cachetools)
Mitigation 3: Sharded Counter
Mitigation 4: Multi-Key Fan-Out
Mitigation 5: Random Expire Jitter
Mitigation 6: Request Coalescing (Single-Flight)
Hot Write vs Hot Read — Chọn Chiến Lược Phù Hợp
Thiết Kế Pre-emptive & Circuit Breaker
Monitoring Liên Tục
Anti-patterns
Tổng Kết & Quiz

Mục Tiêu Bài Học

Hiểu ngưỡng thực tế cần can thiệp: khi nào hot key là vấn đề thật sự cần mitigation.
Phân biệt detection tool phù hợp với từng ngữ cảnh (production liên tục vs debug ngắn hạn).
Triển khai được sáu chiến lược mitigation với code Python đầy đủ.
Biết chọn chiến lược đúng cho hot read vs hot write.
Thiết kế monitoring và alert để phát hiện hot key trước khi thành incident.
Nhận diện anti-pattern phổ biến dẫn đến outage.

Recap: Hot Key Là Gì & Khi Nào Cần Mitigation

Định nghĩa làm việc cho bài này: một key được coi là hot khi nó chiếm trên 10-20% tổng ops trên Redis instance hoặc node chứa nó. Ngưỡng này không phải quy tắc tuyệt đối — trên instance xử lý 5k ops/s, 1 key chiếm 1k ops/s (20%) là đáng lo; trên instance 500k ops/s, ngưỡng có thể cao hơn tuỳ capacity của CPU.

Hai hậu quả cấu trúc:

Single-thread command loop: Redis xử lý command trên 1 luồng. 1 key chiếm 20% ops tiêu thụ ~20% CPU luồng đó, tăng latency cho toàn bộ key còn lại.
Node concentration trong Cluster: mỗi key thuộc về đúng 1 hash slot → 1 node. Thêm node vào Cluster không giúp gì nếu hot key vẫn nằm trên cùng 1 node.

Bài 18 đã giới thiệu detection cơ bản và ba chiến lược ban đầu. Bài này không lặp lại phần đó — nếu cần xem lại cơ chế single-thread và hash slot, đọc Bài 18: Hot Keys — Nhận Diện & Xử Lý Cơ Bản.

Detection Trong Production

redis-cli --hotkeys (LFU bắt buộc)

Công cụ built-in từ Redis 4, scan keyspace và báo key có LFU counter cao nhất. Yêu cầu maxmemory-policy là allkeys-lfu hoặc volatile-lfu; nếu dùng LRU policy thì counter không được cập nhật và kết quả không có ý nghĩa.

# Kiểm tra và đổi policy nếu cần (không cần restart)
redis-cli CONFIG GET maxmemory-policy
redis-cli CONFIG SET maxmemory-policy allkeys-lfu

# Chạy --hotkeys
redis-cli --hotkeys
# Kết quả mẫu:
# hot key found with counter: 8192  keyname: product:sku:trending-001
# hot key found with counter: 4096  keyname: config:feature-flags
# hot key found with counter: 2048  keyname: user:profile:celebrity-123

Khi chạy trên Redis Cluster, cần chạy --hotkeys riêng trên từng master node để xác định node nào đang bị imbalance:

# Với cluster, lặp qua từng master
redis-cli -h node-a -p 6379 --hotkeys
redis-cli -h node-b -p 6379 --hotkeys
redis-cli -h node-c -p 6379 --hotkeys

Lưu ý: --hotkeys cần scan toàn bộ keyspace bằng SCAN. Trên DB lớn (hàng chục triệu key), quá trình mất nhiều giây và vẫn tiêu thụ CPU Redis. Không chạy tự động định kỳ; dùng khi đã có nghi ngờ cụ thể.

OBJECT FREQ — kiểm tra key đơn lẻ

Khi đã biết key nghi vấn, kiểm tra trực tiếp LFU counter:

redis-cli OBJECT FREQ product:sku:trending-001
# (integer) 255
# Giá trị 0-255; 255 = access tần suất rất cao theo LFU decay algorithm

Application-side tracking (best cho production liên tục)

Track throughput per key ngay trong application, không impact Redis runtime. Đây là phương pháp phù hợp nhất để monitoring liên tục:

import random
from collections import Counter
from threading import Lock
import redis

r = redis.Redis(host="127.0.0.1", port=6379, decode_responses=True)

_key_counter: Counter = Counter()
_counter_lock = Lock()
_SAMPLE_RATE = 0.01   # sample 1% request


def get_with_track(key: str):
    if random.random() < _SAMPLE_RATE:
        with _counter_lock:
            _key_counter[key] += 1
    return r.get(key)


def top_hot_keys(n: int = 10) -> list[tuple[str, int]]:
    """Gọi định kỳ (ví dụ mỗi 60s) để lấy top keys và reset counter."""
    with _counter_lock:
        top = _key_counter.most_common(n)
        _key_counter.clear()
    # Estimate actual calls dựa trên sample rate
    return [(k, int(cnt / _SAMPLE_RATE)) for k, cnt in top]

MONITOR chỉ dùng để debug ngắn hạn (dưới 10 giây) vì có thể giảm throughput Redis xuống tới 50%. Không dùng trên production tải cao.

Mitigation 1: Replica Read Scaling

Hot key có tỉ lệ đọc cao → phân tán read load sang replica. Redis replica nhận replicated data từ master theo thời gian thực (asynchronous); read từ replica có thể hơi stale một chút nhưng hầu hết use case chấp nhận được.

Redis Cluster — READONLY mode

Trong Redis Cluster, client mặc định redirect read về master. Để đọc từ replica, gửi lệnh READONLY trước khi đọc trên connection đến replica node:

import redis
from redis.cluster import RedisCluster, ClusterNode

# Khởi tạo cluster client với read_from_replicas=True
rc = RedisCluster(
    startup_nodes=[ClusterNode("127.0.0.1", 7000)],
    decode_responses=True,
    read_from_replicas=True,   # redis-py >= 4.1
)

# Mọi lệnh đọc sẽ được phân tán sang replica tự động
value = rc.get("product:sku:trending-001")

redis-py 4.1+ hỗ trợ read_from_replicas=True trong RedisCluster — client tự chọn replica ngẫu nhiên cho mỗi read command.

Sentinel / Standalone — kết nối replica trực tiếp

Với setup Sentinel hoặc standalone master-replica, application duy trì connection pool riêng đến replica:

import redis
import random

master = redis.Redis(host="redis-master", port=6379, decode_responses=True)
replicas = [
    redis.Redis(host="redis-replica-1", port=6379, decode_responses=True),
    redis.Redis(host="redis-replica-2", port=6379, decode_responses=True),
]


def get_with_replica_routing(key: str):
    """Route read sang replica ngẫu nhiên."""
    conn = random.choice(replicas)
    return conn.get(key)


def set_value(key: str, value: str, ex: int = 300):
    """Write luôn vào master."""
    master.set(key, value, ex=ex)

Giới hạn của replica read scaling

Write vẫn tập trung vào master — không giải quyết hot write key.
Replication lag: replica có thể stale vài ms đến vài chục ms trong điều kiện mạng bình thường.
Số replica có giới hạn thực tế (3-5 replica là phổ biến); không scale vô hạn vì replication fan-out tốn băng thông master.

Mitigation 2: Local TTL Cache (cachetools)

Cache hot value trong memory của application process. Redis call chỉ xảy ra khi local cache miss, thường sau mỗi ttl giây. Với hot key được đọc 50k lần/s và local TTL 10s, chỉ khoảng 1 call/10s đến Redis cho key đó — giảm 99.998% traffic.

Dùng cachetools.TTLCache (thread-safe access cần lock riêng; cachetools không tự đảm bảo thread safety khi đọc-ghi đồng thời):

import threading
import redis
from cachetools import TTLCache

r = redis.Redis(host="127.0.0.1", port=6379, decode_responses=True)

# maxsize=1000: giữ tối đa 1000 key trong local cache
# ttl=10: mỗi entry hết hạn sau 10 giây
_local: TTLCache = TTLCache(maxsize=1000, ttl=10)
_local_lock = threading.Lock()


def get_hot(key: str):
    with _local_lock:
        if key in _local:
            return _local[key]   # local cache hit — không đến Redis

    # Local cache miss → Redis
    value = r.get(key)

    with _local_lock:
        if value is not None:
            _local[key] = value

    return value

Một số điểm cần chú ý khi dùng local cache:

TTL ngắn (10-60s): giới hạn stale window. Key bị update trên Redis sẽ propagate vào local cache sau tối đa ttl giây. Không dùng TTL quá dài (hàng giờ) cho dữ liệu có thể thay đổi.
maxsize cẩn thận: nếu set quá lớn, local cache chiếm nhiều heap memory. Theo dõi RSS của process.
Stale data scope: mỗi application instance có local cache riêng. Sau khi Redis update, các instance khác nhau có thể nhận giá trị cũ trong tối đa ttl giây — chấp nhận được với most use case nhưng không phù hợp khi cần strong consistency.
Cache invalidation: nếu cần invalidate ngay (không chờ TTL), kết hợp với keyspace notification hoặc Pub/Sub để broadcast signal tới mọi instance — xem bài 17 về client-side caching.

Nếu không muốn dùng cachetools, có thể dùng dict thuần với timestamp check — nhưng cachetools có sẵn LRU eviction khi maxsize đầy, tránh memory leak.

Mitigation 3: Sharded Counter

Hot write pattern phổ biến nhất là counter: post views, like count, download count. Mọi request ghi INCR article:views:12345 → 1 key nhận toàn bộ write traffic.

Giải pháp: tách 1 counter thành N shard counter. Write chọn ngẫu nhiên 1 shard; read aggregate tất cả N shard. Write load phân tán đều trên N key, có thể nằm trên N node khác nhau trong Cluster.

import random
import redis

r = redis.Redis(host="127.0.0.1", port=6379, decode_responses=True)

N_SHARDS = 16  # Phân tán write ra 16 shard


def incr_sharded(base_key: str, amount: int = 1) -> None:
    """Ghi vào shard ngẫu nhiên — O(1), không cần biết shard nào."""
    shard = random.randint(0, N_SHARDS - 1)
    r.incrbyfloat(f"{base_key}:shard:{shard}", amount)


def get_sharded_total(base_key: str) -> float:
    """Đọc và cộng dồn tất cả shard — dùng pipeline để giảm round-trip."""
    keys = [f"{base_key}:shard:{i}" for i in range(N_SHARDS)]
    pipe = r.pipeline()
    for k in keys:
        pipe.get(k)
    values = pipe.execute()
    return sum(float(v) for v in values if v is not None)

Trade-off của sharded counter:

Read cost: đọc tổng cần N lần GET (pipeline giảm round-trip còn 1 nhưng vẫn cần xử lý N response). Với N=16 và read ít hơn write, hoàn toàn chấp nhận được.
Eventual consistency: không có transaction atomicity giữa N shard. Tổng tại bất kỳ thời điểm nào là approximate. Phù hợp cho counter analytics; không phù hợp cho balance tài khoản.
Cluster hash slot: {base_key}:shard:0 và {base_key}:shard:1 thuộc về slot khác nhau trừ khi dùng hash tag. Đây là hành vi mong muốn — phân tán sang node khác nhau chính là mục tiêu.

N_SHARDS lý tưởng: bằng hoặc gấp đôi số master node trong Cluster để đảm bảo phân tán. Với standalone Redis, N=16 vẫn có ý nghĩa vì phân tán CPU load trong event loop.

Mitigation 4: Multi-Key Fan-Out

Hot read key có value lớn và được đọc nhiều: thay vì 1 key, tạo N bản sao. Mỗi read chọn ngẫu nhiên 1 bản sao — phân tán read ops trên N key thuộc N slot khác nhau trong Cluster.

import random
import redis

r = redis.Redis(host="127.0.0.1", port=6379, decode_responses=True)

N_COPIES = 10  # Số bản sao


def get_hot_fanout(base_key: str) -> str | None:
    """Read từ bản sao ngẫu nhiên; fallback về canonical key nếu copy miss."""
    copy_idx = random.randint(0, N_COPIES - 1)
    value = r.get(f"{base_key}:copy:{copy_idx}")
    if value is None:
        # Bản sao chưa được populate (cold start) → đọc canonical
        value = r.get(base_key)
    return value


def set_hot_fanout(base_key: str, value: str, ttl: int = 300) -> None:
    """Write cập nhật canonical key và tất cả N bản sao — dùng pipeline."""
    pipe = r.pipeline()
    pipe.set(base_key, value, ex=ttl)
    for i in range(N_COPIES):
        pipe.set(f"{base_key}:copy:{i}", value, ex=ttl)
    pipe.execute()


def delete_hot_fanout(base_key: str) -> None:
    """Xoá canonical key và tất cả bản sao cùng lúc."""
    keys = [base_key] + [f"{base_key}:copy:{i}" for i in range(N_COPIES)]
    r.delete(*keys)

Trade-off và điều kiện áp dụng:

Write cost: mỗi write phải cập nhật N+1 key. Phù hợp khi read >> write (config flag, product info, banner content).
Cache invalidation: phải xoá hoặc cập nhật tất cả N bản sao khi invalidate. Bỏ sót bất kỳ bản sao nào sẽ khiến read tiếp tục nhận stale data đến hết TTL.
Cold start: bản sao chỉ có giá trị sau lần set_hot_fanout đầu tiên. Trước đó fallback về canonical key.
Cluster distribution: không dùng hash tag cho fan-out keys — để chúng tự hash vào slot khác nhau. Mục tiêu là phân tán slot.

So sánh fan-out với local cache: fan-out vẫn là Redis call (chỉ phân tán sang node khác), local cache loại bỏ Redis call hoàn toàn. Kết hợp cả hai cho hot key quan trọng: local cache TTL 5s → giảm tuyệt đối số call, fan-out → mỗi call mà vẫn đến Redis sẽ không cùng đổ vào 1 node.

Mitigation 5: Random Expire Jitter

Khi nhiều hot key được set với cùng TTL (ví dụ tất cả cache expire sau 300s), chúng hết hạn đồng loạt tại cùng một thời điểm → thundering herd: hàng trăm request cùng lúc miss cache và đổ xuống DB.

Giải pháp đơn giản: thêm random jitter vào TTL:

import random
import redis

r = redis.Redis(host="127.0.0.1", port=6379, decode_responses=True)

BASE_TTL = 300    # 5 phút
JITTER_MAX = 60   # jitter tối đa 60 giây


def set_with_jitter(key: str, value: str) -> None:
    ttl = BASE_TTL + random.randint(0, JITTER_MAX)
    r.set(key, value, ex=ttl)
    # Kết quả: các key expire rải rác trong khoảng 300-360s
    # Không có đợt expire đồng loạt

Jitter phá vỡ sự đồng bộ expire. Với 1000 hot key và jitter 0-60s:

Không jitter: 1000 key expire cùng một giây → 1000 concurrent cache miss → DB spike.
Jitter 60s: trung bình ~17 key expire mỗi giây trong 60 giây → DB chịu tải đều, không spike.

Lưu ý: jitter chỉ hiệu quả khi keys được set tại cùng thời điểm (ví dụ trong bulk warm-up hoặc deployment). Nếu keys được set dải đều theo thời gian, thundering herd ít xảy ra hơn.

Mitigation 6: Request Coalescing (Single-Flight)

Request coalescing (còn gọi là single-flight hoặc request deduplication) giải quyết vấn đề: nhiều concurrent request cùng đọc 1 key đang miss → tất cả đều đến Redis (hoặc DB nếu cache miss) cùng lúc. Thay vào đó, chỉ 1 request thật sự đi đến Redis; các request khác đợi và nhận chung kết quả.

import threading
import redis
from concurrent.futures import Future
from typing import Optional

r = redis.Redis(host="127.0.0.1", port=6379, decode_responses=True)

_pending: dict[str, Future] = {}
_pending_lock = threading.Lock()


def get_coalesced(key: str) -> Optional[str]:
    """
    Nếu đã có request khác đang fetch key này → chờ và dùng chung kết quả.
    Nếu không có → fetch và thông báo kết quả cho tất cả waiter.
    """
    with _pending_lock:
        if key in _pending:
            future = _pending[key]
            is_leader = False
        else:
            future = Future()
            _pending[key] = future
            is_leader = True

    if not is_leader:
        # Chờ leader fetch xong
        return future.result(timeout=5.0)

    # Leader: thực hiện Redis call
    try:
        value = r.get(key)
        future.set_result(value)
        return value
    except Exception as exc:
        future.set_exception(exc)
        raise
    finally:
        with _pending_lock:
            _pending.pop(key, None)

Điều kiện áp dụng và giới hạn:

Hiệu quả nhất khi nhiều thread/coroutine trong cùng 1 process cùng request 1 key trong khoảng thời gian ngắn (ví dụ trong 1-2ms khi cache miss).
Chỉ coalesce trong process scope — không coalesce giữa các instance application khác nhau. Dùng kết hợp với local cache để giảm số lần phải đến Redis.
Không dùng khi mỗi request cần data mới nhất tuyệt đối (real-time trading, billing) vì waiter nhận kết quả từ time của leader call.
Nếu dùng async framework (asyncio, trio), thay Future bằng asyncio.Future và asyncio.Lock.

Hot Write vs Hot Read — Chọn Chiến Lược Phù Hợp

Không phải mọi hot key đều có cùng đặc điểm. Phân biệt hot write và hot read giúp chọn đúng chiến lược:

Pattern              | Hot type  | Chiến lược phù hợp
---------------------|-----------|-------------------------------------------
Counter (post views) | Hot write | Sharded counter
Leaderboard global   | Hot write | Per-period bucket (ZADD theo time window)
Config/feature flag  | Hot read  | Local cache (TTL 30-60s)
Popular product info | Hot read  | Replica read + local cache
Trending list        | Hot read  | Pre-compute, fan-out, local cache
Pub/Sub hot channel  | Hot write | Sharded Pub/Sub (Redis 7+)

Hot write — pattern chi tiết

Counter phân tán: dùng sharded counter như mô tả ở Mục 6. N_SHARDS nên bằng số master node để mỗi shard nằm trên node khác nhau.

Leaderboard per-period: thay vì 1 leaderboard toàn thời gian (1 ZSET hot), tạo leaderboard theo window thời gian — leaderboard:2026-06-01:hour:14. Khi window đóng, merge snapshot và archive. Mỗi period là ZSET mới, không tích lũy tải vào 1 key mãi mãi.

Hot read — pattern chi tiết

Config flag: đặc điểm write rất ít (deploy mới, feature toggle) nhưng đọc mọi request → local cache với TTL 30-60s là đủ. Miss cost thấp (đọc Redis nếu local miss), stale window chấp nhận được.

Popular product info: kết hợp replica read (phân tán tải Redis) + local TTL cache (cắt bớt số call Redis). Khi có flash sale, pre-load local cache trước.

Thiết Kế Pre-emptive & Circuit Breaker

Pre-emptive design

Một số hot key có thể dự đoán trước: celebrity account ra mắt sản phẩm, flash sale đã lên lịch, event broadcast live. Với những trường hợp này, không nên đợi hot key tự xuất hiện mới xử lý:

Pre-shard: bật sharded counter hoặc fan-out trước khi event xảy ra, không phải sau khi incident.
Local cache warm-up: trước thời điểm spike dự kiến, prefetch key vào local cache của tất cả instance.
Load test viral scenario: mô phỏng 1 key nhận 100x traffic bình thường và đo throughput + latency.

Circuit breaker cho hot key

Khi hot key vượt ngưỡng nguy hiểm và không có sẵn mitigation, circuit breaker chuyển hành vi sang degraded mode thay vì để Redis overload:

import time
import redis

r = redis.Redis(host="127.0.0.1", port=6379, decode_responses=True)

_hot_key_threshold_ops_per_sec = 5000
_circuit_open: dict[str, float] = {}  # key → thời điểm mở circuit
_CIRCUIT_OPEN_DURATION = 30.0         # giây


def get_with_circuit(key: str, fallback_value=None):
    now = time.monotonic()
    # Kiểm tra circuit còn mở không
    if key in _circuit_open:
        if now - _circuit_open[key] < _CIRCUIT_OPEN_DURATION:
            # Circuit mở → trả fallback, không đến Redis
            return fallback_value
        else:
            del _circuit_open[key]  # thử đóng circuit lại

    try:
        return r.get(key)
    except redis.exceptions.ConnectionError:
        _circuit_open[key] = now
        return fallback_value

Circuit breaker ở đây đơn giản là ví dụ minh hoạ pattern. Production cần kết hợp với error rate tracking và half-open state. Framework sẵn có: pybreaker (Python), resilience4j (Java).

Monitoring Liên Tục

Hot key thường xuất hiện đột ngột khi có sự kiện bất thường. Monitoring liên tục giúp phát hiện trước khi thành incident:

Metrics cần theo dõi

Per-key ops/s (application-side sampling): alert khi 1 key vượt 10% tổng ops.
Node CPU distribution trong Cluster: alert khi 1 node CPU cao hơn 2x trung bình.
P99 latency per node: latency spike cục bộ trên 1 node là dấu hiệu hot key.
Redis INFO stats → instantaneous_ops_per_sec: tổng ops, kết hợp với per-node để phát hiện imbalance.

Alert threshold thực tế

# Trong top_hot_keys(), thêm alert logic
def check_and_alert(base_ops_per_sec: int = 10000) -> None:
    """
    Gọi mỗi 60s. Nếu 1 key chiếm > 10% ops tổng → alert.
    base_ops_per_sec: tổng throughput Redis ước tính.
    """
    top = top_hot_keys(n=5)
    threshold = base_ops_per_sec * 0.10   # 10%

    for key, estimated_ops in top:
        if estimated_ops > threshold:
            print(f"[ALERT] Hot key detected: {key!r} ~ {estimated_ops} ops/s "
                  f"({estimated_ops/base_ops_per_sec:.1%} of total)")

Dashboard top keys

Push top keys vào Prometheus Gauge hoặc Datadog histogram với label key_name, sau đó tạo graph top 10 keys by ops/s. Trong production thực tế, dùng probabilistic key tracking (Count-Min Sketch) để tránh cardinality explosion — xem bài 87 về Count-Min Sketch.

Anti-patterns

Bỏ qua hot key signal: 1 node CPU 100% trong khi node khác idle là dấu hiệu rõ ràng. Không xử lý dẫn đến outage khi traffic tăng thêm.
Dùng fan-out cho hot write: fan-out nhân N lần mỗi write — nếu write cũng hot, fan-out làm tình trạng tệ hơn. Sharded counter mới là đúng cho hot write counter.
Local cache với TTL vô hạn: stale data không bao giờ hết hạn, dẫn đến user thấy dữ liệu cũ. Luôn đặt TTL hợp lý.
Hot key trong critical path không có fallback: nếu Redis overload và hot key fail, request fail theo. Cần fallback (stale value, default value) cho key trong critical path.
Thêm node Cluster mà không xử lý hot key: Cluster scale out giải quyết vấn đề capacity tổng thể, không giải quyết imbalance từ 1 hot key — vì hot key vẫn chỉ nằm trên 1 node.
Không monitor distribution: chỉ theo dõi tổng ops Redis mà không theo dõi phân bổ per-key hoặc per-node → không phát hiện được imbalance cho đến khi node bị spike.

Tổng Kết & Quiz

Tổng kết

Hot key: 1 key chiếm > 10-20% ops → bottleneck CPU single-thread, node imbalance trong Cluster.
Detection production: redis-cli --hotkeys (cần LFU policy, dùng khi debug chủ động); application-side sampling (phù hợp monitoring liên tục, không impact Redis).
Hot read → replica read scaling (giảm tải master) + local TTL cache (giảm số call Redis) + fan-out (phân tán slot trong Cluster).
Hot write counter → sharded counter (N shard, aggregate khi đọc).
Thundering herd khi expire → random TTL jitter.
Concurrent miss storm → request coalescing (single-flight per process).
Pre-emptive design: shard hoặc local cache trước khi viral event xảy ra.
Monitoring: track per-key ops/s và per-node CPU; alert khi 1 key vượt 10% tổng ops.

Quiz 5 câu

redis-cli --hotkeys yêu cầu điều kiện gì? Nếu đang dùng allkeys-lru, kết quả trả về có ý nghĩa không? Tại sao?
Một config flag được đọc ở mọi request API (100k req/s), ghi mỗi vài tiếng khi deploy. Chiến lược nào phù hợp nhất và tại sao? Đặt TTL local cache bao nhiêu là hợp lý?
Bạn triển khai multi-key fan-out (N=10) cho hot read key. Một developer quên gọi delete_hot_fanout mà chỉ gọi r.delete(base_key). Hậu quả là gì?
Sharded counter có 16 shard, mỗi write chọn shard ngẫu nhiên. Để đọc tổng, code dùng pipeline gọi GET trên 16 key. Tại sao không dùng MGET thay vì pipeline? (Gợi ý: xem xét Cluster)
Request coalescing chỉ hoạt động trong phạm vi 1 process. Với application chạy 20 instance, bạn cần thêm gì để giảm tiếp Redis call khi tất cả 20 instance cùng miss 1 hot key?

Đáp án gợi ý

Yêu cầu maxmemory-policy là allkeys-lfu hoặc volatile-lfu. Với allkeys-lru, Redis không duy trì LFU counter nên OBJECT FREQ trả về 0 cho mọi key; --hotkeys không xác định được key nào thực sự hot.
Local TTL cache là phù hợp nhất: write rất ít nên stale window ngắn (30-60s) chấp nhận được; Redis hoàn toàn không phải xử lý 100k ops/s cho key này. TTL 30-60s — đủ ngắn để feature flag mới có hiệu lực trong vòng 1 phút sau deploy.
Canonical key bị xoá nhưng 10 bản sao còn nguyên, vẫn có TTL cũ. Read tiếp theo chọn ngẫu nhiên bản sao → nhận stale data đến khi TTL bản sao hết hạn. Nếu cần canonical key sau này, write mới tạo lại nhưng bản sao cũ vẫn tồn tại cho đến TTL. Invalidation phải xoá tất cả N+1 key.
Trong Redis Cluster, MGET yêu cầu tất cả key nằm cùng 1 hash slot (hoặc client phải tự route từng batch theo slot). Các shard key base_key:shard:0..15 không dùng hash tag → thuộc slot khác nhau → MGET trên Cluster sẽ raise CROSSSLOT error. Pipeline trong redis-py Cluster mode tự route từng GET đến đúng node.
Kết hợp local TTL cache: mỗi instance cache kết quả sau lần đầu tiên đến Redis, và các lần sau trong TTL không cần đến Redis. Với 20 instance, mỗi instance miss 1 lần rồi cache; tổng Redis call trong 1 TTL window = 20 thay vì 100k × 20 instance.

Bài tiếp theo

Bài 109 phân tích fork và Copy-on-Write (COW): vì sao BGSAVE và BGREWRITEAOF gây latency spike, cách đo và giảm thiểu trong production.

Danh sách bài viết

Bài 108: Hot Keys — Detection & Mitigation

Mục lục

Mục Tiêu Bài Học

Recap: Hot Key Là Gì & Khi Nào Cần Mitigation

Detection Trong Production

redis-cli --hotkeys (LFU bắt buộc)

OBJECT FREQ — kiểm tra key đơn lẻ

Application-side tracking (best cho production liên tục)

Mitigation 1: Replica Read Scaling

Redis Cluster — READONLY mode

Sentinel / Standalone — kết nối replica trực tiếp

Giới hạn của replica read scaling

Mitigation 2: Local TTL Cache (cachetools)

Mitigation 3: Sharded Counter

Mitigation 4: Multi-Key Fan-Out

Mitigation 5: Random Expire Jitter

Mitigation 6: Request Coalescing (Single-Flight)

Hot Write vs Hot Read — Chọn Chiến Lược Phù Hợp

Hot write — pattern chi tiết

Hot read — pattern chi tiết

Thiết Kế Pre-emptive & Circuit Breaker

Pre-emptive design

Circuit breaker cho hot key

Monitoring Liên Tục

Metrics cần theo dõi

Alert threshold thực tế

Dashboard top keys

Anti-patterns

Tổng Kết & Quiz

Tổng kết

Quiz 5 câu

Đáp án gợi ý

Bài tiếp theo

Tham khảo

Danh sách bài viết

Bài 108: Hot Keys — Detection &amp; Mitigation

Mục lục

Mục Tiêu Bài Học

Recap: Hot Key Là Gì & Khi Nào Cần Mitigation

Detection Trong Production

redis-cli --hotkeys (LFU bắt buộc)

OBJECT FREQ — kiểm tra key đơn lẻ

Application-side tracking (best cho production liên tục)

Mitigation 1: Replica Read Scaling

Redis Cluster — READONLY mode

Sentinel / Standalone — kết nối replica trực tiếp

Giới hạn của replica read scaling

Mitigation 2: Local TTL Cache (cachetools)

Mitigation 3: Sharded Counter

Mitigation 4: Multi-Key Fan-Out

Mitigation 5: Random Expire Jitter

Mitigation 6: Request Coalescing (Single-Flight)

Hot Write vs Hot Read — Chọn Chiến Lược Phù Hợp

Hot write — pattern chi tiết

Hot read — pattern chi tiết

Thiết Kế Pre-emptive & Circuit Breaker

Pre-emptive design

Circuit breaker cho hot key

Monitoring Liên Tục

Metrics cần theo dõi

Alert threshold thực tế

Dashboard top keys

Anti-patterns

Tổng Kết & Quiz

Tổng kết

Quiz 5 câu

Đáp án gợi ý

Bài tiếp theo

Tham khảo

Bài 108: Hot Keys — Detection & Mitigation