Tuần 5 - Ngày 4: Amazon ElastiCache
Mục tiêu học tập
- Phân biệt ElastiCache for Redis vs Memcached
- Hiểu cluster mode, replication, failover
- Nắm caching patterns: lazy loading, write-through
- Biết khi nào dùng ElastiCache vs DAX
1. Tổng quan ElastiCache
ElastiCache = managed in-memory data store, hỗ trợ Redis và Memcached.
Use cases
- Database caching (giảm load DB)
- Session storage (user session, shopping cart)
- Leaderboards (gaming)
- Real-time analytics
- Message broker (Redis Pub/Sub)
- Rate limiting
Lý do dùng cache
- Giảm latency (memory vs disk)
- Giảm load lên DB (cost optimization)
- Increase throughput
2. ElastiCache for Redis
Đặc điểm
- In-memory key-value + data structures: strings, lists, sets, hashes, sorted sets, streams
- Persistence: snapshots to S3, AOF (append-only file)
- Replication: read replicas (up to 5)
- Multi-AZ với automatic failover
- Encryption in-transit + at-rest
- Backup automated
- Pub/Sub messaging
2 deployment modes
Cluster mode disabled (default)
- 1 primary node + 0-5 read replicas
- Single shard
- Multi-AZ failover
Cluster mode enabled
- Sharding: data split across multiple shards (up to 500)
- Each shard: 1 primary + 0-5 replicas
- Total nodes = shards × (1 + replicas)
- Use case: data > single node memory, need horizontal scale
Architecture (Cluster Mode Disabled)
Architecture (Cluster Mode Enabled)
3. ElastiCache for Memcached
Đặc điểm
- Pure key-value cache (no data structures)
- Multi-threaded architecture (good for multi-core)
- Sharding: client-side (no built-in replication)
- No persistence (data lost on restart)
- No multi-AZ failover
- No backup
- Simpler, less features than Redis
Architecture
Client uses consistent hashing:
Key → hash → Node 1, 2, 3, ...
When to use Memcached
- Simple cache (no advanced features)
- Multi-threaded performance critical
- Cache can be rebuilt (data loss OK)
When to use Redis
- Data persistence required
- Need advanced data structures (sorted sets, hashes)
- Pub/Sub messaging
- Multi-AZ HA needed
- Cluster mode for sharding
4. Redis vs Memcached
| Feature | Redis | Memcached |
|---|---|---|
| Data structures | Strings, lists, sets, hashes, sorted sets, streams | String only |
| Persistence | Yes (snapshot + AOF) | No |
| Replication | Master-replica | No (client sharding only) |
| Multi-AZ | Yes (automatic failover) | No |
| Backup/Restore | Yes | No |
| Encryption | Yes (TLS, KMS) | Yes (TLS, KMS — 2018+) |
| Pub/Sub | Yes | No |
| Transactions | Yes (MULTI/EXEC) | No |
| Sharding | Cluster mode enabled | Client-side |
| Multi-threaded | No (Redis 6 added I/O threads) | Yes |
| Use case | Complex caching, sessions, leaderboards | Simple caching, ephemeral data |
2025 trend: Redis được dùng nhiều hơn. Memcached giảm.
5. Caching Patterns
5.1 Lazy Loading (Cache-Aside)
def get_user(user_id):
# 1. Check cache
user = cache.get(user_id)
if user is not None:
return user # Cache HIT
# 2. Cache miss → read from DB
user = db.get(user_id)
# 3. Write to cache
cache.set(user_id, user)
return user
Pros
- Only cache requested data (less memory)
- Cache failure ≠ app failure (read from DB)
Cons
- Cache miss penalty (3 trips: cache → DB → cache → app)
- Stale data possible (if DB updated, cache outdated)
- Cold start slow
5.2 Write-Through
def update_user(user_id, data):
# 1. Write to DB
db.update(user_id, data)
# 2. Write to cache (same time)
cache.set(user_id, data)
Pros
- Cache always fresh
- No cache miss penalty (data always there)
Cons
- Write latency higher (2 writes)
- Cache might contain rarely-read data (wasted memory)
5.3 TTL (Time To Live) + Lazy Loading
- Set TTL on cached items (e.g., 5 minutes)
- Auto-expire stale data
- Combine với lazy loading: TTL expiry → next read pulls fresh from DB
Common pattern: Lazy loading + TTL
- Best of both worlds
- Simple to implement
- Acceptable staleness (5-10 min)
6. Cache Replacement (Eviction)
Redis maxmemory policies
noeviction: error when fullallkeys-lru: evict least recently used (any key)volatile-lru: evict LRU among keys with TTL setallkeys-random,volatile-randomvolatile-ttl: evict keys with shortest TTL firstallkeys-lfu,volatile-lfu(Redis 4+): least frequently used
Memcached
- LRU only
7. Redis Cluster Mode
Sharding
- 16384 hash slots total
- Distributed across shards
- Key → CRC16(key) % 16384 → slot → shard
Resharding
- Add/remove shards → redistribute slots
- Online operation (no downtime)
Limitations
- Multi-key operations limited (keys must be in same slot)
- Use hash tags
{tag}to group keys:user:{u-001}:profile,user:{u-001}:orders
8. Security
Encryption in transit (TLS)
- Enable at cluster creation (cannot change later)
Encryption at rest
- KMS encryption
- Snapshots also encrypted
Authentication (Redis)
- Redis AUTH token
- Redis IAM (newer): IAM-based authentication
- Memcached: SASL authentication
Network
- Deploy in VPC
- Security Group control access
- Subnet group selection
9. Backup & Restore (Redis)
Snapshots
- Automated: daily, retention 0-35 days
- Manual: on-demand, indefinite retention
- Stored in S3 (managed)
- No performance impact (taken from replica)
Restore
- Restore from snapshot to new cluster
- Cross-region restore possible
10. DAX vs ElastiCache
| DAX | ElastiCache (Redis) | |
|---|---|---|
| For | DynamoDB only | Any data source (RDS, app, etc.) |
| API | DynamoDB API (transparent) | Custom (Redis client) |
| Caching pattern | Write-through automatically | Lazy loading or write-through (manual) |
| Cache key | Same as DynamoDB key | Custom |
| Setup | Simple (DynamoDB drop-in) | Custom code needed |
Use case
- DAX: Pure DynamoDB caching, no code change
- ElastiCache Redis: General cache, advanced data structures, sessions
11. ElastiCache Serverless (2023+)
Đặc điểm
- Pay per use (no capacity planning)
- Auto-scale instantly
- Both Redis và Memcached support
- Min: ~512 MB cache size
- Support primary use case của ElastiCache without operational overhead
Use case
- Variable workload
- Dev/test
- New apps (don't know capacity)
12. Common Patterns
Pattern 1: Database cache (lazy loading + TTL)
App → Cache (Redis, lazy load) → RDS
TTL 5 min on cached items
Pattern 2: Session storage
User → ALB (no sticky session) → App tier (any instance)
App reads/writes session in Redis (shared)
Pattern 3: Leaderboard (Redis Sorted Set)
User score event → ZADD leaderboard score user_id
Top 10: ZREVRANGE leaderboard 0 9
Real-time updates với O(log N) performance
Pattern 4: Rate limiting
Redis INCR per user per minute
If count > limit → reject request
Câu hỏi ôn tập
-
Redis hỗ trợ persistence, Memcached thì sao?
Xem đáp án
Redis: hỗ trợ persistence qua RDB (snapshot) và AOF (append-only file log). Data không mất khi node restart (có thể reload từ disk). Hỗ trợ replication, pub/sub, Sorted Sets, Streams. Memcached: không có persistence — data mất khi node fail hay restart. Pure in-memory, multi-threaded, không có replication. Memcached đơn giản và nhanh hơn cho pure caching use cases.
-
Lazy loading pattern có nhược điểm gì?
Xem đáp án
Cache miss penalty: lần đầu request data (cache miss) phải query DB rồi mới write vào cache — latency cao hơn cho requests đầu tiên. Stale data: nếu DB update nhưng cache không được invalidate, đọc từ cache sẽ nhận data cũ. Giải pháp: TTL (Time-To-Live) để auto-expire, hoặc kết hợp với Write-Through pattern để update cache khi write.
-
Multi-AZ failover hỗ trợ trong Redis hay Memcached?
Xem đáp án
Chỉ Redis hỗ trợ Multi-AZ với automatic failover. Redis có primary + replicas — khi primary fail, replica được promote tự động (failover < 60 giây). Memcached không có replication — nếu node fail, data trong node đó bị mất, không failover. Với Memcached cluster nhiều nodes, chỉ mất data của node fail, không phải toàn bộ.
-
DAX khác ElastiCache ở điểm gì?
Xem đáp án
DAX là DynamoDB-specific cache — transparent, không cần thay đổi code (dùng DAX SDK thay DynamoDB SDK), cache DynamoDB table results trực tiếp, write-through, microsecond latency. ElastiCache là general-purpose cache (Redis/Memcached) — cần code logic để check cache trước DB, linh hoạt hơn (cache bất kỳ gì: SQL query results, session data, computed values).
-
Redis Cluster Mode dùng để làm gì?
Xem đáp án
Redis Cluster Mode cho phép horizontal scaling của dataset bằng cách sharding data qua nhiều shards (mỗi shard = 1 primary + N replicas). Không có Cluster Mode: toàn bộ data trên 1 shard — bị giới hạn bởi memory của single node. Cluster Mode: dataset lớn hơn tổng memory một node, write throughput cao hơn qua parallel shards. Tối đa 500 nodes per cluster.
Bài tập thực hành
- Tạo ElastiCache Redis cluster (cluster mode disabled), 1 primary + 1 replica
- Implement lazy loading pattern trong app: read cache, fallback DB
- Set TTL 60 seconds, observe expiry
- Tạo Redis Cluster Mode Enabled với 3 shards
- Test failover: stop primary, observe replica promote
- So sánh latency: DB direct vs DB + cache hit
Tài liệu tham khảo chính thức
Tiếp theo: Redshift Introduction