</>Học Dev
Bài học

Tuần 9 - Ngày 5: Aurora Global Database (Deep Dive)

Tuần 9 – Ngày 5

Tuần 9 - Ngày 5: Aurora Global Database (Deep Dive)

Mục tiêu học tập

  • Hiểu Aurora Global Database architecture
  • Nắm Write Forwarding và failover process
  • So sánh với cross-region read replica thông thường

1. Tổng quan Aurora Global Database

Aurora Global Database = Aurora cluster với primary region + up to 5 secondary regions, sub-second replication.

Đặc điểm

  • 1 Primary region + up to 5 Secondary regions
  • Replicate via dedicated infrastructure (not standard MySQL/PostgreSQL replication)
  • < 1 second replication lag typical
  • RPO < 1 second
  • RTO < 1 minute (failover)
  • Read-only in secondary regions (or use Write Forwarding)
  • Supports both Aurora MySQLAurora PostgreSQL

Limits

  • Up to 5 secondary regions
  • Up to 16 read replicas per region
  • Total: up to 80 read replicas across all regions

2. Architecture

PrimaryRegion(us-east-1)WriterInstancewritesReader1(sync)Reader2...upto15Storage(shared,6copies,3AZs)AuroraGlobalDatabaseengine(replicatetosecondaryregions)Replication(dedicatedinfra)<1seclagSecondaryRegion(eu-west-1)Read-onlyreplicas(upto16)Localreads(lowlatencyforEUusers)Storage(replicatedfromprimary)

3. Use Cases

Disaster Recovery

  • Primary fails → promote secondary → new primary
  • RTO < 1 minute (vs hours for traditional restore)
  • RPO < 1 second (minimal data loss)

Multi-Region Reads

  • Global users get low-latency reads from nearest region
  • e.g., US users → us-east-1, EU users → eu-west-1, Asia → ap-southeast-1

Cross-Region Migration

  • Use Global Database for migration: replicate to new region, then cutover

4. Replication Mechanism

How it works

  • Storage-level replication (not query replication)
  • Dedicated infrastructure between regions
  • Asynchronous BUT very fast (sub-second)
  • No impact on primary performance

vs Standard Cross-Region Read Replica

Aurora Global DatabaseCross-region Read Replica
ReplicationStorage-level (dedicated infra)Query-level (logical)
Lag< 1 secondSeconds to minutes
Performance impact on primaryNoneSome
SetupSingle clickPer replica
Failover time< 1 minute (planned/unplanned)Manual promote, longer

5. Failover Process

Planned Failover

  • Manual switch primary to secondary
  • Use case: scheduled maintenance, region rebalancing
  • ~30-60 seconds RTO

Unplanned Failover (DR)

  • Primary region down
  • Detach secondary from global DB
  • Promote secondary to standalone cluster
  • Update app connection string
  • RTO: < 1 minute

Steps for unplanned failover

1. Detect primary region failure
2. Click "Failover global database" or use API
3. Select secondary region to promote
4. Aurora promotes secondary cluster to standalone
5. Update app to point to new primary endpoint
6. (Optionally) Re-add detached cluster as new secondary

6. Write Forwarding (Aurora MySQL only)

Định nghĩa

Write Forwarding = secondary region can accept writes, forward to primary.

Đặc điểm

  • App can write to secondary region's endpoint
  • Aurora forwards write to primary, then replicates back
  • Write latency = round-trip to primary + replication
  • Read-after-write consistency options

Use case

  • Multi-region apps where reads dominate but occasional writes
  • Simpler app architecture (single connection logic)

Limitations

  • Not available for PostgreSQL (only MySQL)
  • Higher write latency than direct primary writes
  • Not for write-heavy workloads (use primary directly)

7. Consistency Modes (Write Forwarding)

Eventual Consistency (default)

  • Write returns immediately
  • Read may not see the write yet

Session Consistency

  • Read in same session sees previous writes
  • Other sessions may lag

Global Consistency

  • All sessions in region see write after replication
  • Highest latency

8. Pricing

Components

  • Primary cluster: standard Aurora pricing
  • Secondary cluster: standard Aurora pricing (cheaper if smaller instance)
  • Cross-region replication: $0.20/M I/O operations (or no extra cost for some)
  • Cross-region data transfer

Cost optimization

  • Smaller instances in secondary (read-only workload may be smaller)
  • Aurora Serverless v2 in secondary (auto-scale)

9. Monitoring

CloudWatch metrics

  • AuroraGlobalDBReplicationLag: time lag between regions
  • AuroraGlobalDBProgressLag: log-based lag
  • Per-secondary-region metrics

Alarms

  • Alert when replication lag > threshold (e.g., 1 second)
  • Trigger DR procedures if critical lag

10. Backup Strategy with Global DB

Backups in each region

  • Primary region: automated backups + PITR
  • Secondary region: also has automated backups (independent)
  • → 2x backup storage but additional protection

Manual snapshots

  • Can take from primary
  • Can share/copy across accounts/regions

11. Common Patterns

Pattern 1: Multi-region SaaS

Primary: us-east-1 (writes)
Secondary: eu-west-1, ap-southeast-1, sa-east-1
Each region: ALB → App → local read replica
Writes: all go to primary (acceptable latency)

Pattern 2: DR with auto-failover

Primary: us-east-1
Secondary: us-west-2 (DR)
Route 53 health check on us-east-1 app
If unhealthy → auto-trigger Lambda → Promote Aurora secondary + DNS failover

Pattern 3: Global app with Write Forwarding

AuroraMySQLGlobalDB:Primary:us-east-1Secondary:eu-west-1(WriteForwardingenabled)Secondary:ap-southeast-1(WriteForwardingenabled)Appreads+writesfromlocalregionendpointAuroraforwardswritestoprimarytransparently

12. Aurora Global vs Alternatives

Aurora Global Database

  • RTO < 1 min, RPO < 1 sec
  • 2-6 regions, < 5 secondaries
  • Aurora MySQL or PostgreSQL only

DynamoDB Global Tables

  • Multi-master (active-active writes everywhere)
  • Last-write-wins conflict resolution
  • NoSQL only

RDS Cross-Region Read Replica

  • Single replica per region (multiple possible)
  • Slower replication
  • Standard RDS engines (MySQL, PostgreSQL, MariaDB)

Manual replication (DMS)

  • Highest flexibility, more complex
  • For non-Aurora databases

Decision

  • Aurora workload, need fast DR: Aurora Global Database
  • NoSQL multi-region writes: DynamoDB Global Tables
  • RDS non-Aurora: Cross-region read replica
  • Custom logic: DMS

Câu hỏi ôn tập

  1. Aurora Global Database max bao nhiêu secondary regions?

    Xem đáp án

    Tối đa 5 secondary regions. Mỗi secondary region có Aurora cluster read-only. Tổng cộng: 1 primary + 5 secondary = 6 clusters. Secondary cluster có thể có tối đa 16 read replicas để scale read workload trong region đó. Hữu ích cho global applications cần low-latency reads từ nhiều continents.

  2. RTO và RPO điển hình của Aurora Global Database?

    Xem đáp án

    RPO < 1 giây (replication lag thường dưới 1 giây — storage-level replication rất nhanh). RTO < 1 phút khi promote secondary region thành primary (managed failover). Unplanned failover khi primary region fails: khoảng 1 phút để detect và promote. Điều này giúp Aurora Global Database đáp ứng RTO/RPO requirements cho tier-1 applications cần DR cross-region.

  3. Write Forwarding hỗ trợ engine nào?

    Xem đáp án

    Aurora MySQL (không phải PostgreSQL). Write Forwarding cho phép secondary region forward write requests đến primary region thay vì trả lỗi "read-only". Latency tăng thêm round-trip đến primary — không phù hợp cho write-heavy workloads qua long-distance network. Dùng cho applications cần sesssion writes với đa số traffic là reads.

  4. Aurora Global Database replicate ở layer nào (storage vs query)?

    Xem đáp án

    Storage layer — Aurora Global Database replication xảy ra ở storage engine level, không phải SQL query level. Tất cả storage writes được ghi qua distributed storage và forwarded đến secondary regions qua dedicated replication infrastructure. Đây là lý do replication lag rất thấp (< 1s) và không ảnh hưởng write performance của primary — replication không đi qua database query engine.

  5. Khác biệt với cross-region read replica thông thường?

    Xem đáp án

    Cross-region read replica (binlog): replication qua binlog ở database layer — lag có thể 5-30 giây, overhead trên primary. Aurora Global Database (storage): storage-level replication với lag < 1s, không ảnh hưởng primary performance, support planned failover với data integrity guarantee. Promote secondary của Global DB nhanh hơn nhiều và ít data loss hơn. Cost cao hơn read replica nhưng cho DR-grade solution.

Bài tập thực hành

  • Tạo Aurora MySQL cluster
  • Convert sang Global Database, add secondary in another region
  • Test cross-region read latency
  • Practice failover: promote secondary to new primary
  • (Optional) Enable Write Forwarding, test from secondary

Tài liệu tham khảo chính thức


Tiếp theo: Quiz Tuần 9