Debugging Phantom Redis Outages on GKE

How we traced 518 Redis failures in 7 days to shared-core VMs and HAProxy defaults - and what fixed it (and what didn't).

How a Single React Query Setting Took Down Our System

How a default React Query setting triggered a synchronized retry storm that overwhelmed our infrastructure during a backend outage.

How to deliver data quickly

What techniques we can apply to improve performance of data transmission?

System design - collection of links

Collection of courses, videos, books related to system design interview preparation. Constantly updated.