Klaviyo has developed a resilient event publisher using a dual failure capture design to ensure that no incoming events are lost during processing, even amidst network issues or serialization errors. By integrating Kafka topics and S3 for backup, the system can efficiently handle failures and maintain real-time event publishing for its customers. The implementation has proven effective, with significant automatic retries and event recovery in production.
event-publishing ✓
+ resilience
kafka ✓
data-recovery ✓
system-design ✓