Wired for Scale: Sid Rao's Musings

Wired for Scale: Sid Rao's Musings

The Night DNS Died: A Pager Tale About File Descriptors, Runaway Logs, and Debugging in the Dark

Tales of Debugging While Drunk

Sid Rao's avatar
Sid Rao
May 31, 2025
∙ Paid

Let me tell you a story.

It was a quiet autumn evening. The kind of evening that whispers, “Pour a glass of wine and relax.” And I did. One glass turned into three—maybe four—but who’s counting? Indeed, not the person whose pager was supposed to remain silent.

But it didn’t.
Bzzzt.
Bzzzt.
BZZZT.

The wh

ole damn service was down. Fuck, that Amazon COE is going to hurt. Really love those early Wednesday morning drive-ins for a colonoscopy.

And that’s when the adrenaline hits harder than any cabernet.


Act I: All Systems Down, and No One Knows Why

A critical service—let’s call it the nexus of our control plane—had fallen over. This wasn’t some toy microservice tucked away in a dark corner of the mesh. This was the hub through which everything flowed. Traffic was backed up. Alerts were cascading across the SRE channel. Metrics? Gone. Control plane servers weren’t emitting anything.

User's avatar

Continue reading this post for free, courtesy of Sid Rao.

Or purchase a paid subscription.
© 2026 Sid Rao · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture