Wired for Scale: Sid Rao's Musings

Wired for Scale: Sid Rao's Musings

Share this post

Wired for Scale: Sid Rao's Musings
Wired for Scale: Sid Rao's Musings
The Night DNS Died: A Pager Tale About File Descriptors, Runaway Logs, and Debugging in the Dark

The Night DNS Died: A Pager Tale About File Descriptors, Runaway Logs, and Debugging in the Dark

Tales of Debugging While Drunk

Sid Rao's avatar
Sid Rao
May 31, 2025
∙ Paid
3

Share this post

Wired for Scale: Sid Rao's Musings
Wired for Scale: Sid Rao's Musings
The Night DNS Died: A Pager Tale About File Descriptors, Runaway Logs, and Debugging in the Dark
Share

Let me tell you a story.

It was a quiet autumn evening. The kind of evening that whispers, “Pour a glass of wine and relax.” And I did. One glass turned into three—maybe four—but who’s counting? Indeed, not the person whose pager was supposed to remain silent.

But it didn’t.
Bzzzt.
Bzzzt.
BZZZT.

The wh

ole damn service was down. Fuck, that Amazon COE is going to hurt. Really love those early Wednesday morning drive-ins for a colonoscopy.

And that’s when the adrenaline hits harder than any cabernet.


Act I: All Systems Down, and No One Knows Why

A critical service—let’s call it the nexus of our control plane—had fallen over. This wasn’t some toy microservice tucked away in a dark corner of the mesh. This was the hub through which everything flowed. Traffic was backed up. Alerts were cascading across the SRE channel. Metrics? Gone. Control plane servers weren’t emitting anything.

Keep reading with a 7-day free trial

Subscribe to Wired for Scale: Sid Rao's Musings to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Sid Rao
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share