High Availability is not Cheap

bq. The system will fail at some point, no matter what, even if it’s only for a few seconds. That’s reality.

Well said, bro. The context was discussions on the MySQL mailing list, but it applies to my areas of expertise (networks and security) just as easily.

Networks are fun because they can fail in surprising ways. The most obvious example is an expensive, redundant network connection that, at some geographic location, shares the same copper or fibre bundle as its primary. The backhoe takes them both at the same time! There are many less obvious failures, too. The host that starts transmitting garbage packets on a network; the network is still _up_, just unusable…

High Availability is not just expensive, it’s also _hard_…

