Subscribe for updates!

Search this blog..

Top Stories of the week

Data Center Feng Shui: Reliability is not the Absence of Failure

Posted in : Data Center Feng Shui

(added last year!)

Before commoditization and the advent of cheap computing (a.k.a. cloud computing ) we worried about redundant power supplies and network connections. We leveraged fail-over as a means to ensure that when the inevitable happened, a second, minty-fresh server/application/switch was ready to take over without dropping so much as a single packet on the data center floor.

Notice I said “inevitable.” That’s important, because we know with near-absolute certainty that inevitably hardware and software fails. Interestingly, it is only hardware that comes with an MTBF (Mean Time Between Failures) rating. It is nearly as inevitable that software (applications) will also experience some sort of failure – whether due to error or overload or because of a dependency on some other piece of software that will, too, inevitably fail because of hardware or internal issues. Failure happens. That doesn’t mean, however, that an application or an architecture or a network is unreliable.

Reliability is not an absence of failure in the data center, it’s a measure of how quickly such failures can be compensated for. Being able to rely upon an application does not mean it never fails, it simply means that such failures as do occur are corrected for or otherwise addressed quickly, before they have an impact on the availability of the application. And that means the entire application delivery chain.

Hardware. Software. Any given piece of a critical system should be able to fail without negatively impacting availability.  Performance may degrade, but availability itself is maintained. This often takes the form of “standby” systems; duplicates of a given infrastructure or application service that, in the event of a failure, are ready to stand in for the primary and continue doing what needs to be done. They’re the second-stringers, the bench warmers, the idle resources that are the devil’s playground in the data center.

And we’re getting rid of them: As we optimize the data center for cost and efficiency, we’re eliminating the redundant duplication (see what I did there?) within the architecture and replacing it with something more aligned with the business goals of maximizing  the return on investment associated with all that hardware and software that makes the business go. We’re automating fail-over processes that no longer assume a secondary exists: instead, we automatically provision a new primary in the event of a failure from the much larger pool of resources that were once reserved. We’re modifying the notion of architectural reliability to mean we don’t need to fail-over, we’ll just fail-through instead.

It can’t. At least not yet. And while we’re getting quite good at leveraging intelligent health monitoring and collaborative infrastructure architectures, we still haven’t figured out how to predict a failure. Auto-scaling works because it does not account for failure. It assumes infinite resources and consistent availability. We can tell when an application is nearing capacity and adjust the resources accordingly before it becomes necessary. And it is exactly that “before” that is important in maintaining availability and thus providing a reliable application. But we can’t predict a failure, and thus we can’t know when to begin provisioning until it’s essentially too late. There are only two viable solutions: pre-provisioning, which defeats the purpose of such real-time automation and scalability services in the first place and reserved resources, which can have a deleterious affect on efficiency and costs – you’re purposefully creating a pool of idle resources again. Both tactics have the same effect: idle resources waiting to be needed which runs contrary to one of the desired intents of implementing a virtualized or cloud computing-based infrastructure in the first place. 

Related Posts

» Fusion Feng Shui: Ford's Marketing Misfire

» Data Center Feng Shui: Architecting for Predictable Performance

» Feng shui: Fact vs fiction

» Feng shui: Fact vs fiction

» Corporate Feng Shui: The Good and Bad

» Feng Shui: Life-changing or pseudoscience? (Video)

» Thanksgiving Feng Shui: 7 Tips to Take the Bite out of Turkey Day

» Packers feng shui: Fans devote rooms to favorite team

(added last year!) / 261 views