Risk #5: Downtime

The fifth risk listed in the Six Key Risks a CIO Must Avoid post is

This risk is actually two things, , , downtime plus what I like to call “lack of systems availability” when Users can’t access technology needed to do their job.

Downtime is straightforward, , , a server has crashed, a printer has broken, or a remote office router has failed. Something isn’t working so we have downtime.

System unavailability can mean the systems and network are all working properly but something prevents a User from accessing a system. An example might be when the IT organization freezes a server to perform an upgrade or maintenance.

In both situations, the User sees it as downtime. “I can’t work so something must be broken.”

A CIO must create a stable and reliable technology environment. Nothing will get you fired quicker than managing an IT organization that experiences lots of downtime. It is simply unacceptable.

The reason downtime is unacceptable is because it costs the company so much in many ways:

Loss of productivity
Morale issues
Client satisfaction problems
Troubleshooting and resolution expense
Loss of revenue

Effective CIO’s understand, “UPTIME IS KING !”

It’s important for a CIO to create an environment that supports a stable systems and network environment. To do this, the CIO should put in place a few key things:

Reliable hardware and network components – It goes without saying that an environment made up of old, dilapidated systems and network components is going to have failure. Understand where your “achilles heels” are and upgrade as needed to improve the stability of your technology environment.
Infrastructure support staff – Your infrastructure support can be staffed in-house or outsourced, but the staff must be capable and qualified to support the technologies used by the company. This team must also be positioned to respond quickly to problem issues.
Reliable support vendors – You need vendors you can count on, , , the type that provides reliable and responsive support.
Change management processes – Implementing processes to control changes made to networks and systems will help ensure thoroughness and quality of upgrade projects.
Monitoring systems – One of the best tools an infrastructure manager can have is an early warning of an impending failure. Good monitoring systems help you anticipate need.
Escalation procedures – When a system or network component goes down, you need to fix the problem as quickly as possible, , , this will be handled faster and more effectively when you have sound escalation procedures to follow.

Two additional things the CIO should understand is:

How much downtime the company is experiencing
The cost of downtime

When I joined a small company I knew we were having downtime issues, but with no Help Desk I couldn’t get a good handle on what kind of issues we were having. To gain a better understanding of our downtime situation, I created a simple spreadsheet and started tracking every downtime event we encountered. Within a couple of months I had a very good sense of what was going on which helped me in developing our strategy to stabilize our technology environment.

The other thing I’m a big advocate of is to understand the cost of downtime.

You can do this very easily for any component in your technology environment, from a larger server, a remote office router, , , even a desktop PC. Take a look at an ITLever post I wrote about this and download the Cost of Downtime tool. There is a link to a 20 Minute IT Manager training session that explains it all.

Reducing downtime should be a key focus of any CIO.