How NOT to communicate in an IT disaster

I’ve spoken here before about HOW to communicate in an IT disaster. Today I’m going to illustrate the opposite situation.

First of all, the Internet Cheerleaders aka “The Blogosphere” all seem to get hung up on the concept of “community”. This idea that somehow everyone will hold hands on top of the mountain and sing in harmony, provided the Internet is used as a free and open exchange of ideas. The echo chamberer, circle jerk… um blogosphere seems to forget that real communities operate under a set of rules and have a subset of the community that enforces those rules, carries out mediation and if required, punishment, to promote and keep civil order. While very few people I know are as libertarian as I am, I still recognize the need for the occasional exercise of the power of the state to maintain the community. I’m not so naive to think that if all the rules were lifted that we’d all just get along peachy. Self-interest is the prime human motivator and everyone seeks to improve their advantages in life. It isn’t just human nature, it IS NATURE. So when it comes to setting up communications channels with your “community” of customers, it pays to remember the lessons of society and nature.

There are three ways to view communications between service provider and customer (or IT Dept and Users, or Company and Clients, or however you wish to define this relationship):

  • One way/Public – One speaks, all listen
  • Two way/Private – One speaks, all listen, but can only reply directly, only the speaker sees the replies
  • Free way/Chaos – All speak, all listen

The Blogtards (thank you John Welch for that term!) want you to think that option three is the One True Way. Open and honest communication! Unfettered and free. Sorry, but that is complete BS. Chaos and disorder is what it really is.

I experienced this second-hand today and it was an epiphany for me. We rent some rackspace in a facility in Vancouver, BC. We came to this position when we acquired a smaller competitor in 2002. They had a private suite in this facility and we maintained it for a year or so until some circumstances forced us to relocate the majority of our equipment down to our main facility in Seattle. We left behind a handful of servers, namely things that require geographical redundancy… secondary mail, DNS, offsite monitoring, etc. I live halfway there so I actually go up there once or twice a year to do server maintenance and whatnot. The company we rent the space from uses a web forum to interact with their customers. So far, so good right? Like most datacenter operators they aren’t really in the real estate biz, they rent space in facilities who provide the infrastructure. (FYI: We don’t do this. We specifically construct our leases to have full control over assets like UPS and generators. But that is pretty unique in this business.)

Today, one of those freak accident/force majeure events happens. A fire in an electrical vault creates a large area power outage in the city of Vancouver. Some of the backup power systems have intermittent problems staying running.

Now, I’m not here to criticize the provider or the building management about their backup power systems right now. The post-mortems haven’t been completed, but from what little I do know now it sounds like they don’t quite have their ducks lined up properly. No, I’m here to disprove the blogtards about the wonderfulness of open communications. Their web forum as customer communications channel blew up in their faces.

Here, go read this.

If you didn’t have the stomach to wade through the whole thing, here is the entire 27 pages (as of Monday, July 14, 2008 @ 8 PM PDT) of it in a nutshell:

Provider: The power went out, a generator failed, we are working as fast as we can to fix the problem.
Customers: Oh crap…
Customers: WTF!? We pay you outrageous prices for uptime! Where’s your redundancy??!!
Provider: We’ll have more data as soon as it is available. Please be patient.
Customers: Didn’t this happen once before?? OMG! You Guys Suck!
Customers: OMG! I’m losing thousands of dollars EVERY MINUTE!!!
Customers: Don’t you test these things?? Ever??
Customer: Hey, my stuff over at (other facility) is still up!
Customers: WTF??! We’re pulling our equipment out ASAP!
Provider: Current status is X, ETA for full turneup is Y. Please be patient while we sort this out!
Customers: (Rampant speculation and worry based on uninformed observation)
Competitor Sales Staff: Hey, our stuff is still online, We’re offering discounts for new setups TODAY only!
Customers: Cool! Sign us up!
Provider: (deletes post from Competitor Sales Staff)
Customers: Hey! WTF!?? You are deleting posts! That is CENSORSHIP! You can’t do that!
Provider: We now have an ETA of X:XX for full recovery. Almost there folks, hang on!
Customers: How come the ETA just changed?? You Guys Suck!
Provider: (tries to correct rampant speculation and worry based on uninformed observation, with some facts)
Customers: You guys are lying bastards, get your story straight!
Provider: Any minute now, trust us! We’re working REALLY hard here!
Customers: How come nobody answers the phone?
Customers: Hey, what about (names company)’s servers, when will they be up?
Competitor Sales Staff: Hey, our stuff is still online, We’re offering discounts for new setups TODAY only!
Customer: hey, take it easy on them guys… they are nice people.
Competitor Sales Staff: Hey, our stuff is still online, We’re offering discounts for new setups TODAY only!
Provider: We’re partially up! Rolling starts are being conducted by NOC staff.
Provider: (deletes post from Competitor Sales Staff)

Customers: My stuff is down still!
Customers: My stuff is back up! Thanks guys!
Provider: (deletes post from Competitor Sales Staff)
Customers: My stuff is down still! I’m losing MILLIONS OF DOLLARS PER SECOND!! I expect to be compensated!
Competitor Sales Staff: Hey, our stuff is still online, We’re offering discounts for new setups TODAY only!
Provider: (deletes post from Competitor Sales Staff)

etc, etc, etc.

You will note that the provider actually did everything that they should have and could have. They were informative, open, honest and direct. The real problem was the “community” which devolved into complete chaos within minutes and kept getting worse by the second. Once the blood was in the water the sharks arrived and started picking off the survivors one by one. What a disaster.

Why have an option for anonymous contribution to a forum?
Why even have an open forum about facility status?

You can serve the same function with either of the two other methods. A straight announcement-only broadcast, or if you want to have feedback an announcement channel with a private feedback loop. No public chaos, no feeding frenzy of your competitors preying on your misfortune. No accusations or random speculation. Just focussed communication that stays on-point and useful.

There is a time and a place for an open exchange of ideas. Two way communication is valuable. Free-for-all communications even has its place. But NOT when you are dealing with a crisis.

Your thoughts?
(this is, after all a semi-open forum!) 😉

3 thoughts on “How NOT to communicate in an IT disaster”

  1. I know it’s terribly politically incorrect, but a poster I once saw, sums up the (Internet) situation: I’ve replaced the offensive prefix in [brackets.]

    “Arguing on the Internet is like running in the Special Olympics: It doesn’t matter who wins….in the end, you’re all still [blog]tards.”

  2. My blog is open to new comments, today only!

    I guess I agree with you that an open forum on facility status is just kinda dumb.

    In a situation like this, I want to:

    a) Know what’s going on, specifically, even if I have to go find out what that means
    b) A time estimate, or, preferably, range (best case, worst case)
    c) An explanation when the situation is over, or is largely over, that’s honest about what happened
    d) A solution so that the probability that this will happen in the future is minimal
    e) Compensation, if appropriate, even if it’s just “we credited your account $2 for today’s downtime” or a coupon for x% off my next renewal

    I think where a lot of hosting companies fail is in clear communication about what the situation currently is and in accurate estimates of what it will take to bring the system back up. That’s more frustrating than silence–almost.

    P.S. If xkedata went down for a day, I’d actually make money, such is its lucrative revenue stream. 😉

  3. Open forum on anything support, whether is data center or webhost, is just plain bad.

    While you are using Peer1 as an example, ever strolled onto dreamhost’s status website or official blog… whenever something breaks bad (they run into about 4-5 ‘bad’ breaks a year or that their home made billing system starts charging people…) thousands of people complain.

    However… in their defense they truly don’t censor any of it so you truly do get competitors advertising in their comments

Comments are closed.