Cisco Blogs

The New Bronze Age: SLA’s too high and they prevent innovation, too low and they prevent operation

October 10, 2011 - 0 Comments

Where I grew up, you could buy individual cigarettes. While I played ball at the park, I’d see the young men approach the paper kiosk to get a cigarette. Not a pack, just one lonely stick. The customers overpaid on per-cigarette basis but it helped them manage their budget I’d watch them and think nothing of it. It was normal.

People also could buy shampoo in ketchup-sized packages. Unilever still sells them in India. I grew up in the third world, it was the bronze age, but only only on good days.  We’re back to bronze with cloud computing, and I’m hyper ready.

For me, the biggest invention cloud computing brings about is unreliable level services. And how important it is to have low quality service levels available on a metered basis. A metered basis the customer can manage.  Hear me out.

Today, Amazon’s block storage is unpredictable for databases. The latency in the network is funky. Machines fail to start. Machines don’t fail to fail. Service levels in the cloud don’t exist.

This is not your typical datacenter. It’s a bronze age datacenter. No great expectations, but diminished expectations.  And for a young segment of the market, it’s just right and couldn’t be be better.

I sat down with a young start up and asked them why do they use cloud computing if it’s so unreliable, if it requires so much more coding.

Answer: They have more time than money. And the money they have, they have to be parsimonious, avaricious and cautious. They are ok coding more to deal with the cloud’s weirdness. But running out of cash would kil them. The bronze age suits them just fine.

So all the cool kids in Silicon Valley are super excited about writing software for “Designed-to-Fail’ infrastructure. We can’t wait for a chaos monkey to spank us. Well…  that’s a San Francisco thing.

So what’s the lesson of this meditation? It’s that service levels are important. Too high and they prevent innovation, too low and they prevent operation.

Here’s how somebody is going through the process of figuring it out. They were kind enough to write about their thinking, so be sure to say thank you. Here it is:

If you think about combining these primitives in customer-friendly ways, you’ll probably recognize that it’s often the case that most file shares don’t require real-time remote replication and failover.  It’s also usually the case that transactionally-intensive applications can’t get by with only a daily backup.

The goal is to combine performance and protection attributes into something mere mortals can understand.  You might end up with something that looks like this:

  • File Share Class 1 – base offering for generic file storage
    (low bandwidth, moderate latency, “business support” availability)
  • File Share Class 2 – if you need something special
    (moderate bandwidth, moderate latency, “business critical” availability)
  • Decision Support Class 1 — best for running departmental reports
    (moderate bandwidth, moderate latency, “business support” availability)
  • Decision Support Class 2 — bigger data marts and small warehouses
    (high bandwidth, moderate latency, “business critical” availability)
  • Transaction Support Class 1 — logging requests or events
    (moderate bandwidth, moderate latency, “business critical” availability)
  • Transaction Support Class 2 — the bigger OLTP apps
    (high bandwidth, low latency, “mission critical” availability).

Ideally, you could move a data set between, say, Class 1 and Class 2 (or back!) non-disruptively.  Broadcasting your ability to do so would help keep people from over specifying up-front, or having a place to put that oh-so-important project when it ain’t so important anymore.


In an effort to keep conversations fresh, Cisco Blogs closes comments after 60 days. Please visit the Cisco Blogs hub page for the latest content.