How do you recognize a good network or datacenter design? What contributes to a good design? How do you learn to do network design well, and then continue improving?

Spotting design problems is key to the network assessments my employer, NetCraftsmen, does. There are other important aspects of assessments: network configuration (execution of the design and attention to details), and performance (network bottlenecks, poor choices of equipment, etc.). However, they are more often implementation flaws in the design. If the design itself is flawed, good implementation cannot do much to compensate. We generally require one of our Craftsmen Assessments so we can identify and fix important design problems before agreeing to provide support via our Craftsmen Assurance Managed Services program.

Whether conducting a new design or an assessment, the first step is determining the requirements, both business and technical. You can build a super-fast network, but if the business needs resiliency and high availability, you’ve focused on the wrong thing. I’ve seen a few networks that I would describe as a “CCIE Practice Lab”, with a lot of complexity. If most of the staff (or the contractors) cannot maintain that, and if the network has constant outages, it is likely not the right design: It is failing to meet the business needs.

A recent complicating factor is what I have been calling “integrated design.” Security and server/virtualization staff may have their own needs. If one group (network, security, server/virtualization, or even storage) gets things all its own way, that may not be a good thing for one of the others. The problem here is that what is good for one technical area may impose a poor design and/or complexity on another.

One example: VMware Layer 2 desires (often poorly understood use cases) can drive a design to inappropriate Layer 2 Datacenter Interconnect (DCI) – inappropriate because it doesn’t actually help with VMware, and is later regretted. Another example is security imposing firewalls or complex segmentation where an alternative approach might be less cumbersome, cheaper, and perform better.

Every once in a while I’ve been involved in a complex design, or trying to understand one that is already in place. Is such complexity appropriate, or not? The answer depends on whether the design is sustainable. Are there enough staff, with enough skills, to support the network? Is the network unnecessarily complex, or is there a simpler way it could have been built? Are there going to be outages because of maintenance mistakes, due to all the complexity? Or worse: Is there a history of outages, “self-inflicted wounds” (due to complexity, not brain dropouts)?

Shifting gears, how does someone learn to design networks, and to recognize good and bad designs?

I’ve found that working with many designs helps. Exercising critical thinking to decide whether something you’ve never seen before is good design can build skills, especially if you focus on the pros and cons of the design. Thinking about alternatives can help – if you think a design is bad, how would you improve it? What should be changed?

If you’re going to work with design, you’ve also got to be prepared to explain it to management, customers, or whoever is the decision maker. You may have a great idea, but if you can’t explain why, it is not going to be implemented. Pro/con analysis (critical thinking), diagramming skills, and verbal skills are factors there. Knowledge of hardware capabilities, performance factors, quirks, and reference designs also helps.

There are some good design sources out there. I used to love the Cisco SRNDG’s (System Reference Network Design Guides). They’ve been updated over the years and replaced by the Cisco Design Zone. Easy-access URLs for that: cisco.com/go/designzone and cisco.com/go/cvd.

There are some interesting documents to be found there. Some gaps, some design documents that are not located there, but a lot of Good Reading. I don’t necessarily agree with 100% of them, but I’d say 90% isn’t shabby.

The Cisco Validated Designs (CVDs) you’ll find there are a great intellectual resource, and a useful labor-saver because Cisco engineers put some solid work into testing and validating the design and alternatives, and writing it up. I love the idea of Front Door VRFs in the IWAN design guide, for example.

There’s another reason to read, understand, and follow the CVD, if there’s one that’s relevant. TAC is probably going to be well-prepared to support it, and due to the validation, you may encounter fewer bugs if you stick with the CVD. I’ve worked with some customers recently that got themselves far into the fringes of new technologies (e.g. trying to make LISP VM-mobility work with EIGRP OTP). (Thanks to Jim Kiker for his efforts on that!) When a bug was encountered, it took a while to get to where TAC could confirm the bug existed and initiate getting it fixed.

That example might have been particularly pernicious because it may have crossed internal programming and TAC team boundaries. This suggests a design principle: don’t mix new technology from Routing/Switching and Datacenter areas. This is something I’ve also seen with IP multicast + WLAN + MPLS and VRFs. The TAC engineers tried hard, but when we said “VRF” the WLAN folks had to bring in the R&S folks, and it took quite a while to get traction on the problems. The design tidbit learned there is that it is a Really Good Idea if the AP and WLC management interfaces are in the same VRF.

To wrap up, concerning design, there’s a course for that. I recommend the Cisco ARCH course and CCDP certification, which has CCNA (R&S) and CCDA (DESGN course) as pre-requisites. There’s also the less well-known but prestigious CCDE certification, which has a little different focus. Reading books by my friend Russ White also might help with that!

My colleague Carole Warner Reece and I developed the 2.0 version of the ARCH course, and learned a lot in doing that. We based it primarily on CiscoLive slides that had been developed over the years, so a lot of credit is due to the Cisco authors of those slides. The e-commerce topic was probably one topic too far – we had to skimp since we were exceeding the slide and time budget, for what can be reasonably covered in a one-week course.

You might also consider reading ARCH Cisco Press book, authored by John Tiso (and tech edited by yours truly): Designing Cisco Network Service Architectures (ARCH) Foundation Learning Guide: (CCDP ARCH 642-874), 3rd Edition.

A parting thought: the wealth of material Cisco makes available, both documentation and especially Use Cases and the CVD documents, really helps. I’ve recently been working with documentation by several other vendors. I have really noticed the lack of concrete design use case documents or CVD equivalents.

By the way, NetCraftsmen also does server/virtualization and other types of assessments. That’ll have to be the topic of a different post!

Twitter: @pjwelcher


Peter J Welcher

Networking Consultant