Rob Hirschfeld on OpenStack Core: Your baby is ugly, and so is everyone else’s.
Rob’s not talking about real babies of course—he’s talking about OpenStack Core and how shiny new projects that need nurturing and care—like babies do—are not good candidates for it—no matter how beautiful and chock-full of potential they may to be their doting supporters in the community. Core is a place for mature, stable projects that people can depend on.
And that’s just a tiny little preview of the spot-on insights Rob shared during our interview with him on OpenStack Podcast #14. He also spoke about:
- How Crowbar came to be and what it does
- The perils of saying “Yes” to everybody
- The mission of DefCore
- How a benevolent dictator might help contributors focus on what the OpenStack community really needs
- How the OpenStack Product Group can serve as a bridge between developers and customers
- Why having code designated as “Core” kind of sucks
- Why the word “the” can be problematic
For a full transcript of the interview, click read more below.
Niki Acosta: All right and we are live. Good morning viewing audience or good afternoon depending on wherever you may be. I am Niki Acosta from Cisco and I am serving today for Jeff Dickey of Redapt. We are here with a very awesome guest today, we are super excited to have Rob. Rob, introduce yourself.
Rob Hirschfeld: My name is Rob Hirschfeld. I am an OpenStack board member, I’m also the founder of the OpenCrowbar Project which is a physical infrastructure provisioning project. I recently stepped out of Dell where I was leading the OpenStack and Hadoop technical project to pursue OpenCrowbar full time. We’ve built this company called RackN or RackN as we like to say where we’re working on building up that as a supported platform and really being able to help customers leverage the potential we’ve opened up with Crowbar.
Niki Acosta: Wow. Rob, I’ve known you for quite some time. Talk about somebody who’s been involved in OpenStack from the very beginning and I definitely want to talk to you about that. Before we do that let’s go back a little bit further. Obviously you and I are in the same city-ish. Talk to us about your … How you got into tech.
Rob Hirschfeld: Sure. Let me start where I got into Cloud. I got into Cloud and believe it or not in New Orleans in ‘99 with partner in crime Dave McCrory, who is pretty well known in certain circles for his data gravity work. He and I basically started at what was at the time called an application service provider back in ‘99. We would now call it an IS or a Cloud. I’ve been playing at this game for quite awhile, trying to make it work, trying to figure out what the right way is to make infrastructure more available.
Niki Acosta: Where are your roots? Are your roots in systems administration? Are they in devops? What do you classify as your roots?
Rob Hirschfeld: I’m an engineer by training, industrial engineering, which might sound very strange, “How does an industrial engineer get into Cloud?” What I found over time is that Cloud is about process, it’s about automation and recently I’ve been really involved in lean, agile, that type of process. That type of process translates into what we do in Cloud–it’s actually a natural evolution. For the same type of revolution we saw an industry in the 90’s where we moved into just in time manufacturing.
Cloud has been creating the same overall change in the industry as just in time and lean deployment have. In operations we’ve really changed the way we operate IT. I put that frame on, I usually consider myself a developer. I’ve been doing ops since 99 because I found it was so hard to be a developer and actually deploy things. I’ve been trying to make that part of my job easier ever since.
If you look at my career it’s always been about how do we take great ideas people can write as code and then make them usable for people at scale. Sometimes that scale has two users at a time, sometimes that can be hundreds or thousands. It’s a significant gap. It’s much easier to write code than deploy code unfortunately.
Niki Acosta: How was the Crowbar project… I guess you… It might be great to sum that up for people what intentionally it was. But how did that get started and fast forward now to where you are today with it.
Rob Hirschfeld: Crowbar started frankly because my team was up against the ropes at Dell. Not through anything we expected, we had a new class of servers from Dell that were cloud-enabled. We were dealing with (this was pre- OpenStack) we were dealing with the Azures and the Eucalyptuses and the early Hadoops, some of the early big data plays. We had the software, we had the hardware, but we would go to customers with it and fall apart. It was just really, really hard.
What we discovered was… This is the same time devops was getting defined as a word. What we found was this huge gap between the software and the hardware, we had to fill it with some automation, otherwise we would be doing one-off deployments every single time. Crowbar came out of those battle scars of: We can’t walk into a data center and hope to get the install right in a reasonable amount of time. We had to automate our processes and do it in a repeatable way.
So we could duplicate it in our lab, we could duplicate it in the field and then we could go back to that site six months later and have them still be successful. Crowbar was really taking those best practices and automating the best practices, that’s really how we see Crowbar. When we did it for OpenStack people … We described it as an OpenStack installer. That’s really sort of what we needed to describe it as. But when you look at what Crowbar is really doing, it’s really an orchestration system that sequences physical operations as part of provisioning.
So it’s really a system tool. I wouldn’t call it system management–that has its own meaning–but it really is responsible for coordinating all the operational activities and start doing scale deployment.
Niki Acosta: That is used for… Namely for installation and then what, for scaling as well?
Rob Hirschfeld: This is one of the things, you asked me about a little bit of history. We started Crowbar back in… The 2011 time frame is when we’d OpenSource and we had our first OpenStack install. Crowbar was actually the first OpenStack install back in the Cactus days. I’ve been doing OpenStack and OpenStack installs for a long time. When we took the feedback and literally re-architected Crowbar into OpenCrowbar.
Niki Acosta: Oh, oh—we lost you?
Rob Hirschfeld: You lost my audio?
Niki Acosta: We’re losing a little bit. There we go. We’re back.
Rob Hirschfeld: Okay. We literally re-architected Crowbar – Crowbar2 and in that process we defined something called ready state and it’s been a really important thing for us. We found that in order to do OpenStack deployments there are a lot of ways to deploy OpenStack, Chef, Puppet, Ansible, Salt, and there’s a whole bunch of installers out in the market. Packstack is a good example of one that we’ve been playing with a little bit.
What we found was that all of the installers for them to work had to get to a point where there was a good ready state. The networking was set up, all of the infrastructure was passed out, keys were set up, all of that work. One of the things I know we want to talk about is how do you make this successful OpenStack, what are my favorite parts of OpenStack?
One of the things I really feel like is that for the community to come together around these installers we have to have a baseline so that somebody can say, “I did it this way,” and you can get to the same place and break in the same place when they’re doing installs. That’s a lot of what we want OpenCrowbar to be about. It’s about getting you to a baseline, a repeatable baseline and then you build on top of that. Then you can build all sorts of different ways on top of that.
We’re very flexible from that and what’s fun is that you don’t have to just build OpenStack on top of it because of this ready state boundary. You could build Cloud Foundry on top or Ceph on top or Hadoop or Mesos, or Kubernetes. Our goal is to take all of that pain from the physical infrastructure, be vendor-agnostic so you can swap out different types of hardware but your scripts above that would still run. Sort of a long answer but it’s a big vision. It’s really an important thing to help make operations more consistent for a community like OpenStack.
Niki Acosta: Being open I’m imagining that there would be a mechanism by which different types of hardware would be compatible with OpenCrowbar. Is that right?
Rob Hirschfeld: Exactly. Crowbar was designed to have basically an abstraction layer for hardware and so we can deal with the fact that you have a Dell gear, HP, Cisco. It really doesn’t matter from that perspective because what we do is we break out the control actions into very small pieces. Then each piece can be run separately. If it’s a Dell, actually Dell has two types of gear that we’ve dealt with.
One type you have to make all your changes and corrections internal to the system what we call the side band or in-band change. Then some types of gear you only… You control bass through their BMC networks in their out of band control planes. That’s perfectly typical. We find the amount of gear, variety of gear, very high and we just have to deal with it. We have to deal with it inside of Dell, we have to deal with in every data center.
We can’t… practically… maybe if you want to buy for only one vendor and only one model you could have a completely homogeneous system. You get to a thousand nodes and even from one vendor there’s a fair bit of variety.
Niki Acosta: Now, speaking of vendors you obviously spent a long time at Dell and there’s always been this notion–at least how Rackspace founded OpenStack–there was a notion that the hardware really shouldn’t matter. You should be able to put whatever kind of hardware you want or make your own or buy in mass quantities from some black label provider. Do you believe that’s true? How much does hardware matter in OpenStack?
Rob Hirschfeld: Really interesting question. Um, the hardware … The hardware shouldn’t matter at all to the end user of course. The whole purpose of OpenStack is to make all of the mess that is a physical data center go away. Unfortunately when you’re building something like OpenStack, Hadoop is a really good example, Ceph, Swift inside of OpenStack. All really care about the physical topology, the infrastructure.
You care about your switch layouts, you care about your hard drive configurations, you care about which machines are next to each other in the racks. Because they have … they’re in the same power zones. Those details do matter, the vendor doesn’t matter. The vendor is much more specific just from your operational needs and what you’re trying to accomplish but there’s a really interesting movement called OpenCompute that we started to play in a little bit where people try to open source the hardware designs.
Even with open source the hardware still comes from a vendor. It’s still Supermicro or Dell or HP or Quanta making those boards. It’s not until you get to Facebook scale that you go directly to the OEM and tell them what to make. The hardware… We thought that we could make hardware not matter. When you’re dealing at the hardware level, the hardware matters. You have to get it right, you have to create an abstraction for it.
What we had to do is we decided that we needed a boundary and to say, “All right, we’re gonna let … We’re going to try keep all hardware details below this level, this ready state level. Then focus our scripts on dealing with the abstraction above it. I think we’re getting to a point where there’s sort of three layers, right. There’s a cloud user level where you’re using OpenStack and that OpenStack infrastructure is very portable and homogeneous. There’s a level below that where you’re doing physical ops on top of an abstraction boundary so that the scripts should be the same site to site.
Then there’s the actual physical ops themselves where you have to deal with the variety of NICs and rate cards and topologies and how people want to manage their gear.
Niki Acosta: In Cisco probably more so then working at Rackspace. When I joined Metacloud we dealt with a ton of enterprise users and it seems like the enterprise users actually do care what the hardware is but I’m not sure it’s for the right reasons. Obviously your data center guys and your infrastructure teams have probably been working with the same vendors for eons and have these big vendor relationships.
What happens to all the intelligence that people have laid down on all their boxes? Does that go away in the Crowbar scenario or do they still have access to that, the tools and different things that they can use to troubleshoot hardware?
Rob Hirschfeld: From our perspective a lot of those tools we’ve tried to preserve and work with. Crowbar’s job is not to be very opinionated. We started off more opinionated and we heard very clearly that we were way too opinionated by selecting Chef. What we found is that there’s a couple of aspects, one is that people are heterogeneous by design. Most data centers, to prevent themselves from being locked in to one vendor, pick multiple suppliers. They do it for business continuity reasons. In some cases they’ve done it because they’ve acquired companies or do they have different projects and have bought different gear and they have to pull those together.
Also very normal, we see that a lot in the devops tools. You’ll get a team that likes Chef and a team that likes Puppet and then they have to come together and live in harmony in the operations environment. What we’ve tried to do is make those things much more neutral and sort of be a neutral territory for that. When we were first building OpenStack certified hardware, building Dell’s first reference architecture—and I know that I was working with your team at Rackspace when we were in that process.
We would talk to one customer and they would have one opinion, so we’d morph to that opinion. Then we talk to somebody else and they’d have a second opinion and maybe even a third opinion. They’d be fighting internally. We found that it just wasn’t worth telling people they were wrong because they weren’t. There are six different correct ways to install OpenStack at least–probably exponentially more than that. There’s a lot of different hardware that works for this. I think the market over the last four years has converged into two or three patterns that are pretty consistent.
For example people with compute typically are going for a 1 or 2 U box with six drives and teamed 10 gig NICs and dual proc with 48 to 96 Gigs… I can give you … It’s not that hard to come up with the general spec. Even inside that spec you see how much wiggle room I’m giving myself. The reality is that’s normal and it’s a fool’s errand in my opinion to try and tell people they’re wrong when they have just a reasonable alternative. Our job was to create ways that you could abstract that… From Dell selling OpenStack reference architectures.
If I told people, “Hey, we don’t use teamed 10 gig nics,” which our original spec didn’t, they told us no. It wasn’t helpful and they weren’t wrong. It was more expensive. This is the biggest shock. Maybe it comes back to your question. Sometimes people make choices that are much more expensive than they have to because it’s what they’re comfortable with.
Because they’re comfortable with the vendor or because they think they need 10 gig teamed NICs where they want to physically segregate their public and private traffic onto different physical networks. It’s not helpful for me to tell them they’re wrong and to me this is part of the challenge with community. One of our big challenges here is that in communities there’s a lot of right answers. There’s a lot of people who are right, there’s a lot of use cases that are different. You get into a weird corner if you tell everybody, yes.
You’ll also get in very and equally weird corner if you don’t tell anybody no, and I think that as we bridge into the OpenStack community and things like that we can come back to that.
Niki Acosta: Have you learned that the hard way? Not telling people, “No.” It’s just as important as not telling anybody.
Rob Hirschfeld: On the OpenStack side and I hope we have people who have … Who are here to talk here about OpenStack, to talk about the OpenStack core pieces. Two years ago we had a real dilemma about how we were going to define OpenStack core. OpenStack was still growing–at that point it seemed modestly compared to today–but it was still growing, and people were having trouble figuring out how to make two OpenStack clouds work together.
Most famously Rackspace’s and HP’s didn’t, although we maybe gave them a harder time because they were the first for something that’s become a pattern a little bit.
Niki Acosta: Thanks for that Rob.
Rob Hirschfeld: This is the challenge being first, right? I think that kudos to getting the sites up. I think that this was a classic case where the aspirations of having a uniform public cloud across multiple vendors and the reality of what it took to do that was much, much more sophisticated than we thought. In part because we had to say no to things in order to create this interoperable base, and we didn’t. We spent a lot of times building a community which meant saying yes to people.
You see this in the OpenStack community today as we keep adding more and more projects. We love to bring in developers, we love to bring in more projects, we like saying, “Yes, we want your code.” That’s exactly what a community should do. As an operator–and when I talk to operators–that same behavior is very frustrating. They will turn around and say, “Wait a second do I have to have this component? Do I have to have that component? Is Ceilometer a required piece? Or can I substitute?”
Leaving that ambiguous is very frustrating to the community and it leads to funny behaviors where… I was listening to Michael Stills talk about Nova, and additions that were made into Nova to support, I think it was Trove–I don’t remember exactly which project–that were causing other projects to have challenges in implementation or there are something stalled through the gate. It’s a complex series of interconnections.
The fact the Trove is a “yes” in OpenStack–which makes sense–but is not clear yet to the operators if it’s a required piece or not a required piece really causes a lot of confusion. Then it causes us to make changes to the APIs to support the project because we feel like it’s required. Then potentially interrupt other components that other people will think are required. You end up with this interlock dilemma of who’s most important. OpenStack has been going through some really interesting things in the TC side with this levels definition.
This big tent says, “We’re going to say yes to more people,” level says “But not all animals are equal,” in my animal farm reference. There needs to be somebody at the bottom that says, “These things have to integrate together as level zero.” Then we layer things on top of it and more radially. Things at the base have to be there that brighten my sunshine and then these two pieces don’t necessarily have to be related. It’s part of how you grow a project the size of OpenStack.
Jumping all the way back this was just an emerging problem two years ago when we started this core definition work where there’s a lot of stress between Swift and Nova and how we would work out what was required, what wasn’t required. We already singled out Ceph deployments where people were substituting Ceph for Swift. I wish they’d made those phonetically more different.
What we’ve seen here is that we needed to be able to tell people very clearly “This is what you had to have to have an OpenStack cloud.” “This is what you didn’t have to have to have an OpenStack cloud.”
Niki Acosta: That’s a tough thing to do, right? Just by sheer number of vendors that are involved, you look at all the projects, you look at all the priorities of different companies that are participating. You’re trying to sort through all of that and figure out what you need, what you don’t need, what’s required, what’s not required. Does this work seamlessly with this other project, if I implement this this other work or does it break? That can be a very, very difficult process.
To the point where someone might just throw their hands up and say, “I’m done with OpenStack.” I think we’ve seen a lot of people go that route, the DIY route and say, “I’m going to build this on my own. It’s going to be great.” Then they start experiencing some of these intricacies and they just say, “Whoa.” Rob, you’ve done probably more than anybody else in the DefCore front. If you wanted to talk … Maybe you define DefCore for the people, I guess we already have.
You’ve written a lot of blogs about it, you’ve talked to a lot of people about it and I think it’s an important movement. One that I certainly get a lot of questions about. Let’s hear a little bit about that.
Rob Hirschfeld: I’d be happy to talk about it. It’s not as big and scary as some people think. I’ve had people come back and say “When you explained it—it all makes perfect sense now.” Let me see if I can help take the people who are DefCore’s critics and explain what we’re doing. The first thing that people don’t realize is it’s about commercial use of OpenStack. We get a lot of people up in arms because they think we’re trying to run the technical side of this and it’s not at all that.
What we’re really doing is, the OpenStack Board controls the trademark for OpenStack. It lets people use the word OpenStack in their product and use the logo. If you want to do that we need to … For actually trademark management reasons we have to tell people you can use or can’t use OpenStack in this way. If we don’t do it we actually could lose control of the trademark. It’s very important for us to describe that but it’s only commercial.
People using in the community, contributing code that’s managed by the TC, the technical committee and they control which projects are in and which code is in and all that stuff. That’s the first thing that sort of gets people up and people understanding. We have a very commercial flavor about what we’re doing for DefCore because we’re trying to help create a commercial ecosystem.
I’m very unapologetic, right, OpenStack has to make money for the people, the companies that are paying, the developers who participate. We’re at over 90% corporate sponsored development in OpenStack, I think the number is even higher. The people’s sponsoring those developers have to see some return for their investment. It’s all tied into that. I hate to be … In an open source community you want to be able to say, “Kumbaya, we’re all doing this because we love open source.”
But OpenStack is not exactly a Kumbaya project. It’s operations, infrastructure and at the end of the day like you were saying, it has to deliver workloads. It has to be stable. Stability is a primary feature for OpenStack even more than some of the bells and whistles we want to add on. That said when you start looking at Core you’re going to have to say no to people. Right, this is I’m thinking to be our theme for the Podcast is you have to have a way to say no.
It doesn’t work to say no if you just look at them and say, “You know I don’t like you. No.” You have to give them a way to … A reason why you’re saying no, you have to tell them what it would take to say yes and that’s a lot of what we spent the last two years doing with DefCore. In DefCore we started with some basic principles, to describe how everything fits together. We made a decision at the time that it would be test driven so it would be very quantitative, not qualitative in making decisions.
Then we had to say, “How do we pick the test?” We spent another couple of months figuring out how pick tests and came up with 12 criteria that say, “These tests are going to be in, these tests are not.” They give very clear signals to the community. It seems like documentation is part of it, use is part of it. That is important. Then we also had to say which parts of the code would be required because OpenStack is not an API, it’s also a project with living code. We had to go through a process to say, “This is part of … These parts of the code are required or not.”
The reason we’ve done that is when we start saying, “You can be. You are core. You’re not core.” It’s really not as much you are core, you’re not, you’re telling a vendor “You must implement these parts of OpenStack, you don’t have to implement those parts of OpenStack.”
Niki Acosta: Do you expect a shorter list of vendors that meet a ton of criteria or do you see a larger lists of vendors that meet a smaller set of criteria? Where is that balance?
Rob Hirschfeld: It’s a great question. We actually were struggling with that balance on the Board. Because we have some vendors, SwiftStack is a really good example but they’re not the only, there are going to be others in their wake. We just want to use Swift, right? They don’t need to use OpenStack Nova in their product. That doesn’t help them. They really want a core definition that fits for them for their use. We have some vendors like DreamHost who want to use Ceph, they were one of the original proponents for Ceph.
They want to use Ceph as their object storage instead of Swift. They want OpenStack Nova and the packaging around the compute side and they don’t want to be told they have to use Swift as part of their deployment. In those case, yes, we can support smaller vendors–this is the change we made in October–to create OpenStack components for core so there’s a core component concept.
Then there’ll be broader level vendors, I know IBM is very interested at this level, I believe Red Hat will be too. At the platform level where they say, “We use all the components. Everything is good.” I think Rackspace will be a component because they have the history with Swift would be a component or a platform level where they are using both Swift and components but everything…
There’s no free ride with any of this. I haven’t talked to any vendor in the DefCore process that said, “We’re already in compliance. It’s not going to be a big deal for us.” A great example is Keystone for Rackspace… Rackspace famously didn’t implement Keystone for I believe very sound technical reasons when they made those decisions. They have to figure out how they’re going to implement the Keystone requirements. Today Keystone code isn’t required–just the APIs. Actually not even the APIs in our current set.
It’s very important to understand how all these pieces work together but we definitely have an outlet for people who don’t want the whole project. We have an added mark or more flexible mark for people, vendors who can implement the whole project. Then what’s very important to me is we have a community process by which people can see what’s coming with this. Can talk about which capabilities and tests they think should be in Core. They can talk about which code should be required or not.
Our goal is not… And really the board isn’t capable of doing this. I can explain capable maybe a little bit more. We’re not in a position to make a caveat decision with 24 board members. We want this, we don’t want that. We have to have a way that people can say, “I see what you’re doing. I come to you with an opinion. I have an objection to this.” If the objection makes sense we incorporate it. If they don’t reflect the majority view then we have an outlet so that they don’t get trumped on by the community.
There’s been a lot of nuance in those things. There’s a lot of safety valves in the process but we at the end of the day have to be able to define something, say “This is the limit,” and then move on.
Niki Acosta: That’s got to be a really a tough job. Just thinking about the scope of the number of people involved in those decisions and all the things that vendors are going to have to do to meet that core definition. At the end of the day for people who are seeking that commercial option, it’s probably a good thing. It doesn’t mean that if you’re not using … Someone who meets that core definition that you’re not going to get what you need, right?
There’s going to be probably some due diligence on the part of users to figure that out for their own sake, right?
Rob Hirschfeld: I think so. What we really want is people to start with the Core when they do an implementation then extend forward, right. People will pull in projects that they think are valuable. We want to know what those projects are so we can start adding it to Core. That’s one of the other things, this isn’t a recommendation, it’s a base. And then from that if you want to add Trove. Trove’s really good, I think up and coming candidate to be a component of Core then you have Trove, you start using it.
The more people use it the more likely it is to become a core component because here’s our goal. If I write an application and I’m very ecosystem focused with OpenStack. OpenStack has a great community, it has a great development community but it’s evolved. To be very frank on this. It doesn’t really do us that much good if we have a whole bunch of people getting together in exotic locations, Paris isn’t that exotic but in Paris or whatever, in Vancouver and we all pat each other on the back and say we’re great.
Because that’s internal inside our own community. The thing that we really want is a thriving ecosystem of people who are building applications, building tools and things like that above us. Right? Anyway, Amazon is fantastic for having a very vibrant ecosystem of people who understand how to use it and they build products that extend the product.
Niki Acosta: Pause right there. Let me ask you a trick question. Do you think vendors have been selfish in their interpretations of OpenStack?
Rob Hirschfeld: It’s a very interesting… Selfish in what sense?
Niki Acosta: Selfish in the sense that they may disregard things potentially that may be important to everyone for the sake of making their x work with OpenStack?
Rob Hirschfeld: The tragedy of the commons question if I can ask a clarification. I think that any open source project will have a certain need for tragedy of the commons. This is an interesting need to fix tragedy of the commons. The idea here is that one vendor contributes a lot into making OpenStack work. QA efforts and build efforts, right? HP has been doing a really good job with paying people salaries to do nothing but run the CI system. Right? It’s been fantastic for sponsoring that type of the work.
They are not the only ones but they certainly stepped forward from leadership perspective there. What I think you need to make sure is that people are contributing back upstream and being part of the community and doing things that help everybody else succeed. One of the challenges to me is that there’s two parts. One is you need a lever to make people do upstreams and we don’t really have… This is the benevolent dictator challenge and I’m hoping Core will fill this role a little bit.
Benevolent dictator can say, “We’re not doing what you want until you help the community.” We rely on the community to carry that message and say that. The social pressure is enormous on these companies to contribute and participate. I don’t see them being particularly selfish in that but it’s not very clear where they should be selfish or not selfish. Somebody might say, “I’m doing my community, my work by doing this, by contributing this piece of code.” It could be that piece of code isn’t what the community needs most, so they are not exactly being selfish but they are not being directed in the places that they shouldn’t be selfish.
I feel like it’s a little bit of a cop-out answer but I really feel like this ties into some work I was doing with Allison Randal and Sean Roberts about what we call hidden influencers that become the open site product managers group.
Niki Acosta: Yup.
Rob Hirschfeld: This was really exciting to me, after the Atlanta Summit we had some great conversations where people were pulling their hair out. The Atlanta Summit theme to me was “Where are the product managers, right?” We need product management. We look to each other, we realize that all these companies have product managers. Product managers understand OpenStack, they are relatively involved in the community but they have their own timelines to answer to.
A percent of time they allocate to the community, they do allocate some, we all agree that but it’s not allocated in a consistent way so they sometimes undo each others work. They sometimes show up at the wrong place.
Niki Acosta: And they don’t talk to each other, right?
Rob Hirschfeld: That’s 100% the challenge. So we started putting together a product managers group, it’s called now the product group. They are starting to have meet ups and things like that. Just talk to each other. The reason we called it hidden influencers was because they are not talking. They don’t show up as an organized group at the Summit and so you don’t realize that the developers in this … This to me is one of the real challenges with OpenStack and the process. We do a whole bunch of work in design and face to face things with developers in Summit sitting in a circle saying, “Yes I think we can do that, yes we should do that.”
At the end of the day developers really can’t make big commitments of their time if they are paid by a company to do certain work. While three developers from different companies might sit down and agree that they are going to support each other in doing the work, if their product managers, respective product managers don’t back up that commitment, you’re going to end up at the end of the release and we see this happen. You’ll end up at the end of the release with code that, people want to come in but hasn’t been finished or hasn’t been supported by the other people who are needed to support it to get it over the finish line.
That’s where it’s maybe subtle type of selfishness but during the release focusing on getting your own stuff done, which is good normally, can cost us features, collaboration and things moving, getting good flow through our gate process because people are more focused on their own work. That is the point.
Niki Acosta: It’s what’s slowing things down, right? It’s interesting, we talked to … At this point quite a few people on the show, and you speak to people who believe that there should be components–for reasons I can completely understand–that you have some proprietary special sauce that you put in or under your OpenStack powered cloud to make it your own. Then you talk to other people who are just like, “Hey, anything we write we’re going to try to give it back.”
They might develop something and they think it’s a really good idea and they might put in their project or in their product but then when they try to give it to OpenStack, OpenStack is like, “We don’t necessarily want that. Maybe if you do it this way.” You get all of these different forks of what could be really good ideas moving different directions. I think that a lot of the developers, the actual design conference portion of the summit is meant to address these.
But I think often times I’ve sat in a lot of those design sessions and you end up getting a lot of developers and none of the business people who are trying to help you know, meeting you halfway and helping meet their customers needs. I’m wondering if there’s a better way to do that, you know? I think the product group is a good start.
Rob Hirschfeld: I think product group is a really good start. I was concerned, is the right word for this? I felt like there was a lot of drift in Paris between the product teams, the business side and the teams that were productizing OpenStack in the developer summit. I think that we product managers were the place where that gets on together. It means that the developers are going to have to take those influences and reprioritize things at times.
One of the things about Core is and it’s funny because there’s two things. Let me track three threads with this. We have to actually change the bylaws that I need to encourage people to vote in the bylaws election. Let’s placemark that and come back to it because it’s interesting there. Being Core or not Core, originally we defined that as “project,” so Nova was core and Swift was Core.
Niki Acosta: Right.
Rob Hirschfeld: The reality is Nova is huge. Swift is huge. There are pieces in Swift that are stable and haven’t changed. There’s pieces in Swift that are changing and dynamic and new. Having your code designated as core as a developer sort of sucks, frankly because what you’re doing is you’re saying, “Hey, this code is in use and it’s production grade and people depend on it so don’t change it.” If you think you found a better way to swizzle this, you can’t do it here because if you change the APIs you break things. If you change the behaviors you break things. So code in Core really should be under a microscope, it should be slowed down, it won’t move as fast. It’s not where the innovation is going to occur in OpenStack…by definition. Core is about repeatable interoperable clouds and if we’re innovating at the core then we can’t interoperate because somebody’s going to be at a different version and it breaks everything.
Part of your thread with all this stuff is that we need to actually have much more fine grained definitions of where the innovation goes and where the stable core is. I’m hoping that core itself will be at signal that will help product managers differentiate and know when they are dealing with making core more stable. Core extensions versus whole new projects that are more of the ecosystem around this. There’s another point in there and I lost it.
Niki Acosta: That’s okay. It happens. It happens to me all the time.
Rob Hirschfeld: I was so excited about tying the core and the other pieces together.
Niki Acosta: The members of the foundation, do they feel the pressure of all this happening? From the outside it’s probably easy to say, “OpenStack should just make up its mind and they should just decide and we should move on.” I can imagine it’s probably a lot harder for those who choose to run and we are chosen to be a project team lead or be on the technical committee. How do you manage through all of those conversations and motions? People probably, the naysayers, right? I mean, there’s always going to be naysayers.
Rob Hirschfeld: We definitely, I was calling some of the OpenStack … Some of the core blog posting I did was, sometimes you just dive straight in and you got to take it. I was saying the Core is that your-baby-is-ugly process. That’s part of what people feel like we’re doing, we’re back at the emotional piece. People feel like we’re saying, “Hey, your baby is ugly. We don’t want it.” Our perspective on that has been, “All babies are ugly.” People’s babies to themselves are beautiful and you look at somebody else’s baby…it’s not that exciting.
What we realize with Core is: That’s good. Right? We’re looking for the adolescents, we’re looking for not even adolescents, we’re looking for the grown-ups. Right? We’re looking for the mature, stable part of the community, those are the core ones. Those are the ones we depend on. Babies are things you take care of, right? You nurture and you help them grow because they have potential. That’s how I look at the project and it’s not something everybody sees when they look at this and I wish they did.
I wish we were more excited about the new projects but thought of them as babies. I will tell you that operators and customers do. They have that perspective and maybe I’ve never articulated it quite this way but one of the things, one of the harms that we do in the OpenStack community is when we talk about a baby as if it’s a grown-up. We undermine our own credibility in the community and we really hurt OpenStack from that perspective. There have been a lot of times where people have talked about projects that were still at even just being considered–they have no real code–as if they were OpenStack projects, and they were going to solve problems for people.
You need to talk about their potential but when I talk to operators and users and customers of OpenStack they want to know, “Can it run my workload? Is the API stable? Does it deliver this?” It doesn’t help when we spend a lot of time in the presentation talking about the ins and outs of I’ll pick on Ceilometer today. Ins and outs of Ceilometer. Ceilometer performs an important function and conceptually it’s really important but it’s not a blocking factor for whether or not people can run a workload.
We need to respect the relative maturity points for that. And DefCore’s job has been to define help but make sure that we have a way to pick these things because for some backwards reason everybody wants to be a Core project because they think it’s a mark of importance, certainly a mark of use but that you need to figure that out. Right? Otherwise you end up looking like a child’s soccer team. Everybody’s getting a trophy and you wouldn’t want to field them against a pro soccer team.
Niki Acosta: Your analogies are priceless today, Rob.
Rob Hirschfeld: I need to open my drawer.
Niki Acosta: All of the babies, everyone gets a trophy. Look, obviously today we’re talking about a lot of the issues, problems. Maybe perhaps showing people a little bit of how to be … How the sausage is made, behind the scenes in OpenStack and the real challenge is that the community people like yourself and others have to grapple with on a regular basis. For other people, especially end-users what would you say to them?
Obviously they are hearing, “Gosh, there’s all these projects. They are probably not ready.” What do you say to a user who you want to instill confidence in the direction of OpenStack? Obviously you want them to believe, “Hey, look it has its problems but from where it’s at right now it’s pretty dang good.” What do you say?
Rob Hirschfeld: A couple of things. The first thing I tell people when they are looking at not just OpenStack but any cloud is focus on operations and automation. Right? Deal with automated deployments and get that working. Right? That’s your table stakes. It comes back to a lot of what we talked about with the puppies and cattle analogy and what I found is customers of OpenStack have been successful back in the Cactus days. They were successful when they automated their provisioning, when they brought their workloads in and they knew what they were doing.
If you look at the user studies there’s great case studies of people who are very effective with OpenStack deployments. By and large they had a specific use case that fit the capabilities of OpenStack at the time they were using it and they automated their provisioning so that if they had to upgrade or change or fix something they weren’t manually deploying things. They were using the APIs the way you would expect to do those deployments.
Niki Acosta: In other words, there were treating compute units and VMs as dumb compute, right?
Rob Hirschfeld: Exactly right. They were being cloud users and I consider the DevOps drivers to be fundamentally part of using cloud computing. If you’re setting up things by hand you’re not in my opinion you’re not doing cloud computing, you’re doing old school IT.
Niki Acosta: I’m feeling that. Look, it’s such a struggle too because I think especially when you look at enterprises you’re starting to see a big divide of your more traditional application authors, developers, infrastructure folks, and then you’ve got these new cloud groups that are just moving super fast and super agile and are delivering stuff at an alarming rate. Gartner calls it bimodal IT.
There’s your two groups that are grappling with this but it’s going to be interesting to see what that end state looks like because I think we’re still… none of that old stuff is going away and everybody wants to get on at least on a path where they are just trying to do that new stuff but I still see a lot of occasions where people are bringing old habits over to cloud. They are still trying to apply the principles in which they develop and scale applications over to cloud. I think a lot of that is just frankly it’s a lot of VMware users that expect OpenStack to have these substitutes for features that they’ve been depending on for a long time.
Rob Hirschfeld: I think VMware is a great crutch for OpenStack. Meaning that it keeps OpenStack focused on delivering cloud workloads and doesn’t force us to solve how do we create puppies, how do we support puppies in OpenStack. I’m not a believer that people should uninstall everything else in their data center, put in an OpenStack layer and then move forward. I don’t think that OpenStack is going to run on every server in the data center. Maybe that’s heresy from a community perspective, but I don’t.
I think OpenStack has its utility and it should focus on delivering that utility as an infrastructure, as a service and a great ecosystem of things on top of that. There are going to be applications that don’t make sense to run on OpenStack and there’s going to be alternatives to OpenStack that companies are going to explore. Just like we started this conversation saying that one size doesn’t fit all for hardware and people have reasons for adopting it.
I think it’s much better for us to be good at what we’re good at, than try to cover every use case and spread ourselves really thin. My preference is to see OpenStack focusing on doing the things that it needs to do really well. One part of that is servicing people who have automated, bring automated workloads into the cloud. That’s a really important specialty. As soon as we start to try to attack turning off VMware in data centers I think we’ve made our life a lot harder for ourselves. We’ve actually lost what people are really excited about in the beginning.
Niki Acosta: I wonder how many comments are going to come out of this from viewers. Look, you always have great opinions. I can see where a lot of people wouldn’t agree with you but I think there’s a lot of people that probably do. I don’t know. It’s good to hear from our viewers on that one.
Rob Hirschfeld: I’ll be interested to see, I think some of these opinions I’ve had and maybe this is a bigger megaphone. I think when you look at what has been important to me for core was sending us signal to what parts of OpenStack were usable and ready and people can rely on. We’ve been doing that, you can go back to a blog post I did in 2011 saying OpenStack is ready. And what I said was it was ready if you knew what you were doing and knew how to use it and wanted to make the operational investments for it. I think that’s been true in every single release that we’ve gone on.
Niki Acosta: Do you think that a lot of the extra projects are being driven by demands from end-users to their vendors like, “Hey, this is great but figure out a way to deploy an automatically scalable bad ass implementation of Hadoop.” Do you think it’s coming from the customers and that’s why people are trying to work that back into OpenStack is to make that easier for them?
Rob Hirschfeld: I think that there’s several motivations for that. I’ll throw Crowbar into the middle of this because we use Crowbar because it makes it easier for us to do automated deployment of things like Hadoop. Which in my opinion run really, really best, I won’t say best. Really, there’s a lot of use cases for Hadoop that are bare metal use cases just like I wouldn’t run Ceph inside of VMs.
That wouldn’t make any sense because I need access to the physical media. What I think is that and this is good. We want people to be building ecosystems on top of and around OpenStack. That makes a lot of sense, we should be doing that. Some of the Hadoop scalability stuff, half of the use cases I know for Hadoop are very strongly done in elastic compute where you have, you do nightly workloads or you spin up VMs to run a job. It’s a good use case for Hadoop. I don’t think it’s a right or wrong thing.
Niki Acosta: Should that be an OpenStack project? Or part of an OpenStack project?
Rob Hirschfeld: Here’s an existential question, I’m trying to help us use Core as a definition for this. I think that there are, OpenStack is deciding if it’s a big tent, we’ve been calling it big tent or a sweep project that includes the ecosystem and Hadoop, Sahara might be an example of that. Then, it has a core that is sort of this fundamental building blocks that everybody could have. This isn’t just my opinion. I’ve talked to a lot of different people and there’s things that … There’s no clear consensus at this point. There’s people who believe very strongly that OpenStack should be a very simple small core of projects that deliver IS and then everything else that we’re talking about are ecosystem projects and should be in the ecosystem.
Niki Acosta: Let’s talk about requirements because we just got a comment in the chat box.
Rob Hirschfeld: Okay.
Niki Acosta: I think it’s a really good one. Feature comparison with Amazon is inevitable. Isn’t that a legitimate source of requirements?
Rob Hirschfeld: I think it’s a great source of requirements such I just am not sure if they are OpenStack’s requirements or OpenStack’s ecosystem requirements. One of the things that I’ve seen in OpenStack that I think is not a particularly healthy thing for us to do. I hear a lot of people express the same opinion that when we create a project in OpenStack that has, that does something one way. Heat is a good example, Heat is an orchestration system originally designed to compete with the Amazon capability. When we make–and I think it’s totally appropriate and I think we should have it and it’s great–I think that as soon as you make it the OpenStack orchestration with the word ‘the’ in it.
Niki Acosta: You’re killing the rest, right?
Rob Hirschfeld: You are sending a very clear signal that others ways to solve that problem aren’t welcome. I think that there are a lot of ways I mean Scalr has an orchestration system that runs on top of OpenStack and Amazon and they’ve been part … I met them at the Austin conference or the Bear conference. Great, it’s a good community. Really good product. I don’t know if they feel this way. I would say they would appear competitive with Heat to me. Why is that? Heat is the chosen one from a lot of people’s perspective and I think we’re trying to look at how this works and what we need to do.
I’ve talked to a lot of people who want OpenStack the project and so I’m trying to represent a diversity of use here and it’s part of what my job has been with Core is to be balanced with this. I meet a lot of people who believe that OpenStack should have all this capability…should take on Amazon. I think it’s dangerous to do that. I have a very pragmatic reason why I think that we end up… OpenStack is a community ends up in a trap if we pick one component in the ecosystem and make it the OpenStack thing, is that it limits our ability to innovate.
If you took a Core perspective on this, the most likely competitor the Nova or Swift or Heat or Glance or Keystone is those own projects themselves next generation project or somebody says, “Hey, I have a new way to do Nova. I want to make some serious changes to it. I need to replace it.” Either it changes implementation, heaven forbid changes language in components or potentially change the way it works to make it more container-friendly.
Right? It actually works with containers in a more seamless way. Those changes are coming, it’s not a matter of “we’ll adapt to them.” We have to have our project definitions resilient enough that we can within the project expand and change. As soon as we lock in to “Nova is the OpenStack compute engine,” we potentially find ourselves very vulnerable to our own community splitting out as people say, “You know, I don’t want to do it that way anymore.”
Niki Acosta: Jeff is writing again. He says, “Isn’t this the API versus code question again, Single API, multiple implementations?”
Rob Hirschfeld: It is and this has been part of the maturity we’ve seen as we go through this. It’s worth trying to respect the clocks because I need to put in the pitch about the bylaws change. I’ll tee it up. Today, there’s a lot of people in the community who believe OpenStack is code and the code must be required. That is a very dominant view. Not dominant, it’s a very prevalent view. There’s a lot of people who think OpenStack ultimately will be an API definition and spec.
In my opinion I think that ultimately we do need to be an API spec. I think maturity-wise that’s going to take a long time to go and right now the best thing for the project is to be code and API together and work that out. Get good at validating the API is consistent and repeatable and interoperable. Using a same code base before we go and try and say, all right, we’re going to do this with different code bases I think that makes the problem exponentially harder.
DefCore has from the very beginning been a compromise approach. We’ve always tried to look at a lot of different viewpoints. I’ve been trying to reflect those in this podcast and I’ve been very open to how diverse our community is, and it’s our strength. A lot of what we have to do is solve the problems we can solve today and then move forward in time and give ourselves some flexibility. That’s actually one of the things from the bylaws perspective that we didn’t do in the bylaws.
The bylaws were very, “This project is core, this project is not core,” and at the time I think that was fine. Today we realized that even in the project there’s a more nuanced definition of what core is, what core isn’t. Especially when you talk to vendors, they know they need to know which features they have to use and which features they can optionally implement and ones they can extend.
That puts us in this interesting position, we’ve done a lot of work over the last two years to define DefCore and help give the community clear signals but the bylaws don’t reflect all of that work. They reflect some of that but for the most part we’ve never changed the bylaws, we’ve had them in place for several years. It’s now time for us to modify the bylaws to make this core definition and there’s a couple of other provisions in bylaws that we want to be more flexible. That allow us to change things on a more regular basis.
Niki Acosta: We’re about out of time. If people want to reach out or have this discussion or look for more information about this where they can find that and how can they find information about your company?
Rob Hirschfeld: The best place for my company is go to it’s website, RackN.com. OpenCrowbar.com will also take you through all the Crowbar stuff. For DefCore, I would send people to my blog robhirschfeld.com.
Niki Acosta: Tweet at you?
Rob Hirschfeld: Tweet at me, I’m zehicle on Twitter. I’m happy to talk about that. I do need to put in a final plug. Vote in the upcoming elections or in the bylaws process–it’s also board member elections. I love for people to vote for me and get back on the board and finish the DefCore work, but even if you don’t vote for me, even if you don’t vote for the bylaws changes, please vote. We could actually not have our bylaws changes even considered by the community if we don’t get enough people voting. We need a quorum of voters so if you are in OpenStack make a point to vote. I don’t care how you vote, just vote.
Niki Acosta: All right. Real quick, last question. Two people you want to see on the show?
Rob Hirschfeld: There are two, I’m going to give you some people who should get a little bit more spotlight I’d love to see [inaudible 00:59:34] who works for [Yahweh 00:59:35], who has been really influential in test and in [ResStack 00:59:39] and DefCore work. I appreciate her time, she’d be a great guest. Shamail Tahir who works in EMC in the CTO’s office has been quietly on the product manager front and he and I talked in Paris. He has some incredibly strong insights.
Niki Acosta: Awesome. Thank you so much for joining us today, Rob. Really appreciate you taking the time. Hopefully this is a must view-video for anyone who wants to know about the nuances of what the Core definition is.
Rob Hirschfeld: Appreciate the time. Thank you.
Niki Acosta: Great. We’ll see you guys next week. Bye-bye.