The OpenStack community gathered in Tokyo for the 12th-Liberty release of the OpenStack platform. The Foundation reported over 5,000 attended the conference–50% for the first time. Attendees were from across the globe with 46% from APAC and 38% from North America. Job roles varied and included developers (28%), user/operators (25%), manager/architects (19%), sales/marketing (11%), and CxOs (10%).
OpenStack has entered the post-excitement phase, which may appear slow-moving, but reflects deeper customer engagement and a focus on the operationalization of OpenStack. Hundreds of interesting sessions were presented by community members and recorded for those who could not be there. Check out the OpenStack Foundation Summit site for the full schedule. Common themes included overcoming the complexity of configuring, deploying and maintaining OpenStack; retaining workload flexibility; and various approaches to manageability, scalability and extensibility. Having the Summit in Japan was an opportunity to highlight Asia-based users of OpenStack, including Kirin Brewing, Yahoo Japan, NEC, NTT Resonant, GMO Internet, CyberAgent, and Rakuten.
Below are links to the strategic and technical sessions presented on Cisco solutions at the Summit.
Integrating Monasca and Ceilometer seemed like a very good idea from the start. It would integrate all the OpenStack resources notifications and metrics as well as provide a unified storage layer for Monitoring and Metering, simplifying deployment at scale as well as opening the door for new solutions that weren’t possible before.
So the team set out to make it real. The implementation was carried on a three months period and all the code, unit tests and load simulator are open-source and available in the official OpenStack repo at: https://github.com/openstack/monasca-ceilometer
There are at least two aspects that made this “marriage” titillating:
Ceilometer substantially collects metrics for all the OpenStack resources that Monasca does not currently collect today.
Ceilometer biggest issue is scalability and performance and that is where Monasca excels.
So, we embarked in this experiment and found a neat way to integrate the two services. The first part was the ingestion. Ceilometer has two main types of agents:
Notification, a.k.a. Push model agent
Central/Compute, a.k.a. Pull model agent.
Our integration with Monasca was trying to address all the cases and support both the Push and Pull model. The Compute Agent is probably the part where there is the most overlap with the Monasca Agent that is capable of polling from libvirt or other virtualization layers. We decided to extend the Publisher code in Ceilometer to integrate with Monasca client and send the “measurements” to the Monasca API.
Current Ceilometer architecture:
This brought two distinct advantages to the solution: the first is that we can integrate with any of the Ceilometer agents out of the box, so we can integrate data from all the data sources that Ceilometer supports now and in the future; the second is that we remove the RabbitMQ re-publishing of Samples.
This latter aspect is particularly problematic in large deployments. RabbitMQ clustering has some limitations around the 20M-mark load; this can slow down the queue performance and impact other services relying on the queue. In the Ceilosca case the samples are sent as “measurements” directly to the Monasca API, which stores them into Kafka. This allows for a different publishing rate from the storage rate increasing performance and optimization at the distinct layers.
The Monasca Publisher in the Ceilometer agent also leverages another important aspect in publishing to Monasca, batching. The Monasca Publisher for Ceilometer has three parameters that can be set to control the batching behavior and performance:
Batch count. This allows specifying how many messages are buffered and sent at in one http request to the Monasca. The Monasca API accepts “measurements” from different “metrics” without having to aggregate them and this is a huge performance boost.
Batch timeout. This specifies the max time to wait before committing to the batch. Usually this is helpful in the case your message bus is only handling events and not polling, which means it is rare to get a huge amount in a short window of time.
Batch checking interval. This dictates the frequency when the publisher is checking on the batch size and timeout to understand when is time to make the API POST request. Clearly this has to be carefully set to avoid repeated useless checks but cannot be too large to excessively delay the “measurements” publishing.
In several of our tests we found out the batch of 1000 messages and a timeout of 15s with a polling interval of 5s is a good compromise for a mix of Central Agent loads and OpenStack Notifications.
We all know that deploying and running OpenStack services is not the easiest thing on Earth. For this reason we wanted to move away from sophisticated deployments and make sure the deployment was well understood and one command deployment. We wanted that everybody was able to get Ceilosca to run either on a single VM (or box), so we thought to leverage DevStack…. We know, DevStack is for development and not for scale and performance testing, but guess what; if it scales in DevStack it will scale everywhere else. Hmm, not sure you should keep this statement.
What we needed next was a deployment script; a single unified script to install everything and have it running. Fortunately, both Monasca and DevStack had already deployment scripts that we could run and leverage, the only difference? Monasca uses Ansible and DevStack uses bash … so; we created a new bash script that installs devstack and then runs ansible to deploy Monasca on top of Devstack and that did the trick. Once you download the repo just go and execute:
and (depending on your env) after some time you will get a full DevStack with Ceilosca in it and you are: Ready to GO!
The Devstack+Ceilosca+Monasca is the environment where we run all the tests and we had it running both on virtual machines and baremetals.
Note, we now have a complete DevStack plugin for Monasca.
As we mentioned before the tests were running on DevStack. This is to make sure that the tests are repeatable from anyone that is interested in running them. Clearly DevStack brought some restrictions that we had to deal with it. Moreover, some of us decided to run these tests in OpenStack VM and that made it even more challenging … (hey, we may even try stuff on containers later on, maybe using Kolla…). I will post the results of the these tests in 2 separate blogs relating to Private and Public Cloud.
Ceilosca turned out to be a significant improvement over Ceilometer both during data ingestion as well as querying. The performance gain is quite staggering going from 2x to 4x in ingestion speed and throughput as well as 11x to 18x in querying. These are the main takeaways from the extensive testing we performed:
Ceilometer has an exponential performance degradation that is directly proportional to the number of tenants and resources.
Ceilometer has open-ended queries that do not force the requestor to have a query params like tenant_id and time interval. This has been mitigated with the introduction of limits at the Liberty release but still the API could be significantly improved for performance.
Ceilosca has very efficient batching capabilities across the entire workflow and it is configurable based on cloud deployment specific needs. Ceilosca also can select the metadata to be preserved and the one to be discarded. This is a high value feature.
Monasca API are nearly twice as fast than Ceilosca implementing Ceilometer V2 API. For users that do not need backward compatibility we recommend to consume the data directly from Monasca.
Cisco: Fabio Giannetti, Ken Owens, Srinivas Sakhamuri, Pauline Yeung, Steven Irvin
HP: Roland Hockmuth, Dan Dyer, Atul Aggarway, Jenny Wei, Putta Challa, Rohit Jaisway
Back in September 2014 Cisco acquired private OpenStack cloud service company Metacloud (http://newsroom.cisco.com/press-release-content?articleId=1489587). Initially known as Cisco OpenStack Private Cloud (COPC) and now known as Cisco Metapod®. Cisco Metapod represents one of most robust and scalable OpenStack-as-a-Service or On-Premise Public Cloud Experience offering in the market. With the agility and vision of a startup, the stability and expertise of Cisco, this is a solution and a service that helps businesses with the adoption of the agile/mode 2 or cloud native applications. Read More »
Because of the nature of SDN, and specifically the automation available with Cisco’s Application Centric Infrastructure, ACI works really well with cloud orchestration tools such as OpenStack. I was able to be at the OpenStack Summit in Tokyo last week and gave a vBrownBag TechTalk about why Cisco ACI makes OpenStack even better.
So, how does ACI work with OpenStack and perhaps even make it better? First, ACI offers distributed and scalable networking. It supports Floating IP addresses in OpenStack. If you’re not familiar with Floating IPs, they are essentially a pool of publicly routable IP addresses that you purchase from an ISP and assign to instances. This would be especially useful for instances, or VMs, like web servers. It also saves CPU cycles by putting the actual networking in the Nexus 9000 switches that make up the ACI fabric. Since these switches are built to forward packets, and that’s what they’re good at, why not save the CPU cycles for other things like running instances?
OpenStack doesn’t natively work with Layer 4-7 devices like firewalls and load balancers. With ACI we can stitch in these necessary network services in an automated and repeatable way. We do this in a way that doesn’t sacrifice visibility as well. While it’s important that we’re able to automate things, especially in a private or public cloud that is constantly changing and updating, if we lose visibility, we lose the ability to troubleshoot easily. In the demo, shown in the video above, you will see just how easy it is to troubleshoot problems in ACI. We also get the ability to preemptively strike before a problem causes issues on the network by offering easily interpreted health scores for the entire fabric, including hardware and end point groups.
ACI is also a very secure model. Not only does it use a white-list model where traffic is denied by default and only allowed when explicitly configured that way, it will also give more security when it comes to multi-tenancy. In a strict overlay solution, if a hypervisor is attacked or owned the multi-tenancy model could be deemed insecure. In the ACI fabric the security is done at the port level. So even if a hypervisor is attacked the tenants will be safe.
In recent versions of ACI we are able to use OpFlex as a southbound protocol to communicate between OpenStack and ACI. By using OpFlex we get a deeper integration and more visibility into the virtual environment of OpenStack. Instead of attaching hypervisor servers to a physical domain in ACI we can attach them into a VMM (Virtual Machine Manager) domain. This allows us to learn which instances or VMs are on which physical server. It will also automatically learn IP addresses, MAC addresses, states and other information. We can also see which networks or portgroups contain which hypervisors and instances within our OpenStack environments.
For more information on how Cisco ACI works with OpenStack you can go to http://cisco.com/go/aci
The OpenStack Summit is a four-day conference for developers, users, and administrators of OpenStack Cloud Software. It’s a great place to learn about how Application Centric Infrastructure (ACI) makes it better to Build – Deploy – Scale – Connect your OpenStack based applications.