Avatar

Things that seem obvious today were not always that way. At some point, someone with a bit of courage and a flash of insight makes a bold move—like sticking a digital camera on the back of a phone. The rest of the world responds with a collective “of course!” and the world is changed, never to look back.

We had one of those moments a couple of weeks ago at Mobile World Congress in Barcelona when Rakuten announced their Rakuten Cloud Platform or RCP. Mickey Mikitani, Chairman, President and CEO of Rakuten introduced RCP the following way:

Rakuten has a founding vision of empowering people to realize their dreams and a history of disrupting the status quo to take the lead, in industries from e-commerce to fintech and digital content. We are very excited to launch a mobile network in Japan that is set to become the first choice of consumers and change global standards in telecommunications.

If you want to better understand how Rakuten is building RCP, I have some deep dive technical links at the end of this blog. For now, I wanted to explore why Rakuten decided to invest the time, effort and resources in building RCP.

For a while now, there has been growing tension between apps and services and the infrastructure they depend upon. This tension has increased as the center of gravity for app and service deployment has moved into the cloud. This, in turn, has given rise to cloud native architectures which further exacerbate stresses on infrastructure that was not originally designed for this brave new world.  At the customer end of things, we are now engaging with them in more ways and in more places. Not only do we have an explosion of phones and tablets, we are about to see an even larger explosion of connected cars, drones, cameras, refrigerators, and—my favorite—cows. Customers expect consistent and predictable services regardless of if they are at home, at play or on the move. Almost every network operator is making the investments to keep up with this sea change. But interestingly enough, app and service owners are also looking to take greater control of their own destiny. We saw the first movement in this direction with the large web players getting involved with projects like the Telecom Infra Project (TIP) and CORD. Their objective was to help service providers upgrade the infrastructure on which those web players were dependent to meet their growth goals. Netflix has, for years worked with ISPs to help improve the streaming experience of their subscribers. Rakuten has simply taken the logical next step. They are a cloud-first, mobile-first business and now they are building out bespoke infrastructure that is precisely calibrated to their needs. Moreover, as their business grows and evolves, Rakuten can be assured that their infrastructure will keep up with minimal lag.

While not everyone wants to be or can be Rakuten, it is worthwhile understanding what they did and why they did it, as that insight will be valuable to anyone contemplating an architecture refresh. Tareq Amin, CTO of Rakuten, built RCP around three guiding principles:

  • Zero Touch, End-to-End Automation and Assurance
  • Software Defined Programmable Infrastructure
  • Distributed and Common Carrier Grade Telco Cloud

Looking at the first two principles, we can tell this is an architecture meant to be run by machines (hello SkyNet!). When we look at the scale of Rakuten’s vision and their goals for service agility and customer experience, it’s really the only feasible approach. For velocity, agility and cost reasons, humans simply cannot be inline to the day-to-day operations of RCP. To make this a reality, two things need to happen. First, every element of RCP needs to be programmable. For most of you reading this, deployment of programmable infrastructure (and the ability to take advantage of it) is opportunistic and incremental. Any progress is good news, however, there is significant difference between 99% programmable and 100% programmable. Anything less than 100% means at some point, someone is still sitting at a keyboard and introducing friction into your workflows and acting as a constraint on your business. Cisco’s contribution to Rakuten’s programmable infrastructure goal was our NFVI solution and our IOS-XE, IOS-XR and ACI-based transport platforms. They all provide rich, capable, programmatic interfaces that met all of Rakuten’s design requirements–no keyboards required.

In concert with programmability is automation. Much like programmability, partially automating a service chain is helpful, but having 100% coverage of your end-to-end service chain really unlocks new possibilities around how you build and deliver services. Are example, operationally, you lower costs of operation and reduce the time to stand-up and tear-down service chains. That opens up the door to more dynamic capacity management, auto-scaling and assurance management.  That increases your efficiency and utilization which further lowers opex and frees budget dollars for further investment and a virtuous cycle is spawned. From a customer experience perspective, real benefit comes from minimizing the lag between creation of services and ability of the infrastructure to support them. This frees service owners to iterate offers more quickly, experiment more easily and makes customization and personalization more feasible.

Rakuten’s RCP automation framework is two-tiered to provide flexibility and horizontal scalability. The bottom tier is comprised of four domains: central data center, WAN, edge data center and far edge data center. The domain level automation is built from a combination of Cisco Network Services Orchestrator (NSO), the NFVO Function Pack for NSO, Cisco Elastic Services Controller (ESC) as a virtual network function (VNF) manager, and, on an interim basis, other partner VNF managers—Rakuten’s mid-term goal is to consolidate on ESC.  NSO then uses a feature called Layered Services Architecture (LSA) to tie those four domains together with a cross-domain instance of NSO. Together, this framework provides RCP with fast, dependable, scalable, sophisticated end-to-end service orchestration. Rakuten then takes advantage of the rich northbound software interfaces NSO offers to tie the automation framework to their OSS and BSS systems.

The final principle, distributed and common carrier grade telco cloud, is a reflection of the changing nature of traffic. It no longer makes sense to try and serve subscribers from some far-away central data center. Providers can also no longer make assumptions as to where their customers are located. Instead, RCP needs to be able to serve customers wherever they are, whichever device they are on, whatever service they are consuming. For both customers and service owners, Rakuten needs to be able to pervasively deliver consistent capabilities and predictable customer experience. Let’s take a closer look at how they do that and where we contribute to the effort.

A “telco cloud” is essentially a private cloud optimized for hosting virtualized network functions (VNFs). It is built from NFV Infrastructure (NFVI) that hosts the VNFs and a management and orchestration layer (MANO—discussed earlier). Cisco Virtualized Infrastructure Manager (CVIM) is an open, modular containerized NFVI software solution that forms the building blocks of RCP. The RCP deployment embeds Red Hat Enterprise Linux and Red Hat OpenStack Platform. Beyond support for Cisco and 3rd-party VNFs, CVIM provides key features like security hardening, automated zero-touch provisioning and full lifecycle management of VNFs. Underpinning it all, Cisco ACI and Cisco Nexus 9000 series switches link network, compute and storage resources.

RCP’s CVIM building blocks are flexible and fungible so a collection of CVIMs can be adapted to support any service or application today or in the future. This gives Rakuten great cost efficiencies with RCP, but it also gives service owners great freedom to build new services and get them deployed quickly without worry about what the infrastructure can or cannot do. At the same time, these basic NFVI building blocks can be deployed anywhere along the service chain that makes sense, since managing a CVIM instance in the central data center is no different than managing one in a far edge data center. Along those same lines, VNFs, content and resources can be placed and even moved around on the fly to optimize operations and customer experience—distributing them to wherever makes the most sense.

Mickey Mikitani stated “[w]ith automation and virtualization, Rakuten is redefining how mobile networks are designed and how services can be consumed.” RCP seems ready to do exactly that. Not only will their investment in RCP help Rakuten and its customers, it will serve as lab for their peers to learn and the industry to evolve.

As Rakuten progresses from design to field trials to production, I’ll be posting follow-on blog posts documenting their progress and what they have learned. If you want to get under the hood of RCP, I suggest you check out this very thorough blog post by my friend, Santanu Dasgupta. There is also an in-depth white paper authored by Cisco, Rakuten and Altiostar that is worth checking out



Authors

Omar Sultan

Director, PLM, Automation + AI

Cisco Networking / Provider Connectivity