Avatar

recent study shows that, up to 85% of trained machine learning models are not deployed. Clearly, there needs to be a close collaboration among data scientists, line of business, IT, and others to ensure that data science projects succeed (instead of being just a science fair project). Yet, we continue to see that the biggest gap is between data scientists and IT.

How Often Are Trained Models Deployed? Up to 85% of trained models may never be used

Here at Cisco, we are working on a detailed case study describing how sales orders arriving in the form of faxes (yes, it’s 2019, but we still receive fax orders), Microsoft Word, Excel, and PDF files that cannot be automatically processed by the Cisco ordering tool. Typically, these orders need to be manually entered, and our customers and partners may even change these order forms from time to time. To improve this system through robotic process automation, several groups had to work together, namely:

  • Cisco IT: Provisioning a cluster of C480 ML servers enabling multiple teams to share it
  • Cisco Data Science team: Developing the deep learning data pipeline
  • Cisco Commerce team: Incorporating the deep learning models into the business process

Only with an integrated team is the data science able to make a business impact. Because of use cases like this one and many others, Cisco believes that it is critical for IT to be part of the data science team. Clearly for advanced customers with large volume of data and sophisticated data pipelines, IT is already involved in the selection of the infrastructure in support of the data pipeline. However, even for data science teams starting to experiment in the cloud, Cisco believes that it is still important to have IT to be part of the data science team to follow best practices for security and compliance. With roughly 60% of all machine learning taking place on-premise, having a hybrid cloud architecture in support of machine learning is very beneficial to any data science team. Hence, the tight collaboration between data science team and IT is critical for accelerating AI/ML deployment.

Cisco Live 2019 in San Diego

Are you coming to Cisco Live in San Diego? We will be showing multiple demos in the Data Center section of the World of Solutions, booth DC03, that will highlight operationalizing AI/M. I would like to highlight two 2 specific demos here.

Kubeflow Data Pipeline on Hybrid Cloud

In this demo, we will show a single integrated data pipeline where training is taking place on-premise with inferencing taking place in the cloud. In this type of deployment, data scientists are able to focus on the data pipeline. IT teams are able to focus on the providing consistent tools for both on-premise and cloud deployments with best practice for security and compliance. Clearly, the location of training and inferencing can be reversed depending on the specifics of the data pipeline. However, we are very excited to show the hybrid cloud infrastructure with consistent machine learning tools.

OpenShift Ansible Script

With the IT department being so busy just keeping mission critical applications and infrastructure running, IT managers are demanding both simplicity and flexibility to deploy AI/ML workloads. Simplicity is needed to enable IT to deploy AI/ML workloads quickly. Flexibility is needed to enable IT to customize specific VLAN, hardware configuration, and other parameters. At Cisco Live in San Diego, we will be demonstrating an Ansible script that automates the deployment of an OpenShift cluster on UCS with master, infrastructure, and storage nodes capable of running NVIDIA NGC containers such as TensorFlow. With this type of script, IT can quickly bring up a flexible cluster of servers customize to fit the needs of the data scientists.

Partners

Please also stop by Cisco partner booths as well.

  • NetApp has a Think Tank session in booth 1905 on Tuesday, Jun 11, 2019 discussing Is your IT Infrastructure ready for Artificial Intelligence and Machine Learning.
  • Google booth, number 2919, will be demonstrating Anthos. In addition, Google has a Think Tank session on Monday, June 10 at noon in Think Tank 2 at the World of Solutions discussing HyperFlex running Google ML Pipeline. For more information about the Google sessions at Cisco Live, take a look here.
  • SwiftStack, in booth 2334, will also show how Cisco UCS C480 ML is integrated with object storage delivering a holistic solution for machine learning.

AI/ML Sessions at Cisco Live San Diego

Here are a few sessions that you can check out focusing on Cisco AI/ML solutions on UCS and HyperFlex.

There are also additional training sessions on Cisco AI/ML solutions located just outside the Cisco Live convention center. Please register here.

See you in San Diego.

@hanyang1234

 



Authors

Han Yang

PhD, Senior Product Manager

Data Center Solutions Engineering