They say that data about your data is more important than the data itself. Having the right data in the data warehouse at the right time or loaded up for Hadoop Analysis is critical. I have heard of stories where the wrong product was sent to the wrong store for sale due to incorrect conclusions on what was selling best. This was due to reports and decisions being made on the wrong data. This can be a resume impacting decision in this modern world of data driven product placements around the globe. In previous blog about Enterprise Job Scheduling (aka Workload Automation) http://blogs.cisco.com/datacenter/workload-automation-job-scheduling-applications-and-the-move-to-cloud/ I discussed the basic uses of automating and scheduling batch workloads. Business intelligent, data warehousing and Big Data initiatives need to aggregate data from different sources and load them into very large data warehouses.
Let’s look into the life of the administrator and operations of a workload automation tool. The typical Enterprise may have thousands if not ten thousands of job definitions. Those are individual jobs that get run: look for this file in a drop box, FTP data from that location, extract this specific set of data from an Oracle database, connect to that windows server and launch this process, load this data into a datawarehouse using Informatica PowerCenter, run this process chain in SAP BW and take that information to this location. All this occurs to get the right data in the right place at the right time. These jobs are then strung together in a sequences we in the Intelligent Automation Solutions Business Unit at Cisco call Job Groups. These groups can represent business processes that are automated. They many have 10’s to hundreds of steps. Each job may have dependency on other jobs for completion. The jobs may be waiting for resources to become available. This all leads to a very complex execution sequence. These jobs groups run every day; some run multiple times a day, some only run at the end of the quarter.
The typical IT operations team has a group of people that design, test and implement these job groups by working with people in business IT that design and implement business processes. Often times these job groups need to finish by a certain time to meet the needs of the business. If you are a stock exchange some job groups have to finish say in so many hours after the market closes. If you have to get your data to a downstream business partner (or customer) by a certain time you become very attached to watching those jobs execute. No pun intended, your job may be on the line.
A new technology has hit the scene for our customers of the Cisco Tidal Enterprise Scheduler. It is called JAWS Historical and Predictive Analytics. http://www.termalabs.com/products/cisco-tidal-enterprise-scheduler.html . These modules takes all historical and real time performance data information from the Scheduler and through a set of algorithms produce historical, real-time, predictive, and business analytics historical and predictive analytics. This is the data about the data I mentioned previously. Our customers can do what if analyses as well as get early indication that a particular job group is not able to finish in time. The administrators can take action before it is too late. This is critical in getting the data in the right place so that analytics can be performed correctly and therefore not sending 1000 of the wrong product to the wrong store location. Thanks to our partners from Terma Software Labs http://info.termalabs.com/cisco-systems-and-terma-software-labs-to-join-forces-for-more-sla-aware-workload-processing/ .