Following part one of our Big Data in Security series on TRAC tools, I caught up with talented data scientist Mahdi Namazifar to discuss TRAC’s work with the Berkeley AMPLab Big Data stack.
Researchers at University of California, Berkeley AMPLab built this open source Berkeley Data Analytics Stack (BDAS), starting at the bottom what is Mesos?
AMPLab is looking at the big data problem from a slightly different perspective, a novel perspective that includes a number of different components. When you look at the stack at the lowest level, you see Mesos, which is a resource management tool for cluster computing. Suppose you have a cluster that you are using for running Hadoop Map Reduce jobs, MPI jobs, and multi-threaded jobs. Mesos manages the available computing resources and assigns them to different kinds of jobs running on the cluster in an efficient way. In a traditional Hadoop cluster, only one Map-Reduce job is running at any given time and that job blocks all the cluster resources. Mesos on the other hand, sits on top of a cluster and manages the resources for all the different types of computation that might be running on the cluster. Mesos is similar to Apache YARN, which is another cluster resource management tool. TRAC doesn’t currently use Mesos.