Ready to learn about Trove? Oh, sure–you know it’s OpenStack’s database project. But do you really know what it does?
Amrith Kumar is the founder and CEO of Tesora, and on OpenStack Podcast #26, he sat down with us to talk about Trove, Tesora, and the database applications they work with. Specifically, he covered:
- What Trove does well
- Who is using it
- How it interacts with other OpenStack projects
- Why the hardware matters when it comes to databases
- What Tesora does
- How Trove is changing the way data analysts make decisions
- Why OpenStack is a wonderful teaching tool
To see who we’re interviewing next, or to sign-up for the OpenStack Podcast, check out the show schedule! Interested in participating? Tweet us at @nextcast and @nikiacosta.
For a full transcript of the interview, click read more below.
Read More »
Tags: Amrith Kumar, Cassandra, database, Mongo, MySQL, Niki Acosta, OpenStack, Oracle, Tesora, Trove
A Guest Blog by Cisco’s Frank Cicalese: Frank is a Technical Solutions Architect with Cisco, assisting customers with their design of SQL Server solutions on Cisco Unified Compute System. Before joining Cisco, Frank worked at Microsoft Corporation for 10 years, excelling in several positions, including as Database TSP. Frank has in-depth technical knowledge and proficiency with database design, optimization, replication, and clustering and has extensive virtualization, identity and access management and application development skills. He has established himself as an architect who can tie core infrastructure, collaboration, and application development platform solutions together in a way that drives understanding and business value for the companies he services.
Ah yes, it’s that time of year again. It’s time for PASS Summit! I hope all of you are having a great event thus far. During my conversations with customers and peers, I am inevitably asked “Why should we implement SQL on UCS?” In this blog I cover this very common question. First off, for those of you not familiar with Cisco UCS, please visit here when you have a moment to learn more about this great server architecture. So, why would anyone want to consider running their SQL workloads on Cisco UCS? Read on to learn about what I consider to be the top reasons to do so…
High availability is one of the most important factors for companies when it comes to considering an architecture for their database implementations. UCS provides companies with the confidence that their database implementations will be able to recover quickly from any catastrophic datacenter event in minutes as opposed to the hours if not days that it would take to recover on a competing architecture. UCS Manager achieves this through its implementation of Service Profiles. Service Profiles contain the identity of a server. The UCS servers themselves are stateless and do not acquire their personality (state) until they are associated with a Service Profile. This stateless type of architecture allows for the re-purposing of server hardware dynamically and can be utilized for re-introducing failed hardware back in to production within five to seven minutes.
Service Profiles can provide considerable relief for SQL Server administrators when re-introducing failed servers back in to production. Service Profiles make this a snap! Just un-associate the Service Profile from the downed server, associate it with a spare server and the workload will be back up and running in five to seven minutes. This is true for both virtualized and bare-metal workloads! Yes! You read that last statement correctly!! Regardless of the workload being virtual or bare-metal, Cisco UCS can move the workload from one server to another in five to seven minutes (providing they are truly stateless i.e., booting from SAN).
Since every server in UCS that is serving a workload requires that a Service Profile be associated with it, Cisco UCS Manager provides the ability to create Service Profile Templates which ease the administrative effort involved with the creation of Service Profiles. Server administrators can configure Service Profile Templates specifically for their SQL Servers and foster consistent standardization of their SQL Server implementations throughout the enterprise via these templates. Once the templates are created, Service Profiles can be created from these templates and associated to a server in seconds. Furthermore, these operations can be scripted via Cisco’s Open XML API and/or PowerShell integration (discussed next) simplifying the deployment process even more.
To learn more about Service Profile Templates and Service Profiles, please visit here.
Manage Workloads Efficiently:
Cisco UCS has very tight integration with Microsoft System Center. Via Cisco’s Operations Manager Pack, Orchestrator Integration Pack, PowerShell PowerTool and Cisco’s extensions to Microsoft’s Hyper-V switch, administrators are able to monitor, manage and maintain their SQL Server implementations proactively and efficiently on Cisco UCS. Additionally, Cisco’s PowerTool for PowerShell, with its many cmdlets, can help to automate any phase of management with System Center thus optimizing the overall management/administration of Cisco UCS even more so. All of this integration comes as a value-add from Cisco at no extra cost!
Please visit http://communities.cisco.com/ucs to learn more about, download and evaluate Cisco’s Operations Manager Pack, Orchestrator Integration Pack and PowerShell PowerTool.
Read More »
Tags: business intelligence, Cisco, database, Microsoft, Microsoft SQL Server, Microsoft SQL Server2014, Microsoft Windows Server 2012, Nexus 1000v, OLTP, UCS
In the last chapter of our five part Big Data in Security series, expert Data Scientists Brennan Evans and Mahdi Namazifar join me to discuss their work on a cloud anti-phishing solution.
Phishing is a well-known historical threat. Essentially, it’s social engineering via email and it continues to be effective and potent. What is TRAC currently doing in this space to protect Cisco customers?
Brennan: One of the ways that we have traditionally confronted this threat is through third-party intelligence in the form of data feeds. The problem is that these social engineering attacks have a high time dependency. If we solely rely on feeds, we risk delivering data to our customers that may be stale so that solution isn’t terribly attractive. This complicates another issue with common approaches with a lot of the data sources out there: many attempt to enumerate the solution by listing compromised hosts and in practice each vendor seems to see just a small slice of the problem space, and as I just said, oftentimes it’s too late.
We have invested a lot of time in looking at how to avoid the problem of essentially being an intelligence redistributor and instead look at the problem firsthand using our own rich data sources – both external and internal – and really develop a system that is more flexible, timely, and robust in the types of attacks it can address.
Mahdi: In principle, we have designed and built prototypes around Cisco’s next generation phishing detection solution. To address the requirements for both an effective and efficient phishing detection solution, our design is based on Big Data and machine learning. The Big Data technology allows us to dig into a tremendous amount of data that we have for this problem and extract predictive signals for the phishing problem. Machine learning algorithms, on the other hand, provide the means for using the predictive signals, captured from historical data, to build mathematical models for predicting the probability of a URL or other content being phishing.
Read More »
Tags: analytics, Big Data, Cisco, cloud, database, email, innovation, Intelligence, operations, phishing, security, TRAC, TRAC Big Data Analysis
Following part three of our Big Data in Security series on graph analytics, I’m joined by expert data scientists Dazhuo Li and Jisheng Wang to talk about their work in developing an intelligent anti-spam solution using modern machine learning approaches on Hadoop.
What is ARS and what problem is it trying to solve?
Dazhuo: From a high-level view, Auto Rule Scoring (ARS) is the machine learning system for our anti-spam system. The system receives a lot of email and classifies whether it’s spam or not spam. From a more detailed view, the system has hundreds of millions of sample email messages and each one is tagged with a label. ARS extracts features or rules from these messages, builds a classification model, and predicts whether new messages are spam or not spam. The more variety of spam and ham (non-spam) that we receive the better our system works.
Jisheng: ARS is also a more general large-scale supervised learning use case. Assume you have tens (or hundreds) of thousands of features and hundreds of millions (or even billions) of labeled samples, and you need them to train a classification model which can be used to classify new data in real time.
Read More »
Tags: analytics, ARS, auto rule scoring, Big Data, Cisco, database, email, Hadoop, ham, innovation, Intelligence, offline learning, online learning, operations, security, spam, TRAC
Following part two of our Big Data in Security series on University of California, Berkeley’s AMPLab stack, I caught up with talented data scientists Michael Howe and Preetham Raghunanda to discuss their exciting graph analytics work.
Where did graph databases originate and what problems are they trying to solve?
Michael: Disparate data types have a lot of connections between them and not just the types of connections that have been well represented in relational databases. The actual graph database technology is fairly nascent, really becoming prominent in the last decade. It’s been driven by the cheaper costs of storage and computational capacity and especially the rise of Big Data.
There have been a number of players driving development in this market, specifically research communities and businesses like Google, Facebook, and Twitter. These organizations are looking at large volumes of data with lots of inter-related attributes from multiple sources. They need to be able to view their data in a much cleaner fashion so that the people analyzing it don’t need to have in-depth knowledge of the storage technology or every particular aspect of the data. There are a number of open source and proprietary graph database solutions to address these growing needs and the field continues to grow.
Read More »
Tags: analytics, Big Data, Cisco, database, Gremlin, InfiniteGraph, innovation, Intelligence, NoSQL, operations, security, Titan, TRAC, TRAC Big Data Analysis