A consequence of the Moore Nielsen prediction is the phenomenon known as Data Gravity: big data is hard to move around, much easier for the smaller applications to come to it. Consider this: it took mankind over 2000 years to produce 2 Exabytes (2×1018 bytes) of data until 2012; now we produce this much in a day! The rate will go up from here. With data production far exceeding the capacity of the Network, particularly at the Edge, there is only one way to cope, which I call the three mega trends in networking and (big) data in Cloud computing scaled to IoT, or as some say, Fog computing:
- Dramatic growth in the applications specialized and optimized for analytics at the Edge: Big Data is hard to move around (data gravity), cannot move data fast enough to the analytics, therefore we need to move the analytics to the data. This will cause a dramatic growth in applications, specialized and optimized for analytics at the edge. Yes, our devices have gotten smarter, yes P2P traffic has become largest portion of Internet traffic, and yes M2M has arrived as the Internet of Things, there is no way to make progress but making the devices smarter, safer and, of course, better connected.
- Dramatic growth in the computational complexity to ETL (extract-transform-load) essential data from the Edge to be data-warehoused at the Core: Currently most open standards and open source efforts are buying us some time to squeeze as much information in as little time as possible via limited connection paths to billions of devices and soon enough we will realize there is a much more pragmatic approach to all of this. A jet engine produces more than 20 Terabytes of data for an hour of flight. Imagine what computational complexity we already have that boils that down to routing and maintenance decisions in such complex machines. Imagine the consequences of ignoring such capability, which can already be made available at rather trivial costs.
- The drive to instrument the data to be “open” rather than “closed”, with all the information we create, and all of its associated ownership and security concerns addressed: Open Data challenges have already surfaced, there comes a time when we begin to realize that an Open Data interface and guarantees about its availability and privacy need to be made and enforced. This is what drives the essential tie today between Public, Private and Hybrid cloud adoption (nearly one third each) and with the ever-growing amount of data at the Edge, the issue of who “owns” it and how is access “controlled” to it, become ever more relevant and important. At the end of the day, the producer/owner of the data must be in charge of its destiny, not some gatekeeper or web farm. This should not be any different that the very same rules that govern open source or open standards.
Last week I addressed these topics at the IEEE Cloud event at Boston University with wonderful colleagues from BU, Cambridge, Carnegie Mellon, MIT, Stanford and other researchers, plus of course, industry colleagues and all the popular, commercial web farms today. I was pleasantly surprised to see not just that the first two are top-of-mind already, but that the third one has emerged and is actually recognized. We have just started to sense the importance of this third wave, with huge implications in Cloud compute. My thanks to Azer Bestavros and Orran Krieger (Boston University), Mahadev Satyanarayanan (Carnegie Mellon University) and Michael Stonebraker (MIT) for the outstanding drive and leadership in addressing these challenges. I found Project Olive intriguing. We are happy to co-sponsor the BU Public Cloud Project, and most importantly, as we just wrapped up EclipseCon 2014 this week, very happy to see we are already walking the talk with Project Krikkit in Eclipse M2M. I made a personal prediction last week: just as most Cloud turned out to be Open Source, IoT software will all be Open Source. Eventually. The hard part is the Data, or should I say, Data Gravity…
Tags: Big Data, core, Data Gravity, Eclipse, edge, Enescu, ETL, Fog computing, IEEE, internet of things, IoT, krikkit, M2M, Moore, Nielsen, Open data, open source, virtualization
As information consumers that depend so much on the Network or Cloud, we sometimes indulge in thinking what will happen when we really begin to feel the effects of Moore’s Law and Nielsen’s Law combined, at the edges: the amount of data and our ability to consume it (let alone stream it to the edge), is simply too much for our mind to process. We have already begun to experience this today: how much information can you consume on a daily basis from the collective of your so-called “smart” devices, your social networks or other networked services, and how much more data is left behind. Same for machines to machine: a jet engine produces terabytes of data about its performance in just a few minutes, it would be impossible to send this data to some remote computer or network and act on the engine locally. We already know Big Data is not just growing, it is exploding!
The conclusion is simple: one day we will no longer be able to cope, unless the information is consumed differently, locally. Our brain may no longer be enough, we hope to get help, Artificial Intelligence comes to the rescue, M2M takes off, but the new system must be highly decentralized in order to stay robust, or else it will crash like some kind of dystopian event from H2G2. Is it any wonder that even today, a large portion if not the majority of the world Internet traffic is in fact already P2P and the majority of the world software downloaded is Open Source P2P? Just think of BitCoin and how it captures the imagination of the best or bravest developers and investors (and how ridiculous one of those categories could be, not realizing its potential current flaw, to the supreme delight of its developers, who will undoubtedly develop the fix — but that’s the subject of another blog).
Consequently, centralized high bandwidth style compute will break down at the bleeding edge, the cloud as we know it won’t scale and a new form of computing emerges: fog computing as a direct consequence of Moore’s and Nielsen’s Laws combined. Fighting this trend equates to fighting the laws of physics, I don’t think I can say it simpler than that.
Thus the compute model has already begun to shift: we will want our Big Data, analyzed, visualized, private, secure, ready when we are, and finally we begin to realize how vital it has become: can you live without your network, data, connection, friends or social network for more than a few minutes? Hours? Days? And when you rejoin it, how does it feel? And if you can’t, are you convinced that one day you must be in control of your own persona, your personal data, or else? Granted, while we shouldn’t worry too much about a Blade Runner dystopia or the H2G2 Krikkit story in Life, the Universe of Everything, there are some interesting things one could be doing, and more than just asking, as Philip K Dick once did, do androids dream of electric sheep?
To enable this new beginning, we started in Open Source, looking to incubate a project or two, first one in Eclipse M2M, among a dozen-or-so dots we’d like to connect in the days and months to come, we call it krikkit. The possibilities afforded by this new compute model are endless. One of those could be the ability to put us back in control of our own local and personal data, not some central place, service or bot currently sold as a matter of convenience, fashion or scale. I hope with the release of these new projects, we will begin to solve that together. What better way to collaborate, than open? Perhaps this is what the Internet of Everything and data in motion should be about.
Tags: ai, Android, artificial intelligence, Big Data, BitCoin, Blade Runner, cloud, Do Androids Dream of Electric Sheep, Fog, Fog computing, H2G2, Internet of Everything, internet of things, IoE, IoT, krikkit, M2M, Moore Law, Nielsen Law, open source, p2p, privacy, security