What is Data Gravity?
The term ‘Data Gravity’ was generated by Dave McCrory in 2010. In his blog post he asserted that, analogous to gravitational pull between objects in space, “. . . as data builds mass there is a greater likelihood that additional services and applications will be attracted to this data.” Once the attraction builds, the question is ‘does the data go the service or application or does the service or application go to the data?’ The answer is – it depends. So let’s look at a few considerations.
Evaluating a design scenario starts by asking several questions. Questions like where is the data generated? Where does it create value? When do you need to know what the data means? And what happens if you need to retain the ability to operate in a disconnected state? To put these questions in a context that informs design, let’s examine a few scenarios.
Data gravity: Starting at the Edge
If your data is generated at the edge, you need to exploit that data in real-time at the edge, and you need to retain specific operating capability when disconnected, then the answer is ‘the network needs to support AI/ML at the first device capable of hosting the required storage and compute’. With that said, the design task turns to:
- defining a minimum mission viability level of localized compute and storage
- capitalizing on compute capability within Cisco’s Industrial Routers
- implementing data capture and context-aware routing strategies
- building apps that can tolerate connection degradation, and exploiting shared capacity among edge network peers.
Data gravity: Looking Inward to the Operational Center
Let’s look at another case where we consume edge data for use at the operational level where workflow optimization takes place. At this level, relevant data from the edge may characterize incoming mission demands, expected arrival of supporting resources and currently deployed capacity. In this setting, scheduling optimization applications consume data from multiple sources, reaching into enterprise sources and historical archives on-and off-prem. They go beyond the enterprise to interface with partner organizations. And they interface with third party services seamlessly (via Cisco CloudCenter).
When data gravity rests at the operational level, the design task may consider hybrid and multicloud strategies based on Cisco’s Hyperflex and Cisco Application Centric Infrastructure to meet on-prem compute, storage, and resiliency requirements. Cisco SD-WAN orchestrates off-prem interfaces ensuring consistent access control while Cisco Cloudlock maintains the segmentation boundaries between ecosystem member data and assures application privacy.
Going Big at the Strategic Level
Opposite the edge scenario is the case where data gravity rests at the strategic level. In this case, cross-enterprise data is centralized to support large-scale, deep introspection to find hidden connections, define unrealized relationships, and support downstream application development and training. Designs for AI/ML operations at this level feature:
- high capacity storage in data lakes efficiently hosted on Cisco UCS
- high intensity compute executable on the Cisco 480ML M5 Server that hosts eight NVIDIA GPUs and NVIDIA NVLink to provide over 100 Teraflops of deep learning performance!
Keeping it real on data gravity
So, what does reality look like? It probably reflects a combination of all three scenarios: the edge, the operational center and the strategic heights. This is a combination that calls for a distributed AI/ML architecture postured for today. But it should be capable of evolving in the future. Data gravity will likely shift in transforming organizations as they incorporate new data at the edge, make new data connections within their ecosystem and create growing demand to constantly learn about how their operations are evolving. So, make it easy to work from Edge to Cloud with Cisco.
Artificial Intelligence: Join the conversation
Cisco is proud to be a Platinum Sponsor of this year’s NVIDIA GPU Technology Conference, the premier event on artificial intelligence. Be sure to visit our booth (#104) and we’ll bring you up to speed on the latest Cisco solutions and our expanding AI ecosystem of partners. Plus, be sure to join us for two exciting Cisco led sessions.
When: November 5-6
Where: Washington DC
Government attends for FREE
Social Media: #GTC19