About this Talk
Enterprises employ Graphs systems (as property graphs, Knowledge Graphs, etc) to model business logic semantic and gather insights from their entire estate. Sources of truth are generally silos and business insight requires traversal though multiple of these silos to gather insight for scenarios like Customer 360, logistics, security, etc. Graph systems are the emerging technologies to achieve that.
Data is continuously evolving in multiple ways: schema is changing and new columns or details are augmented to existing sources of truth; entire new datasets are enhancing the business logic; data itself is continuously augmented and is streaming in continuously. Analytics systems deal, in general with data at rest while in real world enterprise have data in motion.
Graph system achieve their latency and richness through indexing and/or de-normalization for fast traversal. In order to minimize the cost of such techniques, the tradeoffs is to silo the analysis most of the time but still bring the “whole” graph together when needed.
We are proposing a framework to handle the realities of data in motion using graph systems. These are generic techniques anyone can employ and model over their existing systems; our product blueprint will incorporate this framework into its roadmap. The framework is built on the following principles:
The graph model is defining the graph schema and semantic, but the indexing should be lazily built to optimize the cost. We strongly believe that graphs are valuable and accepted as indices, an overlay that is not requiring changes to sources of truth in the enterprise
Models should be hierarchical to be able to compose semantic from multiple sources, allowing either localized use of smaller graphs or global views that span over multiple graphs. Users could further project or extend these models as they need. Enterprises will be able to deploy stable semantics and users can still experiment and avoid rigid schemas.
Systems that allow accretive schema and data changes allow real-time updates to these graphs.
We will provide examples through a fictional graph scenario that the audience could relate to; we will show these techniques can be used to handle exabyte size streaming data estates in a cost effective manner.