Ryan Wright

Real-time Graph Analytics on Streaming Data

A Talk by Ryan Wright (Philosopher-CTO, thatDot)

About this Talk

Data is infinite now. It’s not going to stop; it isn’t even going to slow down. The skills required to build and use tomorrow’s data systems will require new techniques to understand complex and endless data in real-time. That’s what we’re going to tackle in this masterclass.

The goal of this masterclass is to give participants a new superpower: to work comfortably with infinite streaming data and turn it into real-time connected insights. Welcome to the exciting world of streaming graphs!

This is a hands-on class where we learn through doing. After a little background context to understand the key new ideas, we’ll dive into a series of hand-on exercises where participants use the open source “Quine” streaming graph (https://quine.io) on batch and streaming data input.

What makes a “streaming graph” streaming is not only that data comes into it as a stream, but also, that results stream out. We’ll use a powerful capability called “standing queries” to monitor the entire graph for patterns expressed as a standard database query. Think of this as continually querying the entire graph for anything you like, except that it’s far more efficient and you don’t have to know when the right time is to issue your query; it’s just always running. Results from your query stream out in time, or can even call back into the graph to advance to the next step of an algorithm.

With a solid understanding of what we can do with streaming graphs, the class will conclude with an application of graph neural networks to streaming graphs. To make this practical in a streaming scenario, we’ll make use of Quine’s ability to maintain a fully-versioned graph and query back in time to access historical states—even while new data streams in.

This class is aimed at data engineers, data scientists, product managers, and the managers of these teams. No deep experience is assumed or required. By the end of this masterclass, students will have built several useful applications with streaming graphs, and have the foundational knowledge and skills to apply these tools in their own environments to turn infinite data streams into real-time answers to deep questions.

Key Topics

  • Streaming data vs. Batch data
  • Streaming graphs
  • Event-driven data pipelines
  • Creating a graph from streaming data using Cypher
  • Monitor dynamically changing graphs for insightful patterns
  • Implement algorithms on a dynamically changing graph
  • Streaming graph-based machine learning and graph embedding

Target Audience

  • Data Engineers
  • Data Scientists & Machine Learning Engineers
  • Data Analysts
  • Managers of the above

Goals

Get hands-on experience with graph analytics on live streaming data. Build the understanding needed to apply these techniques to problems in your own work and life.

Session outline:

Introduction to streaming graphs (20 minutes)

  • Streaming data vs. Batch data
  • Graphs in streaming data
  • Introduction to the Quine streaming graph

Hands on #1: Build a streaming graph from a static data set (30 minutes)

  • Get up and running
  • Data source and goals
  • Creating the data ingest
  • Exploring the data

Hands on #2: Build and analyze a streaming graph from a live streaming dataset (30 minutes)

  • Streaming sources
  • Standing Queries
  • Graph Algorithms

Hands on #3: Graph Neural Networks with streaming graphs (30 minutes)

  • Temporal Queries
  • Random Walks
  • Graph Neural Networks

Conclusion (10 minutes)

  • Applications in the real world and how to get there

Format

This class is very hands-on. The beginning of the class will start in a lecture format, but will quickly move to hands-on exercises.

Each of the exercises is meant to be run independently on participants’ own laptops (MacOS, Windows, or Linux). The exercises will make use of the open source Quine streaming graph software to ingest data, build graphs, perform streaming operations, and produce output from those graphs. Sample data will be provided. 

Participants will edit text files and execute command-line programs to see their changes visible in a web browser running from their own local web server. Those text files are in YAML format, where participants will learn to write Cypher queries to orchestrate and customize their streaming graph applications. Some REST API calls to the local web server may be useful for deeper understanding or customization.

The hands-on sessions will conclude with an exercise demonstrating streaming graphs for Graph Neural Networks (GNNs), where participants can execute Python code to train their own graph neural network with streaming graph data.

Level

Beginner - Intermediate

Prerequisite Knowledge

Basic familiarity with running programs at the command line and editing text files. Some familiarity with the Cypher graph query language is helpful but not required. Basic Python experience is helpful for the final exercise.


11 December 2024, 11:45 AM

Advanced Graph Stage

11:45 AM - 01:45 PM

About The Speakers

Ryan Wright

Ryan Wright

Philosopher-CTO, thatDot

thatDot turns high-volume data into high-value data. Initially funded by DARPA, we have developed new technology for enterprise users to trigger real-time action from complex patterns pulled from high-volume event streams.

Ryan Wright

Location

Convene 133 Houndsditch

133 Houndsditch, London

Neo4j

Neo4j, the Graph Database & Analytics leader, helps organizations find hidden relationships and patterns across billions of data connections deeply, easily, and quickly.

Platinum Sponsor

Ontotext

Connect the dots of your data! Ontotext helps enterprises to lower data management costs by up to 30%, enable data fabric architectures, create digital twins, utilize Graph RAG benefits, and take information delivery from days to minutes!

Gold Sponsor

Semantic Web Company / PoolParty

The vendor of PoolParty Semantic Suite. Graph-based text mining, recommender systems, and data fabric solutions.

Gold Sponsor

yWorks

yWorks specializes in the development of professional software solutions that enable the clear visualization of diagrams and networks.

Gold Sponsor

Oracle

We’re a cloud tech company that provides organisations around the world with computing infrastructure and software to help them innovate, unlock efficiencies and become more effective. We also created the world’s first – and only – autonomous database to help organise and secure our customers’ data.

Gold Sponsor

Ultipa

Ultipa builds next-gen graph XAI & real-time database empowering smart enterprises w/ smooth digital transformations.

Sliver Sponsor

Oxford Semantic Technologies

Oxford Semantic Technologies (OST) spun out from the University of Oxford and was acquired by Samsung in 2024. OST provides AI software to extract insights from big data, solving issues like medical diagnostics and financial crime. One founder is a BCS Lovelace Medal winner.

Sliver Sponsor

FlureeDB

Web3 data platform built on standards. Fluree powers connected, secure, and agile data ecosystems.

Bronze Sponsor

Senzing

Senzing is the first to deliver real-time, artificial intelligence for entity resolution. Senzing software enables organizations of all sizes to gain highly accurate and valuable insights about who is who and who is related to whom in data.

Bronze Sponsor

Semantic Partners

We partner with you, and your chosen semantic stack, to liberate your data's meaning from isolated silos.

Bronze Sponsor

Epsilla

All-in-one platform to create AI agents powered by your private data and knowledge. Make GenAI prototype to production 10 times faster. We are backed by Y Combinator. Start free today: https://epsilla.com

Bronze Sponsor

Neural Alpha

Since 2016 Neural Alpha have delivered cutting edge, sustainability centric Connected Data solutions for blue-chip corporates, financial institutions, Governments and NGOs. Our bespoke software & data solutions fuse AI, Knowledge Graphs, Taxonomies & other technologies for unprecedented insights.

Sliver Sponsor

GraphWise

Graphwise, born from the merger of Ontotext and Semantic Web Company, empowers enterprises to maximize AI ROI with trusted knowledge graph and semantic AI solutions, employing over 200 people globally across North America, Europe, and APAC.

Gold Sponsor

Lettria

Transparent, verifiable AI, Lettria lets your business docs and data deliver trustworthy AI answers.

Bronze Sponsor

Cricket Hill

Cricket Hill: Greek Organic Premium Olive Oil, Cosmo-Local Events and Tours

Partner

Want to sponsor this event? Contact Us