Real time analytics with Apache Cassandra and Apache Spark

Store
06/01/2015 - 12:00 to 12:40
Stage 2
long talk (40 min)
Intermediate

Session abstract: 

Time series data is everywhere: IoT, sensor data, financial transactions. The industry has moved to databases like Cassandra to handle the high velocity and high volume of data that is now common place. However data is pointless without being able to process it in near real time. That's where Spark combined with Cassandra comes in, what was one just your storage system can be transformed into your analytics system, and you'll be surprised how easy it is!

So, join me for a whirl wind tour of how to use these two awesome open source projects for time series data. We'll cover:

  • An overview of Cassandra - Why is it so good for time series?
  • An introduction to Spark 
  • What can’t be done in Cassandra and how Spark can fill in the gaps
  • How to build analytics on top of your operational data without the typical Extract-Transform-Load
  • Specific use cases: sensor data, customer event data, financial transactions

My goal is that by the end of this talk you’ll know whether Cassandra/Spark are right for your next project.

Video: 

Slide: