Detecting Events on the Web with Java, Kafka and ZooKeeper

Scale
06/01/2015 - 15:50 to 16:30
Stage 3
long talk (40 min)
Intermediate

Session abstract: 

Brandwatch is a world-leading social media monitoring tool. We find, process and store nearly 70M mentions from the Web every day from our crawlers and firehoses. As the amount of data we find continually increases, the harder it becomes to separate the signals from the noise for our customers.

Over the last year, we have been building, experimenting and scaling a distributed cluster of JVMs that process the data from our client’s queries and detect influential mentions of their brands online and alert on unusual and important trends. We used Apache Kafka, ZooKeeper, and Spring to achieve this.

This talk explains the architecture that we built, how it performs, scales and some of the difficulties we’ve faced along the way. We’ve learned a lot in the process. We hope you'll learn from us too!

Video: 

Slide: