From Machine Learning Startup to Big Data Company

Scale
06/02/2015 - 15:20 to 16:00
long talk (40 min)
Intermediate

Session abstract: 

How to start a company based on a machine learning idea? And how to scale it into the "Big Data" region?

In this talk I want to share some insights that I gathered during the last 3 years while founding and successfully scaling a real-time bidding (RTB) company from a two-person startup to a leading technology provider in the field:

  • From fancy algorithms to production-proof algorithms.
  • From thousands of model evaluations per day to trillions.
  • From megabytes to petabytes.
  • From real-time to batch to real-time.
  • From two people to entire teams of data scientists and engineers.

I want to present real world examples of pitfalls we were facing, bad technology decisions we made and other things that can and will go wrong. And how to make the best out of it!

Buzzwords involved: Hadoop, Kafka, Spark, Impala, Redis, Aerospike, …

Slides: https://speakerdeck.com/ctavan/from-machine-learning-startup-to-big-data-company

Video: