Approaching Join Index for Lucene

Search
06/01/2015 - 17:00 to 17:40
Stage 2
long talk (40 min)
Intermediate

Session abstract: 

Lucene works great with independent text documents, but real life problems often require to handle relations between documents. Aside of several workarounds, like term encodings, field collapsing or term positions, we have two mainstream approaches to handle document relations: join and block-join. Both have their downsides. Join lacks performance, while block-join makes is really expensive to handle index updates, since it requires to wipe a whole block of related documents.

This session presents an attempt to apply join index, borrowed from RDBMS world, for addressing drawbacks of the both join approaches currently present in Lucene. We will look into the idea per se, possible implementation approaches, and review the benchmarking results.      

Join us! Get to know about forthcoming cool feature!

During the session attendees will learn

  • How modern join algorithms work, their strengths and weaknesses.
  • How RDBMS’s join indices can be applied for Lucene
  • What use cases can benefit from join indices, supported by benchmarks.

Video: 

Slide: