Compression in Lucene

Search
06/02/2015 - 17:20 to 18:00
Stage 2
long talk (40 min)
Beginner

Session abstract: 

Modern search engines can store billions of records containing both text and structured data, but as the amount of data being searched grows, so do the requirements for disk space and memory.  Various compression techniques are used to decrease the necessary storage, but still allow fast access for search.  

While Lucene has always used compression for its inverted index, compression techniques have improved and been generalized to other parts of the index, like the built-in document and column-oriented data stores. In this presentation, Ryan Ernst will give an introduction to how compression is used in Lucene, including recent improvements for Lucene 5.0.

Video: 

Slide: