Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on Jul 27, 2016
Vladimir Rodionov (Hortonworks) / Clara Xiong (Flurry/Yahoo!)
Time-series applications (sensor data, application/system logging events, user interactions etc) present a new set of data storage challenges: very high velocity and very high volume of data. This talk will present the recent development in Apache HBase that make it a good fit for time-series applications. With petabytes of data on thousands of nodes replicated across multiple data centers, growing at an accelerating rate, we have been running a workload at scale with a bottleneck of IO bandwidth. This talk covers a new compaction policy to improve efficiency for time-range scans of various look-back windows by structuring and maintaining a date-tiered store file layout for time-series data with infrequent updates and deletes.