Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on Nov 3, 2016
SoftElegance Data Department is building unified Data Lake for oil and gas industry. Spark is important part of that infrastructure. Using up-to-date capabilities of Big Data technologies and IoT (different sensors on the oil rigs) only recent years it is possible to proceed GB’s or TB’s of raw data, that might be collected from the rigs (transferred and stored properly), in near real time and make predictive analytics. The introduction of presentation will include architectural overview of Data Lake with short description of technologies that are used, and what is the reason for business to develop it. The main part of the presentation will show the practical example how to use Spark Streaming for data collection and preprocessing from oil rigs and than reuse it through Apache Spark MLlib for building predictive maintenance. It would be presented the math model to predict failure of rod pumps. Also, it would be shown the full cycle of data flow, with the technologies that are used for each process: injection data, preprocessing, analyze, and prediction, that will be executed during data streaming. With the most focus on Spark Streaming batch processing and MLlib. As the conclusion a few words about why it was not possible to develop predictive models before.