Rafael Schultze-Kraft - Building smart IoT applications with Python and Spark




Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on Jul 26, 2017

In this talk I will present how we use Python, PySpark and AWS as our preferred data science stack for the Internet of Things, which allows us to efficiently develop and deploy smart data applications on top of IoT sensor data. We use these technologies to analyse and model IoT timeseries data, as well as to build automated and scalable data pipelines for smart IoT data applications in the cloud.

The Internet of Things and Industry 4.0 are here, bringing along a vast amount of connected devices and sensors producing even more data.

In order to build smart applications on top of IoT sensor data we need to deal with the challenges that come along time-series data from a large amount of devices.

At WATTx we build data application prototypes in the field of smart homes, smart buildings, and smart climate, which involves making use of data coming from many IoT sensors measuring -- amongst others -- temperature, humidity, motion, and luminance.

The purpose of this talk is to present how we use Python and Spark to effectively analyse and model IoT data. In particular I will introduce how we use Python to process and model data from multiple IoT sensors, build machine learning models on top of it, and use Spark to scale and deploy our models in automated data pipelines in the cloud as smart IoT applications.

I will use the development of predictive models for smart building applications as a real-world example to demonstrate this setup.

I hope that this talk will give valuable insights on how Python and PySpark in conjunction with AWS are powerful tools to work with time-series sensor data from the Internet of Things and build data products on top of it.


PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.

PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.

Comments are turned off
When autoplay is enabled, a suggested video will automatically play next.

Up next

to add this to Watch Later

Add to

Loading playlists...