Martina Pugliese - Spotting trends and tailoring recommendations: PySpark on Big Data in fashion





The interactive transcript could not be loaded.


Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on Jun 1, 2016

PyData Berlin 2016

Predicting what people like when they choose what to wear is a non-trivial task involving several ingredients. At Mallzee, the data is variegated, large and has to be processed quickly to produce recommendations on products, tailored to each user. We use PySpark to crunch large sets of different data and create models in order to generate robust and meaningful suggestions.

Mallzee is a fashion app where people can see products (clothes, shoes and accessories) and decide whether they like them or not. They can also buy products and create feeds of preferred brands and categories of items. We have large amounts of data generated by the users when they scroll and search through products and we use it to understand the user. We want to give everyone meaningful recommendations on the items they might like, hence tailoring the experience to who they are. We build pseudo-intelligent algorithms capable of extracting the style profile of a user and we crunch products data to match items to the user based on a classification model. The model is validated and statistical analyses are performed to determine the tipping point when recommendations are valuable to the user. The talk will go through the steps we implement by showing how the full data stack of Python is used in achieving this goal and how it is interfaced with a Spark cluster through PySpark in order to run Machine Learning algorithms on big data.

Comments are disabled for this video.
When autoplay is enabled, a suggested video will automatically play next.

Up next

to add this to Watch Later

Add to

Loading playlists...