Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on May 1, 2015
PyData Dallas 2015 Scikit-Learn is one of the most popular machine learning library written in Python, it has quite active community and extensive coverage for a number of machine learning algorithms. It has feature extraction, feature and model selection algorithms, and validation methods to build a modern machine learning pipeline. It also provides more advanced structures to make the machine learning pipeline and flow even easier such as feature union, pipelines, grid parameter search and randomized parameter search. This tutorial introduces common recipes to build a modern machine learning pipeline for different input domains and show how one might construct the components using advanced features of Scikit-learn. Specifically, I will try to go over the following steps in Scikit-Learn: - Introduce various feature extraction methods for image and text - Explain how one might use various feature selection algorithms to capture information rich features and ignoring the irrelevant or redundant ones - Show various approaches and methods to do parameter optimization within Scikit-Learn - Explain and compare different validation score and metrics to evaluate the model accuracy - Introduce how one could do model selection - Show how one could deploy the model into production Then I will introduce more advanced features and methods: - Introduce pipeline structures and parameter optimization within the grid search - Randomized Search to make the parameter search more intelligibly and efficiently - Feature Unions to make the feature more diverse and rich.