Max Humber - Patsy: The Lingua Franca to and from R





The interactive transcript could not be loaded.


Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on Jul 26, 2017

How to build R-like statistical models in Python with Patsy and scikit-learn.

Creating linear and logistic models in R is dead simple. If your numpy/panda-fu isn’t all that great than it’s a lot harder to do in Python. In R, for instance, you can declare a model with a formula as simple as y ~ x1 + x2. But in Python, you have to split out your target and input variables and make sure that the matrices work within the scikit-learn API.

In this talk I will introduce the Patsy package for describing and creating statistical models in Python. I’ll walk through how to implement a logistic regression with Patsy and scikit-learn and I’ll emphasize Patsy as a bridge for those who want to better understand Python and/or R.


PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.

PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.

Comments are turned off
When autoplay is enabled, a suggested video will automatically play next.

Up next

to add this to Watch Later

Add to

Loading playlists...