Andrew Rowan - Bayesian Deep Learning with Edward (and a trick using Dropout)





The interactive transcript could not be loaded.


Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on May 15, 2017

Filmed at PyData London 2017

Bayesian neural networks have seen a resurgence of interest as a way of generating model uncertainty estimates. I use Edward, a new probabilistic programming framework extending Python and TensorFlow, for inference on deep neural nets for several benchmark data sets. This is compared with dropout training, which has recently been shown to be formally equivalent to approximate Bayesian inference.

Deep learning methods represent the state-of-the-art for many applications such as speech recognition, computer vision and natural language processing. Conventional approaches generate point estimates of deep neural network weights and hence make predictions that can be overconfident since they do not account well for uncertainty in model parameters. However, having some means of quantifying the uncertainty of our predictions is often a critical requirement in fields such as medicine, engineering and finance. One natural response is to consider Bayesian methods, which offer a principled way of estimating predictive uncertainty while also showing robustness to overfitting.

Bayesian neural networks have a long history. Exact Bayesian inference on network weights is generally intractable and much work in the 1990s focused on variational and Monte Carlo based approximations [1-3]. However, these suffered from a lack of scalability for modern applications. Recently the field has seen a resurgence of interest, with the aim of constructing practical, scalable techniques for approximate Bayesian inference on more complex models, deep architectures and larger data sets [4-10].

Edward is a new, Turing-complete probabilistic programming language built on Python [11]. Probabilistic programming frameworks typically face a trade-off between the range of models that can be expressed and the efficiency of inference engines. Edward can leverage graph frameworks such as TensorFlow to enable fast distributed training, parallelism, vectorisation, and GPU support, while also allowing composition of both models and inference methods for a greater degree of flexibility.

In this talk I will give a brief overview of developments in Bayesian deep learning and demonstrate results of Bayesian inference on deep architectures implemented in Edward for a range of publicly available data sets. Dropout is an empirical technique which has been very successfully applied to reduce overfitting in deep learning models [12]. Recent work by Gal and Ghahramani [13] has demonstrated a surprising formal equivalence between dropout and approximate Bayesian inference in neural networks. I will compare some results of inference via the machinery of Edward with model averaging over neural nets with dropout training.


PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.

We aim to be an accessible, community-driven conference, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.

Comments are turned off
When autoplay is enabled, a suggested video will automatically play next.

Up next

to add this to Watch Later

Add to

Loading playlists...