'How neural networks learn' - Part III: The learning dynamics behind generalization and overfitting





The interactive transcript could not be loaded.


Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on Mar 10, 2019

In this third episode on "How neural nets learn" I dive into a bunch of academical research that tries to explain why neural networks generalize as wel as they do. We first look at the remarkable capability of DNNs to simply memorize huge amounts of (random) data. We then see how this picture is more subtle when training on real data and finally dive into some beautiful analysis from the viewpoint on information theory.

Main papers discussed in this video:
First paper on Memorization in DNNs: https://arxiv.org/abs/1611.03530
A closer look at memorization in Deep Networks: https://arxiv.org/abs/1706.05394
Opening the Black Box of Deep Neural Networks via Information: https://arxiv.org/abs/1703.00810

Other links:
Quanta Magazine blogpost on Tishby's work: https://www.quantamagazine.org/new-th...
Tishby's lecture at Stanford: https://youtu.be/XL07WEc2TRI
Amazing lecture by Ilya Sutkever at MIT: https://youtu.be/9EN_HoEk3KY


When autoplay is enabled, a suggested video will automatically play next.

Up next

to add this to Watch Later

Add to

Loading playlists...