 Hi, my name is Augustinus Christiadi, and I'm going to present about our paper titled Learnable Insurmountability under Laplace Approximations. This is a joint work between me, Mattias Sain and Filip Henig, and we're from the University of Tubingen, Germany. So we know that Bayesian neural networks promise calibrated predictive uncertainty. And Laplace approximations provide a cheap way to construct Bayesian neural networks. But perhaps due to their approximate natures, in practice, their uncertainty is not necessarily calibrated. And in this talk, we want to answer whether we can improve the uncertainty of Laplace approximations in Bosch hoc and thus cheap manner. So the standard way we train deep neural network is by finding just a single point estimate in the parameter space. And this point estimate will induce predictions while they might be accurate, they are overconfident because they don't have notion of uncertainty. Laplace approximation can be used to mitigate this issue. Laplace approximations basically just surrounding the map estimate with a Gaussian that depends on the lost landscape geometry at that point. And by marginalization, we can arrive at a prediction that is as accurate as the map estimate prediction but it has accompanying error bar or uncertainty. But not this far from the data, we can still have narrow error bar, meaning that Laplace approximations can still be overconfident and thus not calibrated. And our work fixes this issue. We do so by augmenting the lost landscape such that we can find another point which under a Laplace approximation will induce the same prediction but with better uncertainty characteristic. And to do so, we propose a method called Lula. And the premise of Lula is very, very simple. So suppose we have a pre-trained network with its map estimated parameters, then we simply add additional hidden units at each layer of this network. And because we have changed architecture of the network, then we have to adjust the parameters. And we do so as follows. We form block matrices and block vectors where its block element is either the map estimated values of the weights the zero constant weights or free parameters. And all those free parameters are Lula parameters. That is the parameters associated with the additional hidden units that we can set freely. This construction might be very simple but it has interesting properties. The first properties is that Lula construction will not change the output of the network and that's the prediction. Secondly, under a particular patient treatment over the network, we can show that the output variance will change. And lastly, the gradient of the loss with respect to the additional weight in the last layer of the augmented network is nonlinear in Lula's parameters. The first and the second properties imply that Lula units are indeed unsung to units because they only affect the variance and not the prediction itself. And the construction of Lula, along with the third property here, imply that Lula units are learnable because we have free parameters and non-triphily affect the loss landscape. And remember that Laplace approximations exploit the loss landscape geometry. And therefore, Lula is particularly suitable for manipulating Laplace approximations. And we can train Lula as follows. First, we form a Laplace approximation over the augmented network. Then, of course, this approximate posterior will induce a predictive distribution. And we want to find a value of Lula parameters such that they induce good predictive distribution in the following sense. We use uncertainty over the loss, which basically simply says that we want to be more uncertain on odd layers while remain confident on the true data. And all of this can be done post hoc, meaning that this amounts to a post-processing step after the map training. And thus, this procedure is cheap. Now let's see some results. The first one is toy classification data set. Here, we can see that a Laplace approximation can be used to give notion of uncertainty in the input space. But notice that far from the data, we can still have high confident predictions and Lula fixes this. In more challenging scenarios, in data sets shift robustness tasks with rotated Amnist data set and corrupted Cypher 10 data sets, we can see that Lula in green improves the base Laplace approximation in yellow in all metrics considered. The same observation can also be seen in odd of distribution data detection. Here, we use two different kinds of Laplace approximations to show that Lula consistently improves Laplace approximations of various kinds. And indeed, Lula makes both Laplace approximations better in all tasks considered in general. And so to conclude, we have introduced Lula which is a very simple construction where we add hidden units to a pre-trail network and we show that this unit can be used as answer the unit that does not affect the prediction. But they can be trained in a positive manner such that they induce good predictive uncertainty under a Laplace approximation. And empirically, we have seen that Lula is effective in a various kinds of quantification tasks. Thank you very much for your attention.