Tag Archives: Bayesian

WhitePaper: Scalable Bayesian Optimization Using Deep Neural Networks


Click to Download WhitPaper

Bayesian optimization has been demonstrated as an effective methodology for the global optimization of functions with expensive evaluations. Its strategy relies on querying a distribution over functions defined by a relatively cheap surrogate model. The ability to accurately model this distribution over functions is critical to the effectiveness of Bayesian optimization, and is typically fit using Gaussian processes (GPs). However, since GPs scale cubically with the number of observations, it has been challenging to handle objectives whose optimization requires a large number of evaluations, and as such, massively parallelizing the optimization.

In this work, we explore the use of neural networks as an alternative to Gaussian processes to model distributions over functions. We show that performing adaptive basis function regression with a neural network as the parametric form performs competitively with state-of-the-art GP-based approaches, but scales linearly with the number of data rather than cubically. This allows us to achieve a previously intractable degree of parallelism, which we use to rapidly search over large spaces of models. We achieve state-of-the-art results on benchmark object recognition tasks using convolutional neural networks, and image caption generation using multimodal neural language models.

WhitePaper: Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks


Click to Download WhitePaper

Large multilayer neural networks trained with backpropagation have recently achieved state-of-the-art results in a wide range of problems. However, using backprop for neural net learning still has some disadvantages, e.g., having to tune a large number of hyperparameters to the data, lack of calibrated probabilistic predictions, and a tendency to overfit the training data. In principle, the Bayesian approach to learning neural networks does not have these problems. However, existing Bayesian techniques lack scalability to large dataset and network sizes. In this work we present a novel scalable method for learning Bayesian neural networks, called probabilistic backpropagation (PBP). Similar to classical backpropagation, PBP works by computing a forward propagation of probabilities through the network and then doing a backward computation of gradients. A series of experiments on ten real-world datasets show that PBP is significantly faster than other techniques, while offering competitive predictive abilities. Our experiments also show that PBP provides accurate estimates of the posterior variance on the network weights.