Tag Archives: Deep Neural Networks

Mathematical Intuition for Performance of Rectified Linear Unit in Deep Neural Networks


via Mathematical Intuition for Performance of Rectified Linear Unit in Deep Neural Networks | Alexandre Dalyac – Academia.edu.

Everyone thought it was great to use differentiable, symmetric, non-linear activation functionsin feed-forward neural networks, until Alex Krizhevsky [8] found that Rectifier Linear Units, despitebeing not entirely differentiable, nor symmetric, and most of all, piece-wise linear, were compu-tationally cheaper and worth the trade-off with their more sophisticated counterparts. Here are just a few thoughts on the properties of these activation functions, a potential explanation for whyusing ReLUs speeds up training, and possible ways of applying these insights for better learningstrategies

WhitePaper: Learning Activation Functions to Improve Deep Neural Networks


Click to Download WhitePaper

Artificial neural networks typically have a fixed, non-linear activation function at each neuron. We have designed a novel form of piecewise linear activation function that is learned independently for each neuron using gradient descent. With this adaptive activation function, we are able to improve upon deep neural network architectures composed of static rectified linear units, achieving state-of-the-art performance on CIFAR-10 (7.51%), CIFAR-100 (30.83%), and a benchmark from high-energy physics involving Higgs boson decay modes.