WhitePaper: Predicting Alzheimer’s disease: a neuroimaging study with 3D convolutional neural networks

Click to Download WhitePaper

Pattern recognition methods using neuroimaging data for the diagnosis of Alzheimer’s disease have been the subject of extensive research in recent years. In this paper, we use deep learning methods, and in particular sparse autoencoders and 3D convolutional neural networks, to build an algorithm that can predict the disease status of a patient, based on an MRI scan of the brain. We report on experiments using the ADNI data set involving 2,265 historical scans. We demonstrate that 3D convolutional neural networks outperform several other classifiers reported in the literature and produce state-of-art results.

Convolutional Neural Networks

via Convolutional Neural Networks – Andrew Gibiansky.

In the previous post, we figured out how to do forward and backward propagation to compute the gradient for fully-connected neural networks, and used those algorithms to derive the Hessian-vector product algorithm for a fully connected neural network.

Next, let’s figure out how to do the exact same thing for convolutional neural networks. While the mathematical theory should be exactly the same, the actual derivation will be slightly more complex due to the architecture of convolutional neural networks.


Click to Download WhitePaper

We propose a simple two-step approach for speeding up convolution layers within large convolutional neural networks based on tensor decomposition and discriminative fine-tuning. Given a layer, we use non-linear least squares to compute a low-rank CP-decomposition of the 4D convolution kernel tensor into a sum of a small number of rank-one tensors. At the second step, this decomposition is used to replace the original convolutional layer with a sequence of four convolutional layers with small kernels. After such replacement, the entire network is fine-tuned on the training data using standard backpropagation process.
We evaluate this approach on two CNNs and show that it yields larger CPU speedups at the cost of lower accuracy drops compared to previous approaches. For the 36-class character classification CNN, our approach obtains a 8.5x CPU speedup of the whole network with only minor accuracy drop (1% from 91% to 90%). For the standard ImageNet architecture (AlexNet), the approach speeds up the second convolution layer by a factor of 4x at the cost of 1% increase of the overall top-5 classification error.