Deploying memory-hungry deep learning algorithms is a challenge for anyone who wants to create a scalable service. Cloud services are expensive in the long run.
Deploying models offline on edge devices is cheaper, and has other benefits as well. The only disadvantage is that they have a paucity of memory and compute power.
This blog explores a few techniques that can be used to fit neural networks in memory-constrained settings.
Continue reading “How to Fit Large Neural Networks on the Edge”