Mocha.jl is a deep learning framework for Julia, inspired by the C++ Caffe framework. It offers efficient implementations of gradient descent solvers and common neural network layers, supports optional unsupervised pre-training, and allows switching to a GPU backend for accelerated performance. The development of Mocha.jl happens in relative early days of Julia. Now that both Julia and the ecosystem has evolved significantly, and with some exciting new tech such as writing GPU kernels directly in Julia and general auto-differentiation supports, the Mocha codebase becomes excessively old and primitive. Reworking Mocha with new technologies requires some non-trivial efforts, and new exciting solutions already exist nowadays, it is a good time for the retirement of Mocha.jl. Mocha has a clean architecture with isolated components like network layers, activation functions, solvers, regularizers, initializers, etc.
Features
- Built-in Julia deep learning framework inspired by Caffe
- Efficient implementations of stochastic gradient solvers and standard neural network layers
- Native Julia implementation with simple installation and minimal dependencies
- Optional GPU backend leveraging NVIDIA libraries (cuBLAS, cuDNN)
- Switchable between CPU and GPU modes via minimal code changes
- Supports unsupervised pre-training architectures like stacked auto-encoders