How to make your NDIF experiment 130x faster
A user had reached out to me recently asking how they could make their nnsight code faster with NDIF to meet a project deadline. After looking at their code, I introduced a number of improvements that leverage nnsight features and remote execution principles. The result was a 130x improvement speedup.
The experience was successful; I drew many useful lessons, and I want to share with you these key principles, so you can too optimally implement your experiments for remote execution.
TLDR;
- If you're doing more than one forward pass, wrap them in a
model.session - Downloading large tensors can be costly, only
.save()what you need - Cache all your activations in one go
- Reduce loops with Batching Invokes
.skipwhat you can