I am trying to estimate the VRAM needed for a fully connected model without having to build/train the model in pytorch.
I got pretty close with this formula:
# params = number of parameters
# 1 MiB = 1048576 bytes
estimate = params * 24 / 1048576
This example model has 384048000 parameters, but I have tested this on different models with different parameter sizes.
The results are pretty accurate. However, the estimation only takes into account the pytorch session VRAM, not the driver/cuda buffer VRAM amount. Here are the estimated (with the formula) versus empirical results (using nvidia-smi after building/training the model)
ESTIMATE BEFORE EMPIRICAL TEST:
VRAM estimate = 8790.1611328125MiB
EMPIRICAL RESULT AFTER BUILDING MODEL:
GPU RAM for pytorch session only (cutorch.max_memory_reserved(0)/1048576): 8466.0MiB
GPU RAM including extra driver buffer from nvidia-smi: 9719MiB
Any ideas on how to estimate that extra VRAM shown in nvidia-smi output?