Runtimeerror: cudnn error: cudnn_status_not_initialized error occurs because of initialization failure of cuDNN. The initialization failure happens for cuDNN in multiple cases like insufficient memory allocation of GPU for cuDNN, version incompatibility of packages like cuDNN, PyTorch, etc. In this article, We will explore these edge cases in detail with solutions.
Runtimeerror: cudnn error: cudnn_status_not_initialized (Solution ) –
As we have discussed the root cause of the above Runtimeerror, now we will discuss the solutions. But I will request you to follow the solution in order to save your time and effort.
1. Solution 1: Use torch.cuda.empty_cache() –
This error mainly occurs because of memory issues with GPU and insufficient space of cuDNN. The First thing we can do is to clear the memory manually using the below function.
torch.cuda.empty_cache()
It will deallocate the memory.
2. Solution 2: Force cuDNN installation dynamic memory allocation –
Generally, PyTorch consumes the most of memory if we do not launch the cuDNN . To avoid this, you can forcefully install cuDNN in the start of the code ( Early Stage )
def force_cudnn_initialization():
s = 32
dev = torch.device('cuda')
torch.nn.functional.conv2d(torch.zeros(s, s, s, s, device=dev), torch.zeros(s, s, s, s, device=dev))
Please alter the function to better fit in your use case and code if required.
3. Solution 3: Upgrade cuDNN and Pytorch –
Just to verify and make sure cuDNN and Pytorch are compatible. We can reinstall the same and apart from it usually there are some more packages we need to install and reinstall torchvision , torchaudio, etc. Usually, if install the latest version of these modules. It will work but for the safe side, you can use the below version to avoid any issues. Please run the below command for reinstallation.
pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html
Similar Issues :
RuntimeError: CUDA error: device-side assert triggered ( Solved )
Runtimeerror: cuda error: invalid device ordinal ( Solved )
AssertionError: torch not compiled with cuda enabled ( Fix )
ImportError: Could not find ‘nvcuda.dll’ TensorFlow : Solution
Thanks
Data Science Learner Team
Join our list
Subscribe to our mailing list and get interesting stuff and updates to your email inbox.