File "/usr/lib/python3/dist-packages/triton_pre_mlir/compiler.py", line 901, in _compile
name, asm, shared_mem = _triton.code_gen.compile_ttir(backend, module, device, num_warps, num_stages, extern_libs, cc)
RuntimeError: Internal Triton PTX codegen error:
ptxas fatal : Value 'sm_90' is not defined for option 'gpu-name'
Hello @jwatte,
For H100 instances, you are going to have to install CUDA version 11.8 or higher.
I would suggest using the newest container found here:
NVIDIA NGC Catalog
You can also setup a Conda environment with Pytorch 2.0+CUDA 11.8. I can send the instructions if you are interested.
Let me know if this works for you.
Hi @JosephM , Can you please share instruction of setup Conda env with pytorch 2.0+CUDA 11.8 for H100 when you get a chance? Thanks. I tried setup a conda env on H100 but got the following msg: NVIDIA H100 PCIe with CUDA capability sm_90 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70 sm_75 sm_80 sm_86.
I checked pytorch version which is 2.0.1 Thanks 
Hello @artemis, thanks for confirming that CUDA 11.8 worked.
Here are the steps to setup a Conda environment with Pytorch 2.0+CUDA 11.8:
Install miniconda. You can run it without super user privileges and keep it local. See link here on how to download and configure: Miniconda — conda documentation.
For my example I will setup Python 3.9 so here’s what I downloaded and ran. Just follow the prompts:
Python 3.9
wget https://repo.anaconda.com/miniconda/Miniconda3-py39_23.3.1-0-Linux-x86_64.sh
chmod u+x Miniconda3-py39_23.3.1-0-Linux-x86_64.sh
bash Miniconda3-py39_23.3.1-0-Linux-x86_64.sh
Create your environment. I created an python3.9 environment named ‘test’. Make sure you activate your environment:
conda create -n test python=3.9
conda activate test
Install the required pytorch packages. You may notice that we use the test environment’s ‘pip’ so that it will be installed only within the domain we are setting up:
/home/ubuntu/miniconda3/envs/test/bin/pip install torch==2.0.0+cu118 torchaudio==2.0.0+cu118 torchvision==0.15.0+cu118 --extra-index-url https://download.pytorch.org/whl/cu118
Install the other CUDA packages:
conda install cudnn
conda install -c "nvidia/label/cuda-11.8.0" cuda-toolkit
conda install -c "nvidia/label/cuda-11.8.0" cuda-nvcc
conda install -c "nvidia/label/cuda-11.8.0" cuda-runtime
After that, test the pytorch installation:
python -c 'import torch ; print("PyTorch Version: ",torch.__version__) ; print("Is available: ", torch.cuda.is_available()) ; print("Current Device: ", torch.cuda.current_device()) ; print("Pytorch CUDA Compiled version: ", torch._C._cuda_getCompiledVersion()) ; print("Pytorch version: ", torch.version) ; print("pytorch file: ", torch.__file__) ; print("Number of GPUs: ",torch.cuda.device_count())'
Here’s what mine looks like:
(test) ubuntu@209-20-158-254:~$ python -c 'import torch ; print("PyTorch Verison: ",torch.__version__) ; print("Is available: ", torch.cuda.is_available()) ; print("Current Device: ", torch.cuda.current_device()) ; print("Pytorch CUDA Compiled version: ", torch._C._cuda_getCompiledVersion()) ; print("Pytorch version: ", torch.version) ; print("pytorch file: ", torch.__file__) ; print("Number of GPUs: ",torch.cuda.device_count())'
PyTorch Verison: 2.0.0+cu118
Is available: True
Current Device: 0
Pytorch CUDA Compiled version: 11080
Pytorch version: <module 'torch.version' from '/home/ubuntu/miniconda3/envs/test/lib/python3.9/site-packages/torch/version.py'>
pytorch file: /home/ubuntu/miniconda3/envs/test/lib/python3.9/site-packages/torch/__init__.py
Number of GPUs: 1
Hi, I am running into the same issue as below. I am using a conda env:
NVIDIA H100 PCIe with CUDA capability sm_90 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70 sm_75 sm_80 sm_86.
Cuda Version
nvcc: NVIDIA (R) Cuda compiler driver
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0
Packages installed
# Name Version Build Channel
_libgcc_mutex 0.1 main
_openmp_mutex 5.1 1_gnu
accelerate 0.23.0 pypi_0 pypi
bitsandbytes 0.41.1 pypi_0 pypi
bzip2 1.0.8 h7b6447c_0
ca-certificates 2023.08.22 h06a4308_0
certifi 2023.7.22 pypi_0 pypi
charset-normalizer 3.2.0 pypi_0 pypi
cmake 3.27.5 pypi_0 pypi
cudatoolkit 11.8.0 h6a678d5_0
filelock 3.12.4 pypi_0 pypi
fsspec 2023.9.1 pypi_0 pypi
huggingface-hub 0.17.2 pypi_0 pypi
idna 3.4 pypi_0 pypi
jinja2 3.1.2 pypi_0 pypi
ld_impl_linux-64 2.38 h1181459_1
libffi 3.4.4 h6a678d5_0
libgcc-ng 11.2.0 h1234567_1
libgomp 11.2.0 h1234567_1
libstdcxx-ng 11.2.0 h1234567_1
libuuid 1.41.5 h5eee18b_0
lit 16.0.6 pypi_0 pypi
markupsafe 2.1.3 pypi_0 pypi
mpmath 1.3.0 pypi_0 pypi
ncurses 6.4 h6a678d5_0
networkx 3.1 pypi_0 pypi
numpy 1.26.0 pypi_0 pypi
nvidia-cublas-cu11 11.10.3.66 pypi_0 pypi
nvidia-cuda-cupti-cu11 11.7.101 pypi_0 pypi
nvidia-cuda-nvrtc-cu11 11.7.99 pypi_0 pypi
nvidia-cuda-runtime-cu11 11.7.99 pypi_0 pypi
nvidia-cudnn-cu11 8.5.0.96 pypi_0 pypi
nvidia-cufft-cu11 10.9.0.58 pypi_0 pypi
nvidia-curand-cu11 10.2.10.91 pypi_0 pypi
nvidia-cusolver-cu11 11.4.0.1 pypi_0 pypi
nvidia-cusparse-cu11 11.7.4.91 pypi_0 pypi
nvidia-nccl-cu11 2.14.3 pypi_0 pypi
nvidia-nvtx-cu11 11.7.91 pypi_0 pypi
openssl 3.0.10 h7f8727e_2
packaging 23.1 pypi_0 pypi
pip 23.2.1 py310h06a4308_0
protobuf 4.24.3 pypi_0 pypi
psutil 5.9.5 pypi_0 pypi
python 3.10.13 h955ad1f_0
pyyaml 6.0.1 pypi_0 pypi
readline 8.2 h5eee18b_0
regex 2023.8.8 pypi_0 pypi
requests 2.31.0 pypi_0 pypi
safetensors 0.3.3 pypi_0 pypi
scipy 1.11.2 pypi_0 pypi
sentencepiece 0.1.99 pypi_0 pypi
setuptools 68.0.0 py310h06a4308_0
sqlite 3.41.2 h5eee18b_0
sympy 1.12 pypi_0 pypi
tk 8.6.12 h1ccaba5_0
tokenizers 0.13.3 pypi_0 pypi
torch 2.0.1 pypi_0 pypi
tqdm 4.66.1 pypi_0 pypi
transformers 4.33.2 pypi_0 pypi
triton 2.0.0 pypi_0 pypi
typing-extensions 4.8.0 pypi_0 pypi
tzdata 2023c h04d1e81_0
urllib3 2.0.4 pypi_0 pypi
wheel 0.38.4 py310h06a4308_0
xz 5.4.2 h5eee18b_0
zlib 1.2.13 h5eee18b_0