添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

I can never build it according to instruction: GitHub - pytorch/pytorch: Tensors and Dynamic neural networks in Python with strong GPU acceleration
For ubuntu 18.04 here is the last few lines of build failure (log at bottom is Centos 8 build failure as well)

Get:2 http://ppa.launchpad.net/deadsnakes/ppa/ubuntu bionic/main amd64 python3.9-distutils all 3.9.11-1+bionic1 [190 kB] Fetched 315 kB in 2s (194 kB/s) debconf: delaying package configuration, since apt-utils is not installed Selecting previously unselected package python3.9-lib2to3. (Reading database ... 14757 files and directories currently installed.) Preparing to unpack .../python3.9-lib2to3_3.9.11-1+bionic1_all.deb ... Unpacking python3.9-lib2to3 (3.9.11-1+bionic1) ... Selecting previously unselected package python3.9-distutils. Preparing to unpack .../python3.9-distutils_3.9.11-1+bionic1_all.deb ... Unpacking python3.9-distutils (3.9.11-1+bionic1) ... Setting up python3.9-lib2to3 (3.9.11-1+bionic1) ... Setting up python3.9-distutils (3.9.11-1+bionic1) ... root@ixt-hq-178:/pytorch# python3 setup.py install Traceback (most recent call last): File "/pytorch/setup.py", line 219, in <module> from setuptools import setup, Extension, find_packages File "/usr/lib/python3/dist-packages/setuptools/__init__.py", line 14, in <module> from setuptools.dist import Distribution, Feature File "/usr/lib/python3/dist-packages/setuptools/dist.py", line 24, in <module> from setuptools.depends import Require File "/usr/lib/python3/dist-packages/setuptools/depends.py", line 7, in <module> from .py33compat import Bytecode File "/usr/lib/python3/dist-packages/setuptools/py33compat.py", line 54, in <module> unescape = getattr(html, 'unescape', html_parser.HTMLParser().unescape) AttributeError: 'HTMLParser' object has no attribute 'unescape' root@ixt-hq-178:/pytorch# sudo apt install python3-distutils sudo: unable to resolve host ixt-hq-178 Reading package lists... Done Building dependency tree Reading state information... Done python3-distutils is already the newest version (3.6.9-1~18.04). python3-distutils set to manually installed. 0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded. root@ixt-hq-178:/pytorch#

For centOS 8.x:

[root@centos pytorch]# !123 python setup.py build --cmake-only Traceback (most recent call last): File "setup.py", line 219, in Traceback (most recent call last): File "setup.py", line 219, in from setuptools import setup, Extension, find_packages File "/usr/local/lib/python3.8/site-packages/setuptools/**init** .py", line 18, in from setuptools.dist import Distribution File "/usr/local/lib/python3.8/site-packages/setuptools/dist.py", line 34, in from setuptools import windows_support File "/usr/local/lib/python3.8/site-packages/setuptools/windows_support.py", line 2, in import ctypes File "/usr/local/lib/python3.8/ctypes/**init** .py", line 7, in from _ctypes import Union, Structure, Array ModuleNotFoundError: No module named '_ctypes'

no i did not help, already tried it and it fails.

/root/pt/Python-3.9.10/pytorch/torch/nn/functional.pyi.in -> /root/pt/Python-3.9.10/pytorch/torch/nn/functional.pyi.in skipped /root/pt/Python-3.9.10/pytorch/torch/utils/benchmark/utils/timeit_template.cpp -> /root/pt/Python-3.9.10/pytorch/torch/utils/benchmark/utils/timeit_template.cpp skipped /root/pt/Python-3.9.10/pytorch/torch/utils/benchmark/utils/valgrind_wrapper/compat_bindings.cpp -> /root/pt/Python-3.9.10/pytorch/torch/utils/benchmark/utils/valgrind_wrapper/compat_bindings.cpp skipped /root/pt/Python-3.9.10/pytorch/torch/utils/benchmark/utils/valgrind_wrapper/timer_callgrind_template.cpp -> /root/pt/Python-3.9.10/pytorch/torch/utils/benchmark/utils/valgrind_wrapper/timer_callgrind_template.cpp skipped /root/pt/Python-3.9.10/pytorch/torch/utils/data/datapipes/datapipe.pyi.in -> /root/pt/Python-3.9.10/pytorch/torch/utils/data/datapipes/datapipe.pyi.in skipped Traceback (most recent call last): File "/root/pt/Python-3.9.10/pytorch/setup.py", line 424, in check_pydep Building wheel torch-1.12.0a0+git23383b1 -- Building version 1.12.0a0+git23383b1 importlib.import_module(importname) File "/usr/local/lib/python3.9/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 1030, in _gcd_import File "<frozen importlib._bootstrap>", line 1007, in _find_and_load File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked ModuleNotFoundError: No module named 'yaml' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/root/pt/Python-3.9.10/pytorch/setup.py", line 906, in <module> build_deps() File "/root/pt/Python-3.9.10/pytorch/setup.py", line 370, in build_deps check_pydep('yaml', 'pyyaml') File "/root/pt/Python-3.9.10/pytorch/setup.py", line 426, in check_pydep raise RuntimeError(missing_pydep.format(importname=importname, module=module)) RuntimeError: Missing build dependency: Unable to `import yaml`. Please install it via `conda install pyyaml` or `pip install pyyaml`
ModuleNotFoundError: No module named 'yaml'
Please install it via `conda install pyyaml` or `pip install pyyaml`

so did you try to install the missing package?

This is already installed. That is why i am questioning here, why it is asking for packages that is already installed.

[root@slurm-0 /]# pip install pyyaml
Collecting pyyaml
Downloading PyYAML-6.0-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (661 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 661.8/661.8 KB 9.8 MB/s eta 0:00:00
Installing collected packages: pyyaml
Successfully installed pyyaml-6.0
WARNING: Running pip as the ‘root’ user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: 12. Virtual Environments and Packages — Python 3.12.0 documentation
[root@slurm-0 /]# pip3 install pyyaml
Requirement already satisfied: pyyaml in /usr/local/lib/python3.9/site-packages (6.0)
WARNING: Running pip as the ‘root’ user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: 12. Virtual Environments and Packages — Python 3.12.0 documentation
[root@slurm-0 /]#

Actually this time, somehow it moves forward with build. Now I am getting following:

(last lines of build)

/root/pt/pytorch/torch/csrc/utils/variadic.cpp → /root/pt/pytorch/torch/csrc/utils/variadic.cpp skipped
/root/pt/pytorch/torch/csrc/utils/variadic.h → /root/pt/pytorch/torch/csrc/utils/variadic.h skipped
/root/pt/pytorch/torch/lib/libshm/alloc_info.h → /root/pt/pytorch/torch/lib/libshm/alloc_info.h skipped
/root/pt/pytorch/torch/lib/libshm/core.cpp → /root/pt/pytorch/torch/lib/libshm/core.cpp skipped
/root/pt/pytorch/torch/lib/libshm/err.h → /root/pt/pytorch/torch/lib/libshm/err.h skipped
/root/pt/pytorch/torch/lib/libshm/libshm.h → /root/pt/pytorch/torch/lib/libshm/libshm.h skipped
/root/pt/pytorch/torch/lib/libshm/manager.cpp → /root/pt/pytorch/torch/lib/libshm/manager.cpp skipped
/root/pt/pytorch/torch/lib/libshm/socket.h → /root/pt/pytorch/torch/lib/libshm/socket.h skipped
/root/pt/pytorch/torch/lib/libshm_windows/core.cpp → /root/pt/pytorch/torch/lib/libshm_windows/core.cpp skipped
/root/pt/pytorch/torch/lib/libshm_windows/libshm.h → /root/pt/pytorch/torch/lib/libshm_windows/libshm.h skipped
/root/pt/pytorch/torch/nn/functional.pyi.in → /root/pt/pytorch/torch/nn/functional.pyi.in skipped
/root/pt/pytorch/torch/utils/benchmark/utils/timeit_template.cpp → /root/pt/pytorch/torch/utils/benchmark/utils/timeit_template.cpp skipped
/root/pt/pytorch/torch/utils/benchmark/utils/valgrind_wrapper/compat_bindings.cpp → /root/pt/pytorch/torch/utils/benchmark/utils/valgrind_wrapper/compat_bindings.cpp skipped
/root/pt/pytorch/torch/utils/benchmark/utils/valgrind_wrapper/timer_callgrind_template.cpp → /root/pt/pytorch/torch/utils/benchmark/utils/valgrind_wrapper/timer_callgrind_template.cpp skipped
/root/pt/pytorch/torch/utils/data/datapipes/datapipe.pyi.in → /root/pt/pytorch/torch/utils/data/datapipes/datapipe.pyi.in skipped
gmake: Makefile: No such file or directory
gmake: *** No rule to make target ‘Makefile’. Stop.
Building wheel torch-1.12.0a0+gitebeea9e
– Building version 1.12.0a0+gitebeea9e
cmake3 --build . --target install --config Release – -j 56

This is when building with amdgpu
PYTORCH_ROCM_ARCH=gfx90? python setup.py install

on platform with nvidia gpu, it fails also spectacularly with following error:

root@nonroot-MS-7B22:/build-scripts/pt/pytorch# python setup.py install  | tee build-pt.log
CMake Error: File /usr/local/bin/cmake/Modules/CMakeSystem.cmake.in does not exist.
CMake Error at /usr/local/share/cmake-3.16/Modules/CMakeDetermineSystem.cmake:185 (configure_file):
  configure_file Problem configuring file
Call Stack (most recent call first):
  CMakeLists.txt:27 (project)
CMake Error at /usr/local/share/cmake-3.16/Modules/CMakeDetermineCXXCompiler.cmake:23 (include):
  include could not find load file:
    /usr/local/bin/cmake/Modules/CMakeDetermineCompiler.cmake
Call Stack (most recent call first):
  CMakeLists.txt:27 (project)
CMake Error at /usr/local/share/cmake-3.16/Modules/CMakeDetermineCXXCompiler.cmake:64 (_cmake_find_compiler):
  Unknown CMake command "_cmake_find_compiler".
Call Stack (most recent call first):
  CMakeLists.txt:27 (project)
CMake Error: CMAKE_CXX_COMPILER not set, after EnableLanguage
CMake Error: CMAKE_C_COMPILER not set, after EnableLanguage
-- Configuring incomplete, errors occurred!
See also "/build-scripts/pt/pytorch/build/CMakeFiles/CMakeOutput.log".
Building wheel torch-1.12.0a0+git5375b2e
-- Building version 1.12.0a0+git5375b2e
cmake -GNinja -DBUILD_PYTHON=True -DBUILD_TEST=True -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/build-scripts/pt/pytorch/torch -DCMAKE_PREFIX_PATH=/usr/local/lib/python3.9/site-packages;/anaconda3 -DCMAKE_ROOT=/usr/local/bin/cmake -DPYTHON_EXECUTABLE=/usr/bin/python -DPYTHON_INCLUDE_DIR=/usr/local/include/python3.9 -DPYTHON_LIBRARY=/usr/local/lib/libpython3.9.a -DTORCH_BUILD_VERSION=1.12.0a0+git5375b2e -DUSE_NUMPY=True /build-scripts/pt/pytorch
root@nonroot-MS-7B22:/build-scripts/pt/pytorch#
 GGPYTORCH000:

CMake Error: File /usr/local/bin/cmake/Modules/CMakeSystem.cmake.in does not exist.

I don’t know what’s causing this, but cmake ships with this file in this location so it seems you cmake setup might be broken?

i encountered this problem as well.
error message prompts the file cmake did not find is located at differ from cmake bin running location. then, i found out setup.py define a CMAKE_ROOT env to a unexpected dir. inspecting building process, it tells script inherit the envs from your system envs. finally, i find out CMAKE_ROOT defined in .bashrc, which i copied from another development machine. that’s why the error caused and the bug is really rare.

GGPYTORCH000:

and then later on cmake randomly and then appears to have bailed out when that logic is shot down

Unfortunately, you did not follow up with any constructive feedback or more information besides pointing fingers at PyTorch. Unsure what your expectation is since I can build from source and did not see your error before.

GGPYTORCH000:

I dont think pytorch ever built successfully or reliably based on the instruction provided publiclt.

I will happily point to our nightly builds using these instructions. :slight_smile:

tk0320:

i found out setup.py define a CMAKE_ROOT env to a unexpected dir. inspecting building process, it tells script inherit the envs from your system envs. finally, i find out CMAKE_ROOT defined in .bashrc, which i copied from another development machine. that’s why the error caused and the bug is really rare.

That’s a great explanation and thanks for following up! This would explain why cmake could not find the file I’ve pointed out was shipped in its package.

altough, i execute the building process different from you, but i think the problem is the same.

-DCMAKE_ROOT=/usr/local/bin/cmake results in searching into path likes/usr/local/bin/cmake/Modules/.... however, as you see, /usr/local/share/cmake-3.16/Modules/ is the corrected path.

i dont think this will solve to anything → “I will happily point to our nightly builds using these instructions. :slight_smile:

has been down on this path before, will work for timebeing only to see somethign else broken. I am not interested in being QA job for pytorch build issues.

My better suggestion is that build documentation is improved an if CI is running, failed builds are reduced . It is not only the error i have seen when I attempted numeriosu times to build torch, and failed with all sort of issues.
This compared to many open source project that i worked that builds flawlessly.