错误: Multilib version problems found. This often means that the root cause is something else and multilib version checking is just pointing out that there is a problem. Eg.: 1. You have an upgrade for audit-libs which is missing some dependency that another package requires. Yum is trying to solve this by installing an older version of audit-libs of the different architecture. If you exclude the bad architecture yum will tell you what the root cause is (which package requires what). You can try redoing the upgrade with --exclude audit-libs.otherarch ... this should give you an error message showing the root cause of the problem. 2. You have multiple architectures of audit-libs installed, but yum can only see an upgrade for one of those architectures. If you don't want/need both architectures anymore then you can remove the one with the missing update and everything will work. 3. You have duplicate versions of audit-libs installed already. You can use "yum check" to get yum show these errors. ...you can also use --setopt=protected_multilib=false to remove this checking, however this is almost never the correct thing to do as something else is very likely to go wrong (often causing much more problems). 保护多库版本:audit-libs-2.8.1-3.el7.x86_64 != audit-libs-2.8.1-3.el7_5.1.i686
Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details.
不论Docker服务因为什么原因无法启动都会报这个错误,具体的错误信息需要使用
systemctl status docker.service
或者
journalctl -xe
去查看,然后根据具体的错误再去解决(上网寻找资料)
dockerd[3518]: Error starting daemon: Error initializing network controller: error obtaining controller instance: failed to create NAT chain DOCKER: iptables failed: iptables --wait -t nat -N DOCKER: iptables v1.4.21: can't initialize iptables table `nat': Table does not exist (do you need to insmod?)
看起来是跟iptables和nat等网络有关的问题。再往上看会发现一堆
nf_nat_ipv4: Unknown symbol nf_nat_l3proto_register (err 0)
类似的错误,再往上会看到最相关的初始错误:
1 2 3
kernel: xt_conntrack: Unknown symbol nf_ct_l3proto_module_put (err 0) kernel: xt_conntrack: Unknown symbol nf_ct_l3proto_try_module_get (err 0) dockerd[3518]: time="2018-11-02T13:32:27.308082925+08:00" level=warning msg="Running modprobe xt_conntrack failed with message: `modprobe: ERROR: could not insert 'xt_conntrack': Unknown symbol in module, or unknown parameter (see dmesg)\ninstall /bin/true \ninsmod /lib/modules/3.10.0-862.14.4.el7.x86_64/kernel/net/netfilter/xt_conntrack.ko.xz`, error: exit status 1"
docker run -it --rm tensorflow/tensorflow \ python -c "import tensorflow as tf; print(tf.__version__)"
能输出TensorFlow的版本号便是成功,此处为1.11.0
1 2
docker run --runtime=nvidia -it --rm tensorflow/tensorflow:latest-gpu \ python -c "import tensorflow as tf; print(tf.contrib.eager.num_gpus())"
能输出TensorFlow调用GPU的信息以及GPU数量便是成功,此处输出为
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
2018-11-02 10:02:03.213647: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2018-11-02 10:02:03.827049: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2018-11-02 10:02:03.827656: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1411] Found device 0 with properties: name: Tesla P40 major: 6 minor: 1 memoryClockRate(GHz): 1.531 pciBusID: 0000:00:06.0 totalMemory: 22.38GiB freeMemory: 22.22GiB 2018-11-02 10:02:03.916592: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2018-11-02 10:02:03.917168: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1411] Found device 1 with properties: name: Tesla P40 major: 6 minor: 1 memoryClockRate(GHz): 1.531 pciBusID: 0000:00:07.0 totalMemory: 22.38GiB freeMemory: 22.22GiB 2018-11-02 10:02:03.917248: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1490] Adding visible gpu devices: 0, 1 2018-11-02 10:02:04.566762: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] Device interconnect StreamExecutor with strength 1 edge matrix: 2018-11-02 10:02:04.566813: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] 0 1 2018-11-02 10:02:04.566824: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0: N N 2018-11-02 10:02:04.566831: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 1: N N 2018-11-02 10:02:04.567666: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 21551 MB memory) -> physical GPU (device: 0, name: Tesla P40, pci bus id: 0000:00:06.0, compute capability: 6.1) 2018-11-02 10:02:04.960299: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 21551 MB memory) -> physical GPU (device: 1, name: Tesla P40, pci bus id: 0000:00:07.0, compute capability: 6.1) 2