‘’ ’PyPI (pip) версии TensorFlow все еще построены против CUDA 9.0 . Вам нужно будет собрать из исходного кода, если вы хотите использовать CUDA 9.2. См. Здесь для подробного обсуждения того, как это сделать ».



Тестовая среда

Linux: ubuntu 16.04
CUDA: 9.2
Драйвер: 410.79
Cudnn: 7.4.2
NCCL: 2.3
Bazel: 0.18.0
Tensorflow-gpu : 1.12.0

Шаг 1. Обновите систему

sudo apt-get update

Шаг 2. Установите зависимости

sudo apt-get install build-essential 
sudo apt-get install cmake git unzip zip
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt-get update
sudo apt-get install python2.7-dev python3.5-dev pylint

Для python-2.x

sudo apt-get install python-numpy python-dev python-pip python-wheel

Для python-3.x

sudo apt-get install python3-numpy python3-dev python3-pip python3-wheel

Шаг 3. Загрузите NCCL 2.3.7.

Загрузите NCCL v2.3.7 для CUDA 9.2, 8 ноября и 14 декабря 2018 г.
Выберите Локальные установщики (x86)
Выберите O / Независимый локальный установщик

Шаг 4. Установите NCCL 2.3.

tar -xvf nccl_2.3.7–1+cuda9.2_x86_64.txz
cd nccl_2.3.7–1+cuda9.2_x86_64/
sudo mkdir /usr/local/cuda-9.2/nccl
sudo cp -R * /usr/local/cuda-9.2/nccl
cd /usr/local/cuda-9.2/nccl
mv LICENSE.txt NCCL-SLA.txt
sudo ldconfig

Шаг 5: Установите libcupti

sudo apt-get install libcupti-dev
echo ‘export LD_LIBRARY_PATH=/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH’ >> ~/.bashrc

(Вариант): удалить старый базел из источника

rm -fr ~/.bazel ~/.bazelrc
rm $HOME/.cache/bazel -fr
sudo rm /usr/local/bin/bazel /etc/bazelrc /usr/local/lib/bazel -fr
rm -rf /usr/bin/bazel

Шаг 6: Установите bazel 0.18

sudo apt-get install pkg-config zip g++ zlib1g-dev unzip python
wget https://github.com/bazelbuild/bazel/releases/download/0.18.0/bazel-0.18.0-installer-linux-x86_64.sh
chmod +x bazel-0.18.0-installer-linux-x86_64.sh
./bazel-0.18.0-installer-linux-x86_64.sh --user
echo 'export PATH="$PATH:$HOME/bin"' >> ~/.bashrc
source ~/.bashrc
sudo ldconfig

Шаг 7: настройте Tensorflow-1.12

git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow
git checkout r1.12
./configure
Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3
Please input the desired Python library path to use.  Default is [/usr/local/lib/python3.5/dist-packages]
Do you wish to build TensorFlow with Apache Ignite support? [Y/n]: N
Do you wish to build TensorFlow with XLA JIT support? [Y/n]: N
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: N
Do you wish to build TensorFlow with ROCm support? [y/N]: N
Do you wish to build TensorFlow with CUDA support? [y/N]: Y
Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 9.0]: 9.2
Please specify the location where CUDA 9.2 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /usr/local/cuda-9.2
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]: 7.4.2
Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda-9.2]: /usr/local/cuda-9.2
Do you wish to build TensorFlow with TensorRT support? [y/N]: N
Please specify the NCCL version you want to use. If NCCL 2.2 is not installed, then you can use version 1.3 that can be fetched automatically but it may have worse performance with multiple GPUs. [Default is 2.2]: 2.3
Please specify the location where NCCL 2 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda-9.2]:/usr/local/cuda-9.2/nccl
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 7.5]: 7.2
Do you want to use clang as CUDA compiler? [y/N]: N
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: /usr/bin/gcc
Do you wish to build TensorFlow with MPI support? [y/N]: N
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: -march=native
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: N

Шаг 8: создайте Tensorflow

bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package tensorflow_pkg

Шаг 9: Установите Tensorflow

cd tensorflow_pkg

Для python-2.x

sudo pip install tensorflow*.whl

Для python-3.x

sudo pip3 install tensorflow*.whl

Шаг 10: проверьте установку Tensorflow

python
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))

Должен напечатать
Здравствуйте, TensorFlow!

(Опция): libnccl.so.2: невозможно открыть файл общих объектов: нет такого файла или каталога

Создайте ссылку на путь в $ LD_LIBRARY_PATH

sudo ln -s /usr/local/cuda-9.2/nccl/lib/libnccl.so.2 /usr/local/cuda-9.2/lib64/