Installing Tensorflow from Source with CPU Optimizations

Mayur Deshmukh — Thu, 16 Jul 2020 15:34:26 +0000

Why install Tensorflow from source?

Tensorflow comes with default settings to be compatible with as many CPUs/GPUs as it can. You can easily optimize it to use the full capabilities of your CPU such as AVX or of your GPU such as Tensor Cores leading to up to a 3x accelerated code.
The default builds from pip install tensorflow are intended to be compatible with as many CPUs as possible. If you ever have seen logs in your console while running your Tensorflow program, you must have seen such a warning- “Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA”.
Building it from the source itself might speed up your Tensorflow program significantly. TensorFlow actually warns you about doing just. We should build TensorFlow from the source for optimizing it with AVX, AVX2, and FMA whichever CPU supports.

Let's Start Installing Tensorflow -

Step 1. Install Nvidia CUDA and latest Drivers -

Adding Nvidia package repositories

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.1.243-1_amd64.deb
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo dpkg -i cuda-repo-ubuntu1804_10.1.243-1_amd64.deb
sudo apt-get update
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt install ./nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt-get update

Installing Nvidia Driver

sudo apt-get install --no-install-recommends nvidia-driver-430

To check that GPUs are visible use -

nvidia-smi

Installing CUDA 10.1 and Runtime libraries

sudo apt-get install --no-install-recommends \
    cuda-10-1 \
    libcudnn7=7.6.4.38-1+cuda10.1  \
    libcudnn7-dev=7.6.4.38-1+cuda10.1

Installing TensorRT and libcudnn7

sudo apt-get install -y --no-install-recommends libnvinfer6=6.0.1-1+cuda10.1 \
    libnvinfer-dev=6.0.1-1+cuda10.1 \
    libnvinfer-plugin6=6.0.1-1+cuda10.1

Step 2. Install Bazel

Bazel build system is used to build Tensorflow from the source. Bazel requires JDK to be installed

Installing JDK if not installed

sudo apt-get update
sudo apt install default-jre
sudo apt install default-jdk

Installing Bazel

echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
curl https://storage.googleapis.com/bazel-apt/doc/apt-key.pub.gpg | sudo apt-key add -
sudo apt-get update
sudo apt-get install bazel

Step 3. Cloning Tensorflow source code Repository

git clone https://github.com/tensorflow/tensorflow
cd tensorflow

If you want to install specific Tensorflow version then checkout to that version branch

git checkout version_branch

Step 4. Configure Tensorflow Installation

./configure

Output -

Please specify the location of python: [enter]
Please input the desired Python library path to use: [enter]
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: N
Do you wish to build TensorFlow with ROCm support? [y/N]: N
Do you wish to build TensorFlow with CUDA support? [y/N]: y
Do you wish to build TensorFlow with TensorRT support? [y/N]: y
Do you want to use clang as CUDA compiler? [y/N]: N
Please specify which gcc should be used by nvcc as the host compiler: [enter]
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: N

During configuration, if following is asked

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]:

Use these flags -

--copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.2

Step 5. Building Tensorflow with Bazel

Bazel requires GCC compiler version less than 9. check your version -

gcc --version

If version greater than 8 then following steps. If not then skip the step

sudo apt install software-properties-common
sudo add-apt-repository ppa:ubuntu-toolchain-r/test
sudo apt install gcc-7 g++-7
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-9 90 --slave /usr/bin/g++ g++ /usr/bin/g++-9 --slave /usr/bin/gcov gcov /usr/bin/gcov-9
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-7 70 --slave /usr/bin/g++ g++ /usr/bin/g++-7 --slave /usr/bin/gcov gcov /usr/bin/gcov-7
sudo update-alternatives --config gcc

Output -

There are 3 choices for the alternative gcc (providing /usr/bin/gcc).

  Selection    Path            Priority   Status
------------------------------------------------------------
* 0            /usr/bin/gcc-9   90        auto mode
  1            /usr/bin/gcc-7   70        manual mode
  3            /usr/bin/gcc-9   90        manual mode

Press <enter> to keep the current choice[*], or type selection number: [Enter number of gcc 7]

Bazel takes a lot of resources while building TensorFlow, we will increase your swapfile to 16GB to avoid problems

sudo swapoff /swapfile
sudo fallocate -l 16G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile

You can pass flag --local_ram_resources=2048 in bezel build if you have limited memory

Then call bazel to build the TensorFlow pip package

bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.2 --config=cuda //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

This bazel building can easily take up to 4-5 hours depending upon your processor. So take a break :)

Step 6. Installing TensorFlow pip package

Once bazel build is finished it will generate a tensorflow .whl file

pip3 install --upgrade /tmp/tensorflow_pkg/tensorflow-*.whl

Step 7. Test your Installation

Check if Tensorflow is installed or not

pip3 list | grep tensorflow

Check if Tensorflow is working or not

python3
>>> import tensorflow as tf
tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally

Forem: Mayur Deshmukh

Installing Tensorflow from Source with CPU Optimizations

Why install Tensorflow from source?

Step 1. Install Nvidia CUDA and latest Drivers -

Step 2. Install Bazel

Step 3. Cloning Tensorflow source code Repository

Step 4. Configure Tensorflow Installation

Output -

Step 5. Building Tensorflow with Bazel

Output -

Step 6. Installing TensorFlow pip package

Step 7. Test your Installation