<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Mayur Deshmukh</title>
    <description>The latest articles on Forem by Mayur Deshmukh (@mayurdeshmukh10).</description>
    <link>https://forem.com/mayurdeshmukh10</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F356130%2F6ab1c740-9610-4e3e-ae36-f2954b162c3e.png</url>
      <title>Forem: Mayur Deshmukh</title>
      <link>https://forem.com/mayurdeshmukh10</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/mayurdeshmukh10"/>
    <language>en</language>
    <item>
      <title>Installing Tensorflow from Source with CPU Optimizations</title>
      <dc:creator>Mayur Deshmukh</dc:creator>
      <pubDate>Thu, 16 Jul 2020 15:34:26 +0000</pubDate>
      <link>https://forem.com/mayurdeshmukh10/installing-tensorflow-from-source-with-cpu-optimizations-44k8</link>
      <guid>https://forem.com/mayurdeshmukh10/installing-tensorflow-from-source-with-cpu-optimizations-44k8</guid>
      <description>&lt;h2&gt;
  
  
  Why install Tensorflow from source?
&lt;/h2&gt;

&lt;p&gt;Tensorflow comes with default settings to be compatible with as many CPUs/GPUs as it can. You can easily optimize it to use the full capabilities of your CPU such as AVX or of your GPU such as Tensor Cores leading to up to a 3x accelerated code. &lt;br&gt;
The default builds from &lt;code&gt;pip install tensorflow&lt;/code&gt; are intended to be compatible with as many CPUs as possible. If you ever have seen logs in your console while running your Tensorflow program, you must have seen such a warning- “Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA”.&lt;br&gt;
Building it from the source itself might speed up your Tensorflow program significantly. TensorFlow actually warns you about doing just. We should build TensorFlow from the source for optimizing it with AVX, AVX2, and FMA whichever CPU supports.&lt;/p&gt;

&lt;p&gt;Let's Start Installing Tensorflow - &lt;/p&gt;
&lt;h4&gt;
  
  
  Step 1. Install Nvidia CUDA and latest Drivers -
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Adding Nvidia package repositories
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.1.243-1_amd64.deb
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo dpkg -i cuda-repo-ubuntu1804_10.1.243-1_amd64.deb
sudo apt-get update
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt install ./nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt-get update
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;Installing Nvidia Driver
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo apt-get install --no-install-recommends nvidia-driver-430
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;To check that GPUs are visible use -
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;nvidia-smi
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;Installing CUDA 10.1 and Runtime libraries
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo apt-get install --no-install-recommends \
    cuda-10-1 \
    libcudnn7=7.6.4.38-1+cuda10.1  \
    libcudnn7-dev=7.6.4.38-1+cuda10.1
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;Installing TensorRT and libcudnn7
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo apt-get install -y --no-install-recommends libnvinfer6=6.0.1-1+cuda10.1 \
    libnvinfer-dev=6.0.1-1+cuda10.1 \
    libnvinfer-plugin6=6.0.1-1+cuda10.1
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Step 2. Install Bazel
&lt;/h4&gt;

&lt;p&gt;Bazel build system is used to build Tensorflow from the source. Bazel requires JDK to be installed&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Installing JDK if not installed
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo apt-get update
sudo apt install default-jre
sudo apt install default-jdk
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;Installing Bazel
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
curl https://storage.googleapis.com/bazel-apt/doc/apt-key.pub.gpg | sudo apt-key add -
sudo apt-get update
sudo apt-get install bazel
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Step 3. Cloning Tensorflow source code Repository
&lt;/h4&gt;


&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git clone https://github.com/tensorflow/tensorflow
cd tensorflow
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;If you want to install specific Tensorflow version then checkout to that version branch&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git checkout version_branch
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h4&gt;
  
  
  Step 4. Configure Tensorflow Installation
&lt;/h4&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;./configure
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h4&gt;
  
  
  Output -
&lt;/h4&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Please specify the location of python: [enter]
Please input the desired Python library path to use: [enter]
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: N
Do you wish to build TensorFlow with ROCm support? [y/N]: N
Do you wish to build TensorFlow with CUDA support? [y/N]: y
Do you wish to build TensorFlow with TensorRT support? [y/N]: y
Do you want to use clang as CUDA compiler? [y/N]: N
Please specify which gcc should be used by nvcc as the host compiler: [enter]
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: N
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;During configuration, if following is asked&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]: 
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Use these flags -&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;--copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.2
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h4&gt;
  
  
  Step 5. Building Tensorflow with Bazel
&lt;/h4&gt;

&lt;p&gt;Bazel requires GCC compiler version less than 9. check your version -&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;gcc --version
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;If version greater than 8 then following steps. If not then skip the step&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo apt install software-properties-common
sudo add-apt-repository ppa:ubuntu-toolchain-r/test
sudo apt install gcc-7 g++-7
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-9 90 --slave /usr/bin/g++ g++ /usr/bin/g++-9 --slave /usr/bin/gcov gcov /usr/bin/gcov-9
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-7 70 --slave /usr/bin/g++ g++ /usr/bin/g++-7 --slave /usr/bin/gcov gcov /usr/bin/gcov-7
sudo update-alternatives --config gcc
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h4&gt;
  
  
  Output -
&lt;/h4&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;There are 3 choices for the alternative gcc (providing /usr/bin/gcc).

  Selection    Path            Priority   Status
------------------------------------------------------------
* 0            /usr/bin/gcc-9   90        auto mode
  1            /usr/bin/gcc-7   70        manual mode
  3            /usr/bin/gcc-9   90        manual mode

Press &amp;lt;enter&amp;gt; to keep the current choice[*], or type selection number: [Enter number of gcc 7]
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Bazel takes a lot of resources while building TensorFlow, we will increase your swapfile to 16GB to avoid problems&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo swapoff /swapfile
sudo fallocate -l 16G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;You can pass flag &lt;code&gt;--local_ram_resources=2048&lt;/code&gt; in bezel build if you have limited memory&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Then call bazel to build the TensorFlow pip package
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.2 --config=cuda //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;This bazel building can easily take up to 4-5 hours depending upon your processor. So take a break :)&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 6. Installing TensorFlow pip package
&lt;/h4&gt;

&lt;p&gt;Once bazel build is finished it will generate a tensorflow .whl file&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pip3 install --upgrade /tmp/tensorflow_pkg/tensorflow-*.whl
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h4&gt;
  
  
  Step 7. Test your Installation
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Check if Tensorflow is installed or not
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pip3 list | grep tensorflow
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Check if Tensorflow is working or not
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python3
&amp;gt;&amp;gt;&amp;gt; import tensorflow as tf
tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



</description>
      <category>tensorflow</category>
      <category>installation</category>
      <category>tutorial</category>
      <category>performance</category>
    </item>
  </channel>
</rss>
