<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Hamza Anas</title>
    <description>The latest articles on Forem by Hamza Anas (@hamza5485).</description>
    <link>https://forem.com/hamza5485</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F536826%2F399d5387-a740-47a6-86c0-05d912336187.jpeg</url>
      <title>Forem: Hamza Anas</title>
      <link>https://forem.com/hamza5485</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/hamza5485"/>
    <language>en</language>
    <item>
      <title>A Story of Face Recognition using Python</title>
      <dc:creator>Hamza Anas</dc:creator>
      <pubDate>Mon, 14 Dec 2020 19:18:18 +0000</pubDate>
      <link>https://forem.com/hamza5485/a-story-of-face-recognition-using-python-3b1a</link>
      <guid>https://forem.com/hamza5485/a-story-of-face-recognition-using-python-3b1a</guid>
      <description>&lt;h1&gt;
  
  
  Introduction to the Problem
&lt;/h1&gt;

&lt;p&gt;I recently found myself in a position where I was supposed to find pictures of myself from a set of &lt;em&gt;~1700&lt;/em&gt; photos. I decided that coding my way out of it was the only possible solution as I do not have the temperament to sit and look at that many pictures. This would also allow me to play with python and do some deep learning stuff which I don't necessarily get to do as a web developer. Over the weekend, I wrote a small program that leverages face detection &amp;amp; recognition to find images of me from a large dataset and paste them into another directory.&lt;/p&gt;

&lt;h1&gt;
  
  
  Prerequisites
&lt;/h1&gt;

&lt;p&gt;To recreate this experiment, you will need to be aware of the following libraries:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://github.com/ageitgey/face_recognition" rel="noopener noreferrer"&gt;face_recognition&lt;/a&gt;: An awesome and very simple to use library that abstracts away all the complexity involved in face detection and face recognition.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://github.com/ageitgey/image_to_numpy" rel="noopener noreferrer"&gt;image_to_numpy&lt;/a&gt;: Yet another awesome library, written by the same author as above. It is used to load image files into NumPy arrays. (more details later on).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://github.com/skvark/opencv-python" rel="noopener noreferrer"&gt;opencv-python&lt;/a&gt;: Python bindings for the OpenCV library&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Setup
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Project Directory
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;./
│   main.py
│   logger.py    
│
└───face_found
│   │   image.JPG
│   │   _image.JPG
│   │   ...
│
└───images
│   │   image.JPG
│   │   ...
│
└───source_dataset
│   │   image.JPG
│   │   ...
│   
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;main.py&lt;/code&gt; contains the core of this project.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;logger.py&lt;/code&gt; is a simple logger service that prints out colored logs based on message severity. &lt;a href="https://github.com/hamza5485/PythonLogger" rel="noopener noreferrer"&gt;You can check it out here!&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;images&lt;/code&gt; directory contains all the images in &lt;code&gt;.JPG&lt;/code&gt; format.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;source_dataset&lt;/code&gt; directory contains sample images of the person whose face needs to be searched, in &lt;code&gt;.JPG&lt;/code&gt; format.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;face_found&lt;/code&gt; directory is an image dropbox for the search results. &lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Code Flow
&lt;/h2&gt;

&lt;p&gt;The program first iterates through the &lt;code&gt;source_dataset&lt;/code&gt; directory. For each image in the directory, the face encodings are extracted and stored inside an array. Let's call these &lt;strong&gt;"Known Faces"&lt;/strong&gt;. The code then proceeds to iterate through the &lt;code&gt;images&lt;/code&gt; directory. For each image, the face locations of each face are extracted. These locations are then used to extract the face encodings of each face. Let's call these &lt;strong&gt;"Unknown Faces"&lt;/strong&gt;. Each unknown face will now be compared against all the known faces to determine whether there is any similarity. If a similarity exists, the image will be stored inside the &lt;code&gt;face_found&lt;/code&gt; directory. &lt;/p&gt;

&lt;h2&gt;
  
  
  Code Explanation
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;INIT&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="n"&gt;SOURCE_DATA_DIR&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source_dataset/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;IMAGE_SET_DIR&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;images/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;FACE_FOUND_DIR&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;face_found/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;TOLERANCE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;
&lt;span class="n"&gt;FRAME_THICKNESS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;
&lt;span class="n"&gt;COLOR&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;DETECTION_MODEL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hog&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;log&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Logger&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;basename&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__file__&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;source_faces&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;  &lt;span class="c1"&gt;# known faces
&lt;/span&gt;
&lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This part of the code sets up the basic conditions for the script. Things to notice here are &lt;code&gt;TOLERANCE&lt;/code&gt;, &lt;code&gt;DETECTION_MODEL&lt;/code&gt; and &lt;code&gt;time.perf_counter()&lt;/code&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;TOLERANCE&lt;/code&gt; is a measure of how strict the comparison should be (distance between a known face and an unknown face). A lower value is more strict whereas a higher value is more tolerant. The documentation for &lt;code&gt;face_recognition&lt;/code&gt; mentions that a value of &lt;strong&gt;0.6&lt;/strong&gt; is typical for best performance. I initially tried &lt;strong&gt;0.7&lt;/strong&gt; and then &lt;strong&gt;0.6&lt;/strong&gt;. These resulted in some inconsistent matches. As a result, I settled for &lt;strong&gt;0.5&lt;/strong&gt;.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;DETECTION_MODEL&lt;/code&gt; are methods &lt;em&gt;(read pre-trained models)&lt;/em&gt; through which we perform face detection. For this project, I initially considered the &lt;strong&gt;CNN&lt;/strong&gt; (&lt;strong&gt;C&lt;/strong&gt;onvolutional &lt;strong&gt;N&lt;/strong&gt;eural &lt;strong&gt;N&lt;/strong&gt;etwork) model, a neural network used primarily in computer vision. To implement this, we need to ensure that we have a CUDA enabled GPU and that dlib (&lt;code&gt;face_recognition&lt;/code&gt;'s underlying core for machine learning and data analysis) can recognize this hardware.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;dlib&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cuda&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_num_devices&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;If the above code snippet gives us a value of &amp;gt;=1, then we can proceed to use CNN. We can check if dlib is using our GPU using &lt;code&gt;print(dlib.DLIB_USE_CUDA)&lt;/code&gt;. This should ideally return &lt;code&gt;True&lt;/code&gt; but if it doesn't, then we can simply set dlib to use CUDA: &lt;code&gt;dlib.DLIB_USE_CUDA = True&lt;/code&gt; and everything should work fine.  However, in my case, I was getting &lt;code&gt;MemoryError: std::bad_alloc&lt;/code&gt; when attempting to find face locations using CNN. &lt;a href="https://github.com/ageitgey/face_recognition/issues/691" rel="noopener noreferrer"&gt;From what I understand&lt;/a&gt;, this was because of the large resolution of the images &lt;em&gt;(largest: 6016 × 4016 pixels)&lt;/em&gt; that I was loading. My available solutions were to either reduce the resolution of all my images or move away from CNN (more memory intensive). For the time being, I decided to use &lt;strong&gt;HOG&lt;/strong&gt; (&lt;strong&gt;H&lt;/strong&gt;istogram of &lt;strong&gt;O&lt;/strong&gt;riented &lt;strong&gt;G&lt;/strong&gt;radients) instead. HOG utilizes classification algorithms such as the SVM to determine the existence of faces. A comparison between HOG and CNN suggests that HOG is faster in terms of computation time but less reliable when it comes to accuracy. CNN tends to be the most accurate. You can read &lt;a href="https://maelfabien.github.io/tutorials/face-detection/#3-real-time-face-detection-1" rel="noopener noreferrer"&gt;Maël Fabien's work&lt;/a&gt; on face detection where he covers both these models in great depth.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;start = time.perf_counter()&lt;/code&gt; starts a counter. This value is utilized at the end and serves as a basic time metric for code performance.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;FRAME_THICKNESS&lt;/code&gt; and &lt;code&gt;COLOR&lt;/code&gt; are optional and can be ignored.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;LOADING SOURCE IMAGES&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;loading source images&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;filename&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listdir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SOURCE_DATA_DIR&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;endswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;JPG&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;processing source image &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;img_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SOURCE_DATA_DIR&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;filename&lt;/span&gt;
        &lt;span class="c1"&gt;# using image_to_numpy to load img file -&amp;gt; fixes image orientation -&amp;gt; face encoding is found
&lt;/span&gt;        &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;image_to_numpy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load_image_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;source_img_encoding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;face_recognition&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;face_encodings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;success&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;face encoding found&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;source_faces&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;source_img_encoding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;IndexError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;no face detected&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listdir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SOURCE_DATA_DIR&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;source_faces&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="c1"&gt;# -1 for .gitignore
&lt;/span&gt;    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;source_faces&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; faces found in &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listdir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SOURCE_DATA_DIR&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; images&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We then proceed to iterate through each image in the &lt;code&gt;source_dataset&lt;/code&gt; directory. Most of it is standard python stuff. The if condition &lt;code&gt;if filename.endswith("JPG"):&lt;/code&gt;, albeit a bit crude, only exists to ignore the .gitignore file. Since all my pictures are .JPG, I have a 100% guarantee it won't fail. &lt;br&gt;
Next we construct the image path and pass it as an argument to &lt;code&gt;image_to_numpy.load_image_file()&lt;/code&gt;. To me, this is the most interesting bit in this snippet because if you've used &lt;code&gt;face_recognition&lt;/code&gt;, you know it already has it's own &lt;code&gt;load_image_file()&lt;/code&gt; function that also returns a NumPy array of the image. The reason I had to go and use another library just for loading the image was because of the image orientation. During the earliest version of this code, I was baffled at how some faces failed to be detected. &lt;a href="https://medium.com/@ageitgey/the-dumb-reason-your-fancy-computer-vision-app-isnt-working-exif-orientation-73166c7d39da" rel="noopener noreferrer"&gt;After some research&lt;/a&gt;, I learned that face detection completely fails if the image is turned sideways. &lt;code&gt;image_to_numpy&lt;/code&gt;'s load image function reads the EXIF Orientation tag and then rotates the image file if required. &lt;br&gt;
Lastly, we pass the loaded image file to the &lt;code&gt;face_recognition.face_encodings()&lt;/code&gt; function. This returns a list of face encodings for each face in the image but since I know that&lt;code&gt;source_dataset&lt;/code&gt; contains only single shots of me, I can simply access the first element in the array. Hence the &lt;code&gt;[0]&lt;/code&gt; at the end of this line. This bit is encapsulated in a &lt;code&gt;try-except&lt;/code&gt; block so that if face detection fails, the raised exception doesn't crash (as it did when pictures were being loaded sideways). Otherwise, these encodings are added inside the &lt;code&gt;source_faces&lt;/code&gt; &lt;em&gt;(known faces)&lt;/em&gt; object we instantiated in the &lt;code&gt;INIT&lt;/code&gt; part of the code.&lt;br&gt;
In the end, we check if the length of the encoded faces array is equal to the total items in our directory &lt;em&gt;(minus the .gitignore)&lt;/em&gt; and print a log if it's not. This helped me figure out the sideway quirk.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;MAIN PROCESS&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Processing dataset&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;filename&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listdir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;IMAGE_SET_DIR&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;endswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;JPG&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;processing dataset image &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listdir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;IMAGE_SET_DIR&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;img_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;IMAGE_SET_DIR&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;filename&lt;/span&gt;
        &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;image_to_numpy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load_image_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;locations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;face_recognition&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;face_locations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;DETECTION_MODEL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;encodings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;face_recognition&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;face_encodings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;locations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;face_encoding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;face_location&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;encodings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;locations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;face_recognition&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compare_faces&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;source_faces&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;face_encoding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TOLERANCE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;success&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;match found!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="c1"&gt;# optional start
&lt;/span&gt;                    &lt;span class="n"&gt;top_left&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;face_location&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;face_location&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
                    &lt;span class="n"&gt;bottom_right&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;face_location&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;face_location&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
                    &lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rectangle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;top_left&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bottom_right&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;COLOR&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;FRAME_THICKNESS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="n"&gt;cv2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;imwrite&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;FACE_FOUND_DIR&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;_&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="c1"&gt;# optional end
&lt;/span&gt;                    &lt;span class="nf"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;FACE_FOUND_DIR&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="k"&gt;break&lt;/span&gt;
                &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;no match found&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;IndexError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;no face detected&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error encountered: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;stop&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Total time elapsed &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;stop&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mf"&gt;0.4&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; seconds&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We now attempt to read each image in the &lt;code&gt;images&lt;/code&gt; directory to compare it with our list of known faces. The code starts the same way as our previous code block. Things slightly change when we enter the &lt;code&gt;try-except&lt;/code&gt; block. Before, we knew that all images were from the &lt;code&gt;source_dataset&lt;/code&gt; directory, which contained only single shots of m know we need to adjust the tolerance any time a face that isn't ours is in the face_found directory he. The current directory also contains group shots. Therefore, we will first find the locations of all faces in the image: &lt;code&gt;locations = face_recognition.face_locations(img, model=DETECTION_MODEL)&lt;/code&gt; and then use those locations to get the face encodings: &lt;code&gt;encodings = face_recognition.face_encodings(img, locations)&lt;/code&gt;. This will return a list of all face encodings found within an image. We then iteratively test each of these encodings against all our well-known/source/my faces: &lt;code&gt;results = face_recognition.compare_faces(source_faces, face_encoding, TOLERANCE)&lt;/code&gt;. If a match exists, we copy the picture to the &lt;code&gt;face_found&lt;/code&gt; directory: &lt;code&gt;copy(img_path, FACE_FOUND_DIR)&lt;/code&gt;. Else, we move on to the next face encoding and then the next image. &lt;br&gt;
Initially, when the tolerance was set to 0.7 and then 0.6, I had some pictures inside the &lt;code&gt;face_found&lt;/code&gt; directory where my face did not exist. The optional bit comes in handy here. Whichever face the code considers a match, a bounding box of &lt;code&gt;COLOR&lt;/code&gt; and &lt;code&gt;FRAME_THICKENSS&lt;/code&gt; is drawn over it using the face locations we extracted earlier: &lt;code&gt;cv2.rectangle(img, top_left, bottom_right, COLOR, FRAME_THICKNESS)&lt;/code&gt;. This version of the image is saved with an underscore preceding its name. This way, we know we need to adjust the tolerance any time a face that isn't ours has a bounding box on it).&lt;br&gt;
The very last bit stops the counter and prints the total time elapsed for this entire process.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion and Future Work
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fdhbvq3g8y8wsoqrya52q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fdhbvq3g8y8wsoqrya52q.png" alt="Conclusion" width="795" height="323"&gt;&lt;/a&gt;&lt;br&gt;
At the end of this exercise, I had 804 images in the &lt;code&gt;face_found&lt;/code&gt; directory. If half of the saved pictures have bounding boxes on them, then 402 images were a match. Not knowing how many of the pictures are mine beforehand makes it a little hard to judge the accuracy of this code. The code should ideally be modified and tested against a more contained dataset so that actual results can be evaluated. Moreover, It took ~3.7 hours (13312.6160 seconds) for this code to process ~1700 pictures, and that too with a faster face detection model. Implementing this code on a distributed processing model would be a lot more time-efficient. Finally, the program would benefit from an added block of code that creates low-resolution variants of the images and then uses the CNN model over HOG for more accurate results.&lt;br&gt;&lt;br&gt;
For now, however, I was able to extract &lt;em&gt;~400&lt;/em&gt; images out of &lt;em&gt;~1700&lt;/em&gt; without much effort. For me, that is a success story. This exercise also allowed me to revisit python and play around with face recognition, which in itself was a lot of fun. You can check out the complete &lt;a href="https://github.com/hamza5485/FindMyFace" rel="noopener noreferrer"&gt;code repo here&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>python</category>
      <category>deeplearning</category>
      <category>facerecognition</category>
    </item>
  </channel>
</rss>
