Your Photos Are Just 512 Numbers

Alemnew Marie — Tue, 03 Feb 2026 08:28:28 +0000

The surprisingly simple math behind image search, reverse image lookup, and duplicate detection.

Modern vision models can represent an image as a vector of numbers (often 512 dimensions) that capture its semantic meaning.

This is the principle behind visual search and duplicate detection in real-world systems.

The Trick in 3 Steps

1. Turn images into fingerprints

A neural network (CLIP) looks at an image and outputs 512 numbers. Same image → nearly identical embeddings.

# Input: cat.jpg
# Output: [0.023, -0.156, 0.891, ... 512 numbers]

2. Compare fingerprints with dot product

Two images are similar if their embedding vectors point in the same direction. We measure this with cosine similarity.

After normalizing the vectors, cosine similarity reduces to a simple dot product.

cat1.jpg  →  [0.1, 0.9, -0.3] ──┐
                                ├──→ 0.94 (94% similar) ✓
cat2.jpg  →  [0.2, 0.8, -0.2] ──┘

dog.jpg   →  [-0.8, 0.1, 0.5] ──┐
                                ├──→ 0.12 (12% similar) ✗

3. Set a threshold

If similarity > 0.90: it's a match. Same photo, slight crop, resize, or filter.

If similarity < 0.50: completely different content.

Exact thresholds depend on your dataset and model, but these are common starting points

🚀 See it in action: https://image-similarity.balewgize.app/

Why This Works

CLIP (Contrastive Language-Image Pre-training) was trained on 400 million image-text pairs. It learned that a photo of a "golden retriever on beach" should have similar numbers to "dog running on sand" - even if pixels don't match.

Traditional image comparison looks at pixels. Embeddings capture meaning.

This lets you search for images based on what they actually show, instead of relying on filenames or alt-text.

While this model uses 512 numbers, larger models may use 768 or 1024 dimensions to capture finer semantic detail.

What You Can Build

Use Case	How
Duplicate detection	Find photos that are 95%+ similar
Reverse image search	Find visually similar images in a database
Content moderation	Detect near-duplicates of flagged images
Image clustering	Group photos by visual similarity

The Full Picture

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Image 1   │────→│  Embedding  │────→│             │
│             │     │ (512 dims)  │     │   Cosine    │
└─────────────┘     └─────────────┘     │  Similarity │──→ 0.94 → MATCH
                                        │  (0 to 1)   │
┌─────────────┐     ┌─────────────┐     │             │
│   Image 2   │────→│  Embedding  │────→│             │
│             │     │ (512 dims)  │     └─────────────┘
└─────────────┘     └─────────────┘

That’s it conceptually. The hard work is already baked into the model.

💻 Source Code: GitHub Repo Link

🌐 Live Demo: Demo Link

If you’re exploring similar problems, let’s connect.

Code snippet

Install: pip install open-clip-torch opencv-python-headless pillow

import cv2
import numpy as np
import torch
import open_clip
from PIL import Image
from typing import Any, Optional, Tuple

_model: Optional[Any] = None
_preprocess: Optional[Any] = None
_device: Optional[str] = None


def _load_model() -> Tuple[Any, Any, str]:
    """Load OpenCLIP model and preprocessing on CPU/GPU."""
    global _model, _preprocess, _device
    if _model is not None:
        if _preprocess is None or _device is None:
            raise RuntimeError("Model cache is incomplete")
        return _model, _preprocess, _device

    _device = "cuda" if torch.cuda.is_available() else "cpu"
    # RN50 is fast and lightweight; ViT-based models give better quality at higher cost.
    _model, _, _preprocess = open_clip.create_model_and_transforms(
        "RN50", pretrained="openai"
    )
    if isinstance(_preprocess, (list, tuple)):
        _preprocess = _preprocess[-1]
    _model = _model.to(_device).eval()
    if _preprocess is None or _device is None:
        raise RuntimeError("Model initialization failed")
    return _model, _preprocess, _device


def _load_image(path):
    """Read an image from disk as BGR array."""
    img = cv2.imread(path)
    if img is None:
        raise FileNotFoundError(path)
    return img


def _embed(img: np.ndarray) -> np.ndarray:
    """Compute a normalized CLIP embedding."""
    model, preprocess, device = _load_model()
    rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    tensor = preprocess(Image.fromarray(rgb)).unsqueeze(0).to(device)
    with torch.inference_mode():
        emb = model.encode_image(tensor)
        emb = emb / emb.norm(dim=-1, keepdim=True)
    return emb.cpu().numpy().flatten()


def compare(img1, img2):
    """Return cosine similarity for two image paths."""
    e1 = _embed(_load_image(img1))
    e2 = _embed(_load_image(img2))
    return float(np.dot(e1, e2))


if __name__ == "__main__":
    image1 = "path/to/image1.jpg"
    image2 = "path/to/image2.jpg"
    score = compare(image1, image2)
    print("Similarity score:", score)

Think Python: A book that helped me build strong fundamentals of Python

Alemnew Marie — Thu, 09 Mar 2023 18:05:56 +0000

If you are a complete beginner aiming to learn Python, or you have some programming experience in Python and/or other programming languages but want to master the fundamentals deeply, then give this book a try.

In this post, I am going to share how the book really helped me to understand the fundamentals of Python and programming concepts in general from scratch.

Why Think Python?

In programming, it is well known that understanding the fundamentals of the language you want to learn is the crucial first step of the career. Once you master and practice the fundamentals very well, you can do the following at ease.

learning advanced concepts
learning frameworks and tools of the language
understanding other people’s code (especially from Stack Overflow 😅)
working on more complex projects
debugging your program
even switching to other languages

Here is why I find this book to be really important (at least for me)

The author explains fundamental programming concepts easily with interesting real-world examples. The book is organized in a way enabling you to comprehend advanced concepts at each step slowly by mastering the simpler ones.

As it says in the title, it makes you think like a computer scientist by changing your way of thinking about programming. The book is full of programming concepts explained in beginner-friendly and interesting ways. It doesn’t merely teach you about Python syntax, but about Python programming and how to use it in solving real-world problems.

The book has a section called Debugging at the end of each chapter where you will learn how to acquire debugging skills and easily debug your programs. Bugs are inevitable in programming, so unless you learn how to easily fix them, you may find it difficult to go further in programming, and even start to hate coding. Debugging is a must-have skill and the book teaches you this crucial skill practically step by step.

Last but not least, the book has excellent exercises at the end of each chapter that asks you to practice what you learned in that chapter and previous chapters. They challenge you to think creatively like a programmer and help you sharpen your programming as well as problem-solving skills.
How to read it?

There is a famous quote saying (I don’t exactly remember the author)

I hear and I forget. I see and I remember. I do and I understand.

Everyone agrees the quote applies to programming very well. The best way of learning programming is by doing it. So it is really important to do (or at least try) the exercises at the end of each chapter on your own.

Once you have done or tried, you can take a look at the solutions of the author. When you take a look at the solutions, study how the author approaches the problem and how he expresses the solution in code.

You can download the book in PDF format here: Download

You can find solutions to exercises here: Solutions

This book really helped me to understand Python and programming very well, so sharing in case it helps someone out there :)

If you have any thoughts, let me know in the comment below.

Happy learning!

Forem: Alemnew Marie