Forem: Imthadh Ahamed

From Code Push to Docker Hub: CI/CD with GitHub Actions🚀

Imthadh Ahamed — Thu, 09 Oct 2025 11:45:51 +0000

Imagine you’re running a bakery. Every morning, you bring fresh ingredients (your new code). But instead of kneading, baking, packaging, and delivering each loaf by hand, you want a conveyor belt that handles everything the moment the ingredients arrive.

That conveyor belt is your CI/CD pipeline.
Each time you git push, it builds, tests, packages, and ships your product automatically.

In this article, we’ll create that conveyor belt for your Node.js app using GitHub Actions and Docker.

Why Automate Docker Builds?

Why It Matters
Manual builds are slow, repetitive, and error-prone. Automation brings consistency and peace of mind.

Explain Like a Friend
Think of manually assembling furniture for every order — you’ll eventually forget screws or swap panels. A factory line ensures the same result every single time. CI/CD is that factory line for software.

Real-World Example
A small dev team pushes hotfixes daily. Without automation, they rebuild locally, tag manually, and sometimes push to the wrong repo. With GitHub Actions, every push triggers a clean, predictable build → tag → push pipeline.

Action Step
Write down every manual command you currently run to deploy your Docker image. That’s your “to-automate” checklist.

Prerequisites: Accounts, Secrets & Permissions

Why It Matters
Your automation can’t log in or deploy if it doesn’t have the right keys.

Explanation
Think of GitHub and Docker Hub as two warehouses. Your CI/CD robot needs secure keys to both: one to collect the code, the other to store the built product (your image).

Real-World Example
In the demo, the author logs into Docker Hub using encrypted secrets:

DOCKER_HUB_USERNAME
DOCKER_HUB_TOKEN

These secrets are stored safely in GitHub → Settings → Secrets → Actions — never in your source code.

Action Steps

Create or log into your Docker Hub account.
Generate a personal access token (instead of using your password).
In GitHub, go to Settings → Secrets → Actions, and add:
DOCKER_HUB_USERNAME
DOCKER_HUB_TOKEN

Writing the Workflow (main.yaml) — Step by Step

Why It Matters
The workflow file is your recipe — the step-by-step plan your robot will follow.

Explanation
In .github/workflows/main.yaml, you define:

When to run: (trigger event, like push)
What to run: (the jobs and steps)
How to run: (what tools or actions to use)

Key Parts

on: push: “Start the oven when new ingredients arrive.”
jobs: “Each job is a workstation — baking, packaging, shipping.”
actions/checkout@v3: “Fetch ingredients from the storage (repo).”
docker/login-action@v2: “Unlock the warehouse with your keycard (secret).”
docker/build-push-action@v3: “Bake and ship the final product.”

Example Workflow

name: Build and Push Docker Image

on:
  push:
    branches:
      - main

jobs:
  build:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v3

      - uses: docker/login-action@v2
        with:
          username: ${{ secrets.DOCKER_HUB_USERNAME }}
          password: ${{ secrets.DOCKER_HUB_TOKEN }}

      - uses: docker/build-push-action@v3
        with:
          context: .
          push: true
          tags: yourname/app:latest

⚠️ Common Pitfalls

Wrong Dockerfile path → build fails.
Missing or misnamed secrets → login fails silently.
Forgot push: true → image never uploads.

**✅ Action Step
Commit your file to .github/workflows/main.yaml and push it. Your pipeline is ready.

Build & Push: Watching It Run

Why It Matters
Seeing it work builds confidence that your automation truly works.

Explanation
Push your code, and watch:

GitHub detects the push.
Actions triggers the job.
The workflow checks out code, logs in to Docker Hub, builds, and pushes the image.

Real-World Example
In the demo, after pushing a small code change, the author refreshed Docker Hub and saw the new image tagged latest. No manual typing, no Docker commands.

Action Steps

Edit a small file (e.g., README).
Run git push.
Open GitHub → Actions tab → watch logs.
Open Docker Hub → confirm the new tag.

Counterpoints & Limitations

Private repos or registries: Require additional permissions or tokens.
Multi-service setups: Need multiple jobs or build matrix strategies.
Large images: Consider caching layers for faster builds.
Security: Always scan base images and protect secrets carefully.

Mini Case Study: “Solo Developer to One-Click Deployment”

Before:
X, a solo Node.js developer, manually built and pushed Docker images every update — often forgetting tags or pushing the wrong version.

After GitHub Actions:

Added main.yaml
Configured Docker Hub secrets
Now, every git push builds + pushes automatically

He cut his deployment time from 10 minutes to under 1, and never mistakes again.

Conclusion — From Kitchen Chaos to Conveyor Belt

We began with a bakery. You don’t want to bake, pack, and ship every loaf by hand. You want a conveyor belt that works reliably.

GitHub Actions and Docker give you that belt. As soon as you push code, your automation handles the rest — repeatable, consistent, hands-off.

Take this workflow file, plug in your secrets, push a change, and watch your pipeline come alive.

Docker Doesn’t Bite: A Beginner’s Guide

Imthadh Ahamed — Tue, 23 Sep 2025 09:00:41 +0000

When I first heard about Docker, I imagined it was only for “serious DevOps people” running massive cloud systems. The truth? Docker is simply a smarter way to run apps without the headaches of manual setup. Instead of installing Node.js, PostgreSQL, or frameworks directly on your laptop, Docker lets you bundle everything your app needs into neat little containers like bento boxes for software. These containers run the same anywhere: your laptop, your friend’s machine, or a cloud server.

In this post, I’ll share how I used Docker to spin up a PENN (PostgreSQL, Express/Node, Next.js) project in minutes, no messy installs required. We’ll walk through containerization basics, Docker Compose for multi-service apps, and even pushing images to Docker Hub so teammates can run the project instantly. If Docker has ever felt intimidating, think of this as a friendly guide — it’s easier than you think.

What is Docker?

Docker is a tool that lets you package an application with everything it needs code, libraries, and settings, into a container. A container is like a bento box for software: it neatly holds your app and its ingredients so it can run the same way anywhere.

Docker Image = the recipe (instructions + ingredients).
Docker Container = the meal (a running instance of the recipe).
*Isolation *= each bento box keeps its food separate, so one app doesn’t interfere with another.
Portability = once packed, you can run the container on any computer with Docker installed — your laptop, a server, or the cloud. In short, Docker makes apps consistent, portable, and easy to run. No more “works on my machine” drama — if it works in your container, it’ll work everywhere.🐳

Why Use It?

Docker solves the “it works on my machine” problem by bundling your app with all its dependencies. No more manual installs, version mismatches, or dependency hell. Everyone runs the same container, so the app behaves consistently everywhere. And if something breaks? Just restart or replace the container — your laptop stays clean.

When One Container Isn’t Enough: Enter Docker Compose

Running a single app in a container is nice, but real projects often need multiple pieces — a database, an API, a frontend. That’s where Docker Compose comes in. Think of it as the orchestra conductor for containers: instead of starting each one manually, you describe everything in a docker-compose.yml file, then run docker compose up. Compose pulls images, builds your code, wires containers together on a private network, and manages startup order.

For example, in a PENN stack (PostgreSQL, Express/Node, Next.js), your Compose file might define:

db → runs Postgres from an official image, with a persistent volume for data.
backend → builds from a Node image and connects to the database service.
frontend → builds the Next.js app, mapped to port 3000.

With a single command, you have a database, server, and UI all talking to each other. Need Redis later? Just add a service — no local installs required.

The best part? Consistency and speed. A teammate can clone your repo, run docker compose up, and be productive in minutes. No dependency hell, no OS-specific setup, just a clean, reproducible dev environment.🎉

To illustrate, here’s a snippet of what a Compose file might look like for our PENN stack (PostgreSQL, Express, Node, Next.js) example:

services:
  db:
    image: postgres:15
    environment:
      - POSTGRES_USER=myuser
      - POSTGRES_PASSWORD=mypassword
      - POSTGRES_DB=mydb
    volumes:
      - postgres-data:/var/lib/postgresql/data

  backend:
    build: ./backend
    ports:
      - "8080:8080"
    environment:
      - DATABASE_URL=postgres://myuser:mypassword@db:5432/mydb
    depends_on:
      db:
        condition: service_healthy

  frontend:
    build: ./frontend
    ports:
      - "3000:3000"
    environment:
      - API_URL=http://localhost:8080
    depends_on:
      backend:
        condition: service_started

volumes:
  postgres-data:

Docker as a Friendly Tool in Your Toolbox

Docker might look magical at first, but it’s very practical magic. By wrapping your app and its dependencies into containers, you avoid the repetitive installs and endless “it works on my laptop” debates. Think of Docker as a kitchen assistant that preps everything ahead of time, or a shipping manager that delivers your package sealed and intact.

The key benefits are simple:

Isolation → run multiple apps with conflicting setups without clashing.
Consistency → your app runs the same everywhere, from dev to production.
Simplicity → with Docker Compose, even multi-service projects spin up with a single command.

For beginners, the trick is not to overthink it. Start small: containerize a simple app, then expand. Use Docker Hub to pull ready-made images or share your own. With just docker compose up, you can run a database, backend, and frontend in minutes—no manual installs, no dependency chaos.

Once you try it, Docker feels less like a scary sea monster and more like the friendly whale in its logo. It helps you ship software reliably and with less stress. So go ahead — dip your toes in. You’ll wonder how you ever coded without it.🐳

Breaking Down Text for Better AI Processing: Why Chunk Size and Overlap Matter

Imthadh Ahamed — Mon, 22 Sep 2025 08:13:37 +0000

Before an AI model like GPT or Gemini can provide smart answers, summarize documents, or generate insights, the input text needs to be prepared carefully. This process, known as text preprocessing, makes sure the AI can process and comprehend the data you supply.

Token limits are a major preprocessing challenge. The most tokens that even sophisticated models can handle in a single request is limited. They will struggle if you give them a whole book or research paper. At this point, chunking(splitting) — the process of dividing lengthy texts into manageable chunks — becomes crucial.

Frameworks like LangChain contain tools like RecursiveCharacterTextSplitter that are made especially for this use. They minimise the loss of meaning while assisting in the division of lengthy texts into digestible sections.

Why Split Text in NLP?

Language models are powerful, but they don’t have infinite memory. For example:

GPT-4 Turbo supports up to 128,000 tokens (roughly 300 pages of text).
Other models may handle much less, sometimes only 4,000–8,000 tokens.

When your text exceeds these limits, you need to split it. But if you split text carelessly, you risk:

Context loss: Important details cut in half between chunks.
Semantic breaks: Sentences or paragraphs split in the middle, making chunks harder to understand.

The goal of splitting is to preserve semantic coherence — keeping ideas whole and understandable — while staying within token limits.

📱Example: Sending Long Messages on WhatsApp
Imagine you want to send your friend a long story over WhatsApp, but WhatsApp only lets you send 500 characters per message.

If you paste the whole story in one go, it won’t send (like hitting the AI’s token limit).
So, you split the story into smaller messages (like chunks).

Now two problems can happen if you split carelessly:

Context loss

You cut right in the middle of a sentence.
Message 1: “The hero opened the doo — ”
Message 2: “ — r and saw a dragon.” → The flow feels broken.

Semantic breaks

You accidentally separate related parts.
Message 1 ends with: “The hero raised his sword.”
Message 2 starts with something completely new: “Meanwhile, in another city…” → Your friend might get confused because the action was cut too sharply.

✅ The solution is to split at natural points (like at the end of a sentence or paragraph) and sometimes repeat a little overlap.
For example, you might copy the last few words of one message into the next:

Message 1: “…he opened the door and saw a dragon.”
Message 2: “He saw a dragon breathing fire across the room…”

That way, your friend remembers the scene, and the story feels smooth — just like preserving semantic coherence when chunking text for AI.

What is RecursiveCharacterTextSplitter?

The RecursiveCharacterTextSplitter is a utility in LangChain (and similar NLP libraries) that intelligently splits large text into smaller parts.

Here’s how it works:

It tries to split text by larger natural boundaries first (paragraphs, sentences).
If a chunk is still too big, it splits further down to smaller boundaries (words, then characters).
This recursive process ensures the chunks are manageable but still meaningful. In short, it’s like cutting a long story into chapters, then into scenes, and only as a last resort into lines — making sure each piece still makes sense.

Code Breakdown

from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(  
    chunk_size=1000,  
    chunk_overlap=200  
)

What does this mean?

chunk_size=1000 → Each chunk will be about 1,000 characters long. Think of it as setting the length of each episode in your story.
chunk_overlap=200 → The end of one chunk overlaps with the next by 200 characters. This ensures continuity.

Real-World Example: Watching a TV Series

Imagine your PDF is a long TV series. If you cut it into episodes without overlap, a scene might end halfway through Episode 1 and continue in Episode 2. Confusing, right?

With overlap, the last 5 minutes of Episode 1 are replayed at the start of Episode 2.
👉 This way, you don’t forget what happened, and the story flows smoothly.

That’s exactly what chunk_overlap=200 does: it repeats part of the previous text to keep context intact.

Practical Use Cases

Chunking text isn’t just academic — it’s used in real projects every day:

Document Q&A systems: Splitting a PDF or manual into chunks so an LLM can answer specific questions accurately.
Text summarization: Breaking down large reports into smaller parts before generating summaries.
Vector databases (FAISS, Pinecone, Chroma): Storing chunk embeddings for efficient semantic search and retrieval.
Training data preparation: Splitting text before feeding it into custom NLP/LLM training pipelines.

Pros and Cons

✅ Advantages

Maintains context continuity with overlaps.
Preserves semantic meaning by splitting at logical points.
Works well with vector databases and LLMs.

⚠️ Drawbacks

Increased processing time: More chunks = more computations.
Higher token cost: Overlaps mean some text is repeated, slightly increasing usage costs.

Conclusion

Thoughtful text splitting is a cornerstone of effective AI text processing. By carefully choosing chunk size and overlap, you make sure your AI has just enough information in each piece without losing the bigger picture.

While RecursiveCharacterTextSplitter is a go-to tool, alternatives exist: sentence-based splitters, semantic chunkers, or token-level splitters. The key is to balance chunk length and context preservation based on your use case.

If you’re building anything from a chatbot to a summarizer, applying these chunking strategies will dramatically improve your results.

Getting Started with Google Gemini Embeddings in Python: A Hands-On Guide

Imthadh Ahamed — Mon, 22 Sep 2025 06:36:16 +0000

Artificial Intelligence is evolving rapidly, and one of the most exciting areas is retrieval-augmented generation (RAG) and semantic search. At the heart of these systems lies a powerful concept: embeddings.

In this article, I'll walk you through the process of generating embeddings using Google's Gemini API in Python. Don't worry if you're a beginner, we'll break it down step by step, with code you can run today.

What Are Embeddings (with a real-world twist)?

Imagine walking into a huge supermarket. Instead of wandering, you notice how items are grouped.

Fruits are in one section 🍎🍌🍇
Vegetables in another 🥦🥕
Bakery items together 🥖🍩

Even though apple and banana are different, they're close together in the fruit section because they mean similar things (both are fruits).
That's exactly what embeddings do for text. They put similar concepts near each other in a mathematical supermarket.
So if you ask the system about solar energy, it doesn't just look for the exact word "solar." It also finds photovoltaics, sunlight power, or renewable energy, because those live in the same "aisle."

👉 Think of embeddings as a way to organize knowledge like a supermarket organizes groceries.

What Are Embeddings (Technical View)?

An embedding is a numerical representation of data (text, images, audio, etc.) in a continuous vector space. In NLP (natural language processing), embeddings are used to represent words, sentences, or documents as high-dimensional vectors of real numbers.

Key Properties

Dimensionality

Each embedding is a vector of fixed length (768 dimensions for embedding-001 in Gemini).
Example:

[0.12, -0.34, 0.89, ...]  # length = 768

Semantic Proximity

The geometry of the vector space encodes meaning.
Texts with similar semantic meaning will have embeddings that are closer together (low cosine distance / high cosine similarity).
Example:
"Solar energy" and "photovoltaics" → embeddings close in space
"Solar energy" and "chocolate cake" → embeddings far apart

Training Basis

Embeddings are learned by large language models (LLMs) trained on massive corpora.
The model optimizes representations such that semantically related text produces vectors with high similarity.

Mathematical Use

You can compute distances between embeddings using:
Cosine similarity (most common)
Euclidean distance

Setting Up the Environment

pip install google-generativeai python-dotenv

google-generativeai → The engine (Gemini API)
python-dotenv → The cashier who checks your membership card (API key)

Inside your .env file, add:
GEMINI_API_KEY=your_api_key_here

Writing the Python Code

Here's our hands-on demo:

import os
import google.generativeai as genai
from dotenv import load_dotenv

# Load API key
os.environ.pop("GEMINI_API_KEY", None)
load_dotenv(override=True)
GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")

if not GEMINI_API_KEY:
    raise ValueError("GEMINI_API_KEY is not set in the environment variables.")
print("GEMINI_API_KEY is set.")

# Configure Gemini
genai.configure(api_key=GEMINI_API_KEY)

# Text chunks = our knowledge items
chunks = [
    "Chunk 1: Renewable energy comes from resources that are naturally replenished (sunlight, wind, rain, tides, waves, geothermal).",
    "Chunk 2: Solar energy is abundant and captured using photovoltaic panels. Wind energy uses turbines to generate electricity.",
    "Chunk 3: Geothermal energy comes from heat inside the Earth. Biomass energy is derived from organic materials."
]

# Generate embeddings
for i, chunk in enumerate(chunks):
    response = genai.embed_content(
        model="models/embedding-001",
        content=chunk,
        task_type="retrieval_document"
    )
    embedding = response["embedding"]
    print(f"\nEmbedding for Chunk {i + 1}:\n{embedding[:10]}...")

Breaking It Down

API Key Handling

We use .env for security. never hardcode keys.

Chunks of Text

Each chunk represents a small passage of knowledge. In real-world projects, chunks might be paragraphs from PDFs, product descriptions, or support docs.

embed_content Call

The Gemini model (embedding-001) converts text into a 768-dimensional vector. We only print the first 10 numbers to keep output readable.

Result

Each chunk now has a unique vector representation. These embeddings can be stored in a vector database (like Pinecone, Weaviate, or ChromaDB) for semantic search or powering RAG pipelines.

Real-World Applications

So why does this matter? Here are some examples where embeddings shine:

Search Engines → Find relevant docs by meaning, not just keywords.
Chatbots & RAG Systems → Retrieve context-aware answers.
Recommendation Engines → Suggest similar products or articles.
Clustering & Topic Modeling → Group similar content automatically.

Imagine building a renewable energy Q&A bot: the chunks above could serve as knowledge, and embeddings would help the bot fetch the right passage when a user asks, "How does geothermal energy work?"

Conclusion

Embeddings are like the hidden language that bridges human words and machine understanding. With Google's Gemini API, creating them is no longer rocket science - it's just a few lines of Python.

If you're planning to build your own AI-powered search, chatbot, or recommendation system, embeddings will be at the core of it. This hands-on example is your first step toward building those advanced systems.

Forem: Imthadh Ahamed

From Code Push to Docker Hub: CI/CD with GitHub Actions🚀

Why Automate Docker Builds?

Prerequisites: Accounts, Secrets & Permissions

Writing the Workflow (main.yaml) — Step by Step

Build & Push: Watching It Run

Mini Case Study: “Solo Developer to One-Click Deployment”

Conclusion — From Kitchen Chaos to Conveyor Belt

Docker Doesn’t Bite: A Beginner’s Guide

What is Docker?

Why Use It?

When One Container Isn’t Enough: Enter Docker Compose

Docker as a Friendly Tool in Your Toolbox

Breaking Down Text for Better AI Processing: Why Chunk Size and Overlap Matter

Why Split Text in NLP?

What is RecursiveCharacterTextSplitter?

Code Breakdown

What does this mean?

Real-World Example: Watching a TV Series

Practical Use Cases

Pros and Cons

Conclusion

Further Reading / Resources

Getting Started with Google Gemini Embeddings in Python: A Hands-On Guide

What Are Embeddings (with a real-world twist)?

What Are Embeddings (Technical View)?

Key Properties

Setting Up the Environment

Writing the Python Code

Breaking It Down

Real-World Applications

Conclusion