Forem: Omair Ahmed

I Built a CLI Tool That Makes Text Analysis Beautiful (And You Won't Believe How Simple It Is)

Omair Ahmed — Fri, 19 Dec 2025 09:30:25 +0000

AI Disclaimer: This article was written with AI assistance to document a real open-source project.

The Hook That Changed Everything

I was knee-deep in analyzing a 50,000-word manuscript when it hit me: why is text analysis still so ugly?

You know the drill. You pipe some text through grep, maybe write a quick Python script with Counter, dump the results to a CSV, open Excel, create a chart... and by the time you're done, you've forgotten what you were even looking for.

What if I told you there's a better way? What if analyzing word frequency could be as simple as typing one command and watching your terminal light up with colorful, interactive visualizations?

I couldn't find that tool. So I built it.

The Problem Nobody Talks About

Here's the truth: most data analysis tools treat the terminal like it's 1985.

We have beautiful web dashboards, stunning Jupyter notebooks, and slick GUI applications. But when you're working in the terminal—where developers actually spend most of their time—you get this:

$ python analyze.py document.txt
the: 1247
and: 892
to: 654
of: 543
...

Boring. Uninspiring. Impossible to understand at a glance.

The terminal is powerful. It's fast. It's universal. But somewhere along the way, we accepted that it had to be ugly.

I refused to accept that.

The Breakthrough: Beauty Meets Performance

Enter WordFlow—a lightweight CLI tool that transforms text analysis from a chore into a visual experience.

Here's what blew my mind when I finished building it:

# The entire core algorithm in ~30 lines
def analyze_text(text, top_n=10):
    """Extract word frequency with blazing speed"""
    # Tokenize and normalize
    words = re.findall(r'\b[a-z]+\b', text.lower())

    # Count with Python's optimized Counter
    word_counts = Counter(words)

    # Get top N words
    top_words = word_counts.most_common(top_n)

    return top_words

def visualize_bars(word_data, max_bar_length=50):
    """Create beautiful terminal bar charts"""
    max_count = word_data[0][1]

    for word, count in word_data:
        # Calculate proportional bar length
        bar_length = int((count / max_count) * max_bar_length)

        # Generate colored bars
        bar = colored('█' * bar_length, 'cyan')

        # Format output with padding
        print(f"{word:15} {bar} {count}")

That's it. That's the core of WordFlow.

No heavy frameworks. No bloated dependencies. Just clean Python that does one thing exceptionally well.

The Magic in the Details Nobody Notices

Here's where it gets interesting. The real challenge wasn't just counting words—it was making the experience delightful. Let me show you the three details that make WordFlow special:

1. Smart Color Mapping

Instead of random colors, I implemented a gradient system:

def get_color_for_rank(rank, total):
    """Color intensity based on word frequency ranking"""
    if rank <= total * 0.2:  # Top 20%
        return 'green'
    elif rank <= total * 0.5:  # Top 50%
        return 'cyan'
    elif rank <= total * 0.8:  # Top 80%
        return 'yellow'
    else:
        return 'white'

The most frequent words pop with green. Less frequent words fade to white. Your eyes are naturally drawn to what matters.

2. Adaptive Bar Scaling

Here's something subtle: WordFlow automatically adjusts bar lengths based on your terminal width.

def get_terminal_width():
    """Dynamically adjust to terminal size"""
    try:
        columns = os.get_terminal_size().columns
        return max(min(columns - 30, 50), 20)  # Reserve space for labels
    except:
        return 50  # Sensible default

Whether you're on a laptop screen or a 4K monitor, the visualization always looks perfect.

3. Streaming for Large Files

The early version choked on files over 100MB. The fix was elegant:

def stream_analyze(filepath, chunk_size=8192):
    """Process massive files without memory overflow"""
    counter = Counter()

    with open(filepath, 'r') as f:
        while chunk := f.read(chunk_size):
            words = re.findall(r'\b[a-z]+\b', chunk.lower())
            counter.update(words)

    return counter

Now it handles gigabyte-sized files without breaking a sweat.

The Stack: Less is More

I kept the dependencies minimal on purpose:

Python 3.8+ – The only requirement
termcolor – For beautiful colored output
argparse – For clean CLI argument parsing
re & collections – Built-in Python modules doing the heavy lifting

No NumPy. No Pandas. No bloated machine learning libraries.

Total package size? Less than 50KB.

This thing installs in seconds and runs on anything from a Raspberry Pi to a cloud server.

The Technical Deep Dive

Want to understand how it really works? Here's the complete flow:

#!/usr/bin/env python3
import re
import argparse
from collections import Counter
from termcolor import colored

def main():
    # Parse command-line arguments
    parser = argparse.ArgumentParser(
        description='Analyze word frequency with beautiful visualizations'
    )
    parser.add_argument('file', help='Text file to analyze')
    parser.add_argument('-n', '--top', type=int, default=10,
                       help='Number of top words to display')
    parser.add_argument('--no-color', action='store_true',
                       help='Disable colored output')

    args = parser.parse_args()

    # Read and process file
    with open(args.file, 'r', encoding='utf-8') as f:
        text = f.read()

    # Analyze word frequency
    words = re.findall(r'\b[a-z]+\b', text.lower())
    word_counts = Counter(words)
    top_words = word_counts.most_common(args.top)

    # Display results
    print(f"\n📊 Top {args.top} words in {args.file}:\n")

    max_count = top_words[0][1]
    max_bar_length = 50

    for rank, (word, count) in enumerate(top_words, 1):
        bar_length = int((count / max_count) * max_bar_length)

        if not args.no_color:
            if rank <= 3:
                bar = colored('█' * bar_length, 'green')
            elif rank <= 7:
                bar = colored('█' * bar_length, 'cyan')
            else:
                bar = colored('█' * bar_length, 'yellow')
        else:
            bar = '█' * bar_length

        print(f"{rank:2}. {word:15} {bar} {count:,}")

    print()

if __name__ == '__main__':
    main()

Clean. Readable. Maintainable.

Lessons I Learned Building This

1. Simplicity Scales

I started with complex features—stopword filtering, stemming, TF-IDF scores. Stripped them all out. The simple version is what people actually use.

2. The Terminal is Underrated

We've been conditioned to think GUIs are superior. But for quick analysis? Nothing beats typing wordflow document.txt and getting instant results.

3. Visual Feedback Matters

The difference between plain text output and colored bar charts isn't just aesthetic—it's cognitive. Your brain processes visual hierarchies faster than reading numbers.

4. Performance Through Minimalism

By avoiding heavy dependencies, WordFlow starts instantly. No import lag. No initialization overhead. Just pure speed.

The Numbers Don't Lie

Since releasing WordFlow:

50+ GitHub stars in the first month
Sub-50ms analysis time for most documents
Zero dependencies beyond the Python standard library (+ termcolor)
Works on Linux, macOS, and Windows out of the box

Try It Yourself

Installation is stupidly simple:

git clone https://github.com/omairqazi29/wordflow.git
cd wordflow
pip install -r requirements.txt

# Analyze any text file
python wordflow.py sample.txt

# Show top 20 words
python wordflow.py sample.txt -n 20

# Disable colors for piping
python wordflow.py sample.txt --no-color > results.txt

What's Next?

I'm working on:

Export formats – JSON, CSV, and Markdown output
Advanced filtering – Custom stopwords, regex patterns
Multiple files – Compare word frequencies across documents
Language support – Unicode handling for non-English text

But here's the thing: I'm not adding features unless they maintain the core simplicity.

WordFlow will always be lightweight. It will always be fast. And it will always make text analysis beautiful.

The Challenge

I challenge you to:

Clone the repo – Take 2 minutes to try it
Analyze your writing – Run it on your blog posts, documentation, or code comments
Share what you discover – What patterns did you find in your own work?

Because here's what I learned building WordFlow: the best tools don't just solve problems—they reveal insights you didn't know you were missing.

Your Turn

What terminal tools do you wish were more beautiful? What analysis tasks feel unnecessarily complicated?

Drop a comment below. Or better yet, fork WordFlow and make it your own.

The code is open source. The future is collaborative. And the terminal doesn't have to be boring.

Star the repo: github.com/omairqazi29/wordflow

Follow me for more: Building tools that make developers' lives better, one CLI at a time.

Found this useful? Clap it up and share with a friend who needs better text analysis tools.

Let's make the terminal beautiful again.

I Had 24 Hours, Zero Templates, and a Vision. Here's What I Built.

Omair Ahmed — Fri, 19 Dec 2025 09:16:05 +0000

The story of an interactive 3D orb, a terminal that talks back, and why I stopped settling for "good enough."

Disclaimer: This article was written with the help of AI, transforming my scattered development notes into a coherent narrative. The project, the code, the design decisions… those are all mine.

It Started With Frustration

I’ve built websites for clients that generated millions in revenue. Enterprise dashboards. AI platforms. The works.

But when I looked at my own company’s website? It was… fine. Functional. Forgettable.

That’s the trap, isn’t it? We give our best work to everyone else.

Today, I decided that was over. I gave myself 24 hours. No templates. No shortcuts. Just me, a blank editor, and a single question:

What if a website could feel alive?

The Problem With “Professional” Websites

Every tech company website looks the same. Hero section with gradient background. Stock photo of diverse people pointing at a laptop. Three-column feature grid. Contact form that feels like filling out a tax return.

It’s visual white noise. Users scroll, glaze over, leave.

I wanted something different. Something that would make a developer pause mid-scroll and think, “Wait, how did they do that?”

Something that would make a potential client feel like they were already experiencing what we could build for them, or something better than what they were expecting.

The 3D Orb That Took 3 Hours

The hero section needed a centerpiece. Not an illustration. Not a video. Something interactive.

I envisioned a floating orb: layered hexagons rotating in 3D space, responding to the user’s cursor, surrounded by orbiting code snippets. It would breathe. It would react. It would make people want to play with it.

The first version looked like a spinning loading icon from 2005.

The second version made my browser crash.

The third version… something clicked.

I realized I didn’t need WebGL or Three.js. I could fake convincing 3D with pure CSS transforms and clever math:

const rotateX = (mouseY — 0.5) * 30 * rotationSpeed
const rotateY = (mouseX — 0.5) * 30 * rotationSpeed

Suddenly, the orb was alive. It tracked my mouse. The layers rotated at different speeds, creating depth. Twelve particles floated around it, each moving independently. Code snippets: async/await, docker run, SELECT *, orbited like satellites.

300 lines of code. Zero external dependencies. 60 frames per second.

When I added the click ripple effect, I actually whispered “yes” to myself in an empty room.

Turning a Contact Form Into a Conversation

Forms are transactional. You input data. You click submit. You wait.

But what if a form told a story?

I built the contact form as a terminal interface. When you start, you see:

> fazper init — contact
> Initializing secure connection…
> Enter your name: _

Each field appears one at a time, like commands executing in sequence. Your responses become part of the terminal output. When you submit:

> Encrypting payload with AES-256…
> Dispatching to Fazper engineering team…
████████████████████ 100%
> Request ID: lx7k9m2 generated
> Status: SUCCESS
> < 24 hour Response ETA

It’s still a form. It still collects the same information. But now it feels like something is happening. Like you’re interfacing with a system, not filling out a spreadsheet.

316 lines of state management. Seven distinct states. One seamless experience.

A friend tested it and said, “I actually wanted to fill this out.” That’s when I knew it worked.

The Details Nobody Notices (But Everyone Feels)

Great experiences are built on invisible decisions.

The scroll animations: I wrote a custom component with six animation variants. Elements fade up, scale in, blur into focus. But only once. Only when you first see them. The IntersectionObserver disconnects after triggering because wasted resources are wasted experiences.

The color system: I used OKLCH, a perceptually uniform color space. Most developers haven’t heard of it. But it’s why the gradients feel natural and the dark mode doesn’t look like an afterthought.

The sticky header: It starts transparent. As you scroll, it gains a frosted glass effect. The transition is 300ms with ease-out timing. You don’t notice the header. You notice that the site feels polished.

The stats counter: Numbers animate from zero to their target over two seconds. But only when the section enters your viewport. It’s a small dopamine hit. “500+ projects delivered” feels more real when you watch it count up.

The Stack (For the Curious)

React 19.2.0: Yes, the latest. It works.
Next.js 16: App Router, server components where they make sense.
TypeScript: Strict mode. Non-negotiable.
Tailwind CSS v4: With OKLCH color functions.
Zero animation libraries: Just requestAnimationFrame and IntersectionObserver.

The entire site is fast because I refused to add dependencies I didn’t need.

What 24 Hours Taught Me

Constraints unlock creativity: No WebGL meant I had to invent solutions. The orb exists because I couldn’t take the easy path.
Details compound: One smooth animation is nice. Fifty coordinated micro-interactions create magic.
Build for yourself like you build for clients: We give clients our best work but showcase ourselves like leftovers. That’s hypocrisy.
“Good enough” is the enemy: Every tech company has a website. Almost none of them are memorable. The gap between forgettable and remarkable is smaller than you think. It just requires giving a damn.

The Real Reason I’m Sharing This

I could have kept this as just another portfolio piece. But here’s what I’ve learned after years of building:

The best marketing is proof.

You can say you build exceptional software. Or you can show it.

Every hover effect, every animation, every line of code on the Fazper website is a demonstration. If we obsess over details on our own site, imagine what we build when someone’s paying us.

One Last Thing

After I deployed, I sat back and just… scrolled. Watched the orb respond to my mouse. Stepped through the terminal form. Let the animations cascade.

For the first time in a long time, I felt proud of something I built for myself.

If you’re a developer reading this: build something tonight that isn’t for a client, isn’t for money, isn’t for anyone’s approval. Build it because you want to see if you can.

That’s how you remember why you started.

Fazper LLC builds AI solutions, custom software, and data platforms for companies who refuse to settle. If our contact form is this thoughtful, imagine what we’ll build for you.

Visit fazper →

Using ML to Predict Credit Card Defaults

Omair Ahmed — Fri, 19 Dec 2025 09:02:18 +0000

Have you ever wondered what the “system” does to approve you instantly when you apply for a credit limit increase on your credit card? Does it make an API call to ChatGPT with your credit report or do the banks have their own AI/ML system? The banks today probably use a complex solution for this problem which I have tried to touchbase on by developing my very own ML analysis for predicting credit card defaults.

An average American holds 3.9 credit cards, and just in the States, there are hundreds of millions of active credit cards. For each of these, the banks, credit unions, other companies do one common but critical thing: complex calculations to determine which credit card clients may fail to make their next payment. The approach needs to be precise, too liberal and defaults pile up affecting the bank’s business and too conservative will result in low profits, and potentially losing good customers.

Problem Summary

The project’s question is pretty straightforward which is to predict whether a credit card client will default on their payment next month. This is a binary classification problem where my ML model needs to flag risky accounts and also minimizing false alarms. The banks use a more complex version of these predictions to adjust credit limits, put accounts on financial review, or hold accounts before losses occur.

Exploratory Data Analysis

I was provided a dataset of 30,000 Taiwanese credit card clients from 2005. It contained 24 features, for example, demographics, credit limits, payment history, billing information, etc. I identified variables the most important for my project which were payment status indicators across six months (PAY\_0\ through PAY_6), which identify whether the credit card clients paid on time, late or early.

The dataset also contained a significant class imbalance shown in Figure 1 below. About 77.7% of clients didn’t default while 22.3% did. This approx. 3.5:1 ratio distanced me away from accuracy as a metric. A simple and dummy model could achieve 78% accuracy by always predicting “no default.” Figure 2 shows feature distributions. Here we see huge scale differences, like credit limits ranging from 10,000 to over 1 million. On the other hand, we have payment status values with a much smaller range: from -2 to 8. This prompted the need for feature scaling.

Figure 1: Distribution of target variable showing 77.7% non-default vs 22.3% default

Figure 2: Feature distributions comparing default and non-default groups across key variables

How I approached the problem

I started with engineering new new features to capture patterns in the raw data. These included credit utilization ratio (just like in our credit reports), average payment status across all months, average bill amount, and average payment amount.

After dropping the ID column and the SEX column (to avoid gender bias), I tested four model types:

A baseline linear classifier, Logistic Regression,
An ensemble of decision trees, Random Forest,
Sequential tree building that corrects previous errors, Gradient Boosting,
and Instance-based learning using similar examples, K-Nearest Neighbors.

I applied hyperparameter tuning to each moel using 5-fold cross-validation in order to find the optimal configuration for each of them. I used ROC-AUC (Area Under the Receiver Operating Characteristic Curve) as my primary metric because it evaluates model performance across all classification thresholds, making it ideal for imbalanced datasets.

Findings

My tuned Gradient Boosting model was the clear winner with a test ROC-AUC of 0.7821, an F1 score of 0.4784, and an accuracy of 82.12%. Putting this into perspective, the baseline model that always predicted “no default” achieved 77.68% accuracy (as I expected it based on reasons mentioned earlier) but had a ROC-AUC of exactly 0.5. This is equivalent to guessing a coin toss.

Figure 3 below shows feature importance analysis. It indicates that PAY_0 was the most recent payment status. It was by far the most critical feature, and accounted for 53% of the model’s decision-making. The feature I engineered, AVG_PAY_STATUS was the second most important accounting for 20%. These two together drove over 72% of predictions, with a steep drop-off after. Credit utilization, average bill amount, and average payment amount also contributed smaller but meaningful signals at around 3% each.

Figure 3: Feature importance showing PAY_0 (53%) and AVG_PAY_STATUS (20%) as dominant predictors

Now, if we look with the perspective of individual predictions, there was a client who was correctly flagged as high-risk who had PAY_0 = 2 (which means the payment was delayed by two months), AVG_PAY_STATUS = 2.0 (indicating consistent delays), and credit utilization of 91.7%. On the other hand, there was also a client who also correctly, classified as low-risk. This client showed PAY_0 = -1 (meaning paid early), a negative average payment status (-0.33), and a relatively moderate utilization of 72.5%.

Limitations

These results may sound awesome but there are some factors that can make the model less reliable in production. I will list the two most (not necessarily important) popular in my head:

1. Data’s age and locality: The dataset is nearly two decades old. Credit card behavior, economic conditions, and lending practices, since then, have evolved significantly. What worked for Taiwanese consumers in 2005 may not apply today and/or in other regions. For example, one region, the US, has a significantly higher credit card spending.

2. Personal bias in engineering features: My engineered features like AVG_PAY_STATUS and credit utilization come with my assumption that averaging payment behavior across months is meaningful. However, this is not necessarily true today. There are literal Reddit posts that teach how to have a massic credit card limit by being a good consumer for the first 6 months, then max it out after being approved for a big credit limit increase. Average here is not going to increase the big utilization and recent delayed payments.

Wrapping Up

Now, we can improve the dataset by collecting more features, for example, economic indicators like unemployment rates in the region and/or seasonal effects in the client’s life which may influence default patterns. Crucial life-changing events like job loss, medical emergencies, relationship issues can also influence defaulting. Usually banks provide an insurance for such events so a model trained purely on historical patterns included in the client’s credit report may perform well in stable conditions.