DEV Community

Cover image for Vibe Coding with Intent: A Tester’s Perspective
Don Johnson
Don Johnson

Posted on • Edited on

43 3 1 2 1

Vibe Coding with Intent: A Tester’s Perspective

There’s a lot of noise right now around vibe coding.

People are generating entire apps off vibes, using LLMs to go from idea to code in minutes. And I’ll admit—it’s impressive. But having been in testing for over 20 years, I couldn’t help but notice something was missing.

The energy is real. The speed is addictive. But the stability? The reliability? Often nowhere in sight.

That’s where I come in.


Who I Am

I’ve been a tester most of my life. My job has always been to make sure products deliver on their promise—not just in ideal demos, but in real-world use.

I’ve seen every testing angle: automation, exploratory, performance, security, chaos. And when vibe coding entered the chat, I was curious.

I started building with it.

I started testing it.

And I started breaking it—easily.


The Problem with Raw Vibes

Vibe coding is missing a backbone. It’s missing intention. The AI can generate, but can it validate? Can it correct itself? Can it know when it’s broken?

Not unless we give it the tools to do so.


How I Made It Work

That’s where my tester mindset kicked in. I created a project called vibe-todo to prove that vibe coding can be viable—if it’s guided by structure and testing.

Here’s what made the difference:

  • Hypothesis tests as self-correction checkpoints
  • A .windsurfrules file that tells the AI how to behave
  • Benchmarks with SLA enforcement (10ms max per op)
  • Logs that are structured for LLMs to observe and react to
  • A Grammatical Evolution engine that generates edge cases dynamically

Yes, I even wired in a GE-based evolutionary testing suite using DEAP to evolve inputs over time. This thing will literally mutate payloads to find failures your unit tests missed.

And just like that—LLM code went from vibes to validated.


Real Talk: What Happens Without Tests?

Let me give you an example.

The LLM once generated a beautiful CRUD API. Looked perfect.

Until I ran it.

POST requests with empty strings didn’t trigger validation. Worse—no unit test existed to catch it. I only saw the issue when Hypothesis tests started failing fast.

This wasn’t an edge case. This was a gap.

Without testing, LLMs hallucinate correctness.


Structured Logs = Observability for LLMs

I output structured JSON logs with every operation:

{
  "operation": "add_task",
  "duration_ms": 8,
  "sla_pass": true,
  "timestamp": 1713040000.01
}
Enter fullscreen mode Exit fullscreen mode

This lets Windsurf agents watch the output, check SLA compliance, and even suggest fixes based on violation patterns.

It's like putting Grafana in your LLM’s head.


What Every Vibe Coder Should Add

You don’t need 20 years of testing experience to strengthen your vibes. Start here:

Minimal Testing Manifesto:

  • .windsurfrules (your AI coding contract)
  • Hypothesis tests for behavior validation
  • Regression tests for known fail states
  • Benchmark tests with SLA enforcement
  • Grammatical Evolution (GE) for fuzzing + edge discovery
  • Structured logs for agent feedback

If you’re doing even half of this—you’re ahead of the game.


My Ask to the Community

Don’t abandon vibe coding.

Just add intent.

Make your vibes testable. Add rules. Use benchmarks. Think like a tester.

And then? Let the LLM earn that green check.


Closing

I've tested software for decades. And I can tell you this:

Code that feels good isn’t always good.

But code that’s structured, tested, benchmarked—and still feels good?

That’s the future.

Vibe coding doesn’t need to be risky. It can be resilient.

It can be with intent.


Built with vibes. Verified by tests. Ready for prod.

Image of PulumiUP 2025

Explore What’s Next in DevOps, IaC, and Security

Join us for demos, and learn trends, best practices, and lessons learned in Platform Engineering & DevOps, Cloud and IaC, and Security.

Save Your Spot

Top comments (4)

Collapse
 
melroy89 profile image
Melroy van den Berg • Edited

I think you are bringing up great ideas and improvements here. You're right, now I manually need to validate the code. And even manually need to validate the generated (unit) tests. I don't trust the output so most of the work is now in QA.

So your manifesto is very helpful, insightful and inspiring. I will definitely try to apply your ideas.

Collapse
 
houdinii profile image
Brian B.

A little tip from a veteran vibe coder, before it was even a thing. Have your llm create a list of issues in issue form. It's often difficult to get the model to get really granular with it's planning at times, but a way to kinda trick it into looking closer and deeper is making it create thought out issues by category.

It also aids the planning stage in figuring out the ordering as well. Here is a little peak at my obsidian notebook showing a little section of what I'm talking about:
Image description

Collapse
 
094459 profile image
Ricardo Sueiras

This is a great post so thanks for sharing these valuable insights. I’m going to explore these in the coming weeks for sure

Collapse
 
luxe_pham profile image
Luong Pham

Great to hear from QA's perspective. Hope I can have the enthusiastic to vibe- testing as you did with vibe-coding. Sure will try. Thanks

Jetbrains image

Build Secure, Ship Fast

Discover best practices to secure CI/CD without slowing down your pipeline.

Read more

👋 Kindness is contagious

Engage with a wealth of insights in this thoughtful article, cherished by the supportive DEV Community. Coders of every background are encouraged to bring their perspectives and bolster our collective wisdom.

A sincere “thank you” often brightens someone’s day—share yours in the comments below!

On DEV, the act of sharing knowledge eases our journey and forges stronger community ties. Found value in this? A quick thank-you to the author can make a world of difference.

Okay