Forem: Koushik Sen

Repo Optimizer: I Let a KISS AI Agent Optimize Itself Overnight. It Cut Its Own Cost by 98%.

Koushik Sen — Thu, 12 Feb 2026 15:32:22 +0000

No manual tuning. No architecture redesign. Just a plain-English instruction and a feedback loop.

The Setup

I maintain KISS, a minimalist multi-agent framework built on one principle: keep it simple, stupid. The framework's flagship coding agent, RelentlessCodingAgent, is a single-agent system with smart auto-continuation — it runs sub-sessions of an LLM-powered coding loop, tracks progress across sessions, and keeps hammering at a task until it succeeds or exhausts its budget. The agent was self-evolved to run relentlessly.

It works. But it was expensive. A single run with Claude Sonnet 4.5 cost $3–5 and took 600–800 seconds. For an agent framework that preaches simplicity and efficiency, that felt like hypocrisy.

So I built a 69-line Python script and told it, in plain English, to fix the problem.

The Tool: `repo_optimizer.py`

The entire optimizer is a RelentlessCodingAgent pointed at its own source code. Here is the core of it:

from kiss.agents.coding_agents.relentless_coding_agent import RelentlessCodingAgent

TASK = """
Your working directory is {work_dir}.

Can you run the command {command}
in the background so that you can monitor the output in real time,
and correct the code in the working directory if needed?  I MUST be able to
see the command output in real time.

If you observe any repeated errors in the output or the command is not able
to complete successfully, please fix the code in the working directory and run the
command again.  Repeat the process until the command can finish successfully.

After the command finishes successfully, run the command again
and monitor its output in real time. You can add diagnostic code which will print
metrics {metrics} information at finer level of granularity.
Check for opportunities to optimize the code
on the basis of the metrics information---you need to minimize the metrics.
If you discover any opportunities to minimize the metrics based on the code
and the command output, optimize the code and run the command again.
Note down the ideas you used to optimize the code and the metrics you achieved in a file,
so that you can use the file to not repeat ideas that have already been tried and failed.
You can also use the file to combine ideas that have been successful in the past.
Repeat the process.  Do not forget to remove the diagnostic
code after the optimization is complete....
"""

agent = RelentlessCodingAgent("RepoAgent")
result = agent.run(
    prompt_template=TASK,
    arguments: ..., 
    model_name="claude-opus-4-6", 
    work_dir=PROJECT_ROOT
)

That's it. The agent runs itself, watches the output, diagnoses problems, edits its own code, and runs itself again — in a loop — until the numbers drop.

No gradient descent. No hyperparameter grid search. No reward model. Just an LLM reading logs and rewriting source files.

What the Optimizer Actually Does

The feedback loop works like this:

Run the target agent on a benchmark task and capture the output.
Monitor the logs in real time. If the agent crashes or hits repeated errors, fix the code and rerun.
Analyze a successful run: wall-clock time, token count, dollar cost.
Optimize the source code using strategies specified in plain English — compress prompts, switch models, eliminate wasted steps.
Repeat until the metrics plateau or the target reduction is hit.

The strategies themselves are just bullet points in the task prompt:

Shorter system prompts that preserve meaning
Remove redundant instructions
Minimize conversation turns
Batch operations, use early termination
Search the web for agentic patterns that improve efficiency and reliability

The optimizer isn't hard-coded to apply any particular technique. It reads, reasons, experiments, and iterates. Which techniques it picks depend on what the logs reveal.

The Results

After running overnight, the optimizer produced this report:

Metric	Before (Claude Sonnet 4.5)	After (Gemini 2.5 Flash)	Reduction
Time	~600–800s	169.5s	~75%
Cost	~$3–5	$0.12	~96–98%
Tokens	millions	300,729	massive

All three benchmark tests passed after optimization: diamond dependency resolution, circular detection, and failure propagation.

What the Optimizer Changed

The optimizer made nine concrete modifications, all discovered autonomously:

Model switch: Claude Sonnet 4.5 ($3/$15 per million tokens) to Gemini 2.5 Flash ($0.30/$2.50 per million tokens) — 10x cheaper input, 6x cheaper output.
Compressed prompts: Stripped verbose CODING_INSTRUCTIONS boilerplate, shortened TASK_PROMPT and CONTINUATION_PROMPT without losing meaning.
Added Write() tool: The original agent only had Edit(), which fails on uniqueness conflicts. Each failure wasted 2–3 steps. Adding Write() eliminated that.
Stronger finish instruction: "IMMEDIATELY call finish once tests pass. NO extra verification." — stopped the agent from burning tokens on redundant confirmation runs.
Bash timeout guidance: "set timeout_seconds=120 for test runs" — prevented hangs on parallel bash execution.
Bounded poll loops: "use bounded poll loops, never unbounded waits" — eliminated infinite-loop risks on background processes.
Reduced max_steps: 25 down to 15. Forced the agent to be efficient. Still enough to complete the task.
Simplified step threshold: Always max_steps - 2 instead of a complex adaptive calculation.
Removed CODING_INSTRUCTIONS import: Eliminated unnecessary token overhead loaded into every prompt.

None of these changes are exotic. Each one is obvious in hindsight. But together they compound into a 98% cost reduction. The point is that no human sat down and applied them — the optimizer discovered and validated each one through experimentation.

Why This Works

The RelentlessCodingAgent is a general-purpose coding loop: it gets a task in natural language, has access to Bash, Read, Edit, and Write tools, and runs sub-sessions until it succeeds. The repo_optimizer.py simply reuses this same loop, pointed inward.

This is possible because of three properties of the KISS framework:

Agents are just Python functions. There's no config ceremony or deployment pipeline. An agent is a class you instantiate and call .run() on. So an agent can instantiate and run another agent — or itself.
Tools are just Python functions. Bash(), Read(), Edit(), Write() — plain functions with type hints. The agent calls them natively. No wrappers, no adapters.
Tasks are just strings. The optimization strategy, the constraints, the success criteria — all expressed in the task prompt. Changing what the optimizer does means editing a paragraph, not rewriting a pipeline.

The result is a self-improving system built from the same primitives as every other KISS agent.

The Bigger Picture: `repo_agent.py`

The optimizer is actually a specialization of an even simpler tool: repo_agent.py. This is a 28-line script that takes any task as a command-line argument and executes it against your project root:

uv run python -m kiss.agents.coding_agents.repo_agent "Add retry logic to the API client."

The repo agent and the repo optimizer share the same engine (RelentlessCodingAgent) and the same interface (a string). The only difference is the task. The optimizer's task happens to be "optimize this agent for speed and cost." It could just as easily be "add comprehensive test coverage" or "migrate from REST to GraphQL."

The agents in KISS don't care what you ask them to do. They care about doing it relentlessly until it's done.

Try It Yourself

# [Install KISS](https://github.com/ksenxx/kiss_ai/README.md)
# Run the repo optimizer on your own codebase
uv run python -m kiss.agents.coding_agents.repo_optimizer

# Or give the repo agent any task in plain English
uv run python -m kiss.agents.coding_agents.repo_agent "Refactor the database layer for connection pooling."

The framework, the agents, and the optimizer are all open source: github.com/ksenxx/kiss_ai.

KISS is built by Koushik Sen. Contributions welcome.

Agent Evolver: The Darwin of AI Agents

Koushik Sen — Mon, 26 Jan 2026 08:30:14 +0000

Can AI agents be systematically optimized for cost and latency using evolutionary methods?

As multi-agent systems grow in complexity, managing their operational cost and latency becomes a practical concern. Token usage and execution time scale with the number of agents, the length of prompts, and the depth of orchestration logic. Manual optimization of these systems is time-consuming and difficult to do systematically.

Agent Evolver applies genetic evolution to AI agent code, optimizing for cost and speed. It is built using the KISS framework.

The Limits of Prompt Engineering

Prompt engineering is a common approach to improving agent behavior, but it addresses only one dimension of agent performance. An agent's efficiency also depends on:

How the orchestrator delegates to sub-agents
Whether operations are batched or run sequentially
Which tools are created dynamically vs. hardcoded
How checkpointing affects recovery time
Whether task management adds overhead or saves tokens

These are code-level concerns, not prompt-level ones, and they require a different optimization approach.

Evolutionary Optimization

Agent Evolver applies principles from evolutionary computation—specifically, mutation, crossover, and Pareto-based selection—to agent codebases.

Here is how the process works:

1. Seed the Population

You provide a task description specifying what you want the agent system to accomplish. Agent Evolver then uses a coding agent to generate an initial agent implementation. This produces complete, runnable code including:

Orchestrator patterns for long-running tasks
Dynamic todo list management
Tool creation at runtime
Checkpointing for resilience
Sub-agent delegation strategies

The coding agent searches the web for current patterns in building efficient agents, incorporating publicly available techniques.

2. Mutate and Crossover

Each generation, Agent Evolver applies two evolutionary operations:

Mutation: A successful agent variant is selected, its code is analyzed, and targeted improvements are applied—shortening prompts, adding caching, batching operations, or optimizing algorithms. The improver agent reads the code, understands the architecture, and makes specific modifications.

Crossover: Two high-performing variants are selected, and their respective strengths are combined. For example, if Variant A has effective caching logic and Variant B has more compact prompt structures, crossover produces offspring that incorporate both.

3. Pareto Frontier Selection

Agent Evolver optimizes for multiple objectives simultaneously using a Pareto frontier of non-dominated solutions.

Consider two agents:

Agent A: 5,000 tokens, 10 seconds
Agent B: 3,000 tokens, 15 seconds

Neither dominates the other. Agent A is faster; Agent B is cheaper. Both represent valid trade-offs, so both remain on the frontier.

An agent is removed only when another agent is both cheaper and faster. This preserves diversity in the population and avoids premature convergence to a local optimum.

The system uses crowding distance to maintain diversity, ensuring that when the frontier needs trimming, solutions remain distributed across the trade-off curve.

Comparison with Prompt Optimization

Prompt optimization tools tune prompt text while leaving agent code unchanged. Agent Evolver operates on both prompts and code:

Traditional Prompt Optimization	Agent Evolver
Tunes prompt text	Optimizes prompts AND code
Single objective (accuracy)	Multi-objective (cost + speed)
Static architecture	Evolves architecture
Manual iteration	Automated generations
Local improvements	Global search via genetics

The improver agent analyzes control flow, identifies redundant API calls, finds opportunities for parallelization, and restructures agent delegation hierarchies.

Architecture

The system follows this structure:

┌──────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                           Task Description                                               │
└─────────────────────────────────────────────────────────────┬────────────────────────────────────────────┘
                                                              │
                                                              ▼
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│         Initial Agent Creation (Relentless Coding Agent) + Web Search for Best Practices                 │
└─────────────────────────────────────────────────────────────┬────────────────────────────────────────────┘
                                                              │
                                                              ▼
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                          Evolution Loop                                                  │
│  ┌────────────────────────────────────────────────────────────────────────────────────────────────◄──┐   │
│  │    Mutation (80%): Single parent, Targeted changes   │   Crossover (20%): Two parents, Combine │  │   │
│  └───────────────────────────────────────────────┬────────────────────────────────────────────────┘  │   │
│                                                  ▼                                                   │   │ 
│  ┌────────────────────────────────────────────────────────────────────────────────────────────────┐  │   │
│  │                         Evaluation: Measure tokens_used, execution_time                        │  │   │
│  └───────────────────────────────────────────────┬────────────────────────────────────────────────┘  │   │
│                                                  ▼                                                   │   │
│  ┌────────────────────────────────────────────────────────────────────────────────────────────────┐  │   │
│  │            Pareto Frontier Update: Keep non-dominated solutions, Trim by crowding distance     │  │   │
│  └───────────────────────────────────────────────┬────────────────────────────────────────────────┘  │   │
│                                                  └───────────────── More generations? ───────────────┘   │
└─────────────────────────────────────────────────────┬────────────────────────────────────────────────────┘
                                                      │ Done
                                                      ▼
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                            Optimal Agent Output: Best trade-off on Pareto frontier                       │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Each generation, the system:

Samples from the Pareto frontier
Applies mutation or crossover
Evaluates the offspring
Updates the frontier with any non-dominated variants
Copies the current best to an optimal_agent directory

The best agent is always available, even while evolution continues.

Getting Started

from kiss.agents.create_and_optimize_agent import AgentEvolver

evolver = AgentEvolver()

best_agent = evolver.evolve(
    task_description="""
    Build a code review agent that can:
    1. Analyze pull requests for bugs and style issues
    2. Suggest improvements with explanations  
    3. Auto-fix simple issues when confident
    """,
    max_generations=10,
    initial_frontier_size=4,
    max_frontier_size=6,
    mutation_probability=0.8,
)

print(f"Optimal agent: {best_agent.folder_path}")
print(f"Tokens used: {best_agent.metrics['tokens_used']}")
print(f"Execution time: {best_agent.metrics['execution_time']:.2f}s")
print(f"Success: {best_agent.metrics['success']}")

The result is a complete agent package including code, config, tests, and documentation.

Summary

Agent Evolver automates the optimization of AI agent systems by treating agent code as an evolvable artifact. Rather than manually iterating on prompts and code, you define a task and let the evolutionary loop search for efficient implementations across both cost and latency dimensions.

Each generation of evolution incorporates current publicly available knowledge about building efficient agents, so improvements from the broader community can be absorbed automatically.

Agent Evolver is part of the KISS (Keep It Simple, Stupid) agent framework. It is open-source and available on GitHub.

Meet KISS Agent Framework

Koushik Sen — Fri, 09 Jan 2026 02:13:03 +0000

When Simplicity Becomes Your Superpower: Meet KISS Agent Framework

"Everything should be made as simple as possible, but not simpler." — Albert Einstein

The Problem with AI Agent Frameworks Today

The AI agent ecosystem has grown increasingly complex. New frameworks appear weekly, each layered with abstractions, sprawling configuration files, and dependency trees that rival any large-scale web project. Getting a simple tool call working often requires more effort than the task itself.

There is another way.

KISS — the Keep It Simple, Stupid Agent Framework — takes a fundamentally different approach.

The Philosophy: Radical Simplicity

KISS is more than a clever acronym. It is a design philosophy that shapes every line of code in the framework.

Born from frustration with overly complex agent architectures, KISS strips away the unnecessary and focuses on what matters: getting intelligent agents to solve real problems. The API is simple enough that a coding agent can write complex AI pipelines — called AI programs — from natural language descriptions. You can also optimize an agent program using a builtin optimizer.

Every KISS agent is a ReAct agent by default:

1. You give the agent a prompt
2. The agent thinks and calls tools
3. Repeat until done
4. That's it. That's the framework.

No workflow graphs. No state machines.

Your First Agent in 30 Seconds

Here is a minimal working example:

from kiss.core.kiss_agent import KISSAgent

def calculate(expression: str) -> str:
    """Evaluate a math expression."""
    return str(eval(expression))

agent = KISSAgent(name="Math Buddy")
result = agent.run(
    model_name="gemini-3-flash-preview",
    prompt_template="Calculate: {question}",
    arguments={"question": "What is 15% of 847?"},
    tools=[calculate]
)
print(result)  # 127.05

That is a fully functional AI agent with tool use — no boilerplate, no annotations, no big setup.

KISS uses native function calling from the LLM providers. Python functions become tools automatically. Type hints become schemas. Docstrings become descriptions. Everything composes naturally.

Multi-Agent Orchestration is Function Composition

Here's where KISS really shines — composing multiple agents into systems greater than the sum of their parts.

Since agents are just functions, you orchestrate them with plain Python. Here's a complete research-to-article pipeline with three agents:

from kiss.core.kiss_agent import KISSAgent

# Agent 1: Research a topic
researcher = KISSAgent(name="Researcher")
research = researcher.run(
    model_name="gpt-4o",
    prompt_template="List 3 key facts about {topic}. Be concise.",
    arguments={"topic": "Python asyncio"},
    is_agentic=False  # Simple generation, no tools
)

# Agent 2: Write a draft using the research
writer = KISSAgent(name="Writer")
draft = writer.run(
    model_name="claude-sonnet-4-5",
    prompt_template="Write a 2-paragraph intro based on:\n{research}",
    arguments={"research": research},
    is_agentic=False
)

# Agent 3: Polish the draft
editor = KISSAgent(name="Editor")
final = editor.run(
    model_name="gemini-2.5-flash",
    prompt_template="Improve clarity and fix any errors:\n{draft}",
    arguments={"draft": draft},
    is_agentic=False
)

print(final)

That's it. Each agent can use a different model. Each agent saves its own trajectory. And you compose them with the most powerful orchestration tool ever invented: regular Python code.

No special orchestration framework needed. No message buses. No complex state machines. Just Python functions calling Python functions.

GEPA: Teaching Your Agents to Evolve

KISS goes beyond simplicity — it offers intelligent simplicity.

GEPA (Genetic-Pareto Prompt Evolution) is a prompt optimization system built into the framework.

Traditional prompt engineering is largely manual: make changes, evaluate, iterate, and hope for convergence. GEPA automates this process:

1. Run your agent
2. Reflect on what went wrong (using AI)
3. Evolve the prompt based on insights
4. Maintain a Pareto frontier of best performers
5. Combine winning strategies through crossover
6. Repeat until convergence

This is not just iteration — it is evolution. GEPA maintains multiple prompt candidates, each optimized for different objectives. Need an agent that is both accurate and concise? GEPA finds the optimal trade-off on the Pareto frontier.

from kiss.agents.gepa import GEPA

gepa = GEPA(
    agent_wrapper=my_agent_function,
    initial_prompt_template="You are a helpful assistant...",
    evaluation_fn=score_the_result,
    max_generations=10,
    population_size=8
)

best_prompt = gepa.optimize(arguments={"task": "solve problems"})

Our research behind this approach: "GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning". The paper demonstrates that prompt evolution can outperform RL on several benchmarks.

KISSEvolve: When Algorithms Write Themselves

What if you could start with a bubble sort and end up with quicksort — without writing a single line of sorting code yourself?

KISSEvolve is an evolutionary algorithm discovery framework. You provide:

Starting code (even a naive implementation)
A fitness function
An LLM to guide mutations
It includes features of OpenEvolve and several new ideas

KISSEvolve handles the rest:

from kiss.agents.kiss_evolve.kiss_evolve import KISSEvolve

# Start with O(n²) bubble sort
initial_code = """
def sort_array(arr):
    n = len(arr)
    for i in range(n):
        for j in range(n - i - 1):
            if arr[j] > arr[j + 1]:
                arr[j], arr[j + 1] = arr[j + 1], arr[j]
    return arr
"""

optimizer = KISSEvolve(
    initial_code=initial_code,
    evaluation_fn=measure_performance,
    model_names=[("gemni-3-flash-preview", 0.5),("gemni-3-pro-preview", 0.5)],
    population_size=8,
    max_generations=10
)

best = optimizer.evolve()
# Discovers O(n log n) algorithms like quicksort or mergesort

The framework includes several advanced features:

Island-Based Evolution: Multiple populations evolving in parallel with periodic migration
Novelty Rejection Sampling: Ensures diversity by filtering redundant solutions
Power-Law and Performance-Novelty Sampling: Sophisticated parent selection strategies
Multi-Model Support: Use different LLMs with configurable probabilities

The included kissevolve_bubblesort.py script demonstrates the discovery of O(n log n) sorting algorithms from a naive starting point.

Model Agnostic: Your LLM, Your Choice

KISS is not locked to any single provider. Out of the box, it supports:

Provider	Models
OpenAI	GPT-4.1, GPT-4o, GPT-5 series
Anthropic	Claude Opus 4.5, Sonnet 4.5, Haiku 4.5
Google	Gemini 2.5/3 Pro, Gemini Flash
Together AI	Llama 4, Qwen 3, DeepSeek R1/V3
OpenRouter	400+ models from all providers

Each model includes accurate pricing, context length limits, and capability flags. Token usage and costs are tracked automatically across all agent runs.

# Switch models with a single parameter
result = agent.run(model_name="claude-sonnet-4-5", ...)
result = agent.run(model_name="gemini-3-pro-preview", ...)
result = agent.run(model_name="openrouter/x-ai/grok-4", ...)

Docker Integration: Safe Sandboxing

Giving AI agents the ability to execute code is powerful but risky. KISS includes a DockerManager that makes sandboxing straightforward:

from kiss.docker.docker_manager import DockerManager

with DockerManager("ubuntu:latest") as env:
    agent = KISSAgent(name="Safe Agent")
    result = agent.run(
        model_name="gemini-3-flash-preview",
        prompt_template="Install nginx and configure it",
        tools=[env.run_bash_command]
    )

The agent can execute any bash command within the container. When the context manager exits, the container is destroyed. The host system remains untouched.

Trajectory Visualization: See What Your Agents Think

Debugging AI agents is notoriously difficult. What was the agent reasoning about? Why did it make that tool call?

KISS automatically saves complete trajectories to YAML files. For easier analysis, the framework includes a web-based trajectory visualizer:

uv run python -m kiss.viz_trajectory.server artifacts

The visualizer provides:

Dark-themed modern UI
Markdown rendering with syntax highlighting
Complete message history with timestamps
Token usage and budget tracking per step
Tool calls and their results

It turns agent debugging from guesswork into structured analysis.

Built-in Budget Tracking

AI API calls cost money. KISS tracks every token:

agent.run(
    model_name="gpt-4o",
    max_budget=1.0,  # USD limit for this run
    ...
)

print(f"Budget used: ${agent.budget_used:.4f}")
print(f"Tokens used: {agent.total_tokens_used}")
print(f"Global budget: ${KISSAgent.global_budget_used:.4f}")

Set per-agent limits or global limits. Cost calculation is automatic, based on actual model pricing. No more surprise API bills.

Getting Started

# Install uv (modern Python package manager)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone and setup
git clone https://github.com/your-repo/kiss_ai.git
cd kiss_ai
uv venv --python 3.13
uv sync --group dev

# Set your API keys
export OPENAI_API_KEY="your-key"
export ANTHROPIC_API_KEY="your-key"
export GEMINI_API_KEY="your-key"

# Run your first agent
uv run python -c "
from kiss.core.kiss_agent import KISSAgent
agent = KISSAgent('Hello World')
print(agent.run('gpt-4o', 'Say hello!', is_agentic=False))
"

Why KISS?

In a landscape increasingly defined by complexity, KISS offers a principled alternative.

It is for developers who believe that:

Simplicity is a feature, not a limitation
Code should be readable by humans, not just machines
Agents should be tools, not black boxes
Evolution beats manual engineering when the search space is vast

KISS does not try to be everything. It aims to be exactly what you need — a clean, powerful foundation for building AI agents that work.

What's Next

KISS is actively evolving. The roadmap includes:

Additional benchmark integrations
Enhanced multi-agent orchestration
Improved evolution strategies
Community-contributed tools and agents
Asynchronous tool calling support

The core philosophy will remain unchanged: Keep It Simple, Stupid.

Resources

GitHub: KISS Agent Framework
GEPA Paper: arXiv:2507.19457

Built by Koushik Sen (ksen@berkeley.edu)

Because the best code is the code you don't have to write.

License: Apache-2.0

Python: ≥3.13

Philosophy: KISS

Forem: Koushik Sen

Repo Optimizer: I Let a KISS AI Agent Optimize Itself Overnight. It Cut Its Own Cost by 98%.

The Setup

The Tool: repo_optimizer.py

What the Optimizer Actually Does

The Results

What the Optimizer Changed

Why This Works

The Bigger Picture: repo_agent.py

Try It Yourself

Agent Evolver: The Darwin of AI Agents

The Limits of Prompt Engineering

Evolutionary Optimization

1. Seed the Population

2. Mutate and Crossover

3. Pareto Frontier Selection

Comparison with Prompt Optimization

Architecture

Getting Started

Summary

Meet KISS Agent Framework

When Simplicity Becomes Your Superpower: Meet KISS Agent Framework

The Problem with AI Agent Frameworks Today

The Philosophy: Radical Simplicity

Your First Agent in 30 Seconds

Multi-Agent Orchestration is Function Composition

GEPA: Teaching Your Agents to Evolve

KISSEvolve: When Algorithms Write Themselves

Model Agnostic: Your LLM, Your Choice

Docker Integration: Safe Sandboxing

Trajectory Visualization: See What Your Agents Think

Built-in Budget Tracking

Getting Started

Why KISS?

What's Next

Resources

The Tool: `repo_optimizer.py`

The Bigger Picture: `repo_agent.py`