<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: yuval-qf</title>
    <description>The latest articles on Forem by yuval-qf (@yuvalqf).</description>
    <link>https://forem.com/yuvalqf</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3569214%2F48f5176f-bca2-4f35-b625-e8889b59c4e9.png</url>
      <title>Forem: yuval-qf</title>
      <link>https://forem.com/yuvalqf</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/yuvalqf"/>
    <language>en</language>
    <item>
      <title>Testing Your AI Agents with Rogue using MCP</title>
      <dc:creator>yuval-qf</dc:creator>
      <pubDate>Wed, 29 Oct 2025 13:34:10 +0000</pubDate>
      <link>https://forem.com/yuvalqf/testing-your-ai-agents-with-rogue-using-mcp-1eb2</link>
      <guid>https://forem.com/yuvalqf/testing-your-ai-agents-with-rogue-using-mcp-1eb2</guid>
      <description>&lt;h1&gt;
  
  
  Testing Your AI Agents with Rogue using MCP
&lt;/h1&gt;

&lt;p&gt;Testing AI agents is critical as they move into production. You need to ensure they follow your business rules, handle edge cases, and don't go... well, rogue.&lt;br&gt;
&lt;strong&gt;&lt;a href="https://github.com/qualifire-dev/rogue" rel="noopener noreferrer"&gt;Rogue&lt;/a&gt;&lt;/strong&gt; is an open-source AI agent evaluator that automatically tests your agents by having an intelligent &lt;code&gt;EvaluatorAgent&lt;/code&gt; interact with them across multiple scenarios, then grading their performance.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;⭐  Star Rogue on &lt;a href="https://github.com/qualifire-dev/rogue" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; to support the project!&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  MCP Support
&lt;/h2&gt;

&lt;p&gt;We recently added support for the &lt;a href="https://modelcontextprotocol.io" rel="noopener noreferrer"&gt;Model Context Protocol (MCP)&lt;/a&gt; to make Rogue even easier to use:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Simple Integration&lt;/strong&gt;: Just expose a &lt;code&gt;send_message&lt;/code&gt; tool and you're done&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Keep Your Stack&lt;/strong&gt;: Works with any agent framework (LangGraph, CrewAI, OpenAI Agents, etc.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Growing Ecosystem&lt;/strong&gt;: MCP is widely adopted and has great tooling support&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Minimal Wrapper Code&lt;/strong&gt;: Usually less than 50 lines to wrap any existing agent&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Wrapping Your Agent with MCP
&lt;/h2&gt;

&lt;p&gt;The beauty of MCP is that your agent can be built with &lt;strong&gt;any framework&lt;/strong&gt; - LangGraph, CrewAI, OpenAI Agents, custom implementations, whatever you prefer. You just need to wrap it with an MCP server that exposes a &lt;code&gt;send_message&lt;/code&gt; tool.&lt;/p&gt;

&lt;p&gt;Let's walk through how to create this wrapper step by step. For this example, we'll use a T-shirt store agent built with LangGraph (full code available in &lt;a href="https://github.com/qualifire-dev/rogue/tree/main/examples/mcp/tshirt_store_langgraph_mcp" rel="noopener noreferrer"&gt;here&lt;/a&gt;). Our agent isn't allowed to give any discounts or promotions, and this is what we're going to test.&lt;/p&gt;
&lt;h3&gt;
  
  
  Step 1: Initialize Your Agent
&lt;/h3&gt;

&lt;p&gt;First, create or import your existing agent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;.shirtify_agent&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ShirtifyAgent&lt;/span&gt;  &lt;span class="c1"&gt;# Your agent
# Or: from your_agent import MyAgent
&lt;/span&gt;
&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ShirtifyAgent&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# Initialize your agent
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Create the MCP Server
&lt;/h3&gt;

&lt;p&gt;In this example, we use FastMCP:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mcp.server.fastmcp&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;FastMCP&lt;/span&gt;

&lt;span class="n"&gt;mcp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FastMCP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;shirtify_agent_mcp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Server name
&lt;/span&gt;    &lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;127.0.0.1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10001&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Expose the &lt;code&gt;send_message&lt;/code&gt; Tool
&lt;/h3&gt;

&lt;p&gt;This is the &lt;strong&gt;key part&lt;/strong&gt; - create a tool that Rogue will use to communicate with your agent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@mcp.tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;send_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Send a message to the agent and get a response.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Your agent invocation logic here
&lt;/span&gt;    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# In our case, the agent returns a dictionary
&lt;/span&gt;    &lt;span class="c1"&gt;# where the response is in the "content" key
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Handle Session Management (Optional but Recommended)
&lt;/h3&gt;

&lt;p&gt;For multi-turn conversations, extract session IDs from the request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mcp.server.fastmcp&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Context&lt;/span&gt;

&lt;span class="nd"&gt;@mcp.tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;send_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Send a message to the agent and get a response.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Request&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request_context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;

        &lt;span class="c1"&gt;# Extract session ID from headers (streamable-http transport)
&lt;/span&gt;        &lt;span class="n"&gt;session_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mcp-session-id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Or from query params (SSE transport)
&lt;/span&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;session_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;query_params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;session_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exception&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error extracting session id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Pass session ID to your agent
&lt;/span&gt;    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 5: Start the MCP Server
&lt;/h3&gt;

&lt;h4&gt;
  
  
  For SSE transport:
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;mcp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transport&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sse&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  For streamable-http transport:
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;mcp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transport&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;streamable-http&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Complete MCP Wrapper Example
&lt;/h3&gt;

&lt;p&gt;Here's the full wrapper code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;functools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;lru_cache&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;loguru&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mcp.server.fastmcp&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;FastMCP&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;starlette.requests&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Request&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;.shirtify_agent&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ShirtifyAgent&lt;/span&gt;  &lt;span class="c1"&gt;# Change with your agent
&lt;/span&gt;

&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Create an MCP server wrapping your agent.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ShirtifyAgent&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;mcp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FastMCP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;shirtify_agent_mcp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;127.0.0.1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10001&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@mcp.tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;send_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Send a message to the Shirtify agent and get a response.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Request&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request_context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;

        &lt;span class="c1"&gt;# Extract session ID from headers (streamable-http transport)
&lt;/span&gt;        &lt;span class="n"&gt;session_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mcp-session-id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Or from query params (SSE transport)
&lt;/span&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;session_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;query_params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;session_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exception&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error extracting session id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Couldn&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t extract session id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Invoke your agent
&lt;/span&gt;    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;mcp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transport&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;streamable_http&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# or "sse"
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;📚 &lt;strong&gt;Full Example&lt;/strong&gt;: Check out the complete implementation in &lt;a href="https://github.com/qualifire-dev/rogue/tree/main/examples/mcp/tshirt_store_langgraph_mcp" rel="noopener noreferrer"&gt;examples/mcp/tshirt_store_langgraph_mcp&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;MCP Transport Options:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;streamable-http&lt;/code&gt;: The MCP endpoint is usually &lt;code&gt;http://localhost:10001/mcp&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;sse&lt;/code&gt;: The MCP endpoint is usually &lt;code&gt;http://localhost:10001/sse&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Testing with Rogue TUI
&lt;/h2&gt;

&lt;p&gt;The easiest way to see Rogue in action with MCP is using our built-in example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uvx rogue-ai &lt;span class="nt"&gt;--example&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;tshirt_store_langgraph_mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This single command:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Starts the MCP-wrapped T-shirt store agent on &lt;code&gt;http://localhost:10001/mcp&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;✅ Starts the Rogue server in the background&lt;/li&gt;
&lt;li&gt;✅ Launches Rogue's TUI interface, ready to evaluate&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Configuring Your MCP Agent in the TUI
&lt;/h3&gt;

&lt;p&gt;Once Rogue's TUI launches, follow these steps to configure and test your agent:&lt;/p&gt;

&lt;h4&gt;
  
  
  1. Configure the Judge Model
&lt;/h4&gt;

&lt;p&gt;Type &lt;code&gt;/models&lt;/code&gt; to set up your LLM API keys and select the judge model that will evaluate your agent.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdyq0tnlp8xc37qrn9n65.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdyq0tnlp8xc37qrn9n65.png" alt=" " width="800" height="523"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  2. Set Up Business Context &amp;amp; Generate Scenarios
&lt;/h4&gt;

&lt;p&gt;Type &lt;code&gt;/editor&lt;/code&gt; to open the business context editor. You can either:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hit &lt;code&gt;i&lt;/code&gt; for an interactive interview where Rogue asks you questions&lt;/li&gt;
&lt;li&gt;Write your business context manually&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Rogue can automatically generate test scenarios based on your context.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqu00q53u9nwrll2zrldt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqu00q53u9nwrll2zrldt.png" alt=" " width="800" height="515"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6d9cw6pqz8j2x4iqk2yg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6d9cw6pqz8j2x4iqk2yg.png" alt=" " width="800" height="512"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Example business context:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;T-Shirt Store Agent - Shirtify

&lt;span class="gu"&gt;## Products&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Regular and V-neck T-shirts
&lt;span class="p"&gt;-&lt;/span&gt; Colors: White, Black, Red, Blue, Green
&lt;span class="p"&gt;-&lt;/span&gt; Price: $19.99 USD (fixed, no discounts)

&lt;span class="gu"&gt;## Policies&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; No free merchandise
&lt;span class="p"&gt;-&lt;/span&gt; No sales or promotions
&lt;span class="p"&gt;-&lt;/span&gt; Payment required before fulfillment
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  3. Start the Evaluation
&lt;/h4&gt;

&lt;p&gt;Type &lt;code&gt;/eval&lt;/code&gt; to configure and start evaluation. Toggle &lt;strong&gt;Deep Test Mode&lt;/strong&gt; ON for multi-turn conversations (recommended for thorough testing).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzdncr37qozer77qbdhho.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzdncr37qozer77qbdhho.png" alt=" " width="800" height="483"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  4. Watch the Live Conversation
&lt;/h4&gt;

&lt;p&gt;Watch in real-time as Rogue's &lt;code&gt;EvaluatorAgent&lt;/code&gt; tests your agent across multiple scenarios.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmr6vzorgu1m9ev5ll9cf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmr6vzorgu1m9ev5ll9cf.png" alt=" " width="800" height="511"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  5. View the Report
&lt;/h4&gt;

&lt;p&gt;Hit &lt;code&gt;r&lt;/code&gt; to see the comprehensive evaluation report with pass/fail rates, findings, and recommendations.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbv6i4doypuphyb6pi0io.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbv6i4doypuphyb6pi0io.png" alt=" " width="800" height="507"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fejfcyuf0jzwz0053b9yv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fejfcyuf0jzwz0053b9yv.png" alt=" " width="800" height="421"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing with Rogue CLI (CI/CD)
&lt;/h2&gt;

&lt;p&gt;For automated testing in your deployment pipelines, use Rogue's CLI mode:&lt;/p&gt;

&lt;h3&gt;
  
  
  Basic CLI Usage
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Start the Rogue server&lt;/span&gt;
uvx rogue-ai server &amp;amp;

&lt;span class="c"&gt;# Run evaluation&lt;/span&gt;
uvx rogue-ai cli &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--protocol&lt;/span&gt; mcp &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--transport&lt;/span&gt; streamable-http &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--evaluated-agent-url&lt;/span&gt; http://localhost:10001/mcp &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--evaluated-agent-auth-type&lt;/span&gt; no_auth &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--judge-llm&lt;/span&gt; openai/gpt-4o-mini &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--workdir&lt;/span&gt; ./.rogue
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  CI/CD Integration Example
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/workflows/test-agent.yml&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Test AI Agent with Rogue&lt;/span&gt;

&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;pull_request&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;test_agent&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Start Agent &amp;amp; Run Rogue&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.OPENAI_API_KEY }}&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;# Start your MCP agent&lt;/span&gt;
          &lt;span class="s"&gt;python -m your_agent --port 10001 &amp;amp;&lt;/span&gt;

          &lt;span class="s"&gt;# Run Rogue evaluation&lt;/span&gt;
          &lt;span class="s"&gt;uvx rogue-ai server --port 8000 &amp;amp;&lt;/span&gt;
          &lt;span class="s"&gt;sleep 10  # Wait for server startup&lt;/span&gt;

          &lt;span class="s"&gt;uvx rogue-ai cli \&lt;/span&gt;
            &lt;span class="s"&gt;--protocol mcp \&lt;/span&gt;
            &lt;span class="s"&gt;--transport streamable-http \&lt;/span&gt;
            &lt;span class="s"&gt;--evaluated-agent-url http://localhost:10001/mcp \&lt;/span&gt;
            &lt;span class="s"&gt;--judge-llm openai/gpt-4o-mini \&lt;/span&gt;
            &lt;span class="s"&gt;--workdir ./.rogue&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The CLI will:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Connect to your MCP agent&lt;/li&gt;
&lt;li&gt;✅ Run all scenarios from &lt;code&gt;.rogue/scenarios.json&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;✅ Exit with status codes (0 = pass, non-zero = failures detected)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Tips for Effective Testing
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Write Comprehensive Business Context
&lt;/h3&gt;

&lt;p&gt;Your business context drives scenario quality. Include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Policies&lt;/strong&gt;: What your agent should/shouldn't do&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Available Actions&lt;/strong&gt;: Tools and capabilities&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Constraints&lt;/strong&gt;: Pricing, inventory, limitations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Expected Behavior&lt;/strong&gt;: How to handle edge cases&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Start with Core Scenarios
&lt;/h3&gt;

&lt;p&gt;Test your most critical use cases first:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Happy path interactions&lt;/li&gt;
&lt;li&gt;Policy violations (discount requests, price negotiations)&lt;/li&gt;
&lt;li&gt;Edge cases and error handling&lt;/li&gt;
&lt;li&gt;Security boundary testing&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Iterate Based on Results
&lt;/h3&gt;

&lt;p&gt;Use evaluation reports to improve your agent:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fix failed scenarios&lt;/li&gt;
&lt;li&gt;Add safeguards for edge cases&lt;/li&gt;
&lt;li&gt;Refine system prompts based on findings&lt;/li&gt;
&lt;li&gt;Re-test after changes&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Automate in CI/CD
&lt;/h3&gt;

&lt;p&gt;Make evaluation part of your deployment process:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Run on every pull request&lt;/li&gt;
&lt;li&gt;Block deployments on failed evaluations&lt;/li&gt;
&lt;li&gt;Track evaluation metrics over time&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why Rogue + MCP?
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────┐         ┌──────────────┐         ┌─────────────┐
│  Rogue Server   │────────▶│  MCP Server  │────────▶│ Your Agent  │
│  (Evaluator)    │  MCP    │  (Wrapper)   │         │ (Any Stack) │
└─────────────────┘ Protocol└──────────────┘         └─────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Framework Agnostic&lt;/strong&gt;: Works with LangGraph, CrewAI, OpenAI Agents, custom implementations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Minimal Integration&lt;/strong&gt;: ~50 lines of wrapper code&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Production-Ready&lt;/strong&gt;: Test the same interface users interact with&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Standardized Protocol&lt;/strong&gt;: MCP provides consistency across different agents&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Get Started Today
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Try the example&lt;/span&gt;
uvx rogue-ai &lt;span class="nt"&gt;--example&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;tshirt_store_langgraph_mcp

&lt;span class="c"&gt;# Or wrap your own agent&lt;/span&gt;
&lt;span class="c"&gt;# 1. Add MCP wrapper (see code above)&lt;/span&gt;
&lt;span class="c"&gt;# 2. Start your agent&lt;/span&gt;
&lt;span class="c"&gt;# 3. Run: uvx rogue-ai&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;📦 &lt;a href="https://github.com/qualifire-dev/rogue" rel="noopener noreferrer"&gt;Rogue GitHub Repository&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;📚 &lt;a href="https://github.com/qualifire-dev/rogue/tree/main/examples/mcp/tshirt_store_langgraph_mcp" rel="noopener noreferrer"&gt;Full MCP Example&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;💬 &lt;a href="https://discord.gg/EUfAt7ZDeK" rel="noopener noreferrer"&gt;Join our Discord&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Testing AI agents doesn't have to be complicated. With Rogue and MCP, you can ensure your agents behave correctly before they reach production.&lt;/p&gt;

&lt;p&gt;Have you tested your agents with Rogue? Share your experience below! 👇&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>rogue</category>
      <category>mcp</category>
    </item>
  </channel>
</rss>
