<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Maria for Orkes Conductor</title>
    <description>The latest articles on Forem by Maria for Orkes Conductor (@orkesconductor).</description>
    <link>https://forem.com/orkesconductor</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3878902%2F007cd0a9-609f-40aa-86ef-2ef286811698.png</url>
      <title>Forem: Maria for Orkes Conductor</title>
      <link>https://forem.com/orkesconductor</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/orkesconductor"/>
    <language>en</language>
    <item>
      <title>When Agents Meet Reality: Recapping Our Agents in Production Meetup in London April '26</title>
      <dc:creator>Maria for Orkes Conductor</dc:creator>
      <pubDate>Tue, 21 Apr 2026 18:51:59 +0000</pubDate>
      <link>https://forem.com/orkesconductor/when-agents-meet-reality-recapping-our-agents-in-production-meetup-in-london-april-26-3mbi</link>
      <guid>https://forem.com/orkesconductor/when-agents-meet-reality-recapping-our-agents-in-production-meetup-in-london-april-26-3mbi</guid>
      <description>&lt;p&gt;Everyone has seen that version of AI agents where everything just works. The reasoning is clean, every tool call lands, every output is exactly what you wanted. And then you try to build one yourself for production, and honestly? It's a pretty different experience.&lt;/p&gt;

&lt;p&gt;Last week in London, we got engineers, tech leads, and builders into a room for Agents in Production, a meetup hosted by Orkes. The whole evening was basically one long honest conversation about that gap between demo agents and the ones you actually have to keep running in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  The evening ✨
&lt;/h2&gt;

&lt;p&gt;The format was simple: two talks, and then drinks and questions.&lt;/p&gt;

&lt;p&gt;What made the room really awesome was the mix. Half the people there were already building agents in production and running into real problems. Things like state, retries, observability, and all the stuff that doesn't show up in any demo. The other half were earlier on, and were trying to figure out where to even start without repeating everyone else's mistakes. Honestly, both groups had a lot to share with each other.&lt;/p&gt;

&lt;h2&gt;
  
  
  Talk 1: When Agents Meet Reality, and Why Execution Is the Hard Part
&lt;/h2&gt;

&lt;p&gt;I kicked things off with a talk I've been sitting on for months.&lt;/p&gt;

&lt;p&gt;The short version: stop only asking whether your agent is smart. Start asking if it's actually operable. Because the second you try to run a clever agent in production, a pretty different set of problems comes up:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;State has to persist&lt;/strong&gt; across steps that might span minutes, hours, or even days.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Failures are partial and messy.&lt;/strong&gt; Not the kind of clean crash you can just catch and retry. More like silent degradations mid-task, the kind you only notice when someone else tells you.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Humans need visibility&lt;/strong&gt; into what's happening at each stage, and the ability to step in without breaking the whole workflow.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Long-running coordination&lt;/strong&gt; between agents, tools, and humans needs infrastructure most teams just aren't thinking about enough.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is where orchestration actually earns its keep. Not as a buzzword, but as the actual difference between an agent that demos well and one you'd put in front of real users. Can you observe it? Can you recover when it fails? Can a human step in without everything falling over?&lt;/p&gt;

&lt;p&gt;And based on the questions after, the room was feeling this too.&lt;/p&gt;

&lt;h2&gt;
  
  
  Talk 2: From Prototype to Production, and How First Databank UK Did It
&lt;/h2&gt;

&lt;p&gt;Where talk one was the argument, talk two was the evidence.&lt;/p&gt;

&lt;p&gt;Dan Miller from First Databank UK walked us through how his team actually orchestrates three production AI agents using Orkes Conductor:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Noisy cloud alerts.&lt;/strong&gt; Triaging and surfacing only what actually matters.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Time-consuming SPIKE investigations.&lt;/strong&gt; Automating the research and synthesis work.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Manual clinical guidance monitoring.&lt;/strong&gt; Keeping a continuous eye on changing medical guidelines.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What made Dan's talk so good was how honest it was. He didn't skip the hard parts. Things like the retries, the human checkpoints, the observability that needs to be talked about more. Orkes Conductor gave his team durable execution, full observability, and human-in-the-loop checkpoints. Basically, all the boring stuff that turns a clever prototype into something a team can actually rely on.&lt;/p&gt;

&lt;p&gt;And the clinical angle made it hit even harder. When your agent is working somewhere that patient safety matters, the bar for observability and control just jumps way up.&lt;/p&gt;

&lt;h2&gt;
  
  
  The conversation that followed
&lt;/h2&gt;

&lt;p&gt;Once the talks wrapped, I honestly expected the room to slide into small talk or people to start leaving. It didn't though. People stayed locked in and continued to ask questions until we had to leave because the venue was closing for the night.&lt;/p&gt;

&lt;p&gt;A few themes kept coming up:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Safety and trust.&lt;/strong&gt; When do you actually trust an agent's decision? Where do humans need to stay in the loop, and how do you design those handoffs so they don't turn into bottlenecks? And nobody was speaking in the abstract either. People were wrestling with this in stuff they'd shipped that week.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The "how do we even start" question.&lt;/strong&gt; The gap between "we've seen the demos" and "we've actually shipped something real" is way wider than it looks from the outside. There was real hunger for patterns, reference architectures, and honest stories about what didn't work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cross-industry patterns.&lt;/strong&gt; Engineers from fintech, healthcare, dev tools, and retail kept comparing notes and landing on the same problem which is putting these agents out there and building them in a way so that we can trust them.&lt;/p&gt;

&lt;h2&gt;
  
  
  One more thing: Agentspan
&lt;/h2&gt;

&lt;p&gt;We also got to drop something new at the event: &lt;a href="https://agentspan.ai/" rel="noopener noreferrer"&gt;Agentspan&lt;/a&gt;, a framework for building agents in a durable, production-ready way. It's basically our direct answer to everything the evening's talks were circling around.&lt;/p&gt;

&lt;p&gt;The reaction in the room made it pretty clear this is what people have been looking for and were excited to get started.&lt;/p&gt;

&lt;h2&gt;
  
  
  Next stop: Amsterdam
&lt;/h2&gt;

&lt;p&gt;London confirmed something I'd been suspecting for a while. There's a real, growing community of people who want to stop talking about agents in theory and start sharing what actually works (and what doesn't) when you're running them in production. So yeah, we're doing it again.&lt;/p&gt;

&lt;p&gt;If you're:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Building agents right now&lt;/strong&gt; and want to compare notes with people hitting the same walls,&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Thinking about building&lt;/strong&gt; and want to skip a few expensive mistakes,&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Or just trying to make sense&lt;/strong&gt; of where all of this is actually heading,&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;...this is the room for it. &lt;/p&gt;

&lt;p&gt;If you are in Amsterdam and want in drop a comment or shoot me a message on &lt;a href="https://www.linkedin.com/in/maria-shimkovska/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;. I'm collecting names now, and I'll reach out as soon as we've locked in a date and venue.&lt;/p&gt;

&lt;p&gt;See you in Amsterdam!&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>automation</category>
      <category>learning</category>
    </item>
    <item>
      <title>Building a Full Agent System: An Orchestrator and a Customer 360 Example</title>
      <dc:creator>Maria for Orkes Conductor</dc:creator>
      <pubDate>Tue, 21 Apr 2026 14:54:39 +0000</pubDate>
      <link>https://forem.com/orkesconductor/building-a-full-agent-system-an-orchestrator-and-a-customer-360-example-71h</link>
      <guid>https://forem.com/orkesconductor/building-a-full-agent-system-an-orchestrator-and-a-customer-360-example-71h</guid>
      <description>&lt;p&gt;Author: &lt;a href="https://www.linkedin.com/in/maria-shimkovska/?skipRedirect=true" rel="noopener noreferrer"&gt;Maria Shimkovska&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you came to our London tech event, you saw me walk through this as a live demo. A few people asked if I could write it up, so here it is. Same demo, but something you can clone, run, and poke at yourself, and see how you can take some of your own business processes and build them into an agentic system like this one. Keep in mind this is just a demo so the goal here is to show you how you can build a production agentic system and how you can add orchestration to overlook everything.&lt;/p&gt;

&lt;p&gt;You can grab the code &lt;a href="https://github.com/maria-shimkovska/customer360demo" rel="noopener noreferrer"&gt;here&lt;/a&gt;, where I also cover setup in more detail.&lt;/p&gt;

&lt;p&gt;Quick context before we dig in. An "agent" in this post means an AI model that can use tools and make judgment calls on its own, not just answer a question. A "Customer 360" is a complete picture of a customer pulled from every system where their data lives, like billing, support, and product usage. The goal of the demo is to show how agents can assemble that picture and decide what to do about it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting it running
&lt;/h2&gt;

&lt;p&gt;The whole thing is designed to go from clone to running UI in about a minute.&lt;/p&gt;

&lt;p&gt;Clone the repo, then copy .env.example to .env at the repo root and fill in your Orkes credentials and OpenAI key. That's the only configuration you need.&lt;/p&gt;

&lt;p&gt;Then start the stack with one command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;./start_demo.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's genuinely it. The script boots the Agentspan server, waits for it to be ready, sets your API credentials, spins up the Conductor workers, the Express backend, and the React frontend, all in one go. If you already have an Agentspan server running from a previous session, it'll restart it cleanly. Logs for each component go to the &lt;code&gt;logs/&lt;/code&gt; folder if you need to debug. Hit &lt;code&gt;Ctrl+C&lt;/code&gt; to stop everything.&lt;/p&gt;

&lt;p&gt;Then open &lt;code&gt;http://localhost:5173&lt;/code&gt; in your browser.&lt;/p&gt;

&lt;p&gt;The UI is honestly the smallest part of this. The interesting pieces are Conductor and Agentspan, but I wanted a full end-to-end flow so you can see how everything connects.&lt;/p&gt;

&lt;h2&gt;
  
  
  What happens when you hit Run
&lt;/h2&gt;

&lt;p&gt;The UI is honestly the smallest part of this. The interesting pieces are Conductor and Agentspan, but I wanted a full end-to-end flow so you can see how everything connects.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Pick a scenario in the UI.&lt;/strong&gt; There are three, each designed to exercise different branches of the system:

&lt;ul&gt;
&lt;li&gt;John Doe, an at-risk existing customer&lt;/li&gt;
&lt;li&gt;Marcus Webb, a watchlist case whose usage is softening but isn't yet critical&lt;/li&gt;
&lt;li&gt;Marina Petrova, a brand new customer the system has never seen&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Click Run.&lt;/strong&gt; The frontend calls the Express backend, which starts the Conductor workflow on Orkes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Workers pick up each task&lt;/strong&gt; and run the agents via Agentspan.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The UI polls every 500ms&lt;/strong&gt; and shows progress as each step completes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Final output appears&lt;/strong&gt; when the workflow finishes.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's the user-facing loop. Before we dig into the agents themselves, it's worth zooming out to see how the pieces underneath fit together, because the architecture is doing a lot of the heavy lifting.&lt;/p&gt;

&lt;h2&gt;
  
  
  The architecture, piece by piece
&lt;/h2&gt;

&lt;p&gt;Before we look at the agents individually, it helps to zoom out and see the whole system on one page. Here's what the pipeline actually looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Incoming event
      │
      ▼
┌─────────────────┐
│ Identity Agent  │  Works out who the event belongs to
└────────┬────────┘
         │
         ▼
   Is this a new customer?
         │
    ┌────┴────┐
    │ Yes     │ No
    ▼         ▼
┌──────────┐  ┌───────────────┐   ┌────────────────┐
│Onboarding│  │ Health Agent  │──▶│ Strategy Agent │
│  Agent   │  └───────────────┘   └────────────────┘
└──────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A new customer gets routed to Onboarding. An existing customer goes through Health, then Strategy. Every agent receives everything the previous agents produced, so by the end you have one combined payload covering identity, health, and the recommended next action.&lt;/p&gt;

&lt;p&gt;Now the systems that make that happen.&lt;/p&gt;

&lt;h2&gt;
  
  
  The three main systems
&lt;/h2&gt;

&lt;p&gt;There are three moving parts: Conductor, Agentspan, and the agents themselves. Each does a distinct job, and they work independently of each other, which is the point.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conductor is the coordinator
&lt;/h3&gt;

&lt;p&gt;This is essentially the project manager for the whole system. It owns the workflow definition: what runs, in what order, and what happens at each fork in the road. When you click run in the UI, the Express backend tells Orkes (the hosted version of Conductor) to start a new execution of the &lt;code&gt;customer_360_refresh workflow&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;From that point, Conductor is in charge. It queues up the first task, waits for a worker to pick it up, receives the result, and decides what comes next. It handles retries if something fails, tracks state across every step, and enforces the routing logic. &lt;/p&gt;

&lt;p&gt;For example, it uses a branching step to send new customers down the onboarding path and existing customers down the health and strategy path. Conductor doesn't know or care what the agents are doing inside each task. It just moves data through the pipeline.&lt;/p&gt;

&lt;h3&gt;
  
  
  Agentspan is where the agents actually run
&lt;/h3&gt;

&lt;p&gt;It runs as a local server on port 6767 and is what executes the AI model calls. Each agent is registered there with its model, its tools, its instructions, and its safety checks. &lt;/p&gt;

&lt;p&gt;When a worker needs to run the health agent, it calls Agentspan with the input. Agentspan handles the back and forth with the model, including tool calls, retries when a safety check fails, and making sure the output matches the expected format. &lt;/p&gt;

&lt;p&gt;If Conductor is the nervous system connecting everything, Agentspan is the brain doing the actual thinking.&lt;/p&gt;

&lt;h3&gt;
  
  
  The workers are the bridge between the two
&lt;/h3&gt;

&lt;p&gt;They're Python processes that keep asking Conductor, "do you have any tasks for me?" When Conductor hands one off, the worker unpacks the input, calls the right Agentspan agent, and posts the result back to Conductor. &lt;/p&gt;

&lt;p&gt;The workers reach out to Conductor rather than Conductor pushing work to them, which means you can run as many workers as you want and they'll never step on each other.&lt;/p&gt;

&lt;h2&gt;
  
  
  The agents
&lt;/h2&gt;

&lt;p&gt;The agents sit at the end of this chain, and this is where the reasoning actually happens. Each one is scoped to a single responsibility:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Identity&lt;/strong&gt; works out who the incoming event belongs to&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Health&lt;/strong&gt; combines signals from four systems into a score and a risk summary&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Strategy&lt;/strong&gt; decides the single most important next action&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Onboarding&lt;/strong&gt; runs only for brand new customers, to kick off the welcome process&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each agent receives the accumulated output of every step before it, adds its own section, and passes the whole thing forward. By the time the workflow completes, you have one unified payload covering identity, health, and recommended action, assembled piece by piece as it moved through the pipeline. (We'll dig into each agent individually in the next section.)&lt;/p&gt;

&lt;h2&gt;
  
  
  The supporting pieces
&lt;/h2&gt;

&lt;p&gt;Three systems do the orchestration and the thinking, but a few other parts of the repo keep the whole thing honest.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data stores&lt;/strong&gt; live in &lt;code&gt;/data&lt;/code&gt;. &lt;code&gt;customer_store.py&lt;/code&gt; is the identity graph: every known customer and all the different IDs they have across source systems (so an event from Salesforce with a contact ID can be traced back to the same person in Zendesk, Stripe, and so on). &lt;code&gt;health_store.py&lt;/code&gt; holds the signals the Health Agent needs, like product usage, support tickets, billing events, and engagement history, plus the playbooks that match each health status. &lt;code&gt;scenario_inputs.py&lt;/code&gt; is just sample data for the three demo scenarios. In a real system these would be connections to your live databases; for a demo they're self-contained Python files you can read and change.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Guardrails&lt;/strong&gt; (in &lt;code&gt;guardrails.py&lt;/code&gt;) are safety checks that run on every agent's inputs and outputs. They're deterministic code, meaning they always run the same way regardless of what the AI model decides, and they sit at the boundary of each agent to catch things the model shouldn't be trusted with. A few examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;validate_input_record&lt;/code&gt; checks that an incoming event has the required fields and comes from a known source system&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;no_prompt_injection&lt;/code&gt; blocks attempts to smuggle instructions into user-supplied text fields&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;conservative_identity_match&lt;/code&gt; flags suspicious combinations, like a &lt;code&gt;NO_MATCH&lt;/code&gt; result paired with a high confidence score, for a human to review&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;no_pii_in_output&lt;/em&gt; blocks patterns like social security numbers or credit card numbers from appearing in any agent's output&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These exist because AI models are good at reasoning but bad at being reliably boring. The guardrails handle the boring, must-not-fail parts so the agents don't have to.&lt;/p&gt;

&lt;p&gt;The UI (in &lt;code&gt;/demo-ui&lt;/code&gt;) has two halves. The frontend is a React app on port 5173 with the three scenario buttons, a step-by-step progress view, and a results panel. The backend is a small Express API on port 3001 that kicks off workflow executions and proxies the status polling to Orkes. The UI is genuinely the least interesting part of the system, but it gives you a way to see what's happening. The pipeline runs the same way whether the UI is open or not.&lt;/p&gt;

&lt;p&gt;With all of this connection clear, let's get into why each of those four steps is an agent and not just a regular function, because this is another huge part of building agentic systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why every agent is actually an agent in this example
&lt;/h2&gt;

&lt;p&gt;It's tempting, when you're building something like this, to let "agent" become a label you slap on any AI model call. I've tried to be strict about it here. Each of the four components below earns the name because there's real judgment involved that you can't cleanly reduce to code that always follows the same rules.&lt;/p&gt;

&lt;h3&gt;
  
  
  Identity Agent
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it does:&lt;/strong&gt; Takes a raw event from any source system (like Salesforce, Zendesk, Stripe, and so on) and decides whether it belongs to a known customer of this company.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it has to be an agent:&lt;/strong&gt; Matching people is inherently messy. The same person shows up as &lt;code&gt;j.doe@acme.com&lt;/code&gt; in one system and &lt;code&gt;John Doe / Acme, Inc.&lt;/code&gt; in another. A rules engine can calculate similarity scores, and ours does, but it can't reason about whether a 0.78 score with a shared team email like &lt;code&gt;billing@&lt;/code&gt; is actually the account rather than a specific person, or whether two candidates with similar names at the same company are the same human or two colleagues.&lt;/p&gt;

&lt;p&gt;The agent's real job is the judgment call in the gray zone: &lt;code&gt;MATCH&lt;/code&gt;, &lt;code&gt;UNCERTAIN&lt;/code&gt;, or &lt;code&gt;NO_MATCH&lt;/code&gt;. It has to weigh conflicting signals, apply the conservative matching rule ("false merges are worse than missed ones"), and decide when to escalate to a human reviewer. That reasoning step, given all of this, what's the right call and why, is where an AI model earns its place over code in this example.&lt;/p&gt;

&lt;h3&gt;
  
  
  Health Agent
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it does:&lt;/strong&gt; Pulls signals from four separate systems (usage, support, billing, and customer records), combines them into a score, and surfaces risks and opportunities.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it has to be an agent:&lt;/strong&gt; The scoring logic itself (&lt;code&gt;calculate_health_score&lt;/code&gt;) is fixed, meaning the same inputs always produce the same number. That's intentional. You want a reproducible score. But the agent earns its place in the steps before and after.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Before scoring:&lt;/em&gt; it has to decide which customer ID to use. A person record arrives, but their health data lives on the account. The agent has to navigate that relationship, call the right tools, and pass the right data to &lt;code&gt;calculate_health_score&lt;/code&gt;. A hardcoded pipeline would break the moment the data model shifts.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;After scoring:&lt;/em&gt; it has to interpret the outputs in context and produce a human-readable summary. "Product usage declined 38.2% over the last 30 days" combined with "2 escalated tickets" combined with "renewal in 21 days" tells a story that's more than the sum of its parts. The agent connects those dots into a coherent risk narrative rather than just spitting out a list of triggered rules.&lt;/p&gt;

&lt;h3&gt;
  
  
  Strategy Agent
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it does:&lt;/strong&gt; Reads the identity and health output and decides the single most important next action.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it has to be an agent:&lt;/strong&gt; This is the most agent-like of the four. Our &lt;code&gt;prioritize_customer_action&lt;/code&gt; tool has a priority order built in (escalations beat renewal risk, renewal risk beats usage decline), but that order is static. Real accounts don't fit cleanly into one bucket. Marcus Webb (&lt;code&gt;WATCHLIST&lt;/code&gt;) has usage decline and stale engagement and a ticket backlog. None of those trigger the highest-priority rules on their own, but together they tell a different story.&lt;/p&gt;

&lt;p&gt;The agent has to weigh which combination of signals matters most for this specific customer, pull the right playbook, decide whether to create a task or trigger outreach or both, and write the summary in language a customer success manager can act on. That synthesis, turning context into a specific, personalized recommendation with reasoning attached, is what separates it from a simple decision tree.&lt;/p&gt;

&lt;h3&gt;
  
  
  Onboarding Agent
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it does:&lt;/strong&gt; For brand new customers only. It creates a kickoff task, builds a 30-day plan, and triggers a welcome sequence.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it has to be an agent:&lt;/strong&gt; This one is the most tool-like of the four; the tools are largely static templates right now. But it still earns the "agent" label for two reasons.&lt;/p&gt;

&lt;p&gt;First, routing. It only runs when &lt;code&gt;action_taken == "created"&lt;/code&gt;. That condition is checked by the workflow router, but the agent still has to confirm it's in the right context before acting, and gracefully handle edge cases like a missing email, an unknown role, or no customer success manager assigned yet.&lt;/p&gt;

&lt;p&gt;Second, personalization. &lt;code&gt;build_onboarding_plan&lt;/code&gt; returns the same four-week template for everyone today, but an agent can adapt it. A VP of Engineering gets different week-3 actions than a Head of Operations. As the tools get richer, the agent can tailor the plan to the customer's role, company size, and plan tier without anyone having to hardcode every combination.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping up
&lt;/h2&gt;

&lt;p&gt;The thread running through all four: &lt;strong&gt;the parts that should stay consistent stay consistent, and the agents sit around them doing the reasoning work that brittle code can't&lt;/strong&gt;. Scoring is a function. Priority ordering is a lookup. Matching thresholds are numbers in a config. What the agents handle is everything in between: deciding which tool to call, how to interpret the output, when the rules don't fit the situation, and how to narrate the result in a way a human can actually use.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>programming</category>
      <category>javascript</category>
    </item>
    <item>
      <title>You Can Now Let Claude Code Build Workflows For You Using Conductor Skills</title>
      <dc:creator>Maria for Orkes Conductor</dc:creator>
      <pubDate>Tue, 14 Apr 2026 17:16:31 +0000</pubDate>
      <link>https://forem.com/orkesconductor/you-can-now-let-claude-code-build-workflows-for-you-using-conductor-skills-52c5</link>
      <guid>https://forem.com/orkesconductor/you-can-now-let-claude-code-build-workflows-for-you-using-conductor-skills-52c5</guid>
      <description>&lt;p&gt;If you're already using &lt;a href="https://platform.claude.com/docs/en/agents-and-tools/agent-skills/overview" rel="noopener noreferrer"&gt;Agent Skills&lt;/a&gt; with &lt;a href="https://code.claude.com/docs/en/overview" rel="noopener noreferrer"&gt;Claude Code&lt;/a&gt;, you can now add Conductor Skills to build, deploy, and run entire workflows directly from your Claude terminal.&lt;/p&gt;

&lt;p&gt;You can just "chat" with the Claude Code terminal and let it build your workflows directly in your Conductor clusters. Pretty cool!&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Conductor and What are Conductor Skills?
&lt;/h2&gt;

&lt;p&gt;Conductor is a workflow orchestration engine where you define a workflow as a series of tasks like API calls, custom code, conditionals, parallel branches, and human approvals. From there, Conductor runs them, tracks their state, handles retries, and gives you full visibility into what happened.&lt;/p&gt;

&lt;p&gt;Conductor Skills is the plugin that lets Claude Code create and manage these workflows for you so this gives you another way to create workflows. I like using it when I am getting started with workflows the most because it provides me with a really solid start and then I can iterate from there.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A very quick note on prerequisites for a project like this&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Before you start, you'll need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Claude Code&lt;/strong&gt; installed and configured&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Java 21 or later installed&lt;/strong&gt; — the local Conductor server I am using here is a JAR file and won't start without it. Run &lt;code&gt;java -version&lt;/code&gt; to check. If you don’t want to use Conductor’s local server you can just point Claude Code to Orkes Conductor Developer Edition and you don’t need to have Java installed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Conductor Skills plugin&lt;/strong&gt; installed (instructions below)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;How to Install Conductor Skills in Claude Code&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Open up your Claude Code terminal and type in the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/plugin marketplace add conductor-oss/conductor-skills
/plugin &lt;span class="nb"&gt;install &lt;/span&gt;conductor@conductor-skills
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To verify you have it, you can type this in your Claude Code session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt; /plugin list
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you see &lt;code&gt;conductor&lt;/code&gt; in the output, you're good to go. If you don’t see it under the &lt;strong&gt;Plugins&lt;/strong&gt; tab (because there are a lot there by default), you will if you go to &lt;strong&gt;Installed&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Claude now knows how to talk to a Conductor server, register your workflows, start them, monitor their status, manage failures, and write workers (your own custom code/services). I also like using Claude Code because it just helps with planning and brainstorming too.&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 1: Build Your First Workflow with Claude Code and Conductor Skills
&lt;/h2&gt;

&lt;p&gt;Let's start with something simple. A workflow that takes a URL, fetches its contents, and returns them. The point here is to see how the pattern works and get a feel for the build-run-iterate cycle of this process and way of building workflows using Claude Code.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Start Your Local Conductor Server
&lt;/h3&gt;

&lt;p&gt;First you need a Conductor server running so that Claude Code can connect to it to register the workflows in. You have two options (the local one I am showing here, and also the Orkes hosted version which I will show later), but let’s start with a local build. In Claude Code, type:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/conductor:conductor
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude will ask what you want to do. Tell it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Start a local Conductor server for development
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude runs conductor server start behind the scenes. Once it's healthy, you'll have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;API: &lt;code&gt;http://localhost:8080/api&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;UI: &lt;code&gt;http://localhost:8080&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can open the UI in your browser to see your workflows visually as you build them. Keep the server running and move on. You can also just ask Claude Code to explain what it builds and how the workflow is doing. So you can just say things like “Is my server up and running? How many workflows do I have in my Conductor server? What are those workflows? Which Conductor cluster is it pointing to?” It'll query the API and answer you.&lt;/p&gt;

&lt;p&gt;Now just write something like build me a very simple workflow or "Build me a workflow that takes a URL, fetches its contents, and returns them" and see it working. &lt;/p&gt;

&lt;p&gt;And then you can just check it out by going to &lt;code&gt;localhost:8080&lt;/code&gt; (if you want to see the OSS UI), but you don’t have to. Anything you want to know about the workflow you can also ask Claude. &lt;/p&gt;

&lt;p&gt;If I do go to the UI to see the workflow visualized, here is what I see: &lt;/p&gt;

&lt;p&gt;Just one task, but I can use Claude to run it and then I can build on top of this one and iterate. &lt;/p&gt;

&lt;p&gt;From here Claude also suggests some improvements and in this case I agree with the suggestion. So now I can iterate on this and ask Claude to add a task to get all the blog titles from the page. &lt;/p&gt;

&lt;p&gt;You can also connect it to an Orkes Hosted Cluster Instead&lt;/p&gt;

&lt;p&gt;If you don't want to run a local server, point Claude Code at an Orkes cluster. The Orkes Developer Edition is a free hosted service where you can build workflows and experiment without installing anything locally. &lt;/p&gt;

&lt;p&gt;Just tell Claude:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Connect to my Conductor server at https://developer.orkescloud.com/api and create the same workflow there instead
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://developer.orkescloud.com/api" rel="noopener noreferrer"&gt;https://developer.orkescloud.com/api&lt;/a&gt; is the url for the Developer Edition cluster.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Claude will ask for your authentication details like your Key ID and Key Secret. You can generate these from the Applications page in your Orkes dashboard. Create an application (or use an existing one), generate a key pair, and paste the values when Claude asks for them. &lt;/p&gt;

&lt;p&gt;If you'd rather not type credentials into Claude, set them as environment variables in a separate terminal session first:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;CONDUCTOR_SERVER_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://developer.orkescloud.com/api
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;CONDUCTOR_AUTH_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;your_key_id&amp;gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;CONDUCTOR_AUTH_SECRET&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;your_key_secret&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once connected, Claude gives you a summary of everything on the cluster. So your output will look something like this: &lt;/p&gt;

&lt;p&gt;This works with any Orkes cluster: Developer Edition, your team's staging environment, production, whatever. Just swap the URL and credentials. From here, you can create new workflows, run existing ones, or explore what's already there.&lt;/p&gt;

&lt;p&gt;Now you can just tell Claude Code something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Check that you are connected to the Developer Edition of Orkes Conductor and build the same workflow there instead.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Step 2: Run your new Conductor workflow from Claude Code&lt;/p&gt;

&lt;p&gt;Let's test with a simple HTML page first. Go ahead and tell Claude Code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;Run the new workflow with https://orkes.io/blog/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://orkes.io/blog/" rel="noopener noreferrer"&gt;https://orkes.io/blog/&lt;/a&gt; is just the link to the Orkes Conductor blog page. &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;From there your Claude Code session might differ than mine depending on what Claude "decides" to do, but it's likely to ask you questions like it did with me. &lt;/p&gt;

&lt;p&gt;It asked me if I would like to create a new task to grab specific information from the page. I said "Yes, please create a task in the workflow to return all the blog post title from the url". And then Claude continues from there. &lt;/p&gt;

&lt;p&gt;At the end of this short session I got a working workflow in my Conductor cluster and then I could just communicate with it through Claude using plain English to describe what I want. &lt;/p&gt;

&lt;p&gt;Here is the final small workflow in the Developer Edition of Orkes Conductor after a successful run of the workflow: &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkz4vcf6199l8x40m2u0r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkz4vcf6199l8x40m2u0r.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In another article I am going to use Claude Code to create a Content Refresh workflow from a spec. In this one I wanted to show you how you can use Claude Code to build a simple workflow and have it run, but for anything close to a durable workflow I find that the best thing is to approach it the way you would any software project, starting with a good document outlining requirements and other things. &lt;/p&gt;

&lt;p&gt;-- Author: Maria Shimkovska&lt;/p&gt;

</description>
      <category>ai</category>
      <category>tutorial</category>
      <category>beginners</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
