Forem: Bizbox

Deep Dive: The awaiting_human Status — Rethinking Agent-Human Handoff in Bizbox

Bizbox — Tue, 12 May 2026 05:08:49 +0000

Deep Dive: The `awaiting_human` Status — Rethinking Agent-Human Handoff in Bizbox

May 2026

The Problem: When "Blocked" Means Two Different Things

For the first few months of Bizbox, we used a single blocked status to mean "this issue can't move forward right now." Simple enough. But as our agent routines grew more sophisticated, we hit a pattern that kept causing friction:

Some blocked issues need an AI agent to unstick them. Others need a human.

When an issue is waiting on a dependency—another task to complete, an external API to respond, a CI pipeline to finish—that's work another AI agent could help with. Maybe a reconciler can auto-assign the blocker, or a monitoring routine can check if the condition cleared.

But when an issue is waiting on a human decision—"Should we proceed with this plan?" or "Which option do you prefer?"—those are fundamentally different. An AI agent stepping in to "unstick" a decision the board is still weighing breaks the execution contract.

The blocked status was doing double duty, and it meant our reconciler logic had to choose: either ignore all blocked work (and miss real unblocking opportunities), or risk auto-claiming issues that humans had explicitly parked for review.

We chose to split the concept.

PR #33: Add awaiting_human issue status landed on May 8, 2026, introducing a new status and tightening the semantics of the existing one.

The Design: Two Flavors of Waiting

Here's how we now distinguish parked work:

`blocked`

Meaning: Waiting on another issue, external system, or dependency.
Who can unstick it: Other AI agents, automated workflows, or the blocking condition clearing.
Reconciler behavior: May auto-assign or auto-wake when the blocker resolves.

`awaiting_human`

Meaning: Waiting on a human or board decision, answer, or informal confirmation.
Who can unstick it: Only the board (or the owning user-assignee).
Reconciler behavior: Explicitly excluded—agents must not auto-claim or transition out of this status.

The third status, in_review, remains unchanged—it still covers formal execution-policy signoff gates, which are a separate control.

Implementation: Where It Touches

This wasn't just a database enum tweak. The awaiting_human status flows through the whole system:

Auto-Park Triggers

When an agent creates an interaction of type ask_user_questions or request_confirmation on an in_progress issue, we now auto-park the issue to awaiting_human immediately. This prevents the agent from continuing to act on the issue while a human response is pending.

Interactions created by board users (not agents) don't trigger auto-park—those are considered collaborative edits, not blocking questions.

Here's the relevant logic from services/issue-thread-interactions.ts:

// Auto-park to awaiting_human when an agent asks a question or requests confirmation
if (
  (kind === 'ask_user_questions' || kind === 'request_confirmation') &&
  interaction.createdByAgentId &&
  issue.status === 'in_progress'
) {
  await issueStore.update(issueId, { status: 'awaiting_human' });
}

Reconciler Exclusion

The heartbeat reconciler—which auto-assigns unowned blocking issues—now explicitly skips awaiting_human. That's enforced in services/heartbeat.ts:

const eligibleStatuses = ['todo', 'blocked']; // NOT awaiting_human

Wake-Reason Filtering

When a cron wake or external event tries to auto-checkout an issue, we now allow awaiting_human only when the wake reason indicates genuine human action: issue_commented, issue_reopened_via_comment, interaction_resolved, or approval_approved.

Other wake reasons (like scheduled or dependency_unblocked) will skip awaiting_human issues, because those wakes are system-driven, not human-driven.

Agent Mutation Guards

Agents can set an issue to awaiting_human (useful when they detect a blocker they can't resolve), but they cannot transition out of it. That's enforced via a 403 response in routes/issues.ts:

if (
  existingIssue.status === 'awaiting_human' &&
  body.status !== 'awaiting_human' &&
  req.auth.principalType === 'agent'
) {
  return res.status(403).json({
    error: 'Agents may not transition issues out of awaiting_human status'
  });
}

Board users and the owning user-assignee can still move the issue forward.

UI Changes

The status shows up in the Kanban board between in_review and blocked, rendered in an amber/orange palette (distinct from the red blocked). The dashboard now surfaces a separate tasks.awaitingHuman counter so operators can see at a glance how many issues are parked waiting on them.

Why This Matters

The split solves three concrete problems we were hitting in production:

Reconciler Safety: The auto-assignment logic can now confidently act on blocked work without risking stepping on human decision-making.
Agent Clarity: When an agent routine wakes up and sees a blocked issue, it knows it's allowed to help. When it sees awaiting_human, it knows to leave it alone.
Board Visibility: Operators get a clean signal: the awaiting_human counter is the queue of issues that need their attention. The blocked counter is work the system might auto-resolve.

Trade-Offs and Open Questions

The Overload Risk

We now have three parked states (blocked, awaiting_human, in_review), and the boundaries aren't always obvious. For example:

What if an issue is both dependency-blocked and waiting on a human decision?
Should awaiting_human support a blockedByIssueIds array, or is that mixing concepts?

Right now, the answer is: pick the strongest constraint. If a human needs to weigh in, use awaiting_human even if there's also a dependency blocker. The agent can't act either way, so the human-block is the active gate.

We're open to feedback on whether that heuristic holds as routine complexity grows.

Auto-Park Scope

We currently auto-park only for ask_user_questions and request_confirmation interactions. Other interaction kinds—like suggest_tasks—don't trigger auto-park, because those are seen as proposals the board can act on asynchronously without blocking the agent.

Is that the right line? Maybe. We're watching for cases where an agent leaves a "what should I do?" interaction and then keeps working, which would suggest we need to widen the auto-park net.

External Consumers

If you're building on Bizbox or consuming our issue API, note that the status enum just expanded. The canonical list lives in packages/shared/src/constants.ts (ISSUE_STATUSES). Hard-coded status checks in external tooling will need an update.

What's Next

The awaiting_human status shipped in v0.0.11 on May 8, 2026. We're already seeing cleaner reconciler behavior and fewer "why did the agent touch this?" support questions.

But we're still learning:

Do we need a separate awaiting_external for third-party API blockers that aren't agent-unblockable but also aren't human decisions?
Should the UI show why an issue is awaiting_human—like surfacing the unresolved interaction inline?
How does this interact with approval workflows when those land?

If you're running Bizbox in production, we'd love to hear how the new status fits (or doesn't fit) your workflows. Drop a note in GitHub Discussions or on Discourse.

Related Work

PR #33: Add awaiting_human issue status — full implementation and test coverage
PR #38: Human handoff logging and notifications — ClickUp notification integration for awaiting_human transitions
Execution Semantics doc — updated status definitions

About Bizbox: We're building an AI-native task orchestration system where humans and AI agents collaborate on structured work. This Deep Dive is part of our monthly series on architectural decisions and lessons learned. Follow the project on GitHub.

Bizbox Build Log: May 2–8, 2026

Bizbox — Thu, 07 May 2026 02:18:58 +0000

Four releases, nine PRs merged, and one clear theme this week: making Bizbox agents more capable and trustworthy in multi-turn execution contexts.

Shipped this week

Company AI Builder (Phases 0–4)

#20 landed the full Company AI Builder feature — a curated set of mutation tools delivered via a proposal-approval flow. Phase 0 shipped read-only spike work (sessions, settings, OpenAI-compat interface, six read tools, UI). This update extends with Phases 1–4: proposal-store infrastructure, mutation tools behind proposals, and the approval surface for company owners.

Trade-off: Mutation tools are gated by human approval for now. We chose safety and trust before convenience. Future iterations will tune the guardrails based on real operator feedback.

Artifact validation and schema hardening

#27 introduced stricter validation for "artifact" work products — enforcing that artifact work products always have attachment-backed metadata and a createdByRunId. New schema validators, runtime type guards, and tighter integration mean artifact handling is now fail-fast instead of fail-silent.

Why it matters: Agents produce artifacts (deliverables, documents, code outputs). Loose validation meant broken artifact references could propagate through the system. This change catches those errors at the boundary.

Artifact persistence and UI updates for issue-backed runs

#25 adds support for collecting output artifacts from adapter executions (especially OpenClaw Gateway adapters), introduces new types and logic for artifact management, and exposes utilities for artifact-related work products.

Open challenge: Artifact handling is still evolving. We're learning what metadata needs to travel with artifacts, how to version them, and what the UI should surface. Feedback welcome.

Agent thread chat with optimistic UI

#21 adds a direct communication channel between operators and agents. Users can now message agents from the agent detail page, with optimistic UI updates for a snappier feel.

Decision: We chose optimistic updates over waiting for server confirmation. It makes the UI feel faster. The trade-off: rare cases where the server rejects a message won't be obvious until you refresh. We're watching for confusion signals.

Routine execution recovery logic

#22 fixes how Bizbox handles routine_execution issues in blocked state. Previously, the recovery logic treated blocked routines as failures and tried to resume them prematurely. Now, blocked is recognized as a healthy, parked wait state.

Why this was broken: Routines often block on human approval or child issue completion. The old logic didn't distinguish "blocked and waiting" from "blocked and stuck." This change codifies the difference.

Upstream merge and OpenTelemetry metrics

#16 merged upstream PaperClip changes from April 30, 2026 (assisted by Claude Sonnet 4.6).

#14 adds OpenTelemetry metrics, starting with bizbox.issues.human_comments_total — a signal for human intervention frequency.

Trade-off: We're starting with one metric to validate the integration pattern. More will follow once we've confirmed the collector setup works in production.

agentParams refactor and regression fix

#24 fixes a regression introduced in v0.0.6 where the OpenClaw gateway adapter changed the outbound agent request shape. The fix refactors agentParams handling and removes an unused function that was masking the real issue.

Lesson: Request shape changes in adapters are easy to miss when tests don't cover the boundary. We added a test to catch this pattern in the future.

Workflow cleanup

#23 removes the sync-upstream workflow. We're switching to manual upstream merges (with AI assistance) for now.

Why: Automated upstream sync introduced more conflicts than it saved in merge time. Manual merges with AI assistance give us control without the constant breakage.

Decisions

Mutation tools behind proposals: We're prioritizing trust and transparency over convenience. Operators see and approve changes before agents make them.
Artifact validation is fail-fast: Better to catch broken artifacts early than let them propagate.
Blocked routine state is healthy: Routines can wait. Not every blocked issue is a failure.
Manual upstream merges: Automation failed here. Human-in-the-loop merges with AI assistance work better for our repo.

Trade-offs

Proposal flow adds friction: Every mutation requires approval. This is intentional for now, but we know it slows down agents. Future work: smart approval defaults based on context and trust signals.
Optimistic UI updates hide rare server rejections: We chose speed over certainty. Watching for user confusion.
One OpenTelemetry metric to start: We're validating the pattern before adding dozens of metrics. Risk: we might miss important signals early.

Open challenges

Artifact versioning and metadata: What needs to travel with an artifact? How do we version it? What should the UI surface? Still figuring this out.
Approval UX for high-frequency mutations: Approving every change works for low-frequency operations. It won't scale to high-frequency agent work. Need smarter defaults.
Upstream merge strategy: Manual merges with AI assistance work for now, but they don't scale. We need a better long-term approach.

Releases

v0.0.9 — May 6, 2026
v0.0.8 — May 5, 2026
v0.0.7 — May 5, 2026
v0.0.6 — May 5, 2026

This Build Log is grounded in real repo activity. Every claim links to a PR, issue, release, or ADR. No internal-only context, no invented features, no marketing fluff.

Questions? Join the discussion on GitHub.