Skip to content

Playbook Builder — Agent Execution Flow

Transport: SSE + AG-UI. Each chat turn is a short-lived HTTPS POST to Flask that proxies into a bedrock-agentcore.invoke_agent_runtime call. The runtime emits AG-UI events (RUN_STARTED, TEXT_MESSAGE_*, TOOL_CALL_*, CUSTOM, RUN_FINISHED) on the SSE response. Flask passes the bytes through unmodified. There is no persistent connection — "the session" is a UUID + signed token, and continuity comes from AgentCore's runtimeSessionId stickiness routing repeat requests to the same microVM. Out-of-band actions (stop / reset / resume_interrupt) are sibling POSTs to /run/<sid>/<action> that land on the same mvm via the same stickiness mechanism.

Session Establishment (Authenticated)

mermaid
sequenceDiagram
    participant Browser as Browser (Vue)
    participant Flask as Flask (MWA)
    participant AC as AgentCore Runtime

    Browser->>Flask: POST /api/agent/<type>/session (session cookie)
    Note over Flask: 1. Validate session cookie<br/>2. Read user_id, vendor_id<br/>3. Mint HMAC-signed token:<br/>   {session_id, user_id, vendor_id, exp}<br/>   signed with AGENT_SIGNING_SECRET
    Flask-->>Browser: { session_id, agent_url, token }

    Note over Browser,AC: No AgentCore call yet — the mvm is not warmed<br/>until the first /run POST. agent_url is the<br/>Flask proxy base — the browser never speaks to<br/>AgentCore directly.

Chat Turn — First Message (Router to Playbooks Handoff)

mermaid
sequenceDiagram
    participant Browser as Browser (Vue)
    participant Flask as Flask Proxy
    participant Disp as Dispatcher (in mvm)
    participant Cache as AgentSessionCache
    participant Router as Router Agent (Haiku)
    participant Orch as Playbook Orchestrator (Sonnet)

    Browser->>Flask: POST {agent_url}/run<br/>body: { session_id, token, action: "chat",<br/>prompt: "Onboard new clients...", ui_context, ... }
    Flask->>Disp: invoke_agent_runtime(<br/>  runtimeSessionId=session_id,<br/>  payload=body)<br/>routed to same mvm by sid

    Note over Disp: extract_token(body) → AuthContext<br/>(token validated, identity locked)

    Disp-->>Flask: SSE: data: { RUN_STARTED, threadId, runId }
    Flask-->>Browser: passes SSE bytes through

    Disp->>Cache: get_or_init_state(session_id, user_id, vendor_id)
    Note over Cache: First call: create Agent,<br/>agent_state dict, EphemeralState.<br/>Persists across turns via mvm stickiness.
    Cache-->>Disp: (agent_state, ephemeral)

    Note over Disp: active_domain == "router"

    Disp->>Router: invoke_async(content_blocks)
    Note over Router: Single-turn classifier<br/>Structured output: RouterDecision
    Router-->>Disp: { domain: "playbooks", handoff_context }

    Disp-->>Flask: SSE: data: { CUSTOM, name: "control.handoff", value: { target: "playbooks" } }
    Flask-->>Browser: same

    Note over Disp: Auto-trigger _handle_domain_turn<br/>with same content_blocks

    Disp->>Orch: system_prompt = build_system_prompt(canvas=EMPTY, handoff_context)
    Disp->>Orch: stream_async(content_blocks)

    loop Strands Agent Loop
        Orch->>Orch: LLM call (Sonnet)
        Orch->>Orch: playbook_tool("set_outline", ...)
        Orch-->>Flask: SSE: { TOOL_CALL_START, ARGS, END }<br/>(toolCallName: "set_playbook_outline")
        Flask-->>Browser: same
        Note over Browser: useAgentSse dispatches onToolCall<br/>→ Pinia store.applyToolCall()<br/>→ Canvas renders outline
        Orch-->>Flask: SSE: { TEXT_MESSAGE_START, CONTENT*, END }<br/>(delta: "Here's an outline...")
        Flask-->>Browser: same
    end

    Disp->>Cache: sync_state_to_agent(session_id)
    Disp-->>Flask: SSE: { TOOL_CALL_* } for suggest_replies<br/>{ STATE_SNAPSHOT, snapshot: { phase } }<br/>{ RUN_FINISHED }
    Flask-->>Browser: same

Chat Turn — Subsequent (Direct to Playbooks)

mermaid
sequenceDiagram
    participant Browser as Browser (Vue)
    participant Flask as Flask Proxy
    participant Disp as Dispatcher (same mvm)
    participant Cache as AgentSessionCache
    participant Orch as Playbook Orchestrator (Sonnet)

    Browser->>Flask: POST {agent_url}/run<br/>body: { session_id, token, action: "chat", prompt: "Looks good, build it!" }
    Flask->>Disp: invoke_agent_runtime(<br/>  runtimeSessionId=session_id,<br/>  payload=body)<br/>same sid → same mvm

    Disp->>Cache: get_or_init_state(session_id, user_id, vendor_id)
    Note over Cache: Returns existing entry<br/>(same agent, same state dict)<br/>This is what mvm stickiness buys us.
    Cache-->>Disp: (agent_state, ephemeral)

    Note over Disp: active_domain == "playbooks"

    Disp->>Orch: system_prompt = build_system_prompt(canvas: 6 modules, 0 tasks)
    Disp->>Orch: stream_async(content_blocks)

    Orch->>Orch: LLM sees canvas state, calls build_modules
    Orch-->>Flask: SSE: { TOOL_CALL_*, toolCallName: "build_modules" }
    Flask-->>Browser: same
    Orch-->>Flask: SSE: { TEXT_MESSAGE_*, delta: "Building 6 modules now..." }
    Flask-->>Browser: same

    Disp->>Cache: sync_state_to_agent(session_id)
    Disp-->>Flask: SSE: { STATE_SNAPSHOT, snapshot: { phase: "building" } }<br/>{ RUN_FINISHED }
    Flask-->>Browser: same

Background Module Building

mermaid
sequenceDiagram
    participant Tool as build_modules tool
    participant App as BedrockAgentCoreApp
    participant Exec as ThreadPoolExecutor (max 3)
    participant Builder as Module Builder (x N)
    participant Eph as EphemeralState
    participant Browser as Browser (Vue)

    Tool->>App: add_async_task("module_building")
    Note over App: Health checks report "HealthyBusy"<br/>Prevents 15-min idle kill

    Tool->>Exec: submit(_build_single_module_sync) per module

    loop Each module (up to 3 parallel)
        Exec->>Builder: _build_single_module_sync(module_index)
        Note over Builder: detach_otel_context()<br/>Creates independent MLflow trace
        Builder->>Builder: Agent(tools=[add_tasks, add_task_step])
        Builder->>Builder: add_tasks(module_index, tasks=[...])
        Builder->>Eph: on_tool_call(tool_name, args)
        Eph->>Browser: push_event({ type: "tool_call", tool, args })
        Note over Browser: Each task appears on<br/>canvas in real-time
    end

    Note over Exec: _monitor thread waits for all futures

    Exec->>Eph: push_event({ type: "status", phase: "refinement" })
    Eph->>Browser: "Done! Take a look and let me know what to change."
    Exec->>App: complete_async_task(task_id)

Save Flow

mermaid
sequenceDiagram
    participant Browser as Browser (Vue)
    participant Store as Pinia Store
    participant Flask as Flask (MWA)
    participant DB as PostgreSQL

    Note over Browser: User clicks "Save Playbook"
    Browser->>Store: read store.playbook
    Browser->>Flask: POST /agent/save-playbook { playbook: store.playbook }
    Note over Flask: Uses current_user (Flask-Login)<br/>No AgentCore involvement

    Flask->>Flask: save_playbook_from_session()

    Flask->>DB: Create ORPlaybook
    Flask->>DB: Create ORModule (x N)
    Flask->>DB: Create ORTask (x N per module)
    Flask->>DB: Create ORTaskStep (x N per task)

    Flask-->>Browser: { playbook_id, playbook_id_decoded }
    Browser->>Browser: router.push to playbook edit page

Component Initialization

mermaid
sequenceDiagram
    participant User
    participant Vue as AiPlaybookBuilder.vue
    participant Chat as ChatPane.vue
    participant Canvas as CanvasPane.vue
    participant Store as playbookBuilder.store
    participant SSE as useAgentSse
    participant API as agent.js
    participant Flask as Flask (MWA)
    participant AC as AgentCore Runtime

    User->>Vue: Navigate to AI Playbook Builder
    Vue->>Store: usePlaybookBuilderStore()
    Vue->>SSE: useAgentSse()

    Vue->>API: createSession()
    API->>Flask: POST /api/agent/copilot/session
    Flask-->>API: { session_id, agent_url, token }
    API-->>Vue: session config

    Note over Vue,SSE: No "connect" step — SSE is per-request.<br/>Vue stores agent_url + token in conversation state<br/>and includes them in every subsequent /run POST.

    Vue->>Chat: mount (message list, input)
    Vue->>Canvas: mount (empty state)

    User->>Chat: types message, hits send
    Chat->>SSE: startRun({<br/>  url: `${agent_url}/run`,<br/>  body: { session_id, token, action: "chat", prompt, files },<br/>  callbacks: { onText, onToolCall, onDone, onError, ... }<br/>})
    SSE->>Flask: POST {agent_url}/run<br/>(fetch + ReadableStream reader)
    Flask->>AC: invoke_agent_runtime(runtimeSessionId, payload)

    Note over AC: _dispatch_chat() runs<br/>(see Chat Turn diagrams)

    AC-->>Flask: SSE: { TOOL_CALL_START, ARGS (delta=JSON), END }
    Flask-->>SSE: passes SSE bytes through unmodified
    SSE->>Chat: callbacks.onToolCall(tool, args)<br/>(parsed from TOOL_CALL_ARGS.delta)
    Chat->>Vue: emit("tool-call", { tool, args })
    Vue->>Store: applyToolCall(tool, args)
    Store->>Canvas: reactivity updates canvas

    AC-->>Flask: SSE: { TEXT_MESSAGE_CONTENT, delta }
    Flask-->>SSE: same
    SSE->>Chat: callbacks.onText(delta)

    AC-->>Flask: SSE: { CUSTOM, name: "portal.done", value }<br/>{ RUN_FINISHED }
    Flask-->>SSE: same
    SSE->>Chat: callbacks.onDone(value)<br/>(fired on portal.done — RUN_FINISHED is the<br/>envelope close, not a separate handler call)

Out-of-Band Action: Stop / Reset / Resume Interrupt

The SSE response stream is one-way, so out-of-band actions move from "send a message on the open WS" to "POST a sibling endpoint that AgentCore routes to the same mvm via runtimeSessionId stickiness."

mermaid
sequenceDiagram
    participant Browser as Browser (Vue)
    participant Flask as Flask Proxy
    participant AC as AgentCore (same mvm)
    participant Disp as Dispatcher

    Note over Browser,Disp: A /run stream is in flight on the chat path.

    Browser->>Flask: POST {agent_url}/run/<sid>/stop<br/>body: { token }
    Flask->>AC: invoke_agent_runtime(<br/>  runtimeSessionId=sid,<br/>  payload={ action: "stop", session_id, token })
    AC->>Disp: dispatch_action("stop", payload, auth)
    Note over Disp: _stop_events[sid].set()<br/>+ entry.agents[*].cancel()<br/>(load-bearing: same mvm has<br/>the dict the chat loop is polling)

    par
        Disp-->>Flask: SSE on /stop: { CUSTOM, name: "run.cancelled" }<br/>{ RUN_FINISHED }
        Flask-->>Browser: same
    and
        Disp-->>Flask: SSE on original /run loop sees flag:<br/>{ CUSTOM, name: "run.cancelled" }<br/>{ RUN_FINISHED }
        Flask-->>Browser: same
    end

    Note over Browser: useAgentSse maps run.cancelled → onInterrupted callback.

reset is the deliberate counterexample: its job is to throw away the cached state, so it doesn't depend on stickiness — _cache.delete(session_id) and the next /run mints fresh state on whatever mvm it lands on.

Observability — Local (MLflow)

When MLFLOW_TRACKING_URI is set, init_telemetry() registers MLflow's StrandsSpanProcessor which intercepts OTel spans and writes them to the MLflow tracking server.

mermaid
sequenceDiagram
    participant Orch as Orchestrator
    participant B0 as Builder 0
    participant B1 as Builder 1
    participant MLflow as MLflow Server

    Note over Orch: AgentObservabilityHook<br/>enriches with session_id, user_id, vendor_id

    activate Orch
    Orch->>MLflow: Start trace (orchestrator)
    Orch->>MLflow: agent_loop span
    Orch->>MLflow: model_invoke span
    Orch->>MLflow: tool_use: playbook_tool span
    Orch->>MLflow: tool_use: build_modules span
    Orch->>MLflow: Export + close trace
    deactivate Orch

    Note over B0,B1: detach_otel_context()<br/>Each builder creates its own root trace

    activate B0
    B0->>MLflow: Start trace (builder_0)
    B0->>MLflow: agent_loop span
    B0->>MLflow: model_invoke span
    B0->>MLflow: tool_use: add_tasks span
    B0->>MLflow: Export + close trace
    deactivate B0

    activate B1
    B1->>MLflow: Start trace (builder_1)
    B1->>MLflow: agent_loop span
    B1->>MLflow: Export + close trace
    deactivate B1

    Note over Orch,MLflow: Traces are independent roots,<br/>correlated by shared metadata:<br/>session_id, user_id, vendor_id

Observability — Deployed (AgentCore Runtime + CloudWatch)

When deployed to AgentCore Runtime, the runtime auto-instruments the process with AWS Distro for OpenTelemetry (ADOT). No MLflow, no code changes — OTel spans created by the Strands SDK flow to CloudWatch automatically. The trace_attributes set on each Agent (session.id, user.id, vendor_id, agent_type) become CloudWatch span attributes.

mermaid
sequenceDiagram
    participant Orch as Orchestrator
    participant B0 as Builder 0
    participant B1 as Builder 1
    participant ADOT as ADOT Collector
    participant CW as CloudWatch

    Note over Orch,CW: AgentCore Runtime auto-configures ADOT.<br/>No MLFLOW_TRACKING_URI set — init_telemetry() is a no-op.

    activate Orch
    Orch->>ADOT: OTel spans (orchestrator trace)
    Orch->>ADOT: agent_loop, model_invoke, tool_use spans
    ADOT->>CW: Export trace
    deactivate Orch

    Note over B0,B1: detach_otel_context()<br/>Independent root traces (same as local)

    activate B0
    B0->>ADOT: OTel spans (builder_0 trace)
    ADOT->>CW: Export trace
    deactivate B0

    activate B1
    B1->>ADOT: OTel spans (builder_1 trace)
    ADOT->>CW: Export trace
    deactivate B1

    Note over Orch,CW: CloudWatch dashboard shows latency,<br/>token usage, errors. Traces correlated<br/>by session.id span attribute.

Note: Builder traces are independent root traces (not child spans of the orchestrator). This is intentional — the orchestrator trace may be exported before builders finish, which would orphan child spans. Traces are correlated via session_id metadata instead.

Local vs. deployed: The agent code is identical in both environments. The only difference is which OTel exporter is configured externally — MLflow's StrandsSpanProcessor locally, or ADOT's OTLP exporter in AgentCore Runtime.

Internal documentation — gated behind Cloudflare Access.