Skip to main content

Operating Live Instances

This page describes how operators inspect and act on running and completed process instances from the BPMN UI.

The two views you'll spend time in:

ViewPurpose
BPMN InstancesList of instances across all deployed processes — filter, drill in, cancel
Instance detailEverything about a single instance: incidents, history, jobs, user tasks, child instances, variables, and a live diagram view

Instances list

Open BPMN Instances from the BPMN navigation. You'll see one row per instance with workflow ID, status, started timestamp, and a link into the detail view.

StatusMeaning
RUNNINGThe instance is in flight
COMPLETEDThe instance reached an end event
FAILEDThe instance terminated with an error
CANCELEDThe instance was cancelled (by an operator or by a parent process)

Filtering

The status filter is deep-linkable — ?status=RUNNING, ?status=FAILED, etc. The operational dashboard's KPI cards link directly here.

The list can also be scoped to a single deployed process, in which case only instances of that process are shown.

Auto-refresh

When any instance on the page is RUNNING, the list auto-refreshes every 8 seconds. Once everything's terminal, the polling stops.

Cancelling from the list

Each running instance has a cancel action. It asks for confirmation and then cancels the instance immediately. Completed, failed, and canceled instances cannot be cancelled.

Instance detail

Click into a row to open the detail page. The top of the page shows:

  • The instance's status (with a count of incidents if any are open).
  • The workflow ID, in monospace, for copy/paste.
  • A breadcrumb back to parent instances, when the instance was started by a call activity (see Drilling into child instances).
  • A View Diagram action that opens a full-screen diagram view.
  • A Cancel action (running instances only).

If the instance is FAILED, the failure reason is shown in a banner at the top.

The body is a multi-open accordion with the following sections.

Incidents

An incident is what an instance becomes when an activity throws an error that has nowhere to go. The instance pauses at the failed node and waits for an operator to resolve it.

Each incident shows:

FieldDescription
NodeThe failed node's ID
TypeError type as reported by the activity
MessageThe error message
TimeWhen the incident was created

What causes an incident

An error becomes an incident when no matching handler can catch it. The common paths:

SourceWhen it produces an incident
External workerA worker reports an error and the retry budget is exhausted with no error boundary catching the BPMN error
User taskA caller submits ThrowError with no matching error boundary on the user task or up the scope chain
Script taskA FEEL expression in the script body raises an evaluation error — division by zero, unresolved variable, type mismatch — with no error boundary attached
Business-rule taskThe called DMN decision fails to evaluate or the result mapping fails, with no error boundary attached
Uncaught error throwAn error end event or thrown error from an activity that bubbles up to the root with no matching error boundary or error event sub-process anywhere on the path

If an error is caught — by an error boundary on the activity, a parent's boundary, or an error event sub-process — there's no incident. The token routes through the handler.

To act on an incident, open the diagram view and use the run panel — see Resolving incidents.

Activity history

A chronological table of every activity that has fired in the instance:

FieldDescription
NodeThe activity's ID
TypeThe element type (serviceTask, userTask, exclusiveGateway, etc.)
StatusStarted, Completed, Failed, or Canceled
StartedTimestamp of entry
DurationElapsed time, or "Running" if still active

This is the same data the replay slider works from.

External jobs

Service-task and worker-mode send-task jobs that the engine has dispatched. Useful when investigating a stuck job or confirming that a worker actually picked one up.

FieldDescription
NodeThe activity that produced the job
Task typeThe worker type string the job was dispatched with
StatusPENDING, COMPLETED, FAILED, or CANCELED
Cancel reasonThe reason the job was cancelled, when applicable
Created / CompletedTimestamps

For the worker-author side of this, see External workers.

User tasks

The user-task lifecycle for the instance:

FieldDescription
NodeThe user task's ID
AssigneeThe single user assigned (if any)
Candidate groupsGroups eligible to claim the task
StatusCREATED, COMPLETED, FAILED, or CANCELED
DetailError code (FAILED) or cancel reason (CANCELED)
CreatedWhen the task was registered

Child instances

Instances spawned by call activities in this process. Each row links to the child's detail page; clicking Open preserves the breadcrumb so you can walk back up.

Variables

The instance's current variable scope as a JSON object. Empty when the instance has no variables set.

Diagram view

The View Diagram action opens a full-screen modal with the deployed BPMN diagram, overlaid with the instance's live state.

Overlay colors

ColorMeaning
ActiveThe node is currently executing
CompletedThe node finished normally
FailedThe node threw an error
IncidentThe node has an unresolved incident

The overlay updates every 3 seconds while the instance is running.

Replay slider

Below the diagram, a replay slider scrubs through the recorded history one event at a time. The slider runs from the first event (step 1) to the latest (step N). Each tick reflects one history entry, labelled with the node ID and what happened (Started, Completed, Failed, or Canceled).

ModeBehavior
Live (default)The diagram shows the current state; the run panel is interactive
ReplayThe diagram shows historical state at the chosen step; the run panel is hidden

Click Live to drop out of replay mode and return to the current state.

Run panel

To the right of the diagram (visible only when the instance is RUNNING and you have edit permission, and not in replay mode), the run panel lets you act on the instance directly.

The panel sections, top to bottom:

Active elements

A list of nodes that are currently active — the same nodes highlighted on the diagram.

Pending tasks

For each pending external job, a card lets you:

  • Inspect the job's input variables and headers.
  • Complete the job by submitting an output variables JSON — the engine treats this exactly like a worker completion.
  • Throw error by submitting an error code and optional variables — the engine treats this as a worker-reported BPMN error, which routes the token through any matching error boundary.

This is most useful when you have no worker yet and want to drive the instance forward by hand, or to recover from a stuck worker.

Ad-hoc scopes

When the instance is inside an ad-hoc sub-process, the run panel offers per-scope controls:

  • Activate inner — pick one of the ad-hoc's child activities and start it. Useful when the scope's activeElementsCollection doesn't match what you want, or when activities should be triggered manually.
  • Update scope variables — submit a JSON object that's merged into the scope's variables. The completion condition is re-evaluated.

Incidents

Each open incident shows the node ID, error message, and error code (if any) with a Retry button. Retrying resolves the incident and re-executes the failed node from the beginning.

Publish signal or message

Send a signal or message directly from the panel.

For a signal (broadcast — wakes every subscriber):

FieldNotes
NameSignal name to broadcast
VariablesJSON payload merged into the recipients' scope
TTLOptional buffer expiry — ISO 8601 duration (PT30M), duration string (30m), or absolute timestamp

For a message (point-to-point — wakes one matching subscriber):

FieldNotes
NameMessage name
CorrelationEither a single value (number, boolean, or string) or a JSON object — leave empty for "no correlation"
VariablesJSON payload
TTLOptional buffer expiry, same formats as signal

If the value field has a number, boolean, or quoted string, it's parsed and sent with the right type. Anything else is sent as a plain string.

Variables

The instance's current variables as a collapsible JSON block. Expands to a scrollable code view.

Activity history

A condensed view of the activity table from the detail page — node, status (with badge), start time, duration. Useful when you want to glance at recent activity without leaving the diagram view.

Cancelling an instance

A Cancel action is available in two places:

  • The cancel icon next to each running row in the instances list.
  • The cancel button in the instance detail header (running instances only).

Both ask for confirmation and terminate the workflow immediately. Cancellation propagates to child instances spawned by call activities.

Resolving incidents

When an instance has unresolved incidents, work the diagram view:

  1. Click View Diagram on the instance detail page.
  2. In the run panel, find the incident in the Incidents section.
  3. Click Retry. The engine clears the incident and re-runs the activity that produced it.

What "retry" actually does

A retry does not restart the whole instance. It re-runs only the activity that the incident is bound to:

  • For a service task, the engine creates a fresh job in the queue. A worker (or an operator using Pending tasks) picks it up and tries again.
  • For a user task, the task is re-registered as CREATED and waits for an assignee to act on it.
  • For a script task or business-rule task, the engine re-evaluates the script or the decision.

Variables and tokens elsewhere in the instance are unaffected. The activity's scope is the same as it was at the original entry — input mappings are not re-evaluated.

Adjusting input before retry

The resolve API accepts an optional variables map that's merged into the activity's scope before the retry runs. Use this when the input itself was the problem — for example, a FEEL expression failed because a value was missing or had the wrong shape, and you want to inject a correction without changing the model.

The run panel's Retry button submits an empty variables map by default. To pass corrections, call the API directly:

POST /projects/{projectID}/bpmn/instances/{workflowID}/incidents/{incidentID}/resolve
{ "variables": { "amount": 49.95 } }

When the cause persists

Resolving doesn't fix the underlying problem — it just re-runs the activity. If the cause is still there (the worker is still buggy, the input is still bad, the decision is still misconfigured), the activity fails again and a new incident is created on the next failure.

If you need to abandon the instance entirely, cancel it instead of retrying.

Drilling into child instances

Call activities spawn child workflows. The detail view exposes the parent/child relationship in two places:

  • The Child instances accordion section lists all children spawned by this instance.
  • The diagram view's call-activity nodes are clickable when an instance is running — clicking jumps to the child's detail page.

Both paths preserve a breadcrumb in the URL, so the navigation walks back up cleanly. If you deep-link directly to a child instance (no URL breadcrumb but the API knows it has a parent), an Up to parent button appears in the header for a single hop.

What's API-only today

Two operational features exist on the API but don't yet have a UI:

FeatureEffect
Modify token stateInsert a new token before a node (START_BEFORE_NODE) or cancel an active token (CANCEL_TOKEN)
Migrate to a new process versionMove running instances to a new deployed version with an explicit migration plan

Both are exposed through the REST API. UI surfaces are on the roadmap.