tx reflect

Run session retrospective to analyze agent behavior and tune approach

Purpose

tx reflect aggregates data across recent agent sessions to produce a structured retrospective. It surfaces metrics on throughput, proliferation, stuck tasks, and actionable signals.

Two tiers:

Data tier (default): SQL queries produce structured metrics. No LLM required.
Analysis tier (--analyze): Feeds metrics to an LLM backend for synthesis and recommendations.

Usage

tx reflect [options]

# Last 10 sessions (default)
tx reflect

# Last 5 sessions
tx reflect --sessions 5

# Last hour's activity
tx reflect --hours 1

# JSON output for scripting
tx reflect --json

# With LLM analysis
tx reflect --analyze

# Combine options
tx reflect --sessions 5 --hours 2 --analyze --json

import { TxClient } from '@jamesaphoenix/tx-agent-sdk'

const tx = new TxClient({ dbPath: '.tx/tasks.db' })

const result = await tx.reflect.run({ sessions: 5 })
// {
//   sessions: { total: 5, completed: 3, failed: 1, timeout: 1, avgDurationMinutes: 12.5 },
//   throughput: { created: 20, completed: 8, net: 12, completionRate: 0.4 },
//   proliferation: { avgCreatedPerSession: 4, maxCreatedPerSession: 10, maxDepth: 3, orphanChains: 1 },
//   stuckTasks: [...],
//   signals: [...],
//   analysis: null
// }

const analyzed = await tx.reflect.run({
  sessions: 5,
  analyze: true,
})
// analysis is populated when Claude Agent SDK is available
// or when ANTHROPIC_API_KEY is configured as the fallback backend

Tool name: tx_reflect

{
  "name": "tx_reflect",
  "arguments": {
    "sessions": 5,
    "hours": 2,
    "analyze": true
  }
}

Arg	Type	Required	Description
`sessions`	`number`	No	Number of recent sessions (default: 10)
`hours`	`number`	No	Time window in hours
`analyze`	`boolean`	No	Enable LLM analysis when Claude Agent SDK or `ANTHROPIC_API_KEY` fallback is available

GET /api/reflect?sessions=5&hours=2&analyze=true

Example:

curl http://localhost:3456/api/reflect?sessions=5

Output

Signals

Machine-readable flags for orchestrator integration:

Signal	Severity	Meaning
`HIGH_PROLIFERATION`	critical	More tasks created than completed
`STUCK_TASKS`	warning	Tasks with 3+ failed attempts
`DEPTH_WARNING`	warning	Max depth exceeds guard limit
`PENDING_HIGH`	info	Many pending tasks

Orchestrator Integration

#!/bin/bash
# Auto-tune based on reflect signals
SIGNALS=$(tx reflect --hours 1 --json | jq -r '.signals[].type')

if echo "$SIGNALS" | grep -q "HIGH_PROLIFERATION"; then
  tx guard set --max-pending 20 --enforce
  tx compact
fi

if echo "$SIGNALS" | grep -q "STUCK_TASKS"; then
  STUCK=$(tx reflect --json | jq -r '.stuckTasks[].id')
  for id in $STUCK; do
    tx update "$id" --status blocked
  done
fi

Configuration

[reflect]
default_sessions = 10

--analyze uses any available LLM backend. In Claude Code environments it can run through Claude Agent SDK without an API key. In other environments it falls back to ANTHROPIC_API_KEY. If no LLM backend is available, tx still returns the data tier.

tx guard: Set task creation limits and tune them with reflect signals
tx verify: Attach machine-checkable done criteria
tx label: Scope the ready queue by phase
tx compact: Archive completed tasks