Skip to content

Agents

Overview

IPW profiles multi-turn agent workloads through pluggable agent harnesses. Each agent wraps an existing framework and adds energy telemetry instrumentation. All agents inherit from BaseAgent and are registered with AgentRegistry.

Available Agents

Agent ID Framework Install Extra Use Case
react Agno ipw[react] General tool-augmented reasoning
openhands OpenHands SDK ipw[openhands] Autonomous task execution, coding
terminus terminal-bench ipw[terminus] Terminal/CLI task benchmarking
terminus-tb terminal-bench ipw[terminus] TerminalBench native (Docker managed by runner)

Event Types

The EventRecorder captures timestamped events during agent execution, correlated with energy telemetry to compute per-action energy costs.

Event Type Description
lm_inference_start / lm_inference_end LLM call boundaries
tool_call_start / tool_call_end Tool invocation boundaries
prefill_start / prefill_end Prefill phase (if detectable)
decode_start / decode_end Decode phase (if detectable)
submodel_call_start / submodel_call_end Sub-model calls from MCP tools

ReAct (Agno)

uv pip install -e 'intelligence-per-watt[react]'

The ReAct agent uses Agno to implement Reasoning + Acting (ReAct) style tool-augmented reasoning. It wraps Agno's Agent class and instruments tool calls for energy tracking.

ipw run \
  --agent react \
  --model gpt-4o \
  --dataset gaia \
  --max-queries 10
Parameter Type Default Description
model Any required Agno Model instance (e.g., OpenAIChat)
tools list[Callable] required List of callable tool functions
instructions str built-in Custom system instructions
max_turns int -- Maximum reasoning iterations

OpenHands

uv pip install -e 'intelligence-per-watt[openhands]'

The OpenHands agent uses the OpenHands SDK for autonomous task execution with per-tool energy tracking. It is designed for complex, multi-step tasks such as software engineering, research, and document analysis.

ipw run \
  --agent openhands \
  --model gpt-4o \
  --dataset swebench \
  --max-turns 30
Parameter Type Default Description
model Any required LLM model instance
tools list None OpenHands Tool specs
mcp_tools dict None MCP server instances for sub-queries
max_turns int 20 Maximum iterations per run

Terminus

uv pip install -e 'intelligence-per-watt[terminus]'

The Terminus agent uses terminal-bench to run tasks inside Docker containers with tmux, enabling benchmarking of terminal/CLI task execution.

Prerequisites: Docker Engine installed and running; current user in the docker group (or sudo access).

ipw run \
  --agent terminus \
  --model gpt-4o \
  --dataset terminalbench \
  --max-queries 10
Parameter Type Default Description
model str required Model name (e.g., "gpt-4o")
docker_image str "ubuntu:22.04" Docker image for the container
container_name str "terminus-container" Name for the Docker container
max_turns int -- Maximum agent turns

MCP Tools

Model Context Protocol servers provide tool capabilities to agents. Each MCP server wraps an external service with a standard interface.

Inference servers -- wrap LLM APIs so agents can make sub-queries:

Server Backend
OpenAIServer OpenAI API
AnthropicServer Anthropic API
GeminiServer Google Gemini API
OllamaServer Local Ollama instance
VLLMServer Local vLLM instance
OpenRouterServer OpenRouter API

Retrieval servers -- provide document retrieval for RAG-style agents:

Server Method
BM25Server Sparse BM25 retrieval
DenseServer Dense vector retrieval
GrepServer Grep-based text search
HybridServer Combined BM25 + dense retrieval

Writing a Custom Agent

See Extending IPW for how to implement and register a custom agent.