Index
ipw.compute
¶
Compute estimation utilities for LLM inference.
estimate_flops(model, input_tokens, output_tokens, use_calflops=False)
¶
Estimate FLOPs for a model inference.
Strategy: 1. If use_calflops=True, try calflops library first 2. Fall back to 2PT formula using known parameter counts 3. Return (0, 0) if model is unknown
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
str
|
Model name or path |
required |
input_tokens
|
int
|
Number of input tokens |
required |
output_tokens
|
int
|
Number of output tokens |
required |
use_calflops
|
bool
|
Whether to try calflops library |
False
|
Returns:
| Type | Description |
|---|---|
tuple[float, float]
|
Tuple of (total_flops, flops_per_token) |
Source code in intelligence-per-watt/src/ipw/compute/flops.py
estimate_flops_calflops(model_name_or_path, input_tokens, output_tokens)
¶
Estimate FLOPs using the calflops library (optional dependency).
Returns None if calflops is not installed or estimation fails.
Source code in intelligence-per-watt/src/ipw/compute/flops.py
estimate_flops_fallback(params_billions, input_tokens, output_tokens)
¶
Estimate FLOPs using the 2PT approximation.
For transformer inference: - Prefill: ~2 * P * T_input (matrix multiplications) - Decode: ~2 * P * T_output (autoregressive generation) - Total: ~2 * P * (T_input + T_output)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
params_billions
|
float
|
Model parameter count in billions |
required |
input_tokens
|
int
|
Number of input tokens |
required |
output_tokens
|
int
|
Number of output tokens |
required |
Returns:
| Type | Description |
|---|---|
tuple[float, float]
|
Tuple of (total_flops, flops_per_token) |
Source code in intelligence-per-watt/src/ipw/compute/flops.py
lookup_params(model)
¶
Look up parameter count (in billions) for a model.
Returns None if the model is not in the known list.
Source code in intelligence-per-watt/src/ipw/compute/flops.py
normalize_model_name(model)
¶
Normalize model name for parameter lookup.
Handles common naming patterns like: - 'meta-llama/Llama-3.1-8B-Instruct' -> 'llama-3.1-8b' - 'llama3.2:1b' (ollama format) -> 'llama-3.2-1b' - 'qwen2.5-7b-instruct' -> 'qwen-2.5-7b'