Latency & Performance

HTTP Response Latency

12ms

p50 latency

23ms

p95 latency

41ms

p99 latency

~3.5s

Full standard pipeline

Time to first byte. The API responds as soon as the scoring engine completes. This is what your agent waits for.

Full 83-module scoring pipeline time. Includes all analytics modules. Returned in the pipeline_ms field of every preflight response.

Set response_profile: "compact" to skip heavy analytics modules. Compact pipeline: ~100ms total.

informational and reversible actions auto-select compact profile. irreversible and destructive get standard (full pipeline).

from sgraal import SgraalClient

result = client.preflight(memory_state=[...])
print(result["_trace"]["duration_ms"])  # pipeline time in ms