Pipeline Latency & Performance

Real-time latency profile of the Sgraal preflight pipeline.

HTTP Response Latency

12ms

p50 latency

23ms

p95 latency

41ms

p99 latency

~3.5s

Full standard pipeline

Two Latency Layers

HTTP response time (p50: 12ms)

Time to first byte. The API responds as soon as the scoring engine completes. This is what your agent waits for.

pipeline_ms (~3,500ms full profile)

Full 83-module scoring pipeline time. Includes all analytics modules. Returned in the pipeline_ms field of every preflight response.

Pipeline Breakdown

Layer Modules Typical ms
Scoring engine10 core components0.1ms
Security detection4 detection layers0.0ms
Analytics modules83 total~3,400ms
Total~3,500ms

Optimization Tips

Use compact profile for lower latency

Set response_profile: "compact" to skip heavy analytics modules. Compact pipeline: ~100ms total.

Auto-profile selection

informational and reversible actions auto-select compact profile. irreversible and destructive get standard (full pipeline).

Reading pipeline_ms

from sgraal import SgraalClient

result = client.preflight(memory_state=[...])
print(result["_trace"]["duration_ms"])  # pipeline time in ms

Try it now

Try it now — no signup Read the docs