Internal API - Component Communication β
CEO β ACS: render_request.v0 β
json
{
"version": "v0",
"id": "uuid",
"intent": "string",
"budgets": { "tokens_max": 5000, "time_ms": 800 },
"risk_profile": { "level": "low|medium|high" },
"privacy_mode": "allow|redact|block",
"request_id": "string"
}
Fields (summary)
- version (required): "v0"
- id (required): uuid
- intent (required): string
- budgets (required):
{ tokens_max:number, time_ms:number }
- risk_profile (optional):
{ level: "low|medium|high" }
- privacy_mode (optional; default: allow): "allow|redact|block"
- request_id (required): string
ACS β CEO: render_context_reply.v0 β
json
{
"request_id": "string",
"fragments": [
{ "id": "frag-1", "lod": "macro|micro|atomic", "text": "...", "entities": ["..."], "cost_tokens": 120 }
],
"kv_policy": { "pin": ["k1"], "compress": ["k2"], "evict": ["k3"] },
"metrics": { "used_tokens": 480, "planner_ms": 42, "coverage_entities": 0.66 }
}
Errors & Retry Policy (v0) β
Error envelope
json
{ "code": "PROVIDER_TIMEOUT", "message": "L2 timed out", "retriable": true, "attempt": 1, "max_attempts": 2, "options": [
{ "action": "retry_same", "hint": "reduce_top_k" },
{ "action": "expand_l1", "hint": "explicit expansion with warning" },
{ "action": "return_minimal", "hint": "return what we have now" }
]}
Priorities
- Quality > Token budget (strict 0% overshoot) > Latency (comfort)
Streaming
- Disabled in v0; future channel: sse|ws
Privacy (v0) β
- Field
privacy_mode
controls handling of potentially sensitive text:allow
(default): providers may return minimal necessary snippets; providers must redact clearly sensitive spans.redact
: providers must redact sensitive spans; ACS propagates redaction downstream.block
: providers must not return raw text; only IDs/metadata allowed; ACS will return portal/navigation only.
- Providers receive the effective
privacy_mode
via the Provider API request and MUST honor it.
Examples (e2e) β
Example 1: Multi-layer Complex Query (Happy Path) β
Scenario: Developer asks for comprehensive context about implementing hyperbolic search with budget constraints.
CEO β ACS Request
json
{
"version": "v0",
"id": "6f3b6f21-7a5f-4e3f-9af0-1b2c3d4e5f60",
"intent": "implement_hyperbolic_search_with_performance_constraints",
"budgets": {
"tokens_max": 4500,
"time_ms": 800
},
"risk_profile": {
"level": "medium"
},
"privacy_mode": "redact",
"request_id": "req-hyperbolic-001"
}
ACS β CEO Response
json
{
"request_id": "req-hyperbolic-001",
"fragments": [
{
"id": "proj:hyperbolic-impl#architecture",
"lod": "macro",
"text": "Hyperbolic search implementation uses PoincarΓ© ball model with GPU-accelerated distance computation. Core algorithm maintains O(log n) complexity through hierarchical embedding structure...",
"entities": ["hyperbolic_search", "poincare_ball", "gpu_acceleration", "performance"],
"cost_tokens": 340
},
{
"id": "proj:performance-benchmarks#gpu-native",
"lod": "micro",
"text": "GPU-native implementation shows 12x speedup over CPU-based vector similarity. Memory usage: O(dΒ²) where d is embedding dimension. Batch processing optimal at 2048 queries/batch...",
"entities": ["performance", "gpu_native", "benchmarks", "memory_usage"],
"cost_tokens": 280
},
{
"id": "exp:similar-impl-patterns#2025-08",
"lod": "micro",
"text": "Previous implementation attempt in August 2025 revealed critical insight: batch size tuning crucial for GPU memory coalescing. Recommend starting with power-of-2 batch sizes...",
"entities": ["implementation_patterns", "gpu_memory", "batch_tuning"],
"cost_tokens": 195
},
{
"id": "global:hyperbolic-geometry-research#poincare-optimization",
"lod": "atomic",
"text": "Recent research (Nickel et al.) demonstrates that exponential map computation can be approximated with Taylor expansion to 4th order for distance < 0.5, reducing computational overhead by 60%...",
"entities": ["research", "exponential_map", "taylor_expansion", "optimization"],
"cost_tokens": 220
}
],
"metrics": { "used_tokens": 1035, "planner_ms": 12, "coverage_entities": 0.78 }
}
Example 2: Provider Timeout with Intelligent Fallback β
Scenario: L2 provider timeout triggers ACS adaptive response with explicit L1 expansion.
CEO β ACS Request
json
{
"version": "v0",
"id": "b5c6a2fe-5e3b-41a1-b8ec-c7f8b91a7fb2",
"intent": "debug_memory_leak_in_embedding_computation",
"budgets": { "tokens_max": 3000, "time_ms": 500 },
"risk_profile": { "level": "high" },
"privacy_mode": "allow",
"request_id": "req-debug-002"
}
ACS β CEO Error Response
json
{
"request_id": "req-debug-002",
"error": {
"code": "PROVIDER_TIMEOUT",
"message": "L2 provider exceeded deadline (350ms > 300ms budgeted)",
"retriable": true,
"attempt": 1,
"max_attempts": 2,
"options": [
{
"action": "retry_with_reduced_scope",
"hint": "Reduce L2 top_k from 50 to 20; increase time budget to 400ms"
},
{
"action": "explicit_l1_expansion",
"hint": "Skip L2; expand directly to L1 with debug context"
},
{
"action": "return_l4_only",
"hint": "Return experience-based context only"
}
]
}
}
Example 3: Privacy-Redacted Response β
ACS β CEO Response with Privacy Controls
json
{
"request_id": "req-privacy-003",
"fragments": [
{
"id": "proj:auth-implementation#oauth-flow",
"lod": "macro",
"text": "OAuth implementation uses [REDACTED: client_secret] for token exchange. Flow: authorization β token β resource access...",
"entities": ["oauth", "authentication", "token_exchange"],
"cost_tokens": 180
}
],
"metrics": {
"used_tokens": 180,
"planner_ms": 15
}
}
References
- KV Policy contract and semantics: ../kv-policy.md
- Retry actions catalog for error.options[]: ../retry-actions.md