HCS Streaming Protocol β Draft v0.2 β
Status: draft (planning). Batch v0 (render_context_reply.v0) remains the only stable contract. This draft defines minimal streaming protocol for experiments.
Transport:
- Server-Sent Events (SSE) or WebSocket. Frame payloads identical across transports.
Session lifecycle (high level):
- client β server: start
{ request_id, user_id?, session_id? }
- server β client: stream init (metadata)
- server β client: multiple frame deliveries (macro β micro β atomic)
- client β server: ack/feedback frames (optional)
- server β client: complete or error
Common fields:
- request_id: string (matches ACS planning request)
- session_id: string (server-assigned if absent)
- seq: integer β₯ 0 (monotonic per session)
- ts: number (epoch ms)
Frames (JSON objects):
init { "type": "init", "request_id": "r-123", "session_id": "s-abc", "ts": 1736190000000, "protocol":
{ "version": "0.2", "transport": "sse|ws" }
, "hints":{ "expected_layers": ["macro", "micro", "atomic"], "max_chunk_tokens": 512 }
}chunk { "type": "chunk", "request_id": "r-123", "session_id": "s-abc", "seq": 0, "ts": 1736190000100, "lod": "macro|micro|atomic", "id": "frag:macro:001", "text": "...", "cost_tokens": 128, "kv_policy":
{ "pin": ["frag:macro:001"], "compress": [], "evict": [] }
, "metrics":{ "used_tokens": 128 }
}
Notes:
- Fields reuse shapes from render_context_reply.v0 when possible (lod, id, text, cost_tokens, kv_policy, metrics.used_tokens)
- A chunk MAY carry partial text for the same id in subsequent frames; the receiver should concatenate or replace by id depending on delivery_mode (see below).
- delivery_mode (optional hint in chunk frames)
- mode: "append" | "replace" (default append)
- If replace: the chunk.text supersedes any prior text for the same id
ack (client β server) { "type": "ack", "request_id": "r-123", "session_id": "s-abc", "seq": 0, "status": "received|applied", "latency_ms": 42 }
control (either direction) { "type": "control", "request_id": "r-123", "session_id": "s-abc", "action": "pause|resume|flush|tighten|relax", "reason" : "backpressure|user_busy|low_attention|policy_change" }
error { "type": "error", "request_id": "r-123", "session_id": "s-abc", "code": "STREAM_TIMEOUT|INTERNAL|INVALID_REQUEST", "message": "...", "retriable": true, "options": [
{ "action": "retry" }
,{ "action": "backoff", "hint": "500-1500ms" }
] }complete { "type": "complete", "request_id": "r-123", "session_id": "s-abc", "summary":
{ "frames": 42, "used_tokens": 4096 }
}
Backpressure guidance:
- Clients SHOULD send control
{action:"pause"}
if buffer exceeds thresholds. - Servers MAY downgrade LOD, reduce chunk size, or increase inter-frame delay upon backpressure.
Compatibility:
- This draft is additive. Batch v0 responses remain valid and MAY be sent as a final snapshot after streaming completes.
Related:
- Architecture: hcs/architecture.md
- Internal API: api/internal.md
- KV Policy: ../kv-policy.md
- Errors/Retry options: ../errors.md, ../retry-actions.md