Skip to content

HCS Streaming Protocol β€” Draft v0.2 ​

Status: draft (planning). Batch v0 (render_context_reply.v0) remains the only stable contract. This draft defines minimal streaming protocol for experiments.

Transport:

  • Server-Sent Events (SSE) or WebSocket. Frame payloads identical across transports.

Session lifecycle (high level):

  1. client β†’ server: start { request_id, user_id?, session_id? }
  2. server β†’ client: stream init (metadata)
  3. server β†’ client: multiple frame deliveries (macro β†’ micro β†’ atomic)
  4. client β†’ server: ack/feedback frames (optional)
  5. server β†’ client: complete or error

Common fields:

  • request_id: string (matches ACS planning request)
  • session_id: string (server-assigned if absent)
  • seq: integer β‰₯ 0 (monotonic per session)
  • ts: number (epoch ms)

Frames (JSON objects):

  1. init { "type": "init", "request_id": "r-123", "session_id": "s-abc", "ts": 1736190000000, "protocol": { "version": "0.2", "transport": "sse|ws" }, "hints": { "expected_layers": ["macro", "micro", "atomic"], "max_chunk_tokens": 512 } }

  2. chunk { "type": "chunk", "request_id": "r-123", "session_id": "s-abc", "seq": 0, "ts": 1736190000100, "lod": "macro|micro|atomic", "id": "frag:macro:001", "text": "...", "cost_tokens": 128, "kv_policy": { "pin": ["frag:macro:001"], "compress": [], "evict": [] }, "metrics": { "used_tokens": 128 } }

Notes:

  • Fields reuse shapes from render_context_reply.v0 when possible (lod, id, text, cost_tokens, kv_policy, metrics.used_tokens)
  • A chunk MAY carry partial text for the same id in subsequent frames; the receiver should concatenate or replace by id depending on delivery_mode (see below).
  1. delivery_mode (optional hint in chunk frames)
  • mode: "append" | "replace" (default append)
  • If replace: the chunk.text supersedes any prior text for the same id
  1. ack (client β†’ server) { "type": "ack", "request_id": "r-123", "session_id": "s-abc", "seq": 0, "status": "received|applied", "latency_ms": 42 }

  2. control (either direction) { "type": "control", "request_id": "r-123", "session_id": "s-abc", "action": "pause|resume|flush|tighten|relax", "reason" : "backpressure|user_busy|low_attention|policy_change" }

  3. error { "type": "error", "request_id": "r-123", "session_id": "s-abc", "code": "STREAM_TIMEOUT|INTERNAL|INVALID_REQUEST", "message": "...", "retriable": true, "options": [ { "action": "retry" }, { "action": "backoff", "hint": "500-1500ms" } ] }

  4. complete { "type": "complete", "request_id": "r-123", "session_id": "s-abc", "summary": { "frames": 42, "used_tokens": 4096 } }

Backpressure guidance:

  • Clients SHOULD send control {action:"pause"} if buffer exceeds thresholds.
  • Servers MAY downgrade LOD, reduce chunk size, or increase inter-frame delay upon backpressure.

Compatibility:

  • This draft is additive. Batch v0 responses remain valid and MAY be sent as a final snapshot after streaming completes.

Related:

  • Architecture: hcs/architecture.md
  • Internal API: api/internal.md
  • KV Policy: ../kv-policy.md
  • Errors/Retry options: ../errors.md, ../retry-actions.md