VeloxAI
Back to Blog
Engineering· 11 min read

Building a production streaming chat UI: SSE, cancellation, and error recovery

A complete guide to Server-Sent Events for AI chat — buffer management, AbortController, reconnection, and the [DONE] contract.

Nguyen Son Everestt
Nguyen Son EveresttFounder & Engineering Lead, VeloxAI
#streaming#sse#chat
Streaming SSE tutorial
Streaming SSE tutorial

Streaming is not performance optimization — it is a UX requirement. Users perceive a response as 'fast' when the first token appears under 500ms, even if the full response takes 10 seconds. A non-streaming 4-second response feels slower than a streaming 6-second response that shows the first word immediately.

The SSE contract

Every chunk is a data: line with JSON. Stream ends with data: [DONE]. Your parser must handle partial chunks split across TCP frames, empty keepalive lines, and the termination signal. Never assume one data: line equals one complete JSON object.

class SSEParser {
  buffer = ""; decoder = new TextDecoder();
  async *parse(response: Response): AsyncGenerator<SSEEvent> {
    const reader = response.body!.getReader();
    try {
      while (true) {
        const { value, done } = await reader.read();
        if (done) break;
        this.buffer += this.decoder.decode(value, { stream: true });
        const lines = this.buffer.split("\n");
        this.buffer = lines.pop() || "";
        for (const line of lines) {
          if (!line) continue;
          if (line === "data: [DONE]") return;
          if (line.startsWith("data: ")) {
            yield { type: "chunk", data: JSON.parse(line.slice(6)) };
          }
        }
      }
    } finally { reader.releaseLock(); }
  }
}
SSE parser with buffer management

Stop button with AbortController

Every streaming UI needs a Stop button. Without it, users pay for unwanted tokens. Use AbortController — create before fetch, pass signal, call abort() on Stop. This closes the TCP connection and stops token generation. Also handle: connection never established (show error + retry), mid-stream drop (show partial response + error), server error after chunks (show what you have, mark incomplete).

Updated:

Ready to ship your AI product?

Start free, route across providers, and see honest cost + readiness from day one.