Most chat widgets on the web are heavier than the page they sit on. The median embeddable chat is somewhere between 280KB and 1.4MB of JavaScript before a single message is sent, ships its own CSS reset, blocks the main thread for 600ms during hydration, and pings a websocket every second to keep the “online” dot warm.
We did not want to ship that. Live Chat — the chat product in the ByeBloat suite — is the second product we built, and we spent more time on the widget than we did on the inbox. The widget is what your customer sees. It is also the thing that shows up on your marketing page’s Lighthouse score.
Constraints we picked up front
- Total transferred bytes (gzip) must be under 20KB before any message is sent.
- First paint of the launcher must happen in under one frame on a mid-tier Android device.
- The widget must not import a CSS reset, must not register global event listeners during idle, and must not animate when the user has
prefers-reduced-motionset. - Round-trip latency for “send a message and see it on the agent’s screen” must be under 100ms p50 in North America.
The architecture
The widget is a single bundle, served from a long-cache CDN edge, that hydrates a shadow DOM. We use shadow DOM not for encapsulation theater but because it lets us ship a single stylesheet without fighting the host page’s CSS reset. Inside the shadow root, we use plain CSS variables — the same ones that drive ByeBloat’s marketing site — so the widget can be themed by the customer with a four-line config.
Real-time uses a single websocket per browser tab. We do not fall back to long-polling unless the websocket fails to upgrade twice in a row, and even then the fallback is server-sent events, not the 2010-era long-poll loop. About 0.4% of sessions end up on SSE; almost none on polling.
Presence is hard, and most of it is theater
The classic “agent is online” green dot is, in most chat widgets, a lie. It means “an agent has been logged in within the last 30 minutes” or “the system is enabled.” Customers learn the lie quickly and stop trusting the dot.
We took a hard line on presence:
- The dot is green only when at least one agent has had the inbox tab focused in the last 60 seconds.
- The dot is yellow when an agent is online but their tab is not focused (they are still going to see the message; they are just not staring at the inbox).
- The dot is gray otherwise, and we tell the customer explicitly: “We’ll reply by email.”
This sounds obvious. It is not how most products do it, because the gray dot loses sales. We think the gray dot wins trust, and trust compounds.
Latency, where it actually comes from
Most chat round-trip latency is not network — it is the time between the websocket frame arriving and the agent’s inbox UI committing a render. We measured ours and found three offenders:
receive → 4ms (network, NA → NA) parse → 1ms (single JSON.parse, no zod on the hot path) state → 38ms (immer + react state batching) paint → 19ms (layout + paint of the new message row) total → 62ms
The state-update step was the surprise. We were running the incoming message through an immer-based reducer that did a deep clone of the conversation tree. We replaced it with a shallow append on a flat array and dropped 30ms off p50. Three weeks of work for one decision a senior engineer made in fifteen minutes after lunch.
What is not in the widget
The widget does not contain a knowledge base, a help center search, an AI agent, a satisfaction survey, a co-browse button, a file picker, a video call launcher, an emoji picker, or a chat-bot greeting that says “Hi! I’m Sparky, your AI assistant.”
It contains a launcher, a thread, a text input, and a send button. That is what a chat widget is. Everything else is a product someone tried to sell you on top of one.