What Is WebSocket Infrastructure?
WebSocket infrastructure is the layer that manages persistent connections at scale — handling auth, routing, fanout, and delivery so your app doesn't have to.
Most developers have used WebSockets. Fewer have thought about what it takes to run them reliably at scale. The protocol is simple — a persistent, bidirectional TCP connection between client and server. The infrastructure around it is not.
WebSocket infrastructure is the collection of systems that make WebSockets work in production: connection management, authentication, message fanout across servers, presence tracking, reconnection handling, and delivery guarantees. It's the difference between a WebSocket demo and a WebSocket product.
The Protocol vs The Infrastructure
The WebSocket protocol itself is defined in RFC 6455. It describes an HTTP upgrade handshake and a framing format for messages. That's it. The protocol says nothing about:
- How to authenticate connections
- How to route messages to the right clients
- How to fan a message out to 10,000 subscribers at once
- How to track which users are online
- How to handle reconnections when a server crashes
- How to scale across multiple processes or machines
These are infrastructure problems. Solving them is the bulk of the work.
What WebSocket Infrastructure Actually Does
Connection Management
A WebSocket server must keep track of every open connection — who it belongs to, which channels it's subscribed to, and its current state. At 10 connections this is trivial. At 100,000, you need efficient data structures, memory budgets, and cleanup logic for stale connections.
Connections are stateful. Unlike HTTP, where each request is independent, a WebSocket connection lives for minutes or hours. The server must handle the full lifecycle: open, subscribe, message exchange, heartbeats, graceful close, and ungraceful disconnects.
Authentication
Most useful WebSocket connections need to be authenticated. Who is this client? Which channels are they allowed to join? This requires integrating with your existing auth system — typically by validating a token during the HTTP upgrade request, or by signing channel subscriptions with HMAC.
Public channels are straightforward. Private and presence channels require per-subscription auth checks: the client requests permission, your server signs the authorization, the WebSocket server verifies it.
Pub/Sub Fanout
When a message is published to a channel, every subscriber must receive it. On a single server this is a local broadcast — iterate over in-memory subscribers, send to each. Across multiple servers it's a distributed fanout problem.
The standard solution is Redis Pub/Sub. Each server node subscribes to a Redis topic for each channel it has local subscribers on. When a message is published (from any node), Redis delivers it to all subscribed nodes, which then forward it to their local connections.
Client A publishes → Server 1 → Redis PUBLISH channel:chat-room
↓
Server 1 receives → sends to local subscribers
Server 2 receives → sends to local subscribers
Server 3 receives → sends to local subscribers
This is ref-counted: a server subscribes to Redis only when it has at least one local subscriber, and unsubscribes when the last one leaves.
Presence Tracking
Presence is the ability to know who's currently connected to a channel — and to notify members when others join or leave. It's used for typing indicators, viewer counts, online status, and live cursors.
Implementing presence requires a shared, distributed member registry. Each connection registers its user data on join; the registry is queried when a new member wants the current member list; and join/leave events are broadcast when membership changes. Redis hashes and sorted sets are the typical backing store, with heartbeats to detect stale connections.
Reconnection Handling
Networks are unreliable. Mobile clients go offline. Servers restart. A production WebSocket client needs automatic reconnection with exponential backoff. The server needs to handle reconnecting clients cleanly — restoring subscriptions, replaying missed messages if the use case demands it.
Why Building It Yourself Is Hard
The components above are well-understood individually. The difficulty is in the interactions:
- Sticky sessions — if clients must reconnect to the same server, load balancing becomes constrained. If you go stateless (via Redis), you must handle the fanout overhead.
- Memory pressure — each WebSocket connection holds file descriptors and buffers. A single Node.js process can handle ~10k connections before OS limits bite. Scaling requires tuning
ulimit, TCP keepalives, and connection draining on deploy. - Graceful deploys — rolling restarts mean closing connections. Clients see disconnects. You need coordinated draining, client-side reconnection, and ideally zero visible disruption.
- Split-brain scenarios — what happens when your Redis instance goes down? Or a network partition isolates a server node?
These aren't unsolvable problems, but each one takes real engineering time to get right.
Managed WebSocket Infrastructure
Managed infrastructure handles all of this for you. You get an API to publish events, SDKs to subscribe from clients, and the connection management, fanout, presence, and scaling handled by the provider.
With Apinator, for example, publishing a message to all subscribers of a channel is a single HTTP call from your server:
import { ApinatorServer } from '@apinator/server';
const apinator = new ApinatorServer({
appId: process.env.APINATOR_APP_ID,
apiKey: process.env.APINATOR_API_KEY,
apiSecret: process.env.APINATOR_API_SECRET,
});
await apinator.publish('chat-room', 'message', {
text: 'Hello, world!',
userId: '42',
});
And subscribing from the client is equally concise:
import { RealtimeClient } from '@apinator/sdk';
const client = new RealtimeClient({ appKey: 'your-app-key' });
const channel = client.subscribe('chat-room');
channel.bind('message', (data) => {
renderMessage(data);
});
The infrastructure layer — Redis fanout, connection tracking, auth signing, presence, reconnection — is abstracted away.
When You Might Build Your Own
Managed infrastructure makes sense for most applications. You might consider rolling your own if:
- You have strict data residency requirements that prohibit third-party message routing
- You're operating at a scale where the economics favor owning the stack
- Your messaging patterns are unusual enough that generic infrastructure doesn't fit
For the vast majority of teams, the cost of building and maintaining WebSocket infrastructure exceeds the cost of using a managed service. The protocol is simple. The infrastructure is not.