How the WebSocket Handshake Works
The WebSocket connection starts as HTTP and upgrades to a persistent socket. Here's exactly what happens during the handshake — headers, keys, and all.
WebSocket connections don't start as WebSocket connections. They start as HTTP requests. That's by design — and understanding why reveals a lot about how the web's infrastructure actually works.
Why HTTP First?
The internet is built around HTTP. Proxies, load balancers, firewalls, and CDNs all know how to handle HTTP traffic. If WebSocket had invented its own TCP-based protocol from scratch on a new port, it would have been blocked everywhere.
Instead, WebSocket piggybacks on HTTP to get through the existing infrastructure, then upgrades the connection to something more capable. The handshake happens on port 80 (or 443 for TLS), over a plain TCP connection that every router and firewall already allows. Once the upgrade completes, the two sides speak WebSocket for the rest of the connection's life.
The Client's Upgrade Request
The client sends a standard HTTP/1.1 GET request, but with a specific set of headers that signal its intent to upgrade:
GET /chat HTTP/1.1
Host: api.apinator.io
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13
Origin: https://example.com
Let's go through the important ones:
Upgrade: websocket— tells the server this request wants to switch protocols.Connection: Upgrade— required alongsideUpgrade; signals that the connection header should be processed rather than forwarded.Sec-WebSocket-Key— a base64-encoded random 16-byte value generated by the client. It's not a security credential; it's a handshake nonce used to prove the server actually understands the WebSocket protocol.Sec-WebSocket-Version: 13— the version number for the WebSocket protocol as defined in RFC 6455. Version 13 is the only version in wide use today.
The Origin header is optional but typically sent by browsers. Servers use it to enforce which domains are allowed to open WebSocket connections — a basic same-origin policy enforcement point.
The Server's Response
If the server supports WebSocket and accepts the upgrade, it responds with HTTP 101:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
The status code 101 Switching Protocols is the only valid success response here. The critical header is Sec-WebSocket-Accept. The server derives it from the client's key using a specific algorithm:
- Take the
Sec-WebSocket-Keyvalue the client sent. - Append the fixed magic string
258EAFA5-E914-47DA-95CA-C5AB0DC85B11(defined in the RFC). - Compute the SHA-1 hash of the combined string.
- Base64-encode the result.
In JavaScript, that looks like this:
const crypto = require("crypto");
function computeAcceptKey(clientKey) {
const MAGIC = "258EAFA5-E914-47DA-95CA-C5AB0DC85B11";
return crypto
.createHash("sha1")
.update(clientKey + MAGIC)
.digest("base64");
}
The client performs the same computation independently and verifies the server's response matches. If it doesn't, the client aborts the connection. This confirms the server genuinely understands WebSocket — a plain HTTP server that just echoes headers back would fail this check.
After the Handshake: HTTP Is Gone
Once the client validates the Sec-WebSocket-Accept value, the HTTP exchange is finished. Both sides now treat the TCP connection as a WebSocket connection. No more HTTP request/response cycles. The same underlying socket stays open and both parties can send messages at any time, in either direction.
WebSocket Frames
Everything transmitted over a WebSocket connection after the handshake is wrapped in a frame. The frame format is compact and binary:
- Opcode (4 bits) — indicates the frame type:
0x1for text,0x2for binary,0x8for close,0x9for ping,0xAfor pong. - Mask bit (1 bit) — client-to-server frames are always masked; server-to-client frames are never masked (per the spec).
- Payload length — variable length encoding: 7 bits covers payloads up to 125 bytes; 7+16 bits covers up to 65535 bytes; 7+64 bits covers larger payloads.
- Masking key (32 bits) — present only when the mask bit is set; the payload is XORed with this key.
- Payload data — the actual message content.
You don't need to implement this yourself when using a library or a platform like Apinator, but knowing the structure helps when reading protocol traces or debugging low-level issues.
Why Masking Matters
The masking requirement on client frames exists to prevent a specific attack against caching proxies. Without masking, a malicious page could craft WebSocket payloads that look like HTTP responses, potentially poisoning a shared proxy's cache with attacker-controlled content for other users.
By requiring clients to XOR all payloads with a random 32-bit masking key, the WebSocket spec ensures that the data on the wire is unpredictable and cannot be mistaken for valid HTTP traffic by an intermediary. The server unmasks the payload before processing it. Server-to-client frames don't need masking because servers are assumed to be trustworthy endpoints, not potentially malicious scripts.
Subprotocols
If the client wants to negotiate an application-level protocol on top of WebSocket, it can send the Sec-WebSocket-Protocol header:
Sec-WebSocket-Protocol: chat, superchat
The server selects one and echoes it back. This is purely advisory — both sides need to agree on what the protocol means. Common examples include graphql-ws for GraphQL subscriptions and mqtt for IoT messaging.
TLS: wss:// Is Just WebSocket Over TLS
The wss:// scheme is WebSocket over TLS, exactly as https:// is HTTP over TLS. The TLS handshake happens first, establishing an encrypted tunnel, and then the WebSocket HTTP upgrade happens inside that tunnel. From the protocol perspective, nothing changes — same headers, same handshake, same frame format. The OS and TLS library handle the encryption transparently.
In production, always use wss://. Plain ws:// sends all data in cleartext, including any authentication tokens passed during the connection.
What This Means in Practice
Any HTTP/1.1 server can become a WebSocket server by handling the Upgrade header correctly. The existing TCP connection is reused — no new socket, no new port, no firewall rules to update. This is why WebSocket adoption happened quickly: it required almost no infrastructure changes.
When you connect to Apinator, this exact handshake happens on every new connection. The Apinator data plane validates the Sec-WebSocket-Key, responds with the correct Sec-WebSocket-Accept, and from that point forward maintains a persistent, low-latency channel between your server and your users — without either side needing to poll.
The protocol is simple enough to implement from scratch in a weekend, and standard enough that every browser, load balancer, and proxy in wide use today handles it correctly. That combination is what made WebSocket the default choice for real-time applications on the web.