Avik
back|webrtc

Terminal Sharing Over WebRTC: Building Term Bridge with Cloudflare Workers and PTY Multiplexing

By Avik MukherjeeMay 3, 202630 min readUpdated May 3, 2026

Terminal Sharing Over WebRTC#

Term Bridge exists first as a learning project: a hands-on way to understand WebRTC signaling, Cloudflare Workers and Durable Objects, and bidirectional PTY multiplexing by shipping something end-to-end. The product shape and flow are inspired by Skyping, a peer-to-peer terminal sharing tool built by Piyush Garg — pairing code in seconds, no port forwarding, traffic off the relay after setup. This repository and article are independent work, not a fork or endorsement; I will keep extending the implementation and revising this write-up as needed.

I wanted to share my terminal with someone remotely. Not screen sharing — that's blurry and laggy. Not SSH — that requires key management, open ports, and trusting someone with shell access to your machine. Not tmux — both users need accounts on the same server.

I wanted: I type a command. You type a command. We both see the same terminal. Peer-to-peer. No middleman reading our keystrokes. One pairing code.

So I built Term Bridge. A Cloudflare Worker for signaling. A WebRTC DataChannel for data. Two PTY processes bridged bidirectionally. Six-digit code to connect. Done.

What you're reading is the full implementation — how signaling works, how WebRTC negotiation happens through a Durable Object, how terminal data flows in both directions through the same DataChannel, and how a simple prefix protocol keeps control messages separate from terminal output.

The Problem with Terminal Sharing#

Existing options all have trade-offs:

  • SSH — Requires the host to expose port 22, manage keys or passwords, and give the client a real shell account. Fine for servers, terrible for "let me show you something on my laptop." You also need to know the host's IP address or hostname, which means dealing with NAT, dynamic IPs, and firewall rules.

  • tmux attach — Both users need accounts on the same machine. Shared sessions can't have different view modes. If you detach, the other user loses context. It's multiplexing within a single machine, not between two machines.

  • Screen sharing (Zoom, etc.) — The host's terminal is rendered as pixels, compressed, sent over video, and displayed on the viewer's screen. Latency is 200ms+. You can't copy text. You can't type. The resolution is whatever the video codec decides. It's watching, not sharing. You're sending 30 frames per second of a 1920x1080 image when the actual data changed is a few bytes of text.

  • VS Code Live Share — Close to right, but it's an IDE extension, not a terminal tool. Heavy, requires Microsoft accounts, and tunnels through Microsoft's infrastructure. Your terminal data flows through Microsoft's servers.

  • ngrok + SSH — Expose port 22 through a tunnel, then SSH in. Works, but you've given someone full shell access, you need to manage SSH keys, and ngrok's free tier is limited. Also, ngrok's relay servers sit in the middle of your traffic.

  • teleconsole / tmate — These are the closest to what I wanted, but they relay all traffic through their servers. Your terminal data passes through someone else's infrastructure. If their server goes down, your session dies. If they decide to log traffic, they can.

What I wanted was something with the UX of Apple's AirDrop — one code, no setup, instant connection — but for terminals. And the data should go peer-to-peer, not through a relay.

The key insight: WebRTC gives you a peer-to-peer DataChannel. It's like a WebSocket, but the data goes directly between browsers (or in our case, Node.js processes). No server in the middle can read the traffic. The server is only needed for the initial handshake — exchanging connection metadata — not for forwarding data.

So the architecture is:

  1. A signaling server (Cloudflare Worker) that helps two peers find each other
  2. A 6-digit pairing code for the initial connection
  3. WebRTC DataChannel for all terminal data after connection
  4. Two PTY processes (one on each machine) for bidirectional terminal I/O

Architecture Overview#

Before diving into code, here's the full system architecture. Every component, every data path, every protocol layer.

High-Level System Diagram#

code
┌─────────────────────────────────────────────────────────────────────────────┐
│                             INTERNET / CLOUDFLARE EDGE                      │
│                                                                             │
│  ┌─────────────────────────────────────────────────────────────────────┐    │
│  │                  Cloudflare Worker (Hono)                           │    │
│  │                                                                     │    │
│  │   Routes:                                                           │    │
│  │   POST /session        → create code + DO                           │    │
│  │   GET  /join/:code     → resolve code → sessionId (single-use)      │    │
│  │   GET  /session/:id/ws → WebSocket → forward to DO                  │    │
│  │   GET  /install        → curl|bash installer                        │    │
│  └───────────┬──────────────────────┬──────────────────────────────────┘    │
│              │                      │                                       │
│              ▼                      ▼                                       │
│  ┌──────────────────┐    ┌─────────────────────────────────────────────┐    │
│  │   KV Namespace   │    │       Durable Object: SessionRoom           │    │
│  │                  │    │                                             │    │
│  │  code → sessionId│    │  ctx.storage:                               │    │
│  │  TTL: 600s       │    │    machine: string                          │    │
│  │  (auto-expires)  │    │    sessionEnded: boolean                    │    │
│  │                  │    │                                             │    │
│  │  Single-use:     │    │  WebSocket Tags: ["host"] | ["client"]      │    │
│  │  delete on GET   │    │                                             │    │
│  └──────────────────┘    │  Role: relay all messages between           │    │
│                          │        host ↔ client WebSocket              │    │
│                          │                                             │    │
│                          │  Lifecycle:                                 │    │
│                          │    init() → store machine                   │    │
│                          │    accept ws(host) → wait                   │    │
│                          │    accept ws(client) → notify host          │    │
│                          │    relay SDP + ICE between both             │    │
│                          │    on disconnect → sessionEnded = true      │    │
│                          └─────────────────────────────────────────────┘    │
│                                                                             │
│  ┌──────────────────┐                                                       │
│  │  STUN Server     │  stun:stun.cloudflare.com:3478                        │
│  │                  │  Helps peers discover public IP + port                │
│  │  Request: "What  │  through NAT. Returns:                                │
│  │  is my IP?"      │  "You are 203.0.113.5:49152"                          │
│  └──────────────────┘                                                       │
└─────────────────────────────────────────────────────────────────────────────┘
         ▲                                     ▲
         │ HTTP (POST /session, GET /join)     │ HTTP (GET /join, WS upgrade)
         │ WebSocket (SDP/ICE relay)           │ WebSocket (SDP/ICE relay)
         │                                     │
    ─────┼─────────────────────────────────────┼─────────────────────────────
         │                                     │
         │    AFTER DataChannel opens:         │
         │    Signaling server is OUT.         │
         │    All data goes P2P directly.      │
         │                                     │
    ─────┼─────────────────────────────────────┼─────────────────────────────
         │  DTLS-encrypted P2P                 │
         │  WebRTC DataChannel                 │
         │  (ordered, reliable, bidirectional) │
         │                                     │
         ▼                                     ▼
         ┌─────────────────────┐              ┌─────────────────────┐
         │    HOST MACHINE     │              │  CLIENT MACHINE     │
         │                     │              │                     │
         │  ┌────────────────┐ │              │  ┌────────────────┐ │
         │  │  Host PTY      │ │   terminal   │  │  Client PTY    │ │
         │  │  (bash/zsh)    │◄─┼─────────────┼─►│  (bash/zsh)    │ │
         │  │                │ │   data +     │  │                │ │
         │  │  stdin ← DC    │ │   ctrl msgs  │  │  stdin ← DC    │ │
         │  │  stdout → DC   │ │              │  │  stdout → DC   │ │
         │  └───────┬────────┘ │              │  └───────┬────────┘ │
         │          │          │              │          │          │
         │          ▼          │              │          ▼          │
         │  ┌────────────────┐ │              │  ┌────────────────┐ │
         │  │  terminal-io   │ │              │  │  client.ts     │ │
         │  │  (input router)│ │              │  │  (input router)│ │
         │  │                │ │              │  │                │ │
         │  │  local mode:   │ │              │  │  remote mode:  │ │
         │  │   stdin→shell  │ │              │  │   stdin→DC     │ │
         │  │  remote mode:  │ │              │  │  local mode:   │ │
         │  │   stdin→DC     │ │              │  │   stdin→shell  │ │
         │  │   (rev_input)  │ │              │  │                │ │
         │  └───────┬────────┘ │              │  └───────┬────────┘ │
         │          │          │              │          │          │
         │          ▼          │              │          ▼          │
         │  ┌────────────────┐ │              │  ┌────────────────┐ │
         │  │  pty.ts        │ │              │  │  client.ts     │ │
         │  │  (WebRTC +     │ │              │  │  (WebRTC +     │ │
         │  │   DataChannel) │◄─┼─────────────┼─►│   DataChannel) │ │
         │  └────────────────┘ │   P2P DTLS   │  └────────────────┘ │
         │                     │              │                     │
         │  term-bridge        │              │                     │
         │                     │              │  connect · 482-913  │
         └─────────────────────┘              └─────────────────────┘

Read this diagram from top to bottom:

  1. Cloudflare Edge — The Worker handles HTTP routes. KV stores the pairing code. The Durable Object holds two WebSocket connections (host and client) and relays signaling messages between them.

  2. STUN Server — Both peers contact the STUN server independently to discover their public IP:port through NAT. The STUN server never sees WebRTC data — it only helps with IP discovery.

  3. P2P DataChannel — After signaling completes, the host and client have a direct DTLS-encrypted connection. The signaling server, KV, and DO are no longer involved in data transfer.

  4. Host Machine — Has a PTY (bash/zsh), the terminal I/O router (terminal-io.ts), and the WebRTC bridge (pty.ts). Default view mode: "local" (host controls their own machine).

  5. Client Machine — Has its own PTY, its own I/O router, and its own WebRTC bridge (client.ts). Default view mode: "remote" (client watches the host's terminal).

Data Flow: Local Mode (Default)#

When both machines connect, the host is in "local" mode and the client is in "remote" mode:

code
 HOST (viewMode = "local")              CLIENT (viewMode = "remote")
 ─────────────────────────              ─────────────────────────────

  User types "ls"                         User watches
       │                                       │
       ▼                                       ▼
  ┌──────────┐                          ┌──────────────┐
  │ stdin    │                          │ DC.onMessage │
  │ (raw)    │                          │              │
  └────┬─────┘                          └──────┬───────┘
       │ (char-by-char)                        │ (terminal output string)
       ▼                                       ▼
  ┌──────────────┐   DC.sendMessage("ls")   ┌──────────────┐
  │ shell.write  │─────────────────────────►│stdout.write  │
  │ ("l")        │                          │ ("ls output")│
  │ shell.write  │                          │              │
  │ ("s")        │                          └──────────────┘
  └──────────────┘
       │
       │ shell produces output
       ▼
  ┌──────────────┐   DC.sendMessage(output) ┌──────────────┐
  │ stdout.write │                          │ (not used    │
  │ (output)     │                          │  in remote)  │
  └──────────────┘                          └──────────────┘
       │
       │ ALSO
       ▼
  sendRemote(output) ──────────────────►  client receives
                                         and renders on stdout

The host controls their own shell. Keystrokes go to the local PTY. Output goes to both the host's stdout AND the DataChannel. The client receives the output passively.

Data Flow: Remote Mode (After /switch)#

After the host types /switch, viewMode flips to "remote":

code
 HOST (viewMode = "remote")              CLIENT (viewMode = "remote")
 ──────────────────────────              ─────────────────────────────

  User types "pwd"                         User watches host's screen
       │                                         │
       ▼                                         ▼
  ┌──────────────┐                        ┌──────────────┐
  │ stdin        │                        │ DC.onMessage │
  │ (raw)        │                        │              │
  └────┬─────────┘                        └──────┬───────┘
       │                                         │
       │ NOT going to local shell                │ host's output
       │ going to CLIENT's PTY via rev_input     │ still renders
       ▼                                         ▼
  encodeCtrl({type:"rev_input",        ┌──────────────┐
    data: "p"})  ─────────────────────►│ clientPty    │
  encodeCtrl({type:"rev_input",        │  .write("p") │
    data: "w"})  ─────────────────────►│ clientPty    │
  encodeCtrl({type:"rev_input",        │  .write("w") │
    data: "d"})  ─────────────────────►│ clientPty    │
                                       └──────┬───────┘
                                              │
                                              │ client PTY output
                                              ▼
                                        encodeCtrl({type:"rev_data",
                                          data: "/home/user"})
                                      ─────────────────────────►
                                                               │
  HOST receives rev_data                              ┌────────┴───────┐
       │                                              │ dc.sendMessage │
       ▼                                              │ (rev_data)     │
  ┌──────────────┐                                    └────────────────┘
  │ stdout.write │
  │ ("/home/user"│
  │  rendered)   │
  └──────────────┘

  HOST'S SCREEN NOW SHOWS CLIENT'S TERMINAL OUTPUT

The host is now controlling the client's machine. Keystrokes go through rev_input control messages to the client's PTY. The client's PTY output comes back through rev_data control messages and renders on the host's stdout.

Meanwhile, the host's own PTY is still running — its output still goes to the DataChannel (the client still sees the host's terminal). But the host's stdout now shows the client's terminal instead of their own.

This is bidirectional terminal multiplexing. Both PTYs are always active. The /switch command only changes what the host sees on their screen and where their keystrokes go.

Bidirectional data flow · viewMode = local

HOST

controls own shell

stdin:host PTY
stdout shows:host PTY output
CLIENT

watches host's screen

stdin:client PTY (idle)
stdout shows:host PTY output (always)

DataChannel messages (2)

DATA
shell.write(ch)Host keystrokes go to host's local PTY
DATA
dc.sendMessage(stdout)Host's PTY output streams to client (raw terminal data)
Both PTYs are alive.Host drives its own shell. Client's PTY exists but receives nothing. Output flows host → client. One DataChannel, one direction.

The Protocol Stack#

Everything flows over one DataChannel. Here's what the protocol stack looks like:

code
┌──────────────────────────────────────────────────────────┐
│                    Application Layer                     │
│                                                          │
│  ┌──────────────┐  ┌──────────────┐  ┌────────────────┐  │
│  │  Terminal    │  │  Commands    │  │  File Transfer │  │
│  │  I/O         │  │  /switch     │  │  start/chunk/  │  │
│  │  (raw text)  │  │  /kick       │  │  end           │  │
│  │              │  │  /transfer   │  │                │  │
│  └──────┬───────┘  └──────┬───────┘  └────────┬───────┘  │
│         │                 │                   │          │
├─────────┼─────────────────┼───────────────────┼──────────┤
│         ▼                 ▼                   ▼          │
│  ┌─────────────────────────────────────────────────┐     │
│  │          DataChannel Multiplexer                │     │
│  │                                                 │     │
│  │  Regular message:  raw terminal data (string)   │     │
│  │  Control message:  "\x00TB:" + JSON(CtrlMsg)    │     │
│  │                                                 │     │
│  │  isCtrlMsg(msg) = msg.startsWith("\x00TB:")     │     │
│  │                                                 │     │
│  │  CtrlMsg types:                                 │     │
│  │    rev_data      ← client PTY output → host     │     │
│  │    rev_input     ← host keystrokes → client PTY │     │
│  │    rev_resize    ← host resize → client PTY     │     │
│  │    resize        ← client resize → host PTY     │     │
│  │    kick          ← host kicks client            │     │
│  │    cmd_response  ← command output               │     │
│  │    transfer_start│transfer_chunk│transfer_end   │     │
│  └──────────────────────┬──────────────────────────┘     │
│                         │                                │
├─────────────────────────┼────────────────────────────────┤
│                         ▼                                │
│  ┌─────────────────────────────────────────────────┐     │
│  │          WebRTC DataChannel                     │     │
│  │                                                 │     │
│  │  Mode: ordered, reliable (SCTP)                 │     │
│  │  Encryption: DTLS 1.2+                          │     │
│  │  Max message: ~256KB (implementation-dependent) │     │
│  │  Congestion control: SCTP CC                    │     │
│  └──────────────────────┬──────────────────────────┘     │
│                         │                                │
├─────────────────────────┼────────────────────────────────┤
│                         ▼                                │
│  ┌─────────────────────────────────────────────────┐     │
│  │          UDP Transport                          │     │
│  │  (or TCP fallback via ICE-TCP)                  │     │
│  └──────────────────────┬──────────────────────────┘     │
│                         │                                │
├─────────────────────────┼────────────────────────────────┤
│                         ▼                                │
│  ┌─────────────────────────────────────────────────┐     │
│  │          IP Network (P2P direct or TURN relay)  │     │
│  └─────────────────────────────────────────────────┘     │
└──────────────────────────────────────────────────────────┘

The key decision: one DataChannel, not two. Terminal data and control messages share the same channel. They're distinguished by the \x00TB: prefix on control messages. This guarantees ordering — a resize control message arrives after all terminal data that was sent before it. If we used two channels, ordering between channels would be undefined.

Session Lifecycle State Machine#

The Durable Object goes through a well-defined lifecycle:

code
                    POST /session
                         │
                         ▼
                  ┌──────────────┐
                  │   CREATED    │
                  │              │
                  │ KV: code→id  │
                  │ DO: init()   │
                  │ storage:     │
                  │  machine="x" │
                  └──────┬───────┘
                         │
              Host connects WS
              (role="host")
                         │
                         ▼
                  ┌──────────────┐
                  │  HOST_READY  │
                  │              │
                  │ Tags:        │
                  │  ws0=["host"]│
                  │              │
                  │ Waiting for  │
                  │ client...    │
                  └──────┬───────┘
                         │
             Client resolves code
             GET /join/:code
             (code deleted from KV)
                         │
             Client connects WS
             (role="client")
                         │
                         ▼
                  ┌──────────────┐       peer_info sent
                  │  BOTH_READY  │ ────► to host (IP, machine)
                  │              │
                  │ Tags:        │       SDP offer/answer
                  │  ws0=["host"]│       ICE candidates
                  │  ws1=["client│       relayed between
                  │              │       host ↔ client
                  └──────┬───────┘
                         │
              DataChannel opens
              (signaling done)
                         │
                         ▼
                  ┌──────────────┐
                  │  ACTIVE      │
                  │              │
                  │ Signaling    │
                  │ server is    │
                  │ no longer    │
                  │ involved     │
                  │              │
                  │ Terminal     │
                  │ data flows   │
                  │ P2P          │
                  └──────┬───────┘
                         │
            Either peer disconnects
            (ws.close / network drop)
                         │
                         ▼
                  ┌──────────────┐
                  │   ENDED      │
                  │              │
                  │ storage:     │
                  │  sessionEnded│
                  │  = true      │
                  │              │
                  │ peer gets    │
                  │ "peer_       │
                  │  disconnected│
                  │              │
                  │ Any future   │
                  │ ws attempt → │
                  │ HTTP 410     │
                  └──────────────┘

Every state transition is atomic. The Durable Object is single-threaded, so there are no race conditions. Two clients can't both join the same session — the first one deletes the code from KV, and the second one gets a 404.

Durable Object lifecycle · click any state

CREATED· step 1/5

ctx.storage

  • machine = "avik-mbp"
  • sessionEnded = false

WebSockets

  • (none yet)
trigger:POST /session

WebRTC Signaling Sequence#

The most complex part of the system is the signaling dance. Here's every message, in order:

code
 HOST                              SIGNALING SERVER                          CLIENT
  │                                         │                                     │
  │  1. POST /session                       │                                     │
  │  {machine: "avik-mbp"}                  │                                     │
  │────────────────────────────────────────►│                                     │
  │                                         │  KV.put("482913", "uuid-...")       │
  │                                         │  DO.init("avik-mbp")                │
  │  2. {code: "482913", sessionId: "..."}  │                                     │
  │◄────────────────────────────────────────│                                     │
  │                                         │                                     │
  │  3. WS /session/uuid-.../ws?role=host   │                                     │
  │════════════════════════════════════════►│                                     │
  │         (WebSocket kept open,           │                                     │
  │          waiting for client)            │                                     │
  │                                         │                                     │
  │          Host displays:                 │                                     │
  │          "Code: 482-913"                │                                     │
  │          "Waiting for connections..."   │                                     │
  │                                         │                                     │
  │                                         │    4. GET /join/482913              │
  │                                         │◄────────────────────────────────────│
  │                                         │    KV.get("482913") → "uuid-..."    │
  │                                         │    KV.delete("482913") (one use)    │
  │                                         │    {sessionId: "uuid-..."}          │
  │                                         │────────────────────────────────────►│
  │                                         │                                     │
  │                                         │    5. WS /session/uuid-.../ws       │
  │                                         │       ?role=client                  │
  │                                         │◄════════════════════════════════════│
  │                                         │                                     │
  │  6. {type:"peer_info",                  │                                     │
  │       address:"203.0.113.5",            │                                     │
  │       machine:"avik-mbp"}               │                                     │
  │◄────────────────────────────────────────│                                     │
  │                                         │                                     │
  │  ═══════════════════════════════════════════════════════════════════════════  │
  │  WEBRTC NEGOTIATION (via signaling relay)                                     │
  │  ═══════════════════════════════════════════════════════════════════════════  │
  │                                         │                                     │
  │  Host creates PeerConnection +          │                                     │
  │  DataChannel("terminal")                │                                     │
  │                                         │                                     │
  │  7. {type:"offer", sdp:"v=0..."}        │                                     │
  │────────────────────────────────────────►│                                     │
  │                                         │  8. relay                           │
  │                                         │────────────────────────────────────►│
  │                                         │                                     │
  │                                         │   client: RTCPeerConnection()       │
  │                                         │   setRemoteDescription(offer)       │
  │                                         │                                     │
  │                                         │    9. {type:"answer", sdp:"v=0..."} │
  │                                         │◄────────────────────────────────────│
  │  10. relay                              │                                     │
  │◄────────────────────────────────────────│                                     │
  │                                         │                                     │
  │  pc.setRemoteDescription(sdp, "answer") │                                     │
  │                                         │                                     │
  │  ═══════════════════════════════════════════════════════════════════════════  │
  │  ICE CANDIDATE EXCHANGE (trickle ICE)                                         │
  │  ═══════════════════════════════════════════════════════════════════════════  │
  │                                         │                                     │
  │  STUN: "my public IP is 198.51.100.2"   │                                     │
  │                                         │                                     │
  │  11. {type:"ice", candidate:            │                                     │
  │       "candidate:...", mid:"0"}         │                                     │
  │────────────────────────────────────────►│                                     │
  │                                         │  12. relay                          │
  │                                         │────────────────────────────────────►│
  │                                         │   client: addIceCandidate(...)      │
  │                                         │                                     │
  │                                         │    13. {type:"ice", candidate:      │
  │                                         │       "candidate:...", mid:"0"}     │
  │                                         │◄────────────────────────────────────│
  │  14. relay                              │                                     │
  │◄────────────────────────────────────────│                                     │
  │  pc.addRemoteCandidate(...)             │                                     │
  │                                         │                                     │
  │  ... more ICE candidates trickle ...    │                                     │
  │                                         │                                     │
  │  ═══════════════════════════════════════════════════════════════════════════  │
  │  P2P CONNECTION ESTABLISHED                                                   │
  │  ═══════════════════════════════════════════════════════════════════════════  │
  │                                         │                                     │
  │  WebRTC selects best ICE candidate      │                                     │
  │  DTLS handshake completes               │                                     │
  │  DataChannel opens                      │                                     │
  │                                         │                                     │
  │  dc.onOpen → spawnShell()               │   client: dc.onOpen → spawnShell()  │
  │  dc.onOpen → attachHostTerminal()       │   client: wireClientPty()           │
  │                                         │                                     │
  │  ◄═════════════════════════════════════════════════════════════════════════►  │
  │              TERMINAL DATA FLOWS P2P (DTLS ENCRYPTED)                         │
  │              SIGNALING SERVER IS NO LONGER INVOLVED                           │
  │                                         │                                     │

Count the round-trips: From the host starting to the DataChannel opening, it takes 3-4 round-trips through the signaling server (offer, answer, ICE candidates). On a fast connection, this is under 500ms. The signaling server adds minimal latency — it's just relaying JSON strings between two WebSockets.

The STUN requests are parallel. Both peers contact the STUN server independently, before and during the signaling exchange. STUN responses feed into ICE candidates, which are relayed through the signaling server as they arrive (trickle ICE).

After the DataChannel opens, the WebSocket connections are kept alive but idle. If either peer disconnects later, the DO detects it via webSocketClose or webSocketError, sets sessionEnded = true, and notifies the other peer.

WebRTC handshake · step 1 of 13

setup
HOST
your machine
SIGNALING
Worker + DO + KV
CLIENT
peer machine
HTTP
#01POST /session
{ machine: "avik-mbp" }

Host asks the Worker to mint a session.

File Transfer Protocol#

File transfer is a three-phase protocol built on top of the control message layer:

code
 SENDER                                    RECEIVER
   │                                         │
   │  Phase 1: HEADER                        │
   │  ────────────────────────────────────── │
   │  ctrl {type:"transfer_start",           │
   │        filename:"report.pdf",           │
   │        size: 1048576}                   │
   │────────────────────────────────────────►│
   │                                         │  Creates transferState:
   │                                         │  {filename, size, chunks: Map()}
   │                                         │
   │  Phase 2: CHUNKS                        │
   │  ────────────────────────────────────── │
   │  ctrl {type:"transfer_chunk",           │
   │        index:0, data:"JVBERi0..."}      │  chunks.set(0, "JVBERi0...")
   │────────────────────────────────────────►│
   │                                         │
   │  ctrl {type:"transfer_chunk",           │
   │        index:1, data:"xLjQgMC..."}      │  chunks.set(1, "xLjQgMC...")
   │────────────────────────────────────────►│
   │                                         │
   │  ... (1048576 / 16384 = 64 chunks) ...  │
   │                                         │
   │  ctrl {type:"transfer_chunk",           │
   │        index:63, data:"2Pdfg=="}        │  chunks.set(63, "2Pdfg==")
   │────────────────────────────────────────►│
   │                                         │
   │  Phase 3: END                           │
   │  ────────────────────────────────────── │
   │  ctrl {type:"transfer_end",             │
   │        filename:"report.pdf"}           │
   │────────────────────────────────────────►│
   │                                         │  finishTransfer():
   │                                         │    sort chunks by index
   │                                         │    decode base64
   │                                         │    write to cwd/report.pdf
   │                                         │    fs.closeSync(fd)
   │                                         │
   │                                         │  "✓ Saved: report.pdf (1048576 bytes)"

No acknowledgments. The DataChannel is reliable — if a message is sent, it arrives. No need for ACKs. If the connection drops mid-transfer, both sides detect it via dc.onClosed and clean up.

No flow control. The DataChannel has built-in SCTP congestion control. If the sender is faster than the receiver, SCTP backs off automatically. We don't need to implement our own buffering.

Chunk size math:

  • 16,384 bytes raw → ~21,845 bytes base64 → ~21,860 bytes with \x00TB: prefix + JSON overhead
  • Well under the SCTP message size limit (~256KB)
  • 64 chunks for a 1MB file, 1024 chunks for a 16MB file
  • Each chunk is a separate DataChannel message, so congestion control operates per-chunk

The Three Parts#

Term Bridge needs:

  1. A Signaling Server — Cloudflare Worker + Durable Object + KV for pairing codes
  2. A Host Agent — PTY bridge + WebRTC + terminal I/O routing
  3. A Control Protocol — In-band signalling to separate terminal data from commands

Let's build each one.

WebRTC handshake · step 1 of 13

setup
HOST
your machine
SIGNALING
Worker + DO + KV
CLIENT
peer machine
HTTP
#01POST /session
{ machine: "avik-mbp" }

Host asks the Worker to mint a session.


Part 1: The Signaling Server#

WebRTC needs a signaling channel — a way for two peers to exchange connection metadata before they can talk directly. This is the SDP (Session Description Protocol) offer/answer exchange and ICE (Interactive Connectivity Establishment) candidate traversal.

The signaling server never touches terminal data. It's a matchmaker: it introduces two peers, then gets out of the way.

Why Cloudflare Workers?#

Three reasons:

  1. Edge deployment. Cloudflare runs Workers in 300+ cities worldwide. The signaling server is close to both peers, minimizing the round-trip time for SDP/ICE exchange. Latency matters here — every millisecond of signaling delay is a millisecond the user waits before the terminal connection opens.

  2. Durable Objects. WebSocket connections need to be held in a single, stateful location. Cloudflare's Durable Objects provide exactly this — a single-threaded actor with persistent storage, accessible by name. No database, no Redis, no external state store.

  3. Free tier. The signaling server handles ~10 requests per session (POST /session, GET /join, 2 WebSocket upgrades, ~6 SDP/ICE relays). At 100K requests/day on the free tier, that's 10,000 sessions/day. Enough for a side project.

The Hono Router#

The Worker uses Hono as its HTTP framework. Hono is like Express but designed for edge runtimes — no http.createServer, no Node.js dependencies, works in Cloudflare Workers out of the box.

typescript
const app = new Hono<{ Bindings: Bindings }>();
 
app.use("*", cors());

CORS is enabled globally because the install script and any future web client will be on different origins.

Generating the Pairing Code#

When the host starts, it calls POST /session:

typescript
app.post("/session", async (c) => {
  const body = await c.req.json<{ machine?: string }>();
  const machine = body.machine ?? "unknown";
 
  const sessionId = crypto.randomUUID();
  const code = generateCode();
 
  await c.env.CODES.put(code, sessionId, { expirationTtl: 600 });
 
  const roomId = c.env.SESSION_ROOMS.idFromName(sessionId);
  const room = c.env.SESSION_ROOMS.get(roomId);
  await room.init(machine);
 
  return c.json({ code, sessionId });
});

What's happening:

crypto.randomUUID() generates a unique session ID — a v4 UUID like f47ac10b-58cc-4372-a567-0e02b2c3d479. This is the internal identifier for the session. It's never shown to the user.

generateCode() produces a 6-digit code:

typescript
function generateCode(): string {
  const arr = new Uint32Array(1);
  crypto.getRandomValues(arr);
  return String(arr[0] % 1_000_000).padStart(6, "0");
}

crypto.getRandomValues() fills the array with cryptographically random values. We mod by 1,000,000 to get a 6-digit number and pad with leading zeros. The result: 482913.

Why not use the UUID directly? Because humans need to type it. 482-913 is readable. f47ac10b-58cc-4372-a567-0e02b2c3d479 is not.

Why 6 digits and not 4? Because 4 digits gives you 10,000 possible codes. If two people start sessions simultaneously, there's a 1 in 10,000 chance of collision. 6 digits gives you 1,000,000 possible codes — collision probability drops to 1 in a million. That's acceptable for a tool where sessions last minutes, not days.

The code is stored in Cloudflare KV with a 600-second TTL (10 minutes). If no one joins within 10 minutes, the code expires automatically. No cleanup needed. No cron job. No garbage collection.

typescript
await c.env.CODES.put(code, sessionId, { expirationTtl: 600 });

KV is perfect for this: it's globally distributed, eventually consistent, and supports TTL natively. The key is the pairing code, the value is the session ID. When the client joins from a different continent, they'll hit a KV edge node that has the data replicated.

Then we create a Durable Object for the session:

typescript
const roomId = c.env.SESSION_ROOMS.idFromName(sessionId);
const room = c.env.SESSION_ROOMS.get(roomId);
await room.init(machine);

idFromName() creates a deterministic ID from the session ID string. This means any request with the same session ID will route to the same Durable Object instance. The DO is a single-threaded, stateful actor — perfect for managing the WebSocket connections for a specific session.

get() returns a stub — a reference to the DO, not the DO itself. When we call room.init(machine), the Worker sends an RPC to the DO. The DO processes it in its own memory space, on its own machine.

Resolving the Code (Single-Use)#

When the client joins, it calls GET /join/:code:

typescript
app.get("/join/:code", async (c) => {
  const code = c.req.param("code").replace("-", "");
  const sessionId = await c.env.CODES.get(code);
 
  if (!sessionId) {
    return c.json({ error: "Code not found or expired" }, 404);
  }
 
  await c.env.CODES.delete(code);
 
  return c.json({ sessionId });
});

The code is single-use. After the first successful lookup, we delete it from KV:

typescript
await c.env.CODES.delete(code);

Why? Three scenarios this prevents:

  1. Eavesdropping. If someone sees the code in a chat and tries to connect after the legitimate client already has, they shouldn't be able to. One code, one connection. After use, it's gone.

  2. Double-join. If the client's network hiccups and they retry, a second join would create a second WebSocket in the DO. The DO only allows one client, so the second would be rejected (HTTP 409), but the first WebSocket would already be there, causing confusion. Deleting the code after first use prevents the second attempt entirely.

  3. Stale codes. Without deletion, a code stays in KV for 10 minutes. If the session ends after 30 seconds, the code is still valid for 9.5 minutes. Anyone with the code could start a new session against the same DO. Deletion eliminates this window.

The hyphen is stripped (replace("-", "")) so 482-913 and 482913 both work. The formatted version (with hyphen) is for human readability. The raw version is for convenience — some people type the hyphen, some don't.

Note on KV eventual consistency: KV is eventually consistent. In rare cases, a delete might not be immediately visible on all edge nodes. This means two clients in different regions could theoretically both resolve the same code. However, the DO prevents this — it only allows one client WebSocket (HTTP 409 for the second). The double-delete is a belt-and-suspenders approach.

The Durable Object: SessionRoom#

The Durable Object is the heart of the signaling server. It manages WebSocket connections for a specific session, relays WebRTC signaling messages between host and client, and tracks session state.

typescript
export class SessionRoom extends DurableObject {
  constructor(ctx: DurableObjectState, env: Bindings) {
    super(ctx, env);
  }
}

Why a Durable Object and not just a Worker?

Cloudflare Workers are stateless. Each request can hit a different instance on a different machine in a different data center. You can't hold a WebSocket connection in memory between requests because there's no guarantee the next request hits the same machine.

This is a fundamental problem for WebSocket-based applications. The solution usually involves sticky sessions (routing to the same instance) or an external state store (Redis, database). Cloudflare solves it differently — Durable Objects.

Durable Objects solve this. Each DO instance is:

  • Single-threaded — No race conditions. Two WebSockets connecting simultaneously are handled sequentially. The DO processes one request at a time, queuing others.
  • Persistent — State survives restarts via ctx.storage. If the DO hibernates (Cloudflare evicts idle DOs from memory to save resources), it can restore state from disk when it wakes up.
  • Named — Accessed by a deterministic ID (idFromName(sessionId)), so both host and client route to the same instance regardless of which Cloudflare edge node they hit.
  • Co-located — Both WebSocket connections live in the same DO, in the same process, on the same machine. Relaying a message from host to client is a function call, not a network hop.

Persistent State with ctx.storage#

typescript
private async getMachine(): Promise<string> {
  return (await this.ctx.storage.get<string>("machine")) ?? "unknown";
}
 
private async isSessionEnded(): Promise<boolean> {
  return (await this.ctx.storage.get<boolean>("sessionEnded")) ?? false;
}

ctx.storage is a transactional key-value store built into every Durable Object. It persists to disk. If the DO hibernates and wakes up, the data is still there. It's like a mini-database that's local to the DO instance.

We store two things:

  • machine — The hostname of the host machine (set during init()). Used in the peer_info message sent to the client so the host can identify who connected.
  • sessionEnded — A boolean flag set to true when either peer disconnects.

Why sessionEnded and not just checking if WebSockets are open? Because after a disconnect, the DO might hibernate. When it wakes up, the WebSocket state is gone (WebSockets don't survive hibernation), but ctx.storage persists. So we need a durable flag.

When a peer disconnects, the DO sets this flag. If the other peer tries to reconnect (maybe their network dropped momentarily), the DO checks this flag and returns HTTP 410 Gone:

typescript
async handleWebSocketUpgrade(request: Request, role: "host" | "client"): Promise<Response> {
  if (await this.isSessionEnded()) {
    return new Response("This session has ended", { status: 410 });
  }
 
  if (this.getSocketByRole(role)) {
    return new Response(`A ${role} is already connected to this session`, { status: 409 });
  }
  // ...
}

Two safety checks: 410 if the session has ended, 409 if the role is already taken. These prevent every edge case I've thought of.

No reconnection. No zombie sessions. Clean shutdown.

WebSocket Management with Tags#

typescript
private getSocketByRole(role: "host" | "client"): WebSocket | undefined {
  for (const sock of this.ctx.getWebSockets()) {
    const tags = this.ctx.getTags(sock);
    if (tags[0] === role && sock.readyState === WebSocket.OPEN) {
      return sock;
    }
  }
  return undefined;
}

Cloudflare's Durable Object WebSocket API provides ctx.getWebSockets() and ctx.getTags(). When we accept a WebSocket, we tag it with the role:

typescript
this.ctx.acceptWebSocket(serverSocket, [role]);

The tag [role] is an array of strings. We use the first element as the role: "host" or "client".

This is better than storing sockets in a Map<string, WebSocket>. I learned this the hard way. Initially, I used:

typescript
// BAD — doesn't survive hibernation
private sockets = new Map<string, WebSocket>();

This worked in development. Then I deployed to production, and sessions started failing randomly. The DO was hibernating between the host connecting and the client connecting. When it woke up, the Map was empty — the host's WebSocket reference was gone. The client would connect, but the DO couldn't relay messages because it didn't know about the host's socket.

Switching to ctx.getWebSockets() + tags fixed it because:

  1. No memory leaksctx.getWebSockets() only returns active sockets. Closed sockets are automatically cleaned up by the runtime.
  2. Survives hibernation — The runtime tracks WebSocket state externally, not in the DO's memory. After hibernation, ctx.getWebSockets() returns the correct list.
  3. No race conditions — The DO is single-threaded, so there's no concurrent access issue. Two peers connecting simultaneously won't corrupt the tag list.

The WebSocket Upgrade#

typescript
async handleWebSocketUpgrade(request: Request, role: "host" | "client"): Promise<Response> {
  if (await this.isSessionEnded()) {
    return new Response("This session has ended", { status: 410 });
  }
 
  if (this.getSocketByRole(role)) {
    return new Response(`A ${role} is already connected to this session`, { status: 409 });
  }
 
  const { 0: clientSocket, 1: serverSocket } = new WebSocketPair();
 
  this.ctx.acceptWebSocket(serverSocket, [role]);
 
  if (this.getSocketByRole("host") && this.getSocketByRole("client")) {
    const hostSocket = this.getSocketByRole("host")!;
    const cf = (request as any).cf as { ip?: string } | undefined;
    this.send(hostSocket, {
      type: "peer_info",
      address: cf?.ip ?? "unknown",
      machine: await this.getMachine(),
    });
  }
 
  return new Response(null, {
    status: 101,
    webSocket: clientSocket,
  });
}

What's happening:

WebSocketPair() creates two sockets — a client-facing one and a server-facing one. This is Cloudflare's WebSocket model: the Worker holds the server socket, the client (the term-bridge agent) holds the client socket. They're connected internally.

ctx.acceptWebSocket(serverSocket, [role]) registers the server socket with the DO's WebSocket management system. The DO will now receive webSocketMessage, webSocketClose, and webSocketError events for this socket.

The critical check: if both host and client are connected, the DO sends peer_info to the host. This is the trigger for the host to start WebRTC negotiation. The host was waiting for this message — it's the signal that says "your peer has arrived, start the SDP dance."

The peer_info includes the client's IP address (from Cloudflare's cf object) and the host's machine name (from ctx.storage). This gives the host enough information to identify who connected.

Response(null, { status: 101, webSocket: clientSocket }) is the HTTP 101 Switching Protocols response that completes the WebSocket upgrade. The agent receives the clientSocket end of the pair.

The Relay#

When both peers are connected, the DO relays messages between them:

typescript
async webSocketMessage(ws: WebSocket, message: string | ArrayBuffer): Promise<void> {
  const tags = this.ctx.getTags(ws);
  const role = tags[0] as "host" | "client";
  const peer: "host" | "client" = role === "host" ? "client" : "host";
 
  const msgStr = message instanceof ArrayBuffer
    ? new TextDecoder().decode(message)
    : message;
 
  const peerSocket = this.getSocketByRole(peer);
  if (peerSocket) {
    peerSocket.send(message);
  }
}

The DO doesn't parse the message. It doesn't know if it's an SDP offer, an ICE candidate, or anything else. It just forwards the raw bytes to the other peer.

This is important for security: the signaling server never sees the WebRTC session keys or the terminal data. It's a dumb relay for connection metadata. Even if someone compromised the signaling server, they couldn't decrypt the DataChannel traffic because the DTLS handshake (which establishes the encryption keys) happens directly between the peers.

The binary handling: WebRTC signaling messages are JSON strings, but Cloudflare WebSocket messages can be string or ArrayBuffer. We handle both:

typescript
const msgStr = message instanceof ArrayBuffer
  ? new TextDecoder().decode(message)
  : message;

In practice, all our messages are strings (JSON-encoded SDP and ICE candidates). But being defensive here prevents silent failures if the client or Cloudflare's infrastructure changes behavior.

The Disconnect#

When a peer disconnects, the DO notifies the other peer and marks the session as ended:

typescript
private async disconnect(role: "host" | "client"): Promise<void> {
  await this.ctx.storage.put("sessionEnded", true);
 
  const peer: "host" | "client" = role === "host" ? "client" : "host";
  const peerSocket = this.getSocketByRole(peer);
  if (peerSocket) {
    this.send(peerSocket, { type: "peer_disconnected", role });
    peerSocket.close(1000, "Peer disconnected");
  }
}

Two things happen:

  1. sessionEnded is persisted — any future connection attempt gets HTTP 410
  2. The other peer receives a peer_disconnected message and its WebSocket is closed

Why close both WebSockets? Because after one peer disconnects, the session is over. There's no reason to keep the other WebSocket alive. It would just consume resources in the DO. The agent on the other side receives peer_disconnected, tears down its WebRTC connection, and exits.

The session is done. No reconnection possible. This is by design — each session is a fresh pairing code and a fresh DO instance.

Why no reconnection? Reconnection adds complexity: you need to handle partial state (the PTY might have produced output while disconnected), re-establish the DataChannel (full SDP/ICE dance again), and deal with edge cases (what if the "disconnected" peer is actually still running?). Simpler to end the session and let the user start a new one with a fresh code.

Durable Object lifecycle · click any state

CREATED· step 1/5

ctx.storage

  • machine = "avik-mbp"
  • sessionEnded = false

WebSockets

  • (none yet)
trigger:POST /session

Part 2: The Host Agent#

The host agent is the most complex part. It:

  1. Creates a session with the signaling server
  2. Waits for a client to connect
  3. Spawns a PTY (pseudo-terminal)
  4. Creates a WebRTC PeerConnection and DataChannel
  5. Bridges PTY output → DataChannel (to client)
  6. Bridges DataChannel → PTY input (from client)
  7. Handles the /switch command for bidirectional control
  8. Handles file transfers
  9. Cleans up all resources on disconnect

Creating the Session#

typescript
export async function createSession(): Promise<SessionInfo> {
  const signalingBase = getSignalingBase();
  const res = await fetch(`${signalingBase}/session`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ machine: hostname() }),
  });
 
  if (!res.ok) {
    throw new Error(`Signaling server error: ${res.status} ${await res.text()}`);
  }
 
  const data = (await res.json()) as { code: string; sessionId: string };
 
  const wsUrl = new URL(`/session/${data.sessionId}/ws`, signalingBase);
  wsUrl.protocol = wsUrl.protocol.replace("http", "ws");
  wsUrl.searchParams.set("role", "host");
 
  return {
    code: data.code,
    sessionId: data.sessionId,
    signalingUrl: wsUrl.toString(),
  };
}

The agent sends a POST to the signaling server with the machine's hostname. The server responds with the 6-digit code and session ID. Then we construct a WebSocket URL — this is the persistent connection the host maintains with the Durable Object, waiting for the client to arrive.

The URL protocol is swapped from https to wss:

typescript
wsUrl.protocol = wsUrl.protocol.replace("http", "ws");

This is because we're upgrading from HTTP to WebSocket. https:// becomes wss://. http:// becomes ws://. The replace("http", "ws") handles both cases.

The ?role=host query parameter tells the DO which role this WebSocket belongs to. The DO will tag the socket with ["host"] using ctx.acceptWebSocket(serverSocket, ["host"]).

The Config#

typescript
const DEFAULT_SIGNALING_BASE = "https://term-bridge-worker.avikm744.workers.dev";
 
export function getSignalingBase(
  env: Pick<NodeJS.ProcessEnv, "TERM_BRIDGE_SERVER"> = process.env
): string {
  return env.TERM_BRIDGE_SERVER ?? DEFAULT_SIGNALING_BASE;
}

The signaling server URL is configurable via the TERM_BRIDGE_SERVER environment variable. This is useful for:

  • Development (point to a local Worker running with wrangler dev)
  • Self-hosting (point to your own Cloudflare Worker)
  • Testing (point to a staging instance)

If not set, it defaults to the production Worker URL.

Spawning the PTY#

typescript
export function spawnShell(cols = 220, rows = 50): pty.IPty {
  const platform = process.platform;
  let shellBin: string;
 
  if (platform === "win32") {
    shellBin = process.env.COMSPEC ?? "cmd.exe";
  } else {
    shellBin = process.env.SHELL ?? "/bin/bash";
    if (platform === "darwin" && !fs.existsSync("/bin/zsh")) {
      shellBin = "/bin/bash";
    }
  }
 
  if (!fs.existsSync(shellBin)) {
    throw new Error(`Shell not found: ${shellBin}`);
  }
 
  const env: Record<string, string> = {};
  for (const [key, value] of Object.entries(process.env)) {
    if (value !== undefined) env[key] = value;
  }
  env.TERM_BRIDGE = "1";
  env.TERM = "xterm-256color";
  env.COLORTERM = "truecolor";

A PTY (pseudo-terminal) is a pair of virtual devices: a master and a slave. Programs connected to the slave think they're talking to a real terminal — they see the same terminal capabilities, the same environment variables, the same input/output behavior. The master side lets you read output and write input programmatically.

Think of it like this: the PTY slave is a fake terminal window. The shell (bash, zsh) runs inside it, thinking it's displaying to a real screen. But instead of a screen, the output goes to the PTY master, which is a file descriptor our program can read from. And instead of a keyboard, input comes from writes to the PTY master.

node-pty creates a PTY and spawns a shell inside it. The shell runs as a child process. When the shell writes output (like ls results), it goes to the PTY slave, and we can read it from the PTY master. When we write to the PTY master, the shell receives it as keyboard input.

Shell detection:

The shell binary is determined by:

  1. $SHELL environment variable on Unix (set by the login process)
  2. $COMSPEC on Windows (usually C:\Windows\System32\cmd.exe)
  3. Fallback to /bin/bash if nothing is set
  4. macOS-specific: check for /bin/zsh (macOS default shell since Catalina)

Environment variables:

typescript
env.TERM_BRIDGE = "1";
env.TERM = "xterm-256color";
env.COLORTERM = "truecolor";
  • TERM=xterm-256color — Tells programs like vim, top, and ls that the terminal supports 256 colors. Without this, you'd get monochrome output. Programs check this variable to decide whether to use color escape codes.
  • COLORTERM=truecolor — Enables 24-bit color (16 million colors) for programs that support it. Modern terminals support this, and programs like bat, delta, and modern vim use it for richer colors.
  • TERM_BRIDGE=1 — A custom variable that scripts can check to detect they're running inside Term Bridge. Useful for conditional behavior (e.g., disabling interactive prompts).

The macOS binary signing dance:

typescript
function fixPtyNativeBinaries(): void {
  if (ptyFixed) return;
  ptyFixed = true;
  if (process.platform !== "darwin") return;
 
  const ptyDir = path.dirname(require.resolve("node-pty/package.json"));
  const prebuildsDir = path.join(ptyDir, "prebuilds");
  if (!fs.existsSync(prebuildsDir)) return;
 
  try {
    execSync(`chmod +x "${prebuildsDir}"/*/spawn-helper`, { stdio: "ignore" });
  } catch {}
  try {
    execSync(`xattr -dr com.apple.quarantine "${ptyDir}"`, { stdio: "ignore" });
  } catch {}
  try {
    execSync(
      `find "${ptyDir}" -name "*.node" -exec codesign --force --sign - {} \\;`,
      { stdio: "ignore" }
    );
  } catch {}
  try {
    execSync(
      `codesign --force --sign - "${prebuildsDir}"/*/spawn-helper`,
      { stdio: "ignore" }
    );
  } catch {}
}

This is the kind of platform-specific pain you only discover at 2am when your tool works on Linux but silently fails on macOS.

macOS has a security feature called Gatekeeper. When you download a file from the internet, macOS attaches a com.apple.quarantine extended attribute. This attribute prevents the file from being executed until the user explicitly approves it.

node-pty ships a native binary called spawn-helper. When you npm install node-pty, npm downloads the prebuilt binary from GitHub. macOS quarantines it. When node-pty tries to execute spawn-helper to create a PTY, macOS blocks it.

The fix is three steps:

  1. chmod +x — Make the binary executable
  2. xattr -dr com.apple.quarantine — Remove the quarantine attribute recursively
  3. codesign --force --sign - — Ad-hoc sign the binary (macOS requires all native binaries to be signed)

The ptyFixed flag ensures this only runs once. The try/catch blocks silently ignore errors — on Linux, none of these commands are needed, and on macOS, they might fail if the binary isn't quarantined (e.g., installed from a local cache).

The WebRTC PeerConnection#

When the signaling server tells the host that a client has connected, the host creates a WebRTC PeerConnection:

typescript
pc = new nodeDataChannel.PeerConnection("host", {
  iceServers: ["stun:stun.cloudflare.com:3478"],
});

node-datachannel is a C++ binding for libdatachannel. Why not the browser's RTCPeerConnection? Because we're in Node.js, not a browser. The browser's WebRTC API is only available in browser contexts. node-datachannel provides the same WebRTC functionality but as a native addon — a C++ library compiled into a Node.js module.

Why not wrtc? wrtc is another Node.js WebRTC binding, but it's unmaintained and doesn't support DataChannels well. node-datachannel is actively maintained, has a cleaner API, and supports all DataChannel features.

iceServers specifies the STUN server. STUN (Session Traversal Utilities for NAT) helps peers discover their public IP address when they're behind NAT (which almost everyone is — home routers, corporate firewalls, mobile networks).

The NAT traversal flow:

code
  Peer (192.168.1.5)          NAT Router           STUN Server
       │                    (203.0.113.1)         (1.2.3.4:3478)
       │                         │                       │
       │  "What is my public     │                       │
       │   IP and port?"         │                       │
       │────────────────────────►│──────────────────────►│
       │                         │                       │
       │                         │   "You are            │
       │                         │    203.0.113.1:49152" │
       │  "You are               │                       │
       │   203.0.113.1:49152"    │                       │
       │◄────────────────────────│◄──────────────────────│
       │                         │                       │
       │  ICE candidate:         │                       │
       │  203.0.113.1:49152      │                       │
       │  (send to peer via      │                       │
       │   signaling server)     │                       │

The STUN server sees the source IP:port of the incoming UDP packet (which is the NAT router's public IP:port, not the peer's private IP). It sends this back to the peer. The peer now knows its public address and can share it with the other peer.

Types of ICE candidates:

  1. Host candidate — The peer's local IP address (e.g., 192.168.1.5). Works if both peers are on the same network.
  2. Server-reflexive candidate (srflx) — The peer's public IP:port as seen by the STUN server. Works through most NATs.
  3. Relay candidate (relay) — A TURN server's IP:port. Works through any NAT, but requires a TURN server (not implemented in Term Bridge).

WebRTC tries candidates in order: host → srflx → relay. If host works (same network), it's used. If not, srflx (different networks). If that fails too, relay (symmetric NAT). Term Bridge only has host and srflx candidates. If both fail, the connection fails.

The DataChannel#

typescript
dc = pc.createDataChannel("terminal");

A DataChannel is WebRTC's equivalent of a WebSocket — a bidirectional, ordered, reliable byte stream. But unlike WebSocket, the data goes directly between peers. No server in the middle.

Under the hood, DataChannels use SCTP (Stream Control Transmission Protocol) over DTLS (Datagram Transport Layer Security) over UDP. This means:

  • Reliable delivery — SCTP handles retransmission, like TCP
  • Ordered delivery — Messages arrive in the order they were sent
  • Encrypted — DTLS provides encryption (similar to TLS but for UDP)
  • Congestion control — SCTP has built-in congestion control that adapts to network conditions

The label "terminal" is just a name. You can have multiple DataChannels on one PeerConnection (e.g., one for terminal data, one for file transfers, one for clipboard). Term Bridge uses one channel for everything — terminal data, control messages, and file chunks are all multiplexed on the same DataChannel.

SDP and ICE Exchange#

WebRTC connection establishment is a multi-step dance. Here's what happens inside startPtyBridge:

typescript
ws.on("message", (raw) => {
  let msg: SignalMsg;
  try { msg = JSON.parse(raw.toString()) as SignalMsg; }
  catch { return; }
 
  switch (msg.type) {
    case "peer_info": {
      peerAddress = msg.address;
      connectedAt = new Date();
      opts.onConnected(msg.address);
 
      pc = new nodeDataChannel.PeerConnection("host", {
        iceServers: ["stun:stun.cloudflare.com:3478"],
      });
 
      pc.onLocalDescription((sdp, type) => {
        wsSend(ws, { type, sdp });
      });
 
      pc.onLocalCandidate((candidate, mid) => {
        wsSend(ws, { type: "ice", candidate, mid });
      });
 
      dc = pc.createDataChannel("terminal");

Step 1: peer_info triggers negotiation

The host was waiting. When peer_info arrives from the DO, the host knows the client is ready. It creates the PeerConnection and DataChannel.

Step 2: SDP Offer

Creating the DataChannel triggers SDP generation. The PeerConnection gathers its capabilities (supported codecs, security parameters, ICE candidates) and formats them as an SDP offer:

typescript
pc.onLocalDescription((sdp, type) => {
  wsSend(ws, { type, sdp });
});

The SDP offer is something like:

code
v=0
o=- 1234567890 1 IN IP4 0.0.0.0
s=-
t=0 0
a=group:BUNDLE 0
m=application 9 UDP/DTLS/SCTP webrtc-datachannel
c=IN IP4 0.0.0.0
a=ice-ufrag:abcd
a=ice-pwd:abcdefghijklmnop
a=fingerprint:sha-256 AB:CD:EF:...
a=setup:actpass
a=mid:0
a=sctp-port:5000
a=max-message-size:262144

Key fields:

  • m=application — This is a DataChannel (not audio/video)
  • ice-ufrag / ice-pwd — ICE credentials for authentication
  • fingerprint — The DTLS certificate fingerprint (for encryption)
  • sctp-port — The SCTP port for DataChannel
  • max-message-size — Maximum message size (262144 bytes = 256KB)

This is sent to the signaling server, which relays it to the client.

Step 3: ICE Candidates

ICE candidates are discovered in parallel. Each candidate represents a potential connection path:

typescript
pc.onLocalCandidate((candidate, mid) => {
  wsSend(ws, { type: "ice", candidate, mid });
});

Candidates trickle in as the STUN server responds and local network interfaces are enumerated. Each one is relayed to the client through the signaling server. This is called "trickle ICE" — candidates are sent as they're discovered, rather than waiting for all candidates before connecting.

Step 4: SDP Answer

The client receives the offer, creates its own PeerConnection, sets the remote description (the host's offer), and generates an answer:

On the host side, receiving the answer:

typescript
case "answer":
  if (pc) {
    pc.setRemoteDescription(msg.sdp, "answer");
  }
  break;

The answer SDP is similar to the offer but with the client's ICE credentials and DTLS fingerprint. After both sides have each other's SDP, they have enough information to attempt a direct connection.

Step 5: Connection

WebRTC tries to connect using the ICE candidates. It sends STUN binding requests to the other peer's candidates. If a request gets a response, the connection works. The first successful candidate pair is used.

Once connected, DTLS handshake happens over the established ICE connection. This is the encryption setup — both peers exchange certificates and agree on encryption keys. After DTLS completes, the DataChannel opens.

From this point on, the signaling server is irrelevant. All data flows directly between host and client, encrypted with DTLS.

Bridging PTY ↔ DataChannel#

When the DataChannel opens, the host spawns the shell and wires up the data flow:

typescript
dc.onOpen(() => {
  try {
    shell = spawnShell();
    cleanupHostTerminal = attachHostTerminal({
      shell,
      sendRemote: (data) => {
        if (dc!.isOpen()) dc!.sendMessage(data);
      },
      sendRevInput: (data) => {
        if (dc!.isOpen()) {
          dc!.sendMessage(encodeCtrl({ type: "rev_input", data }));
        }
      },
      getViewMode,
      onSwitchView: () => {
        viewMode = viewMode === "local" ? "remote" : "local";
        if (viewMode === "remote" && dc?.isOpen()) {
          const cols = process.stdout.columns ?? 80;
          const rows = process.stdout.rows ?? 24;
          dc.sendMessage(encodeCtrl({ type: "rev_resize", cols, rows }));
        }
      },
      transferFile: (filepath) => {
        sendFile(dc!, filepath);
      },
      peerAddress,
      connectedAt,
      onDisconnect: () => {
        opts.onDisconnected();
        done();
      },
    });
    shell.onExit(({ exitCode }) => {
      if (dc!.isOpen()) {
        dc!.sendMessage(`\r\n[Process exited with code ${exitCode}]\r\n`);
        dc!.close();
      }
      opts.onDisconnected();
      done();
    });
  } catch (err) {
    dc!.sendMessage("\r\n[Failed to spawn shell]\r\n");
  }
});

Five callbacks are wired up:

  1. sendRemote — Shell output to the client. Raw terminal data (ANSI escape sequences and all) sent as regular DataChannel messages.

  2. sendRevInput — Host keystrokes to the client's PTY (in remote mode). Sent as rev_input control messages.

  3. onSwitchView — When the host types /switch, the view mode toggles. If switching to remote mode, a rev_resize control message is sent so the client's PTY matches the host's terminal dimensions.

  4. transferFile — File transfer initiated by the host.

  5. onDisconnect — Cleanup on session end.

The shell.onExit handler: if the shell exits (user typed exit, or the shell crashed), the host sends a message to the client, closes the DataChannel, and tears down the session.

The Incoming Message Router#

Messages from the client arrive on the DataChannel:

typescript
dc.onMessage((msg) => {
  if (!shell) return;
  const raw = typeof msg === "string" ? msg : Buffer.from(msg).toString();
 
  if (isCtrlMsg(raw)) {
    const ctrl = decodeCtrl(raw);
    if (!ctrl) return;
    handleIncomingCtrl(ctrl);
    return;
  }
 
  try {
    const parsed = JSON.parse(raw) as SignalMsg;
    if (parsed.type === "resize") {
      shell.resize(parsed.cols, parsed.rows);
      return;
    }
  } catch {}
  shell.write(raw);
});

Three types of messages:

  1. Control messages (\x00TB: prefix) — Handled by handleIncomingCtrl, which dispatches to the appropriate handler based on the control message type.

  2. Resize messages — JSON { type: "resize", cols, rows }. The host's PTY is resized to match the client's terminal dimensions. This is important because programs like top, vim, and htop use the terminal dimensions to render their UI. If the host's PTY is 220x50 but the client's terminal is 80x24, the output would be garbled.

  3. Terminal input — Everything else is keystrokes from the client. Written directly to the PTY as if the host typed them.

Why is resize not a control message? It predates the control protocol — it was the first message type implemented. In hindsight, it should be a control message. But it works, and changing it would break backward compatibility.

Terminal I/O Routing#

The attachHostTerminal function is the router. It decides where input goes and where output appears, based on the current view mode.

The host's stdin is set to raw mode:

typescript
if (stdin.isTTY) {
  stdin.setRawMode?.(true);
}
stdin.setEncoding?.("utf8");
stdin.resume();
stdin.on("data", onInput);

setRawMode(true) puts the terminal in raw mode — every keystroke is sent immediately, without buffering. No line editing, no echo, no signal handling (Ctrl+C, Ctrl+Z are captured as characters, not signals). This is essential for terminal sharing because the host's terminal needs to behave like a transparent pipe.

The input handler processes characters one at a time:

typescript
const onInput = (chunk: string) => {
  for (const ch of chunk) {
    if (ch === "/") {
      commandMode = true;
      inputBuffer = "/";
      stdout.write("/");
      continue;
    }
 
    if (commandMode) {
      if (ch === "\r" || ch === "\n") {
        stdout.write("\r\n");
        cmdCtx.viewMode = getViewMode?.() ?? "local";
        const handled = handleCommand(inputBuffer, cmdCtx);
        if (handled) {
          inputBuffer = "";
          commandMode = false;
          continue;
        }
        const mode = getViewMode?.() ?? "local";
        if (mode === "remote") {
          sendRevInput?.(inputBuffer + "\r");
        } else {
          shell.write(inputBuffer + "\r");
        }
        inputBuffer = "";
        commandMode = false;
        continue;
      }
      // ... backspace, tab, ctrl+c handling ...
    }
 
    const mode = getViewMode?.() ?? "local";
    if (mode === "remote") {
      sendRevInput?.(ch);
    } else {
      shell.write(ch);
    }
  }
};

The routing decision:

  • Local mode (default): stdin → shell.write(ch) — keystrokes go to the local PTY
  • Remote mode: stdin → sendRevInput(ch) — keystrokes go to the client's PTY via rev_input control message
  • Command mode (/ prefix): keystrokes are buffered until Enter, then checked against known commands

The / character triggers command mode. This means you can't start a normal input line with / without it being intercepted. That's a trade-off — but in practice, shell commands rarely start with / (paths do, but they're usually relative or start with ~/).

If the command isn't recognized (e.g., /ls), the whole buffer (including the /) is forwarded to the shell:

typescript
if (mode === "remote") {
  sendRevInput?.(inputBuffer + "\r");
} else {
  shell.write(inputBuffer + "\r");
}

Shell output routing:

typescript
const dataSubscription = shell.onData?.((data) => {
  const mode = getViewMode?.() ?? "local";
  if (mode === "local") {
    stdout.write(data);
  }
  sendRemote?.(data);
});

Shell output always goes to the DataChannel — the client always sees the host's terminal, regardless of view mode. But it only goes to the host's stdout in local mode. In remote mode, the host's stdout is reserved for the client's terminal output (received via rev_data control messages).

This is the key to bidirectional terminal sharing. Both machines have PTY processes. Both can type. The /switch command determines which PTY's output appears on the host's screen.

Resource Cleanup#

typescript
const done = (err?: Error) => {
  if (settled) return;
  settled = true;
  cleanupHostTerminal?.();
  closeRtcResources({
    dataChannel: dc,
    peerConnection: pc,
    cleanup: () => nodeDataChannel.cleanup(),
  });
  shell?.kill();
  try { ws.close(); } catch {}
  if (err) reject(err);
  else resolve();
};

The settled flag prevents double-cleanup. Resources are closed in order:

  1. Terminal input handler is detached (cleanupHostTerminal)
  2. DataChannel is closed
  3. PeerConnection is closed
  4. nodeDataChannel.cleanup() releases the C++ library's global state
  5. Shell process is killed
  6. WebSocket to signaling server is closed

The closeRtcResources helper:

typescript
export function closeRtcResources({
  dataChannel,
  peerConnection,
  cleanup,
}: RtcResources): void {
  try {
    if (dataChannel?.isOpen()) {
      dataChannel.close();
    }
  } catch {}
 
  try {
    peerConnection?.close();
  } catch {}
 
  try {
    cleanup?.();
  } catch {}
}

Each close is wrapped in try/catch because any of these might already be closed (the peer disconnected first, or the connection dropped). We don't want cleanup to throw.

Bidirectional data flow · viewMode = local

HOST

controls own shell

stdin:host PTY
stdout shows:host PTY output
CLIENT

watches host's screen

stdin:client PTY (idle)
stdout shows:host PTY output (always)

DataChannel messages (2)

DATA
shell.write(ch)Host keystrokes go to host's local PTY
DATA
dc.sendMessage(stdout)Host's PTY output streams to client (raw terminal data)
Both PTYs are alive.Host drives its own shell. Client's PTY exists but receives nothing. Output flows host → client. One DataChannel, one direction.

Part 3: The Control Protocol#

Terminal data and control messages share the same DataChannel. We need a way to distinguish "here's terminal output" from "here's a resize event" or "here's a file chunk."

The Prefix#

typescript
export function isCtrlMsg(raw: string): boolean {
  return raw.startsWith("\x00TB:");
}
 
export function encodeCtrl(msg: CtrlMsg): string {
  return "\x00TB:" + JSON.stringify(msg);
}
 
export function decodeCtrl(raw: string): CtrlMsg | null {
  if (!raw.startsWith("\x00TB:")) return null;
  try {
    return JSON.parse(raw.slice(4)) as CtrlMsg;
  } catch {
    return null;
  }
}

Every control message is prefixed with \x00TB: (null byte + "TB:"). Terminal data never starts with a null byte — it's not a valid UTF-8 start character. Terminal output is either:

  • Regular text (printable ASCII + newlines)
  • ANSI escape sequences (starting with \x1b[)
  • UTF-8 encoded text

None of these start with \x00. So the prefix is unambiguous.

Why not a separate DataChannel for control?

You could create two DataChannels on the same PeerConnection:

typescript
const terminalDC = pc.createDataChannel("terminal");
const controlDC = pc.createDataChannel("control");

This adds complexity:

  1. Two channels to manage, two sets of event handlers
  2. Ordering issues — DataChannels on the same PeerConnection are independent. A resize sent on controlDC might arrive before or after terminal data sent on terminalDC. If the resize arrives late, the terminal output is rendered with the wrong dimensions.
  3. More SCTP streams to negotiate during connection setup

One channel with a prefix is simpler and guarantees ordering — a control message arrives after all terminal data that was sent before it.

Control Message Types#

typescript
export type CtrlMsg =
  | { type: "cmd_response"; text: string }
  | { type: "resize"; cols: number; rows: number }
  | { type: "kick" }
  | { type: "transfer_start"; filename: string; size: number }
  | { type: "transfer_chunk"; index: number; data: string }
  | { type: "transfer_end"; filename: string }
  | { type: "rev_data"; data: string }
  | { type: "rev_input"; data: string }
  | { type: "rev_resize"; cols: number; rows: number };

Here's every message type, who sends it, who receives it, and why:

TypeDirectionPurpose
rev_dataClient → HostClient's PTY output (rendered on host's screen in remote mode)
rev_inputHost → ClientHost's keystrokes (written to client's PTY in remote mode)
rev_resizeHost → ClientHost resized their terminal in remote mode (client's PTY must match)
resizeClient → HostClient resized their terminal (host's PTY must match)
kickHost → ClientHost kicks the client (closes DataChannel)
cmd_responseEither → EitherResponse from command handler (status text, errors)
transfer_startEither → EitherFile transfer header (filename, size)
transfer_chunkEither → EitherFile transfer chunk (index, base64 data)
transfer_endEither → EitherFile transfer complete (filename)

The rev_ prefix stands for "reverse" — these messages carry data in the reverse direction from the normal flow. Normally, the host sends terminal data to the client. rev_data sends client terminal data to the host. rev_input sends host keystrokes to the client's PTY.

The Command System#

Commands are typed with a / prefix. The input handler intercepts them character by character:

typescript
if (ch === "/") {
  commandMode = true;
  inputBuffer = "/";
  stdout.write("/");
  continue;
}

When the user types /, we enter command mode. The / is echoed to the terminal and the character is buffered. Subsequent characters are added to the buffer until Enter is pressed.

The command handler:

typescript
export function handleCommand(input: string, ctx: CommandContext): boolean {
  const trimmed = input.trim();
  if (!trimmed.startsWith("/")) return false;
 
  const parts = trimmed.split(/\s+/);
  const cmd = parts[0].toLowerCase();
  const args = parts.slice(1).join(" ");
 
  switch (cmd) {
    case "/exit":
    case "/quit":
    case "/q":
      ctx.writeToStdout("\r\n\x1b[33m⚡ Disconnecting...\x1b[0m\r\n");
      ctx.disconnect();
      return true;
 
    case "/help":
      ctx.writeToStdout("\r\n\x1b[1mTerm Bridge Commands:\x1b[0m\r\n");
      ctx.writeToStdout("  \x1b[36m/exit\x1b[0m, \x1b[36m/quit\x1b[0m   Disconnect session\r\n");
      ctx.writeToStdout("  \x1b[36m/status\x1b[0m          Show connection info\r\n");
      ctx.writeToStdout("  \x1b[36m/switch\x1b[0m          Switch local ↔ remote terminal\r\n");
      ctx.writeToStdout("  \x1b[36m/help\x1b[0m            Show this help\r\n");
      if (ctx.role === "host") {
        ctx.writeToStdout("  \x1b[36m/kick\x1b[0m            Kick the connected client\r\n");
      }
      ctx.writeToStdout("  \x1b[36m/transfer\x1b[0m <file>  Send a file to peer (tab to complete)\r\n");
      ctx.writeToStdout("\r\n");
      return true;

Each command returns true if handled, false if not. Unknown commands fall through to the shell. The ANSI escape codes (\x1b[33m, etc.) add color to the output — yellow for warnings, cyan for command names, green for success.

The /switch command is the most interesting:

typescript
case "/switch":
  ctx.switchView();
  ctx.writeToStdout(
    `\r\n\x1b[36m→ Switched to ${ctx.viewMode} terminal\x1b[0m\r\n`
  );
  return true;

ctx.switchView() toggles the view mode and updates cmdCtx.viewMode. On the host side, it also sends a rev_resize control message so the client's PTY matches the host's terminal dimensions. Without this, the client's PTY would be the wrong size and programs would render incorrectly.

Tab completion for file paths:

typescript
export function tabComplete(input: string): { completed: string; matches?: string[] } | null {
  const prefix = "/transfer ";
  if (!input.startsWith(prefix)) return null;
 
  const partial = input.slice(prefix.length);
  if (!partial) return null;
 
  const expanded = expandPath(partial);
  const dir = path.dirname(expanded);
  const base = path.basename(expanded);
 
  let entries: string[];
  try {
    entries = fs.readdirSync(dir).filter((f) => f.startsWith(base));
  } catch {
    return null;
  }
 
  if (entries.length === 0) return null;
 
  if (entries.length === 1) {
    const full = path.join(dir, entries[0]);
    try {
      const isDir = fs.statSync(full).isDirectory();
      return { completed: prefix + full + (isDir ? "/" : "") };
    } catch {
      return { completed: prefix + full };
    }
  }
 
  return { completed: input, matches: entries };
}

Tab completion only activates for /transfer. The logic is:

  1. Extract the partial file path after /transfer
  2. Expand ~ to the home directory
  3. List files in the directory that start with the partial name
  4. If one match: complete the path (append / for directories)
  5. If multiple matches: return all matches (displayed on a new line)
  6. If no matches: do nothing

Path expansion:

typescript
export function expandPath(p: string): string {
  if (p.startsWith("~")) {
    return path.join(os.homedir(), p.slice(1));
  }
  return path.resolve(p);
}

~ is expanded to the home directory (/Users/avik on macOS, /home/user on Linux). Relative paths are resolved to absolute paths using path.resolve().


Part 4: The Client Agent#

The client is the mirror image of the host, with some key differences:

  1. It doesn't create a session — it joins one with a code
  2. It doesn't create a DataChannel — it receives one from the host
  3. Its default view mode is "remote" (watching the host's terminal)
  4. It spawns its own PTY for bidirectional control

Joining with a Code#

typescript
export async function connectClient(code: string): Promise<void> {
  const rawCode = code.replace("-", "");
  const signalingBase = getSignalingBase();
 
  const joinRes = await fetch(`${signalingBase}/join/${rawCode}`);
  if (!joinRes.ok) {
    throw new Error(`Invalid or expired code: ${code}`);
  }
  const { sessionId } = (await joinRes.json()) as { sessionId: string };
 
  const wsUrl = new URL(`/session/${sessionId}/ws`, signalingBase);
  wsUrl.protocol = wsUrl.protocol.replace("http", "ws");
  wsUrl.searchParams.set("role", "client");

The client resolves the code to a session ID with one HTTP request (GET /join/482913). The code is deleted from KV (single-use). The session ID is used to construct the WebSocket URL.

This is the only HTTP request in the entire flow. After this, everything is WebSocket (for signaling) and WebRTC DataChannel (for data).

Creating the PeerConnection (Before WebSocket)#

The client creates its PeerConnection before connecting to the WebSocket:

typescript
const pc = new nodeDataChannel.PeerConnection("client", {
  iceServers: ["stun:stun.cloudflare.com:3478"],
});
 
pc.onLocalDescription((sdp, type) => {
  wsSend(ws, { type, sdp });
});
 
pc.onLocalCandidate((candidate, mid) => {
  wsSend(ws, { type: "ice", candidate, mid });
});

Why before? Because as soon as the host's SDP offer arrives, the client needs to process it and generate an answer. The PeerConnection must be ready.

The client's onLocalDescription callback fires when:

  1. The host's offer is set via pc.setRemoteDescription(msg.sdp, "offer")
  2. This triggers the client to generate an answer SDP
  3. The answer is relayed to the host through the signaling server

ICE candidates are generated when:

  1. The PeerConnection contacts the STUN server
  2. Local network interfaces are enumerated
  3. Each discovered path becomes a candidate, relayed to the host

Receiving the DataChannel#

The client doesn't call createDataChannel(). Instead, it listens for the host's channel:

typescript
pc.onDataChannel((channel) => {
  dc = channel;
  channel.onOpen(() => {
    connectedAt = new Date();
 
    const cols = process.stdout.isTTY
      ? process.stdout.columns ?? 80
      : 80;
    const rows = process.stdout.isTTY
      ? process.stdout.rows ?? 24
      : 24;
 
    clientPty = spawnShell(cols, rows);
 
    clientPty.onExit(({ exitCode }) => {
      if (viewMode === "local") {
        process.stdout.write(`\r\n\x1b[33m[Local shell exited with code ${exitCode}]\x1b[0m\r\n`);
      }
      clientPty = spawnShell(cols, rows);
      wireClientPty(clientPty);
    });
 
    wireClientPty(clientPty);
 
    if (process.stdin.isTTY) {
      process.stdin.setRawMode(true);
    }
    process.stdin.setEncoding("utf8");
    process.stdin.resume();
 
    channel.sendMessage(
      JSON.stringify({ type: "resize", cols, rows })
    );

When the host creates a DataChannel, the client's PeerConnection fires the onDataChannel event. This is WebRTC's asymmetry: one side creates, the other side receives. The creating side is the "negotiator" — it controls the channel's configuration (ordered/unordered, reliable/unreliable). The receiving side accepts whatever configuration the creator chose.

The client immediately spawns its own PTY. Why? Because the client isn't just a passive viewer — it can switch to local mode and use its own shell, or the host can switch to remote mode and control the client's machine.

The client sends its terminal dimensions to the host:

typescript
channel.sendMessage(
  JSON.stringify({ type: "resize", cols, rows })
);

This is sent as a regular (non-ctrl) message. The host receives it and resizes its PTY to match. This ensures the host's terminal output fits the client's screen.

The Client's Input Router#

The client processes stdin character by character, just like the host:

typescript
const onInput = (chunk: string) => {
  for (const ch of chunk) {
    if (ch === "/") {
      commandMode = true;
      inputBuffer = "/";
      process.stdout.write("/");
      continue;
    }
 
    if (commandMode) {
      // ... same command mode logic as host ...
    }
 
    if (viewMode === "local") {
      clientPty?.write(ch);
    } else if (channel.isOpen()) {
      channel.sendMessage(ch);
    }
  }
};

The routing decision:

  • Remote mode (default): stdin → channel.sendMessage(ch) — keystrokes go to the host's PTY via DataChannel
  • Local mode: stdin → clientPty.write(ch) — keystrokes go to the local PTY
  • Command mode: same / prefix handling as the host

The Bidirectional Flow#

The client's PTY output goes to the host as rev_data:

typescript
function wireClientPty(cPty: pty.IPty): void {
  cPty.onData((data: string) => {
    if (viewMode === "local") {
      process.stdout.write(data);
    }
    if (dc?.isOpen()) {
      dc.sendMessage(encodeCtrl({ type: "rev_data", data }));
    }
  });
}

This is always sent — the host always receives the client's PTY output, regardless of the client's view mode. Whether the host displays it depends on their view mode. If the host is in local mode, rev_data is ignored. If the host is in remote mode, rev_data is rendered to the host's stdout.

The host's terminal data arrives as regular DataChannel messages:

typescript
channel.onMessage((msg) => {
  const raw = typeof msg === "string" ? msg : Buffer.from(msg).toString();
 
  if (isCtrlMsg(raw)) {
    const ctrl = decodeCtrl(raw);
    if (!ctrl) return;
    handleIncomingCtrl(ctrl);
    return;
  }
 
  if (viewMode === "remote") {
    process.stdout.write(raw);
  }
});

Regular messages are terminal output from the host's PTY. In remote view mode, they're rendered to the client's stdout. Control messages are handled separately.

The client's incoming control handler:

typescript
function handleIncomingCtrl(ctrl: CtrlMsg): void {
  switch (ctrl.type) {
    case "kick":
      process.stdout.write("\r\n\x1b[33m⚡ Host kicked you from the session.\x1b[0m\r\n");
      done();
      break;
    case "cmd_response":
      process.stdout.write(ctrl.text + "\r\n");
      break;
    case "rev_input":
      clientPty?.write(ctrl.data);
      break;
    case "rev_resize":
      clientPty?.resize(ctrl.cols, ctrl.rows);
      break;
    case "transfer_start":
      transferState = {
        filename: ctrl.filename,
        size: ctrl.size,
        chunks: new Map(),
      };
      break;
    case "transfer_chunk":
      if (transferState) {
        transferState.chunks.set(ctrl.index, ctrl.data);
      }
      break;
    case "transfer_end":
      if (transferState) {
        finishTransfer(transferState);
        transferState = null;
      }
      break;
  }
}

rev_input is the reverse input channel — keystrokes from the host when they're in remote mode. Written directly to the client's PTY.

rev_resize synchronizes the client's PTY dimensions with the host's terminal size.

kick is the ejection seat — the host can kick the client at any time. The client's DataChannel is closed and the session ends.

Resize Handling#

Both sides handle terminal resize events:

typescript
if (process.stdout.isTTY) {
  onResize = () => {
    const c = process.stdout.columns ?? 80;
    const r = process.stdout.rows ?? 24;
    if (viewMode === "local" && clientPty) {
      clientPty.resize(c, r);
    } else if (channel.isOpen()) {
      channel.sendMessage(
        JSON.stringify({ type: "resize", cols: c, rows: r })
      );
    }
  };
  process.stdout.on("resize", onResize);
}

In local mode, the client's own PTY is resized. In remote mode, the resize is sent to the host's PTY. This ensures the terminal output always fits the viewer's screen dimensions.


Part 5: File Transfer#

File transfer uses the control protocol. The sender chunks the file and sends it as a series of transfer_chunk messages:

typescript
function sendFile(dataChannel: nodeDataChannel.DataChannel, filepath: string): void {
  if (!dataChannel.isOpen()) return;
  const filename = path.basename(filepath);
  try {
    const stat = fs.statSync(filepath);
    if (!stat.isFile()) {
      process.stdout.write(`\r\n\x1b[31mNot a file: ${filepath}\x1b[0m\r\n`);
      return;
    }
    process.stdout.write(`\r\n\x1b[36mSending ${filename} (${stat.size} bytes)...\x1b[0m\r\n`);
    dataChannel.sendMessage(encodeCtrl({ type: "transfer_start", filename, size: stat.size }));
    const CHUNK = 16384;
    const fd = fs.openSync(filepath, "r");
    const buf = Buffer.alloc(CHUNK);
    let idx = 0;
    while (true) {
      const read = fs.readSync(fd, buf, 0, CHUNK, idx * CHUNK);
      if (read === 0) break;
      dataChannel.sendMessage(encodeCtrl({
        type: "transfer_chunk",
        index: idx,
        data: buf.toString("base64", 0, read),
      }));
      idx++;
    }
    fs.closeSync(fd);
    dataChannel.sendMessage(encodeCtrl({ type: "transfer_end", filename }));
    process.stdout.write(`\x1b[32mSent: ${filename}\x1b[0m\r\n`);
  } catch (err) {
    process.stdout.write(`\r\n\x1b[31mTransfer failed: ${err}\x1b[0m\r\n`);
  }
}

Why 16KB chunks? WebRTC DataChannels have a message size limit. The exact limit depends on the SCTP implementation, but 16KB is safe for all WebRTC libraries. Larger messages might be:

  • Fragmented by SCTP into multiple DTLS packets — works but slower
  • Rejected by some implementations — silently dropped
  • Delayed because SCTP has to reassemble the fragments

16KB is the sweet spot: small enough to never fragment, large enough to be efficient.

Why base64? The control protocol uses JSON. Binary data can't be embedded in JSON directly (JSON is a text format). Options:

  1. Base64 — Simple, reliable, ~33% overhead. Chosen for simplicity.
  2. Hex encoding — 100% overhead. Worse than base64.
  3. Binary framing — Send raw binary after a control header. More efficient but requires changes to the message parser.

For terminal sharing, files are small (configs, scripts, logs). The 33% overhead is negligible.

Why synchronous I/O? fs.readSync and fs.openSync are synchronous. In a real file server, this would block the event loop. But Term Bridge is a terminal tool — the user is waiting for the transfer to complete. Blocking the event loop is fine because nothing else should be happening during the transfer.

Why indexed chunks? DataChannel guarantees ordered delivery, so in theory we don't need indices. But they provide:

  1. Verification — We can check that no chunks were dropped (compare chunk count to expected)
  2. Reassembly — We sort by index and write sequentially, handling partial last chunks correctly
  3. Future-proofing — If we ever switch to unreliable mode (faster but unordered), indices are required

The receiver reassembles:

typescript
function finishTransfer(state: { filename: string; size: number; chunks: Map<number, string> }): void {
  const outPath = path.join(process.cwd(), state.filename);
  try {
    const fd = fs.openSync(outPath, "w");
    const indices = [...state.chunks.keys()].sort((a, b) => a - b);
    for (const idx of indices) {
      const buf = Buffer.from(state.chunks.get(idx)!, "base64");
      fs.writeSync(fd, buf, 0, buf.length, idx * 16384);
    }
    fs.closeSync(fd);
    process.stdout.write(`\x1b[32mSaved: ${outPath} (${state.size} bytes)\x1b[0m\r\n`);
  } catch (err) {
    process.stdout.write(`\x1b[31mTransfer save failed: ${err}\x1b[0m\r\n`);
  }
}

Chunks are sorted by index and written at the correct file offset (idx * 16384). fs.writeSync with an offset allows sparse writes — we can write chunk 5 before chunk 3 if they arrived out of order (though with ordered DataChannels, they won't).

The file is saved to the current working directory with the original filename.


Part 6: Putting It Together (index.ts)#

The entry point dispatches based on the subcommand:

typescript
async function main() {
  const args = process.argv.slice(2);
  const subcommand = args[0];
 
  if (subcommand === "connect" && args[1]) {
    const code = args[1];
    printConnecting(code);
    await connectClient(code);
    return;
  }
 
  printBanner();
  const { code, sessionId, signalingUrl } = await createSession();
  printCode(code);
  await startPtyBridge({
    sessionId,
    signalingUrl,
    onConnected: (peerAddress) => printConnected(peerAddress),
    onDisconnected: () => {
      printDisconnected();
      process.exit(0);
    },
  });
}

Two modes:

  • term-bridge — Host mode. Creates a session, prints the pairing code, waits for a client.
  • term-bridge connect 482-913 — Client mode. Resolves the code, connects to the session.

The UI functions use chalk for colored output:

typescript
export function printBanner(): void {
  console.clear();
  console.log(chalk.green("●") + " " + chalk.bold("Term Bridge agent running"));
  console.log(chalk.gray("Machine:"), chalk.white(hostname()));
}
 
export function printCode(code: string): void {
  const formatted = `${code.slice(0, 3)}-${code.slice(3)}`;
  console.log(chalk.gray("Code:   "), chalk.bold.cyan(formatted));
  console.log(chalk.gray("Waiting for connections..."));
  console.log();
  console.log(
    chalk.gray("  Peer runs: ") +
      chalk.bold.white(`term-bridge connect ${formatted}`)
  );
  console.log();
}

The output:

code
● Term Bridge agent running
Machine: avik-macbook
Code:    482-913
Waiting for connections...

  Peer runs: term-bridge connect 482-913

Clean, one-command setup. No configuration, no accounts, no port forwarding. The host runs one command, shares the code, and waits. The client runs one command with the code, and they're connected.

The Install Script#

The Worker also serves an install script at GET /install:

bash
#!/usr/bin/env bash
set -euo pipefail
 
command -v node >/dev/null 2>&1 || error "Node.js is required."
 
info "Installing term-bridge-agent..."
 
if npm install -g term-bridge-agent 2>/dev/null; then
  info "Installed. Run term-bridge to share your terminal."
else
  info "No install needed. Just run: npx term-bridge-agent"
fi

Users can install with one command:

bash
curl https://term-bridge-worker.avikm744.workers.dev/install | bash

This checks for Node.js, installs the package globally, and falls back to npx if global install fails (common on systems where npm global requires sudo).


The Security Model#

Let me be explicit about what's secure and what's not.

What's encrypted:

  • The DataChannel is encrypted with DTLS. This is the same encryption used by HTTPS (TLS), but over UDP. The signaling server, Cloudflare, and anyone on the network cannot read terminal data or file transfer contents.

What's NOT encrypted:

  • Signaling messages (SDP offers, ICE candidates) pass through the Cloudflare Worker as plaintext JSON. These contain connection metadata (IP addresses, codec capabilities) but NOT terminal data.

What's trusted:

  • The host trusts the client with full shell access. When the client types a command, it's executed on the host's machine with the host's user permissions. This is the same trust model as SSH — you're giving someone a shell.
  • The client trusts the host to display terminal output faithfully. The host could theoretically modify terminal output before sending it (injecting commands, hiding output).

What's not protected:

  • Man-in-the-middle during signaling. If someone compromises the signaling server, they could intercept SDP offers and inject their own. This would let them establish a DataChannel with both peers, relaying (and reading) all traffic. WebRTC supports a "peer identity" feature to prevent this, but it's not implemented in Term Bridge.
  • Pairing code guessing. A 6-digit code gives 1,000,000 possibilities. If an attacker can brute-force the /join/:code endpoint fast enough, they could guess a valid code. The 10-minute TTL limits the window, and KV's global distribution adds latency to each attempt, but this isn't designed to resist determined attackers.

The bottom line: Term Bridge is designed for convenience, not for hostile environments. It's perfect for pair programming, helping a friend debug, or demonstrating something in your terminal. It's not designed for sharing terminals with untrusted parties over hostile networks.


What Actually Surprised Me#

Durable Object hibernation. Cloudflare can hibernate idle DOs to save memory. When the DO wakes up, in-memory state is gone — but ctx.storage and ctx.getWebSockets() + tags persist. I initially stored WebSocket references in a Map, which broke when the DO hibernated. Switching to ctx.getWebSockets() + ctx.getTags() fixed it. This is the kind of thing that works in dev (DO never hibernates locally) but breaks in production.

The PTY binary signing issue. On macOS, node-pty ships a native spawn-helper binary. If you install the package through npm, macOS quarantines it. The PTY silently fails to spawn — no error, no crash, just nothing. I spent hours debugging this. The fix: remove the quarantine attribute and ad-hoc codesign the binaries. Runtime patching in fixPtyNativeBinaries().

WebRTC is not just for browsers. node-datachannel is a full WebRTC implementation for Node.js. It supports DataChannels, ICE, STUN/TURN — everything. You don't need a browser to use WebRTC. This makes terminal sharing possible without Electron or a web UI.

Single-use pairing codes matter. Initially, the code persisted in KV until it expired. Then someone joined twice (network hiccup, page reload) and got two client connections to the same session. The host sent data to both. Chaos. Making codes single-use (delete after first GET) fixed this cleanly.

The bidirectional model. Both machines have PTY processes. The /switch command toggles which PTY's output appears on the host's screen and which PTY receives the host's keystrokes. This isn't screen sharing or remote desktop — it's terminal multiplexing. Both users can be productive simultaneously. The host can work locally, then switch to help the client with a command, then switch back.

DataChannel ordered delivery. By default, DataChannels are ordered and reliable — like TCP. This means terminal data arrives in the correct order, and resize events aren't processed before the data they affect. You can configure unreliable unordered delivery (like UDP) for lower latency, but terminal data needs ordering. Imagine your keystrokes arriving out of order: ls becomes sl.

File transfer over DataChannel. I thought I'd need a separate connection for file transfers. But the DataChannel is already there, and the control protocol already multiplexes different message types. Adding file chunks as a new control message type was straightforward. The 16KB chunk size is conservative but safe.

The signaling server is shockingly small. The entire Worker + Durable Object is ~260 lines of TypeScript. It handles session creation, code resolution, WebSocket management, message relay, and session cleanup. Cloudflare's platform makes this absurdly concise — KV for ephemeral codes, DO for stateful WebSocket management, Hono for HTTP routing.


What's Missing vs Real Tools#

tmux-style session persistence. If the host disconnects, the session ends. Real tools like tmux keep the session alive and allow reattachment. This would require the host to persist PTY state somewhere — a screen session, a container, or a remote server. Term Bridge is designed for live collaboration, not session persistence.

Multi-client support. Currently one host, one client. A broadcast mode (one host, many viewers) would need a different architecture — maybe a mesh of DataChannels (each viewer connects to the host), or a star topology (all clients connect to a relay). The signaling server would need to track multiple clients per session.

Encryption verification. The DataChannel is encrypted (DTLS), but there's no way for users to verify the encryption key fingerprint. Real secure tools like Signal show a safety number that both users can compare out-of-band. This would require displaying the DTLS fingerprint on both terminals and having users verify they match.

TURN relay. For peers behind symmetric NAT (common in corporate networks), STUN alone won't work. A TURN server would relay traffic. This adds a server component but ensures connectivity in all network conditions. Cloudflare doesn't offer TURN, so you'd need a third-party provider.

Clipboard sharing. The DataChannel could carry clipboard data as a new control message type ({ type: "clipboard", data: "..." }). Not implemented, but straightforward with the control protocol.

Audio/Video. WebRTC supports audio and video tracks alongside DataChannels. A future version could add voice chat alongside terminal sharing — both would go over the same PeerConnection.

Scrollback buffer sync. When the client connects, they only see new output. They don't see the host's scrollback buffer (previous commands and output). Syncing scrollback would require the host to maintain a ring buffer of recent output and send it on connection.


The Bottom Line#

Term Bridge is:

  1. A 6-digit pairing code for human-friendly session setup
  2. A Cloudflare Worker + Durable Object for signaling (not data)
  3. WebRTC DataChannel for peer-to-peer terminal streaming
  4. Two PTY processes for bidirectional control
  5. An in-band prefix protocol for control messages
  6. File transfer over the same DataChannel

The signaling server costs nothing on Cloudflare's free tier (100K requests/day). The data never touches the server. The pairing code expires in 10 minutes. Sessions are one-shot.

If you want to understand WebRTC, signaling, or terminal multiplexing — this is worth building. You spend ~900 lines of TypeScript learning how real-time P2P applications work.

The code is on GitHub: github.com/Avik-creator/term-bridge

Install with: curl https://term-bridge-worker.avikm744.workers.dev/install | bash

Or: npm install -g term-bridge-agent


Let's Keep Talking#

Have you built something with WebRTC outside the browser? What surprised you about the signaling dance?

There's something satisfying about making two terminals talk to each other over an encrypted P2P channel with nothing but a 6-digit code. If you've built a terminal tool recently — or you're thinking about it — I'd love to hear about it.

Also: if you spot any oversimplifications in the WebRTC flow or have ideas for multi-client support, call them out. I'd rather be corrected than wrong.

Until next time.

If you want to talk WebRTC, terminal emulators, or peer-to-peer protocols — find me on GitHub, X, or LinkedIn.

Feedback welcome.


Reference: Quick Start#

Host (share your terminal)#

bash
# Install
npm install -g term-bridge-agent
 
# Share
term-bridge
 
# Output:
# ● Term Bridge agent running
# Machine: avik-macbook
# Code:    482-913
# Waiting for connections...
#
#   Peer runs: term-bridge connect 482-913

Client (connect to a shared terminal)#

bash
term-bridge connect 482-913

Commands (available during session)#

CommandDescription
/switchToggle between local and remote terminal view
/transfer <file>Send a file to the peer (tab completes paths)
/statusShow session info (peer IP, uptime, view mode)
/kickHost-only: disconnect the client
/helpShow available commands
/exitDisconnect from the session

Architecture Decision Record#

DecisionChoiceWhy
Signaling transportWebSocketBidirectional, persistent, low overhead for SDP/ICE relay
Data transportWebRTC DataChannelP2P, encrypted, no server relay needed for data
Session stateDurable Object ctx.storageSurvives hibernation, transactional, no external DB
Socket trackingctx.getWebSockets() + tagsSurvives hibernation, auto-cleanup, no memory leaks
Pairing codesKV with TTLAuto-expire, globally distributed, single-use via delete
Control protocolIn-band prefix \x00TB:One channel, guaranteed ordering, simple parser
File transferBase64 chunks over control protocolReuses existing channel, no separate connection needed
PTY librarynode-ptyFull terminal emulation, supports all shells, cross-platform
WebRTC librarynode-datachannelActive maintenance, full DataChannel support, C++ performance
HTTP frameworkHonoEdge-native, lightweight, Cloudflare Workers support

Glossary#

  • WebRTC: Web Real-Time Communication. A protocol and API for peer-to-peer audio, video, and data streaming between browsers (and Node.js via native bindings).
  • DataChannel: A WebRTC feature that provides a bidirectional, ordered, reliable byte stream between peers. Like a WebSocket but P2P. Built on SCTP over DTLS over UDP.
  • SDP: Session Description Protocol. A text format describing connection parameters — codecs, media types, ICE credentials, DTLS fingerprints. Exchanged during WebRTC negotiation.
  • ICE: Interactive Connectivity Establishment. A framework for finding the best network path between two peers. Tries host candidates (local IP), server-reflexive candidates (STUN-discovered public IP), and relay candidates (TURN server).
  • STUN: Session Traversal Utilities for NAT. A protocol that helps peers discover their public IP address and port through NAT. The peer sends a request to the STUN server, and the server responds with the peer's public address.
  • TURN: Traversal Using Relays around NAT. A relay server for peers that can't connect directly (symmetric NAT, restrictive firewalls). All traffic flows through the TURN server. Not implemented in Term Bridge.
  • NAT: Network Address Translation. What your home router does — maps multiple private IP addresses (192.168.x.x) to one public IP address. Necessary because IPv4 has only 4.3 billion addresses, and there are more devices than that.
  • DTLS: Datagram Transport Layer Security. Like TLS but for UDP. Provides encryption for WebRTC DataChannels. The same cryptographic protocols as HTTPS, but adapted for unreliable transport.
  • SCTP: Stream Control Transmission Protocol. The transport protocol used by DataChannels. Provides reliable, ordered delivery over DTLS/UDP, with built-in congestion control and multi-streaming.
  • Trickle ICE: Sending ICE candidates one at a time as they're discovered, rather than waiting for all candidates before connecting. Reduces connection setup time because the remote peer can start trying candidates immediately.
  • PTY: Pseudo-Terminal. A pair of virtual devices (master and slave) that let programs interact as if connected to a real terminal. The shell runs on the slave side; our program reads/writes the master side.
  • Durable Object: A Cloudflare Workers feature — a single-threaded, stateful actor with persistent storage (ctx.storage). Accessed by name, so all requests for the same session ID route to the same instance.
  • KV: Cloudflare's key-value store. Globally distributed, eventually consistent, supports TTL (time-to-live). Used for ephemeral pairing codes.
  • Signaling: The process of exchanging connection metadata (SDP offers, ICE candidates) before a direct P2P connection is established. The signaling server is the matchmaker — it introduces peers but doesn't handle data.
  • Hibernation: Cloudflare can evict idle Durable Objects from memory to save resources. When a new request arrives, the DO is restored from persistent storage. In-memory state is lost; ctx.storage and WebSocket tags persist.
  • Raw mode: A terminal mode where every keystroke is sent immediately to the program, without buffering or special handling. Ctrl+C, Ctrl+Z are delivered as characters, not signals. Essential for terminal sharing because the terminal must be a transparent pipe.
  • ANSI escape sequences: Special character sequences (starting with \x1b[) that control terminal behavior — colors, cursor movement, screen clearing. Terminal output is full of these; they must be preserved exactly during transmission.

Sponsor

Support my open-source work

If my projects, blog posts, or tools have helped you, consider sponsoring me on GitHub. Every contribution keeps the side projects shipping.

Sponsor on GitHub