A production-capable HTTP/1.1 web server built from raw TCP sockets in C++ — no libuv, no Boost.Asio, no frameworks. Every layer written from scratch.
59,009 req/sec Nginx
37,497 req/sec This server ← thread-per-connection, no epoll yet
┌─────────────────────────────────────────┐
│ main.cpp │
│ route registration │
└────────────────┬────────────────────────┘
│
┌────────────────▼────────────────────────┐
│ TcpServer │
│ socket() → bind() → listen() │
│ accept() → std::thread per connection │
└────────────────┬────────────────────────┘
│
┌────────────────▼────────────────────────┐
│ handleClient() │
│ readFullRequest() — partial read loop │
│ parseRequest() — HTTP/1.1 parser │
│ router lookup — path → handler │
└────────────────┬────────────────────────┘
│
┌────────────────▼────────────────────────┐
│ Response │
│ status().body().send() │
│ getContent() — static file serving │
│ MIME detection — html / css / js │
└─────────────────────────────────────────┘
Threading model: one std::thread per accepted connection, detached. The main loop returns to accept() immediately after spawning.
- Raw TCP socket engine —
socket(),bind(),listen(),accept() - HTTP/1.1 request parser — method, path, headers from raw bytes
- Partial read loop — accumulates
recv()chunks until\r\n\r\n - Thread-per-connection — concurrent clients, no blocking
- Trie-ready router —
unordered_mappath →std::functionhandler - Chainable
Responseclass —res.status(200).body("<h1>hi</h1>").send(fd) - Static file serving — reads from disk, correct MIME types, 404 fallback
- Dockerized — runs in a
gcc:13container
Tested on Mac with wrk. Both servers serving a static HTML file.
wrk -t4 -c100 -d10s http://localhost:PORT/
| Server | Req/sec | Avg Latency | Transfer/sec | Errors |
|---|---|---|---|---|
| Nginx | 59,009 | 1.69ms | 63.82 MB/s | 0 |
| This server | 37,497 | 1.16ms | 5.88 MB/s | 328,888 read |
What the numbers say:
Average per-request latency is actually lower on this server. The throughput gap and read errors come from the thread-per-connection model — at 100 concurrent connections, 100 threads are spawned simultaneously. Context switching overhead grows under load and the OS backlog fills, dropping connections before they're fully read.
Nginx uses an epoll-based event loop — one thread watches all connections via kernel event notification, zero context switching overhead.
Next: refactoring the socket layer to epoll + a thread pool to close this gap and eliminate read errors.
With Docker:
git clone https://github.com/matheusmurk/webserver
cd webserver
docker compose up --buildServer starts on http://localhost:8080.
Without Docker:
g++ -std=c++17 -O2 -pthread -o build/HttpLinux server_linux.cpp http_tcpServer_linux.cpp -I.
./build/HttpLinuxRequires GCC 13+ and Linux or macOS.
webserver/
http_tcpServer_linux.h TCP server class + HttpRequest struct
http_tcpServer_linux.cpp Socket engine, parser, router, threading
response.h Response class declaration
response.cpp Response serializer + static file serving
server_linux.cpp main() — route registration + server start
Dockerfile
docker-compose.yaml
- Refactor socket layer to
epollevent loop - Thread pool — fixed worker threads instead of unbounded spawn
- Re-benchmark after epoll — target > 50k req/sec
- HTTP/1.1 keep-alive (persistent connections)
- Template engine —
{{ var }}substitution - TLS via
openssl
Starting from socket() returning a file descriptor and ending at a router that dispatches Lambda handlers makes every abstraction in modern frameworks legible. Express's app.get() is an unordered_map. Nginx's performance advantage is epoll. A "request" is a loop over read() watching for \r\n\r\n.
Built by @matheusmurk