蓝图 · 1,356 字 · 6 分钟阅读

REST: Architecture for the Scalable Web

Roy Fielding's dissertation formalized the architectural style that would become the standard for web APIs.

#TL;DR

By the late 1990s, the web was successful but nobody could explain why. It scaled better than any distributed system before it, yet it had no formal architecture — just conventions that had evolved organically. In 2000, Roy Fielding — one of the principal authors of HTTP itself — published his doctoral dissertation at UC Irvine. Chapter 5 described Representational State Transfer (REST): a set of architectural constraints that explained the web’s design and could guide the design of new systems. REST wasn’t a protocol, a framework, or an API specification. It was a way of thinking about networked applications that made them scalable, evolvable, and simple. When the API economy exploded in the 2000s, REST became the default architectural style — not because it was mandated, but because the alternatives (SOAP, XML-RPC) were so much more painful.

#The Dissertation

In 2000, the web was five hundred million users and growing. Nobody understood its architecture.

Not the surface mechanics — HTTP, HTML, URLs — those were well-specified. But the why: why did the web scale so well? Why could millions of servers and billions of clients interoperate without centralized coordination? Why did it evolve so gracefully, adding new content types and protocols without breaking what already worked?

Roy Fielding was uniquely positioned to answer this. He’d been a central figure in the web’s development for nearly a decade. He co-authored the HTTP/1.0 and HTTP/1.1 specifications. He co-founded the Apache HTTP Server project. He wasn’t designing a new system — he was reverse-engineering the design principles of the system he’d helped build.

His dissertation, “Architectural Styles and the Design of Network-based Software Architectures,” analyzed the web as an architectural style defined by a set of constraints. He called this style REST.

#The Constraints

REST is defined by six constraints. Each one eliminates a class of design choices, and the constraints together produce a system with specific, desirable properties:

1. Client-Server — separate the user interface from the data storage. The client handles presentation; the server handles data. They evolve independently.

2. Stateless — each request from client to server contains all information needed to understand the request. The server stores no session state between requests.

# Stateful (NOT REST): server remembers your context
POST /login          → server stores session
GET  /my-orders      → server looks up session to know who "my" is

# Stateless (REST): every request is self-contained
GET /users/42/orders
Authorization: Bearer eyJhbG...

3. Cacheable — responses must declare whether they can be cached. When a response is cacheable, the client (or an intermediary) can reuse it for later equivalent requests, reducing server load.

4. Uniform Interface — the defining constraint. All resources are accessed through a single, consistent interface:

  • Resources are identified by URIs (/users/42)
  • Resources are manipulated through representations (JSON, HTML, XML — the client doesn’t access the resource directly, it works with a representation of it)
  • Messages are self-descriptive (the Content-Type header tells you how to parse the body)
  • Hypermedia as the engine of application state (HATEOAS) — responses include links that tell the client what it can do next

5. Layered System — the client can’t tell whether it’s talking to the actual server or an intermediary (CDN, load balancer, proxy). This enables transparent scaling.

6. Code on Demand (optional) — the server can send executable code to the client (JavaScript). This is the only optional constraint.

#Resources and Representations

The central abstraction in REST is the resource — any concept that can be named. A user, an order, a search result, a collection of posts. Each resource has a URI:

/users/42              — a specific user
/users/42/orders       — that user's orders
/posts?era=era-3       — a filtered collection
/posts/era-3-dom       — a specific post

A resource is not a database row. It’s not a file. It’s an abstraction. The same underlying data might be exposed as multiple resources, and the same resource might have multiple representations — JSON for an API client, HTML for a browser, CSV for a spreadsheet:

GET /users/42
Accept: application/json

→ {"id": 42, "name": "Alice", "email": "alice@example.com"}

GET /users/42
Accept: text/html

→ <html><body><h1>Alice</h1>...</body></html>

The client asks for a representation of a resource. The server sends back the representation that best matches the client’s preferences. The resource itself is an abstract concept — you never interact with it directly, only with representations of it.

#HTTP as the Uniform Interface

REST doesn’t require HTTP. But HTTP was designed with REST’s principles in mind (Fielding co-authored both), so the mapping is natural:

HTTP MethodMeaningSafe?Idempotent?
GETRead a resourceYesYes
POSTCreate a new resourceNoNo
PUTReplace a resource entirelyNoYes
PATCHPartially update a resourceNoNo
DELETERemove a resourceNoYes
import requests

# Create
requests.post("/api/posts", json={"title": "New Post", "era": "era-4"})
# → 201 Created, Location: /api/posts/17

# Read
requests.get("/api/posts/17")
# → 200 OK, {"id": 17, "title": "New Post", ...}

# Update
requests.patch("/api/posts/17", json={"title": "Updated Title"})
# → 200 OK

# Delete
requests.delete("/api/posts/17")
# → 204 No Content

# List
requests.get("/api/posts?era=era-4")
# → 200 OK, [{"id": 17, ...}, {"id": 18, ...}]

Status codes carry semantics: 201 Created for successful creation, 404 Not Found for missing resources, 409 Conflict for concurrent modification. The response headers carry metadata: Content-Type, Cache-Control, Location. The entire HTTP specification becomes the API’s vocabulary.

#SOAP: The Road Not Taken

REST became the web API standard partly because of what it replaced. In the early 2000s, the enterprise world was building APIs with SOAP (Simple Object Access Protocol) — an XML-based messaging framework that ran on top of HTTP rather than with it.

<!-- SOAP: get a user by ID -->
<?xml version="1.0"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
  <soap:Body>
    <GetUser xmlns="http://example.com/users">
      <UserId>42</UserId>
    </GetUser>
  </soap:Body>
</soap:Envelope>
# REST: get a user by ID
GET /users/42

SOAP used HTTP as a dumb transport — every request was a POST to the same URL, with the actual operation buried inside the XML body. It required WSDL (Web Services Description Language) files to describe the API, and code generators to produce client stubs. Building a SOAP client was a project. Building a REST client was a curl command.

The industry voted with its keyboards. By 2010, REST had won outside of legacy enterprise systems. SOAP didn’t die because it was technically wrong — it died because it was too complex for the web’s pace of development.

#The API Economy

REST gave the industry a shared vocabulary, and that vocabulary enabled an explosion:

Twitter (2006) launched a REST API that let third-party clients post tweets, read timelines, and manage accounts. Most early Twitter users never used the official website — they used apps built on the API.

Amazon Web Services (2006) exposed infrastructure as REST resources. An EC2 instance was a URL. An S3 bucket was a URL. You could create, read, update, and delete cloud infrastructure with HTTP requests.

Stripe (2011) showed that a beautifully designed REST API could be a product. Their API was so clean and well-documented that it became the template for an entire generation of developer-facing products.

# The Stripe API: REST at its best
curl https://api.stripe.com/v1/charges \
  -u sk_test_4eC39HqLyjWDarjtT1zdp7dc: \
  -d amount=2000 \
  -d currency=usd \
  -d description="My First Test Charge"

By 2015, “API-first” was a design philosophy. Companies built the API before the website. Mobile apps were thin clients over REST endpoints. Microservices communicated via REST. The web stopped being a network of pages and became a network of APIs.

#What REST Got Right

Fielding didn’t invent anything new. He named what was already working:

  • Resources, not procedures — SOAP and XML-RPC modeled APIs as remote function calls: getUser(42), createOrder(...). REST modeled them as resources: GET /users/42, POST /orders. This noun-oriented approach was simpler to learn, easier to cache, and mapped directly to HTTP’s existing infrastructure.
  • Statelessness enables scaling — because each request is self-contained, any server can handle any request. Load balancers don’t need sticky sessions. Servers can be added or removed freely. This is why REST APIs scale to billions of requests — the same principle that made HTTP scale to billions of users.
  • The uniform interface — by constraining every API to the same small set of operations (GET, POST, PUT, DELETE on resources), REST made APIs predictable. If you’ve used one REST API, you understand the shape of every REST API. That shared understanding is worth more than any individual optimization.
  • The web is the platform — Fielding’s deepest insight was that the web already had a good architecture. You didn’t need to build a new distributed system on top of it (CORBA, DCOM, SOAP). You just needed to use the web as designed: URIs for naming, HTTP for verbs, representations for data, hypermedia for navigation.

REST is more often approximated than followed. Most “REST APIs” ignore HATEOAS, overload POST for everything, and use HTTP as a transport rather than an application protocol. Fielding himself has noted that what the industry calls REST is often just “HTTP-based APIs.” But even the approximation — resources with URIs, CRUD over HTTP, JSON representations — is a massive improvement over what came before. The web’s architecture, formalized in a dissertation, became the world’s default way to build APIs.