Practical HTTP API Design

Over the years, I've built a lot of systems that include services accessed via HTTP. When designing an HTTP service API, I try to guide my design decisions entirely in terms of concrete, pragmatic benefits for my system, and so tend to do the following (in no particular order):


  • Read using GET and hyperlinked resources because the built-in caching can off-load a lot of work from your servers. Avoid using PUT and DELETE for writing data, for a couple of reasons. First, there is no precedent for their use on the human (as opposed to the service) Web. Both verbs have been removed from the HTML5 specification, suggesting this will continue to be the case. More importantly, unlike GET, there is nothing in the network infrastructure that does anything meaningful with PUT or DELETE. PUT is particularly problematic because it raises the question of how to handle partial updates - extra embedded data in the request, sub-resources, full updates but with some data missing, etc.

  • Instead, use POST to write data, but not to individual resources. Send requests to one resource (or a small handful of resources) and include an action or actions in the request body. This approach makes it easier to implement batch operations, like "give all my customers in California who spent more than $100 last month a $10 gift card", which you would never want to do by manipulating individual resources. It also allows you to “boxcar†multiple unrelated actions together into a single request, if desired. This logical-action oriented approach to writing also provides a benefit with untrusted clients, like code running in a browser. In that environment, it’s important for a service to know what action is taking place so it can decide whether it should be allowed. Data updates via PUTs to individual resources do not convey this information and can be difficult to secure.

 Finally, this model avoids the question of how long individual resource URLs are good for and whether you have to remember them over time. If you retrieve a resource with hyperlinks for writing, and you plan to come back later to write data, where and how do you store the links, and for how long? Holding them in memory is often not sufficient, e.g., when you are executing code on a web server responding to requests from a client and different requests may go to different servers. Do you put the resource representation in the database so you’ll have the URLs you need later? Reducing the URLs you write through to a few well-known ones for a given service eliminates this problem. 

This model may seem strange because it makes reads and writes asymmetrical. Don’t let that bother you, most web sites work the same way. Think of the POST requests as creating transaction resources, which you are choosing to model explicitly in your API.

  • Avoid application-specific MIME types as a way to indicate request or response message semantics or version. Code that consumes HTTP requests or responses only cares about the data. If a semantically meaningful MIME type and the data format don't match, the data wins. If they do match, the MIME type is redundant. Stick with the standard MIME types, for example, application/edn or application/json.

  • If you use links in the representations of your resources, and need a way to indicate their semantics, use a simple token that you document as part of your API. I typically use a string or keyword key in a map, where the value for the is the desired URL, For instance, 

{:person {:name “Tim†:age 43 :action/change-name “/person/25â€}}

. In this case, the :action/change-name key specifies the URL to use change the name of this person. The HTTP method and request format to use are defined in the API documentation for that key and coded into the client code. The only parameterization is the endpoint to communicate with, which can be changed as needed by the service.

  • Support one data format, e.g. EDN, JSON, or XML. In my experience, using multiple formats forces you to lowest-common-denominator semantics and the primary format is the one used to do all the testing - the rest are second class.

- Use only basic HTTP return values. Most consumers are not prepared to deal with all the different HTTP status codes. I use 200, 400, 401, 403, 500 and 30x, and that’s pretty much it. For example, I replace a 201 with a redirect to the given location.

  • Finally, if you are using HTTP services internally, meaning services are calling other services, make sure you manage HTTP connections carefully. While HTTP itself supports pipelining requests, many Web servers serialize server-side processing of requests on a given connection. So if you have a lot of traffic between services, you need a pool of connections. This is especially true if you are using HTTPS internally, since constantly recreating secure connections is expensive. I have profiled services that spent 60% of their time reconnecting to down-stream services because a new HTTP connection was created for each request.

My approach to designing HTTP-based services is somewhat at odds with some of the contemporary practices in this space, but I’ve found it to be a simple, practical way to build them. It's worth noting that this model is more or less what Datomic's API does, and it works very well there.

Get In Touch