Understanding the Good in “The Good, the Bad and the Ugly of REST APIs”

With several dozens of APIs getting published every month or so, it is kind of become a routine for a seemingly innocent “How to do REST” or “Guidelines for REST APIs” kind of blog posts become source of controversies around applying REST principles properly.

Last week, it was the turn of George Reese’s blog post titled “The Good, the bad, and the ugly of REST APIs” for the controversy.  There were several reactions and reactions to reactions on all aspects of his post.  Following tweet from George indicates the mood:

One of the things George’s blog advocates under “Good” list is:

Supporting both JSON and XML

I know you love {JSON,XML} and you think everyone should be using {JSON,XML} and that the people who use {XML,JSON} are simply stupid. But you’re not building an API for yourself, you are building it for everyone, including those who think {XML,JSON} rocks and {JSON,XML} sucks. I don’t want to get in the technical merits of either language or even the possibility that there might be distinct use cases for JSON vs. XML. The bottom line is that there are people out there who want to consume APIs in both languages and it’s just not hard or complex to support both.

To which William Vampenepe reacts in his blog post:

I disagree: Two versions of a protocol is one too many (the post behind this link doesn’t specifically discuss the JSON/XML dichotomy but its logic applies to that situation, as Tim Bray pointed out in a comment).

Now that confuses me. Supporting JSON and XML in my mind is no different from a resource supporting multiple media types and serve an appropriate media type using content negotiation.  The post he links to talks about SOAP and REST as multiple protocols. I don’t see the connection.

The other item on George’s “Good” list is:

Providing solid API documentation reduces my need for your help

Solid API documentation enables an individual to both understand the simple examples that help them get started as well as the complex ways in which they can leverage the API to accomplish advanced tasks

That doesn’t seem like a statement to be argued with, isn’t it ?  Apparently not so. Jan Algermissen posted a comment saying:

If you document an API, you API immediately ceases to have anything to do with REST. The contract in RESTful systems is the media types, *not* the API documentation.

I suggest you move that section to “The Bad”

This comment was met with ridicule by many folks including George and William terming it as nothing but silly.

My takeaway from that comment was that the documentation should focus on media types as the rest of the behvaiour of dealing with media types is fairly standard with REST.

Clearly, Jan Algermissen is no newbie to REST. But I am still baffled by the first part of his statement.  How can a little documentation with examples of request and response payloads (even at the risk of duplicating something very obvious in REST way of doing) make it so against REST.

Stu responds with his blog post defending Jan Algermissen’s comment and many other things.

Jan Algermissen’s comment about how when you’ve documented an API, you’re not really doing REST, is actually spot on, fundamentally, but is met with ridicule. I can completely understand how vacuous the comment sounds if you just want to ship an API nownownow, are being short-term practically minded, or are just playing buzzword bingo with architectural styles. But if we actually want to leverage the benefits of the style, we’d work on the core issue of having a generic media type or two for these sorts of use cases.

I am still trying to digest parts of Stu’s post. Most developers learn the best practices by looking at what experts in that domain recommend or see mimic real-world APIs from popular web sites (twitter.com, facebook.com etc). Unfortunately, of late, these are the wrong sources to learn the best practices from (Remember the hashbang controversy).

Here is something even more basic. Try and get a bookmark-able link to a specific tweet on twitter.com site.

 

(Updated 14th June 2011)

I would like to add one more the list of  “good”  of REST APIs.  I am sure all REST purists would now cringe at this. Try and publish WADL for your API. Goal is not to be able to do all weird stuff that tools force you to do with WSDL while consuming the service.  But it definitely helps your API consumers to leverage some tools that would further help them to understand the API better.  For example, check out this cool API console tool from apigee. Apigee API console takes a WADL and provides a nice way to navigate the API, exercise the API (including OAuth), look at the request/responses  and learn iteratively – all with zero coding and based completely on WADL.

It already supports several public REST APIs as an example.

Advertisements

HTTP Conditional GET in APIs – A Forgotten Art?

HTTP protocol has this cool feature called “Conditional GET“.  Let us understand this with an example of twitter API.

Here is an API request to receive timeline of a twitter user in json representation:

GET /1/statuses/home_timeline.json HTTP/1.1

Authorization:  OAuth oauth_consumer_key=”XXXXXXXXXXXXX”,oauth_signature_method=”HMAC-SHA1″ …
Host: api.twitter.com
Connection:Keep-Alive
The response looks something like this:

HTTP/1.1 200 OK

ETag: “f50e33f5b45783a3cf81d3c76e50f065”-gzip
Content-Length: 26832
Expires: Tue, 31 Mar 1981 05:00:00 GMT
Last-Modified: Sat, 16 Apr 2011 18:29:15 GMT
Connection: close
Cache-Control: no-cache, no-store, must-revalidate, pre-check=0, post-check=0
Pragma: no-cache
Content-Type: application/json; charset=utf-8

[ { “favorited” : false, “text” : “Design Patterns – Progressive actions: http://wp.me/pEZOQ-3t #myblog http://wp.me/pEZOQ-3t”, “retweet_count” : 0, “in_reply_to_screen_name” : null, “in_reply_to_status_id_str” : null, “place” : null, …

Assuming that the timeline of a user is not changed, If I make this request repeatedly, I end up getting 26832 bytes transferred every time.  The request could have been to any other twitter resource such as tweets, users, lists etc that probably donot change very frequently.  In fact, with every service, there will most likely be certain resources that donot change very frequently.  And if a client already has a representation of this resource, downloading the same resource again is a wasteful exercise for clients, network and server. This is particularly important for mobile device based clients where the network bandwidth is limited.

As the name implies, Conditional Get makes a GET method conditional. That is, fetching of a resource happens only if certain conditions are met. Let us see what these conditions are by retrying the above request with slight modifications:

GET /1/statuses/home_timeline.json HTTP/1.1

Authorization:  OAuth oauth_consumer_key=”XXXXXXXXXXXXX”,oauth_signature_method=”HMAC-SHA1″ …
Host: api.twitter.com
Connection:Keep-Alive
If-None-Match: “f50e33f5b45783a3cf81d3c76e50f065”-gzip
If-Modified-Since: Sat, 16 Apr 2011 18:29:15 GMT
I made two changes this time. See the highlighted headers.
  • Taken the “ETag” header value from previous response and added it as “If-None-Match” header in the new request.
  • Taken the “Last-Modified” header value from previous response and added it as “If-Modified-Since” header in the new request.

“If-Modified-Since” header tells the server to send the resource representation only if the resource is modified since the date given in the header value.

ETags(entity tags) are server provided  opaque values associated with the resource. ETags are useful strong validator mechanisms. That is, ETags are expected to change when the resource is modified. A simple implementation could represent ETag as a hash-value of resource representation. Given that a server could compare resource’s current ETag value and the one presented in the request to decide if the client holds a stale representation or not.

By sending previously received ETag value in “If-None-Match” header, client is indicating to the server that it should send the response only if the ETags donot match.

Effectively, with these two new headers, client is indicating to the server that it holds a copy of a resource and that it would like to receive a resource representation in response only if server determines that client is holding a stale copy.

Here is how the new response would look like if the client’s copy is still valid. 304 status code indicates that requested resource is not modified and that the response contains no body.

HTTP/1.1 304 Not Modified

ETag: “f50e33f5b45783a3cf81d3c76e50f065”-gzip
Connection: close

Conditional GET is widely supported by almost all RSS feed servers and RSS clients.  When it comes to APIs, there seem to be complete ignorance. Looking at the wireshark capture of my android mobile device traffic, looks like many apps donot make use of this feature when it is obvious that they should have.

While I have not tried out above requests with real Twitter API, I am glad to see that it does support ETags. I assume it supports Conditional GET as well.  There is no mention of this in any Twitter API docs though. Quick googling indicates that this is the case with pretty much every other API as well.

With API virtualization and API management systems like Apigee, it is possible to implement this feature completely outside the API provider, without changing a single line of their code.

Design Patterns – Progressive actions

Performing an action progressively in increments is a generic design pattern that we can see in many contexts. The goals of the pattern vary depending on the context.

  • Reduce perceived latency
  • Improve the usability
  • Avoid the information clutter
  • Optimize resource usage (CPU, Memory etc)
  • On-demand resource usage (CPU, Memory etc)

Here are few examples of this pattern

Progressive Disclosure

This is a UI design pattern. The goals typically here are to avoid the information clutter and improve the usability. General approach is to break up the content into smaller chunks/blocks, display one or two chunks to begin with and show the other blocks progressively as required. The techniques to hide/show content and the events that trigger progressive display are presentation technology dependent.

The pattern not only makes the digestion of content easy for novice users but also makes it possible for the advanced users to explore further – in a way helping the transition of novice user to advanced user. A popular avoid-the-clutter example that you may be familiar with is twitter.com site that displays controls (Reply, Retweet etc) when the mouse is hovered on the tweet.

Some relevant blog posts that talk about this technique in detail

Progressive Disclosure in User Interfaces

Wikipedia page

http://www.interaction-design.org/encyclopedia/progressive_disclosure.html

http://www.useit.com/alertbox/progressive-disclosure.html

Progressive Rendering

This is another UI design pattern. The goal is to reduce the perceived latency. Like the previous technique, the general approach is to break up the information into smaller chunks (if not already broken in the original content), process each chunk and display before proceeding with another chunk.

Browser is a popular example where the various pieces of information in a web page (html, java script pages, CSS stylesheets, images etc) are processed in incremental and parallel fashion. The processing includes downloading the entity, parsing the entity, updating DOM. The processed entities are then displayed without waiting for other non-dependent entities.  You would also see this pattern at work when a browser is rendering an image where the resolution of the image is improved progressively as the image data is getting downloaded.

maps.google.com is another example where the tiles comprising the map are separately downloaded, processed and displayed progressively. The end result is that you see parts of the map getting displayed instantly and the rest being filled in progressively.

Progressive Collection of Information

This is again a UI pattern where the goal is to ask the user for information incrementally as required based on the context. The traditional approach of displaying a form with dozens of fields and expecting the user to fill-in all in one shot runs the risk of user loosing interest and not providing the desired information.

A typical user registration form in many sites asks for the following information:

  • Email address
  • Mailing address
  • Payment Details
  • Areas of Interest

Instead, one can just ask for Email address to begin with. When the user navigates the site and purchases an item,  then ask for Payment details. If the user buys an item that needs to be shipped, ask for mailing address at that point.

If you think about it, this is no different from how it works in real-life. When you got a brick-and-mortar shop, you are not disclosing any of these details up front. Only when you purchase an item, one is asked about payment details.

Progressive Processing

This is a software design pattern. Goal is to process a stream of information as the stream is downloaded as opposed to waiting to process until the end of the stream happens. In practice, with certain large or never ending streams, it mayn’t be possible to wait for the end of the stream and the only option left is this pattern.  Use of this pattern not only helps in reducing the perceived latency, it would result in optimal use of resources (memory, CPU etc) in some cases.

A familiar example is XML Parsing.  In a server environment where the server is handling multiple requests containing XML documents asynchronously in non-blocking manner, not all request data will be available in one shot due to the inherent nature of network delays and sharing across clients.

If the requirement is to build a DOM of this XML document, a progressive XML parser would fit the scenario better than a traditional non-progressive XML parser. A non-progressive parser either would block on the stream because the data is not available yet (most Java parsers fall into this category) or expect that entire document data be provided in one call. A progressive parser, on the other hand, would accept the input in chunks/increments, parse the chunk and then return the control back to the caller. When more chunks of data are available, a progressive parser would be able to continue parsing.

If the requirement is instead to use a SAX parser to search and extract certain data from the stream, a progressive XML parser would not only reduce the latency, it would also reduce the memory usage (one can discard intermittent data structures and previously received chunks of stream data).

In fact, the only publicly available commercial/open source C language XML Parser that I know is capable of doing progressive parsing is libXML.  At apigee.com, my previous company, we have built couple of progressive XML parsers (one based on lex/yacc and the other based on hand-coded state machine) that not only could do wire-speed XML parsing but also could handle never-ending XML streams (used in financial world).

Use of this pattern influences the API contract of the component using the pattern. In the example of the XML parser above, the progressive parser would likely have appropriate methods to accept chunks of input and be called repeatedly until the last chunk.

Progressive building of data structures

This is a design pattern typically used in software design. Goal is to delay the resource usage to the point of actual requirement.

An example of this is again popular in XML parser world. A delayed DOM parser.  Apache Xerces implements this pattern. In the delayed DOM building mode, When the Xerces parser is given an input XML document, it doesn’t materialize all the DOM nodes in the beginning itself. Instead, only the document and probably root node is materialized. As the methods are called on the root node, subsequent nodes are materialized.

Unified Models for the Cloud

Today, I came across several posts, starting with Lori MacVittie‘s post on “A Unified Model” where she talks about her dream to come up with a unified model across the layers of the cloud:

“If we could agree on a unified model to codify the meta-data necessary across the entire spectrum of application and network infrastructure services and then further agreed to standardize on a REST-based API we could, in fact, arrive at a point where the issue of cloud interoperability is resolved with little more than a URL rewrite.”

“If the model is consistent across cloud computing environments – public, private, hybrid, local, external, internal – much in the same way HTML is consistent across web applications, then the differentiation across providers becomes the services and processes that orchestrate those services into an efficient, fast, secure application delivery system. If the model is consistent then an application can be described in a consistent, unified way and the implementation of the policies required to support the application and its infrastructure needs (security, acceleration, QoS, optimization, data integration, backup, storage) remains the demesne of the cloud computing provider.

In this context, it is also interesting to read her older post “Cloud, Standard, and Pants” with an analogy to the problems women face in correlating the sizes across brands.

In the end she summarizes:

“We need to focus on the models, not the APIs, because it is the models that will provide the interoperability and portability desired both across and within cloud computing environments.

Great post. Echoing similar thoughts on the Cloud API standards, back in Septembr’09, in my post on “Do we need Cloud API Standards”, I argued that we need to have underlying models compatible before coming up with a standard (Unified API for example). As it stands, the models vary quite a bit and unless these are addressed, interoperability will not succeed.

Do we need Cloud API Standards ?

Of late there has been lot of buzz around cloud API standards. For a while there was only one cloud and one set of APIs and that is of Amazon. But now we have several public and private clouds. With so many solutions/products coming up, it is common (and expected of) for  vendors, ISVs and developers to start thinking about a standard set of APIs to access these cloud services.

This seemed to attract lot of criticism from a section of blogosphere. Here are few that I came across:

One of the common criticism seems to be around that standardization is bad for innovation from vendors and that “lowest common denominator” rules. I don’t think this is something we should worry about.  DOM and SAX APIs were the only standards to process XML for several years. That didn’t stop one to innovate and come up with StaX API.  That didn’t stop several other Domain specific languages (DSL) still innovating on different mechanims to access and process XML.

Look at  standards to access  SQL.  Both JDBC and ODBC are around for several year now. Have you seen anyone writing a SQL application that doesn’t use these standards (or some of the new ones that are coming up for each of PHP/Python/Ruby etc) ?  If you are developing a new Java database application today, would you worry that JDBC exposes only “lowest common denominator” functionality and that you go with some native API exposed by the RDBMS vendor. Nah. I don’t think so.

And neither of these standards prevented innovation from vendors. Infact, it made life easy for lot of vendors.

Clay Lovelace‘s post mentions that standard APIs tend to leak abstractions. Reading more carefully, the post is pointing out issues with some tools that generate SQL and not standard API like JDBC.  Leaky abstractions are more prominent when the underlying models are fundamentally different. That is not the case with JDBC/ODBC like standards. Underlying model for these standards is common – SQL and RDBMS.  Same with  XML API standards – underlying model is same.

 If you talk about standard API for  Amazon’s EC2 and Rackspace’s cloud – then there is a good chance that abstractions would leak as the underlying models are different.  Having such a common API is probably ok for vendors like Rightscale(and may be required for their survival), but no good for a developer.

But if you talk about standard API for cloud storage, esp for CRUD like operations – this is very much needed. The underlying models are similar.

I say – This is the right time to start the standardization process. Even if the API addresses only 20% of the functionality, doesn’t matter. Time and adoption will drive the rest. Let us get going in defining standard models (I didn’t say API) for specific cloud services such as cloud storage. Standard APIs just follow.

Then I came across Tom Nolle‘s post on “Multiple standards could spoil cloud computing“.  Interesing. I didn’t know that there are so many self proclaimed standard bodies around for Cloud computing. Clearly, you cannot have a single standard when there is more than one group working on standardization. As he says, it is worse than not having a single standard.  We have seen this in the post where it didn’t succeed. One classic example that comes to the mind is the spec on “WebServices Reliability” where two self-proclaimed group of vendors went on defining separate standards and ended up finally merging into one.