HTTP Conditional GET in APIs – A Forgotten Art?

HTTP protocol has this cool feature called “Conditional GET“.  Let us understand this with an example of twitter API.

Here is an API request to receive timeline of a twitter user in json representation:

GET /1/statuses/home_timeline.json HTTP/1.1

Authorization:  OAuth oauth_consumer_key=”XXXXXXXXXXXXX”,oauth_signature_method=”HMAC-SHA1″ …
Host: api.twitter.com
Connection:Keep-Alive
The response looks something like this:

HTTP/1.1 200 OK

ETag: “f50e33f5b45783a3cf81d3c76e50f065”-gzip
Content-Length: 26832
Expires: Tue, 31 Mar 1981 05:00:00 GMT
Last-Modified: Sat, 16 Apr 2011 18:29:15 GMT
Connection: close
Cache-Control: no-cache, no-store, must-revalidate, pre-check=0, post-check=0
Pragma: no-cache
Content-Type: application/json; charset=utf-8

[ { “favorited” : false, “text” : “Design Patterns – Progressive actions: http://wp.me/pEZOQ-3t #myblog http://wp.me/pEZOQ-3t”, “retweet_count” : 0, “in_reply_to_screen_name” : null, “in_reply_to_status_id_str” : null, “place” : null, …

Assuming that the timeline of a user is not changed, If I make this request repeatedly, I end up getting 26832 bytes transferred every time.  The request could have been to any other twitter resource such as tweets, users, lists etc that probably donot change very frequently.  In fact, with every service, there will most likely be certain resources that donot change very frequently.  And if a client already has a representation of this resource, downloading the same resource again is a wasteful exercise for clients, network and server. This is particularly important for mobile device based clients where the network bandwidth is limited.

As the name implies, Conditional Get makes a GET method conditional. That is, fetching of a resource happens only if certain conditions are met. Let us see what these conditions are by retrying the above request with slight modifications:

GET /1/statuses/home_timeline.json HTTP/1.1

Authorization:  OAuth oauth_consumer_key=”XXXXXXXXXXXXX”,oauth_signature_method=”HMAC-SHA1″ …
Host: api.twitter.com
Connection:Keep-Alive
If-None-Match: “f50e33f5b45783a3cf81d3c76e50f065”-gzip
If-Modified-Since: Sat, 16 Apr 2011 18:29:15 GMT
I made two changes this time. See the highlighted headers.
  • Taken the “ETag” header value from previous response and added it as “If-None-Match” header in the new request.
  • Taken the “Last-Modified” header value from previous response and added it as “If-Modified-Since” header in the new request.

“If-Modified-Since” header tells the server to send the resource representation only if the resource is modified since the date given in the header value.

ETags(entity tags) are server provided  opaque values associated with the resource. ETags are useful strong validator mechanisms. That is, ETags are expected to change when the resource is modified. A simple implementation could represent ETag as a hash-value of resource representation. Given that a server could compare resource’s current ETag value and the one presented in the request to decide if the client holds a stale representation or not.

By sending previously received ETag value in “If-None-Match” header, client is indicating to the server that it should send the response only if the ETags donot match.

Effectively, with these two new headers, client is indicating to the server that it holds a copy of a resource and that it would like to receive a resource representation in response only if server determines that client is holding a stale copy.

Here is how the new response would look like if the client’s copy is still valid. 304 status code indicates that requested resource is not modified and that the response contains no body.

HTTP/1.1 304 Not Modified

ETag: “f50e33f5b45783a3cf81d3c76e50f065”-gzip
Connection: close

Conditional GET is widely supported by almost all RSS feed servers and RSS clients.  When it comes to APIs, there seem to be complete ignorance. Looking at the wireshark capture of my android mobile device traffic, looks like many apps donot make use of this feature when it is obvious that they should have.

While I have not tried out above requests with real Twitter API, I am glad to see that it does support ETags. I assume it supports Conditional GET as well.  There is no mention of this in any Twitter API docs though. Quick googling indicates that this is the case with pretty much every other API as well.

With API virtualization and API management systems like Apigee, it is possible to implement this feature completely outside the API provider, without changing a single line of their code.

Advertisements

Design Patterns – Progressive actions

Performing an action progressively in increments is a generic design pattern that we can see in many contexts. The goals of the pattern vary depending on the context.

  • Reduce perceived latency
  • Improve the usability
  • Avoid the information clutter
  • Optimize resource usage (CPU, Memory etc)
  • On-demand resource usage (CPU, Memory etc)

Here are few examples of this pattern

Progressive Disclosure

This is a UI design pattern. The goals typically here are to avoid the information clutter and improve the usability. General approach is to break up the content into smaller chunks/blocks, display one or two chunks to begin with and show the other blocks progressively as required. The techniques to hide/show content and the events that trigger progressive display are presentation technology dependent.

The pattern not only makes the digestion of content easy for novice users but also makes it possible for the advanced users to explore further – in a way helping the transition of novice user to advanced user. A popular avoid-the-clutter example that you may be familiar with is twitter.com site that displays controls (Reply, Retweet etc) when the mouse is hovered on the tweet.

Some relevant blog posts that talk about this technique in detail

Progressive Disclosure in User Interfaces

Wikipedia page

http://www.interaction-design.org/encyclopedia/progressive_disclosure.html

http://www.useit.com/alertbox/progressive-disclosure.html

Progressive Rendering

This is another UI design pattern. The goal is to reduce the perceived latency. Like the previous technique, the general approach is to break up the information into smaller chunks (if not already broken in the original content), process each chunk and display before proceeding with another chunk.

Browser is a popular example where the various pieces of information in a web page (html, java script pages, CSS stylesheets, images etc) are processed in incremental and parallel fashion. The processing includes downloading the entity, parsing the entity, updating DOM. The processed entities are then displayed without waiting for other non-dependent entities.  You would also see this pattern at work when a browser is rendering an image where the resolution of the image is improved progressively as the image data is getting downloaded.

maps.google.com is another example where the tiles comprising the map are separately downloaded, processed and displayed progressively. The end result is that you see parts of the map getting displayed instantly and the rest being filled in progressively.

Progressive Collection of Information

This is again a UI pattern where the goal is to ask the user for information incrementally as required based on the context. The traditional approach of displaying a form with dozens of fields and expecting the user to fill-in all in one shot runs the risk of user loosing interest and not providing the desired information.

A typical user registration form in many sites asks for the following information:

  • Email address
  • Mailing address
  • Payment Details
  • Areas of Interest

Instead, one can just ask for Email address to begin with. When the user navigates the site and purchases an item,  then ask for Payment details. If the user buys an item that needs to be shipped, ask for mailing address at that point.

If you think about it, this is no different from how it works in real-life. When you got a brick-and-mortar shop, you are not disclosing any of these details up front. Only when you purchase an item, one is asked about payment details.

Progressive Processing

This is a software design pattern. Goal is to process a stream of information as the stream is downloaded as opposed to waiting to process until the end of the stream happens. In practice, with certain large or never ending streams, it mayn’t be possible to wait for the end of the stream and the only option left is this pattern.  Use of this pattern not only helps in reducing the perceived latency, it would result in optimal use of resources (memory, CPU etc) in some cases.

A familiar example is XML Parsing.  In a server environment where the server is handling multiple requests containing XML documents asynchronously in non-blocking manner, not all request data will be available in one shot due to the inherent nature of network delays and sharing across clients.

If the requirement is to build a DOM of this XML document, a progressive XML parser would fit the scenario better than a traditional non-progressive XML parser. A non-progressive parser either would block on the stream because the data is not available yet (most Java parsers fall into this category) or expect that entire document data be provided in one call. A progressive parser, on the other hand, would accept the input in chunks/increments, parse the chunk and then return the control back to the caller. When more chunks of data are available, a progressive parser would be able to continue parsing.

If the requirement is instead to use a SAX parser to search and extract certain data from the stream, a progressive XML parser would not only reduce the latency, it would also reduce the memory usage (one can discard intermittent data structures and previously received chunks of stream data).

In fact, the only publicly available commercial/open source C language XML Parser that I know is capable of doing progressive parsing is libXML.  At apigee.com, my previous company, we have built couple of progressive XML parsers (one based on lex/yacc and the other based on hand-coded state machine) that not only could do wire-speed XML parsing but also could handle never-ending XML streams (used in financial world).

Use of this pattern influences the API contract of the component using the pattern. In the example of the XML parser above, the progressive parser would likely have appropriate methods to accept chunks of input and be called repeatedly until the last chunk.

Progressive building of data structures

This is a design pattern typically used in software design. Goal is to delay the resource usage to the point of actual requirement.

An example of this is again popular in XML parser world. A delayed DOM parser.  Apache Xerces implements this pattern. In the delayed DOM building mode, When the Xerces parser is given an input XML document, it doesn’t materialize all the DOM nodes in the beginning itself. Instead, only the document and probably root node is materialized. As the methods are called on the root node, subsequent nodes are materialized.

Some badly needed improvements to Twitter conversations

Of late, I am spending good amount of time on twitter. I use desktop version of Yoono client to unify all my social streams (Twitter, Facebook, LinkedIn, IM) into one stream and one interface. And on my mobie, I use TweetDeck which provides similar functionality.

This generally works out fine as long as you have only small number of followers and of similar interest.  But you start realizing the weakness of this interface as your interests vary and the followers generally fall into non-overlapping or completely disjoint groups.

For example, I have grouped the people that I follow into multiple lists (some public and some private):

SOA, APIs – Folks who talk about SOA, API related standards and technologies (REST, OAuth etc), Service virtualization, API management systems (@apigee, @mashery etc) etc.  This area is close to my heart. I spent past 5.5 years building the products at apigee (sonoa systems).

Identity – Bloggers/Experts working the “Identity” area. I started spending some time on cloud identity and keenly follow what is happening in this area. I give top priority to this from attention point of view.

BigData – People related to big data (NoSql, Hadoop, Machine Learning etc )

Cloud –  Folks  talking about various aspects of public/private cloud.

Cisco –  Few Cisco folks that I started following recently. Very high bandwidth chit-chat, mainly among the folks @Beaker, @reilyusa, @swardley, @jamesurquhart

Tech – General tech related (mobile, social, java, dynamic languages etc). Also includes tech news from the likes of techmeme.com

Friends -Few friends and known people. The talk here is not particularly related to any area. It could be sometimes about cricket, IPL and sometimes about lokpal bill, corruption, crooked politics etc. Very different topics depending on what is happening in India in general

Startup – Startup advise, folks from interesting startups etc

Everything Else – All others that donot fall into the above category.

If I analyze my twitter stream and how I read, I see few problems:

Missing notion of a conversation

Identifying and following an individual conversation or thread of discussion itself is very difficult to achieve with twitter. There is no fundamental notion of a conversation in twitter. Email clients, for example, make use of either subject or Message-Id, In-Reply-To, References headers to keep track of conversations.  While twitter supports an optional field in_reply_to_status_id to refer to the tweet that you are replying to, but I wonder if any twitter client is making use of it (either in replying or while displaying the tweets).

Imagine a mailing list or a discussion forum where posts from all topics are forced into one virtual topic.  This is what twitter does to the conversations, taking us several decades back in usability.  I think, this problem is generally not perceived if there are only two people tweeting/replying each other. But the moment a 3rd person join in or several other folks following them, it becomes unwieldy.

Mix up of conversations

Many tweets from folks are not related to each other. In some of my lists like “cloud”, “identity” and “Friends”, I see lot of conversations.  But these separate group of conversations when mixed into one universal stream causes serious problem in making sense out of it.  As I go through the twitter stream from the bottom(oldest) to the top(latest), I am forced to switch the context from one conversation to the other.

Incomplete conversations

Assume  users @a,  @b and @c are exchanging tweets either using “ReplyTo” or “Edit & Retweet”. Say, you are following users @a and @b and not @c.   As a result replies or edited retweets of @c are not visible in your stream. You get to know about these incomplete tweets with @c only when @a or @b are replying to tweets of @c.

It would be nice to have an option to detect this and include in my stream the tweets of people that I donot follow provided they are part of the conversation. As I think about this, it shouldn’t be hard to achieve this even without the support of in_reply_to_status_id.

Grouping of  followers

While the notion of  twitter “lists” is useful for others to find a read-made list of people to follow,  no twitter client seems to make use of it to enhance the user interface. In addition to lists, it would be nice to group the people I follow into separate streams so that I can easily deal with high-quality chit-chat and high-bandwidth chit-chat and provide the attention appropriately.

Read/Unread status of Tweets

(Update 6th June 2011 – See the feature announcement from tweetmarks.net for last-read synching. Nice!!)

I mentioned this in the context of another post. I follow twitter streams primarily on two devices today (laptop and mobile). Very soon, I might add a tablet to the list (I do check out the tweets once in a while on the TV (Samsung Internet@TV), but this not very convenient at this point. I have temporarily discontinued my twitter/facebook ticker for Samsung TV project half way. Until I complete this project,  the current TV app is not convenient to use).

Basically, the problem is that the “read” status of tweets is not maintained as I switch devices. As a result, you are forced to re-read or scroll down and find a point to skip to. A big PIA if you have lot of tweets in your stream.

Unlike the other problems mentioned above, this cannot be solved by twitter clients alone. It needs support from twitter platform. This “read” status cannot be the attribute of individual tweets. Instead it has to be an attribute associated with user-id/status_id combination.

While I tried few other twitter clients, I have never explored official twitter client (other than the web version). After switched to the new twitter.com UI, I donot remember seeing any new feature that twitter added. If twitter is discouraging users to innovate on twitter clients, I hope atleast it does some innovation.

Cloud Services & Uninterrupted User Experience

Continuing on the previous post, but in the context of cloud services…

Here are some simple and everyday examples of what I mean by uninterrupted user experience:

  • Go to any web based email service like gmail.com. Say, you have 10 unread emails in your inbox. You read 2 of them on your laptop and logged out.  You went to office, opened gmail.com again and now you see 8 unread emails (assuming that no new emails came in).  No brainer.  However obvious, there is a continuity of experience.  You would feel odd if you 10 unread emails again.

 

  • Extending this to another obvious example  Google Reader.  No matter how many different devices you use, status of read articles is maintained.  Note that some of these devices may be accessing this service via Google Reader API and some may be directly accessing web site. I personally use google reader from at least 3 devices (laptop, mobile, office desktop) on a daily basis

 

  • Look at  Amazon Kindle device and Amazon Kindle app. This is different and slightly sophisticated example. Unlike the previous services, Amazon Kindle controls the user experience. If you donot own a Kindle device, Amazon also provides a standalone application for all platforms including Linux, Windows, Mac, Windows Mobile and Android Mobile.  The cool thing about this service is that it remembers the page you were last reading and opens it for you next time you open the app. It doesn’t matter if you were last reading a book on mobile and now you opened the same book in laptop.  It just works.  It is so seamless you don’t even realize it.

 

  • Here is a futuristic example.   Say, I am sitting in the backseat of a car (someone else is driving) and watching a movie streamed on my mobile (If you are not a movie buff, replace it with a live streamcast of your favorite rock star) over 4G (or 5G?).  I reach home. Since my WiFi is always on at home, the movie streaming service detects this and switches to WiFi for better bandwidth.  I pause the movie, grab a coffee and relax in front of the big screen home theater system.  Once I switch on the TV and select the movie streaming app, it continues from where it left off.

I am sure you are seeing several more example of this seamless and uninterrupted user experience with cloud services. This is natural to think of and implementing it with cloud services. Yet, one of the biggest cloud service twitter fails to do this with tweets.  Yeah, did you notice that there is no “read” or “unread” status of tweets. Yes, it sucks big time as I switch between mobile and laptop several times in a day.

To be able to achieve this uninterrupted user experience, one needs to address it various levels

Infrastructure(Network, Location etc) level – The movie streaming example.

API level – Google Reader and Gmail example. Once you have identified the key elements of the user experience, you need to expose right kind of API so that applications can leverage it.

User Interface level – Amazon Kindle example. Sometimes you need to control the whole user interface aspect to provide the seamless experience. Another interesting example in this category could be cloud based IDE.  If there is ever one such IDE, I would like it to behave exactly like my Eclipse or Visual Studio (with all the files and windows open in the same position) no matter which device I use.

Platform coverage level – With multitudes of connected devices one is carrying these days, one cannot hope to achieve this uninterrupted user experience by sticking to a single platform/device.   And if you combine this with User interface level, may be the device specific Apps are the way to go. Long live Apps.

 

Instant Messaging with Roaming?

We all  use instant messaging (IM) on our desktops and laptops in our everyday life.  And I am sure some of you may use IM on mobile too.

Today, I was on an IM session where I was chatting with a friend in low throughput mode and not with full attention. I was reading an article, monitoring the twitter/facebook stream on Yoono and responding to IM messages with Adium (Yes, I am on Mac these days, but don’t mistake me for an apple fan boy).  My wife walks in, and asks me to get something quickly for her from a nearby provision store.  That would mean, either I have to sign off from IM or tell my friend to wait up for 15 minutes or so.

But what if it is possible to indicate to the IM provider that:

  • I am going to sign into my mobile and the conversation should continue using mobile IM client (esp, in campuses like Cisco you could move around all the blocks, buildings and cafeterias – pretty much still connected over WiFi)
  • I am going to use the SMS on the mobile as I may be moving into a no network connectivity zone

And do the reverse, when you are back to the laptop/desktop.

And all this should be possible without the user on the other end noticing it – much like your mobile phone.

Is it possible to do this with current IM clients today ?   Yahoo IM allows you to sign-in to your mobile and communicate using SMS, but the other party sees this change in your status though.  Some IM platforms (esp google talk related) allow you to sign-in using multiple clients , but I noticed that messages some times go to one client and sometimes to the other.