Gmail for Mobile HTML5 Series : Cache Pattern For Offline HTML5 Web
Applications
On April 7th, Google
launched a new version of Gmail for mobile for iPhone and Android-powered devices. We shared
the behind-the-scenes story through this blog and decided to share more of our learnings in a
brief series of follow-up blog posts. This week, I'll talk about the cache pattern for
building offline-capable web applications.
I recently gave a talk
(preserved YouTube here)
about the cache pattern and the Web
Storage Portability Layer (WSPL) at Google I/O. It was exciting getting to give a
talk at the Moscone Center as previously I had only ever been one of the audience members. The
conference seemed to go by in a blur for me as I was sleep-deprived from getting the WSPL to
"just good enough" to actually be released. (And some ofyou have already pointed out that I
missed several bugs.) In my talk, I provided a general overview of the cache pattern and this
post expands on the handling of hit determination and merging server and local changes.
The cache pattern is a design pattern for building an offline-capable
web application. We implemented the cache pattern to make Gmail for Mobile tolerant of flaky
wireless connections but the approach is generally applicable. Here's how it works. Consider a
typical AJAX application. As shown in the diagram, we have a web application with a local
model, view and controllers. The user interacts with theapplication and the controller
dispatches XmlHttpRequests (XHRs for short) to the server. The server sends asynchronous
requests to the application which it inserts into the model.
As shown in this next diagram, in the cache pattern, we
insert a cache between the application and the server. Having done so, many requests that
would otherwise require a round-trip to the network.
A software cache like this one shares a great deal
conceptually with hardware caches. When designing the cache used in Gmail for mobile, we used
this similarity to guide our design. For example, to keep our cache as simple as possible, we
implemented a software equivalent to a write-through cache with early forwarding and LRU
eviction. The cache pattern in general (and consequently our implementation) has four
important data flows as shown in the diagram.
- Cached content bound for the
UI.
- Changes made to the cache by the user in the UI. These need to be both
reliably sent to the server and updated locally in the cache so that reads from the cache for
UI updates show the state including user changes.
- The changes recorded in
the cache need to be sent upstream to the server as the network connection is
available.
- Changes made to the server (like email delivery in the case of
Gmail) need to be merged into the contents of the
cache.
As shown in the diagram we also need a
place to actually write the data. We use the
WSPL
library to write a cache implementation portable across both Gears and HTML5 databases.
To actually implement these four data flows, we need to decide on a hit
determination mechanism, a coherency strategy and a refresh approach.
Hit Determination
At
its heart, a cache is a mapping from keys to values: the UI invokes the cache with a key and
the cache returns the corresponding element. While this sounds pretty simple, there is an
additional source of complexity if the application wants to provide the user with summary
listings of some subset of all values available from the server. To provide this feature, the
cache needs to contain not only "real" data values but additional "index" values that list the
keys (and possibly user-visible summaries) for "data" values. For example, in Gmail for
mobile, the cache stores conversations as its "real" data values and lists of conversations
(such as the Inbox in Gmail for Mobile) as its "index" values. Keys for index values are
computed specially to record what subset of the complete index is cached locally. For example,
in Gmail for Mobile, while a user's Inbox may contain thousands of conversations, the cache
might contain an index entry whose data values lists metadata for only conversations 1000
through 1100. Consequently, Gmail for Mobile's cache extends keys with the cached range so
that a request for metadata for conversations 1101 through1110 would be considered a cache
miss.
Coherency and Refresh
Perhaps the most complex
aspect of the cache implementation is deciding how to get updated content from the server and
how to merge server updates with changes made locally. A traditional hardware cache resolves
this problem by only letting one processor modify its a cache at a time and have the memory
broadcast any changes to all the other caches in the system. This approach cannot work here
because the Gmail server can't connect to all of its clients and update their state. Instead,
the approach we took for Gmail for Mobile was for the client device regularly poll the server
for alterations.
Polling the server for
changes such as new email or the archiving of email by the same user from a different device
implies a mechanism for merging local changes with server side changes. As mentioned above,
Gmail for Mobile is a write-through cache. By keeping all of the modifications to the cache in
a separate queue until they have been acknowledged, they can be played back against updates
delivered from the server so that the cache contains the merge of changes from the server and
the local user. The following diagram shows the basic idea:
The green box in the diagram shows the contents of the cache's
write buffer changing over time and the cloud corresponds to the requests in-flight to the
server with time advancing from left to right in the diagram. The function names shown in the
diagram are from the simplenotes.js
example file in the Web Storage Portability
Layer distribution. Here, the user has applied some change [1] and the cache has written it to
the write buffer and has then requested new content resulting in query [Q]. The cache prefixes
the outstanding actions from the write buffer to the query. Action [1] is marked as needing a
resend on some sort of network failure.
Later, the user makes change
[2] to the UI which causes the cache to append it to the write buffer in the applyUIChange
call. Later still, another query is made and so, the cache sends [1][2][Q] to the server. In
the mean time, the user makes yet another change [3]. This is written to the write buffer.
Once changes [1] and [2] are acknowledged by the server along with the new cache contents for
query [Q], changes [1] and [2] are removed from the write buffer. However, to keep the cache's
state reflecting the user's changes, change [3] is applied (again) over top of the result for
[Q].
Simplifying the implementation of this reapplication stage is the
most important benefit of implementing a write-through cache. By separating the changes from
the state, it becomes much easier to reapply the changes to the cache once the server has
delivered new content to the cache. As discussed in a
previous post, the use of
SQL triggers can greatly improve database performance. Whether updating or re-updating,
triggers are a great way to make the application of changes to the cache much more
efficient.
Cached Content To the UI
The first of the four data flows is delivering content
to the UI is reasonably easy: query the cache for the desired content and when the query
completes, forward the request to the UI. If you look at the
getNoteList_ function from
the simplenotes.js example
code included in the WSPL distribution, you'll see that the delivering cached content to the
UI has the following basic steps:perform hit
determination: deciding if the requested cache contents are actually in the
cache.
- create a database transaction, and while in
the transaction
- query the database for the desired
key
- accumulate the results
- then outside of
the transaction, return the result to the UI.
Changes From The UI
The second flow (
applyUiChange) is recording changes made by the user to the
write buffer. It has a very similar
structure- create a database transaction,
and while in the transacation
- write the change to the write
buffer
- wait for a trigger to update the state of the
cache.
Updates Bound For The
Server
As discussed above, once the changes have been written to the write
buffer, they still have to be sent to the server. This happens by prepending them to queries
bound for the server. The fetchFromServer from the example is responsible for this. As might be familiar by
now, the flow is
- create a
database transaction and while in the transaction
- query the
write buffer for all the entries that need to be sent to the
server
- accumulate the entries
- then outside
the transaction, send the combination of changes and query to the
server
Changes From The Server
Finally, we need to merge
the changes from the server into the cache as is done in the insertUpdate
method from the example. Here the flow is as
follows:
- create a database transaction
and while in the transaction
- update the
directory
- write the new content into the cache
- touch
the changes in the write buffer that need to be re-applied to the
cache
- wait for the trigger to complete its
update
- then, outside of the transaction, send the response to
the UI if it was satisfying a cache miss.
That's a
brief intro to the cache architecture as found in Gmail for mobile. We're continuing to improve our implementation
of this basic architecture to improve both the performance and robustness of Gmail for mobile. Please stay
tuned for follow on blog posts.
Previous posts from Gmail for
Mobile HTML5 SeriesHTML5 and Webkit pave the way for mobile web applications
Using AppCache to launch offline - Part 1
Using AppCache to launch offline - Part 2
Using AppCache to launch offline - Part 3
A Common API for Web Storage
Suggestions for better performance
By Robert Kroeger, Software Engineer, Google Mobile
Team