How Path engineers hacked their way to a fast, scalable messaging service

Building an application that delivers data in real time can be complicated. That is certainly the case for real-time, cross-platform mobile messaging applications. Apple (s aapl) and Facebook (s fb) have figured out how to do it well, but they’re big companies. Now it’s clear that Path put a good bit of work into the development of its messaging capability, which the social networking company made available to users in its Path 3 release in March.

In a blog post the company was expecting to publish on Monday, Neil Chintomby, a senior server engineer at Path, explains the architecture of the messaging feature and the process of putting it together.

Engineers customized the open-source ejabberd instant-messaging server software — one of many XMPP servers — for its own purposes, Chintomby wrote.

They decided not to use the database that comes with ejabberd out of the box. Instead, they built an application programming interface (API) service to act as a tunnel between ejabberd and MongoDB, a NoSQL database that has become quite popular but has had scaling issues. In the blog post Chintomby justifies the setup:

“We had three main reasons for implementing this API service. First, we wanted to decouple the ejabberd servers from the datastore, giving us a layer of technical flexibility. We had learned from previous experience that having servers talking directly to MongoDB led to difficulties in scaling so we wanted to avoid those pitfalls when bringing up messaging. The second reason was for the API service to provide a layer of caching between ejabberd and MongoDB and give us more consistent timings when fetching data. The third reason was because we could cache many more items for a node with our own API service, instead of relying on ejabberd with mnesia which limits the number of items maintained for each node. Although this limit is configurable, being able to control our caching layer was likely to be less problematic.”

The result of the decision is that when a user hits the send button, ejabberd picks up on it, sends it to the API service to save the message in the database and cache it, and finally routes it to all the intended recipients’ connected devices pretty much right away. If a device is offline, a push notification is delivered.

The team cut lag time by making sure a device only requests content that is actually new, and isn’t also asking for conversations that haven’t changed in a while:

“We added a new last modified attribute to each subscription in the subscriptions response, indicating when an item was last published to that node, which the client then stores locally. The next time the client asks for subscriptions, if a last published timestamp for a node is newer than the timestamp that the client has, then the client will request new items from that node, which is another request where we can use the ‘since’ attribute that we added to get items requests. This way we optimized the client to only make get items requests for nodes that actually have new content and reduced get items calls to the server.”

The engineers also devised a method for delivering data-heavy photos without delaying the acceptance of a bunch of subsequent messages.

Source: Path
Source: Path

Rather than try to move around a full picture with filters and other edits, the app makes a little thumbnail in color and a little one in grayscale. The lightweight grayscale thumbnail gets initially transmitted to the recipient, as a sort of provisional file. Then the full file is processed server-side, and finally it is beamed down to the recipient. An early iteration of sending photo messages took 4 seconds on average to show up in a sender’s conversation on a certain iPhone, but engineers brought the average time down to 250 milliseconds on the same device, Chintomby wrote.

These measures and others are designed to make sure the private messaging part of Path will keep working even as the service takes on more and more users. “We had to design the system in such a way that when we have potentially millions of users using it, it would still be performant,” Chintomby told me in a phone interview. The service has passed the 12 million registered user mark, and it’s right to think about being able to keep the experience great as it keeps on growing.

This post was updated at 1:15 p.m. PT to include Path’s most recent user count.