By Deepanshu
In Part-1 and Part-2 of the Courier series, we spoke about our rationale behind building Courier and the quest for a message broker for Courier. Here’s a quick recap:
- We had decided to replace HTTP polling with a long-running persistent connection for sending order updates to our merchants.
- This was done due to multiple issues with polling like high data & battery consumption, non-deterministic behavior, and high infra usage.
- Multiple protocols were explored for long-running connection like gRPC, WebSockets, MQTT. The finalized protocol was MQTT over TCP.
- Few major reasons for choosing MQTT were small network footprint, multiple QoS(Quality of Service) levels, and built-in acknowledgment.
- The MQTT broker chosen was VerneMQ which is an open-source, performant, and highly available message broker implementation.
Read more here:
Courier is our in-house solution to provide a realtime, lightweight, and high efficient messaging highway between mobile apps and server using MQTT.
When we had started with the design of Courier library for our merchant app, we had a few questions in mind:
- Which library should we use for MQTT client implementation? Or should we create our own library?
- How to solve the Android platform-specific issues?
Let me answer each question individually.
Should we create our own library or use an existing one?
Actually, we did a combination of both. We used an existing library for MQTT protocol implementation and created our own library on top of it.
The library which we used for MQTT implementation is Paho. It is a very popular open-source library written in Java and maintained by Eclipse.
According to the documentation, Paho Java client is an MQTT client library written in Java for developing applications that run on the JVM or other Java compatible platforms such as Android
Now, the question comes — what was the reason for creating a new library on top of Paho? So, there were multiple reasons for doing this.
Paho is a simple MQTT protocol implementation, providing two types of APIs: blocking and non-blocking. It handles all the protocol-specific logic like creating an MQTT connection, subscribing/unsubscribing a topic, publishing & receiving messages, and disconnecting an existing connection. But it is not a full-fledged library needed for maintaining a long-running persistent connection.
Now let’s have a look at the features which the Courier Android library provides.
Courier library features
Clean API
Courier library provides a Retrofit-inspired interface for Send, Receive, Subscribe and Unsubscribe operations.
For connect and disconnect API, Courier library provides a separate interface — MqttClient.
Message adapters and stream adapters support
Similar to Retrofit, the Courier library provides the support of adding custom message and stream adapters.
By default, the library provides support of Gson, Moshi & Protobuf message adapters and RxJava2 & Coroutines(Flow) stream adapters.
Automatic Reconnect
This is one of the most important features of the Courier library. Whenever the MQTT connection is disconnected due to non-user-initiated actions like network loss or socket read timeout, the library tries to reconnect automatically. The exact retry logic is decided by retry policies, which we will discuss soon.
Subscription store & Automatic Resubscribe
Subscription store keeps track of the subscribed topics and pending unsubscribes.
Subscribed topics are used for automatic resubscription when the client gets reconnected on non-persistent connections(cleansession=true).
Pending unsubscribe topics are maintained for avoiding stale subscriptions on the MQTT broker for persistent connections(cleansession=false).
Retry & Fallback policies
There are different types of policies provided by Courier library — connect retry policy, subscription policy, host fallback policy. These policies decide what action should be performed when a failure occurs and what should be the delay in case of retry.
Connect retry policy decides whether the connection should be retried again when a connection failure happens and the next retry time.
Subscription retry policy decides whether the subscription should be retried again when a subscription request fails and the next retry time.
Host fallback policy decides the next host be retried when multiple hosts are available and the connection fails on one of the hosts.
AlarmPingSender
Paho provides the support of PingSender which can be used for sending PING REQUESTS over the MQTT connection in order to keep the connection alive as long as possible.
Courier library makes use of two implementations of PingSender: AlarmPingSender and TimerPingSender.
Courier library uses the AlarmPingSender implementation which provides support of ELAPSED_REALTIME_WAKEUP, RTC_WAKEUP, ELAPSED_REALTIME, and RTC alarms to schedule pings.
The library uses RTC_WAKEUP, which works in doze mode and other power-saving modes(with some restrictions) as well. This helps in keeping the connection alive when the app is in the background. This is also discussed in detail later.
TimerPingSender is used for cases where the connection is not needed to be alive in the background. It uses Timer for scheduling pings.
Adaptive Keepalive
Adaptive keepalive is a feature in the Courier library which tries to find the most optimal keepalive interval for a client on a particular network. This helps in optimising the number of ping requests sent over the network and keeping the connection alive. This will be discussed in greater detail, in a separate blog.
Backpressure Handling
Courier library handles the backpressure using internal persistence by storing all the received messages first and then delivering them to the library’s client.
Backpressure is a situation where producer produces at a pace faster than the pace at which the consumer is consuming.
Database Persistence
Paho provides support for persistence but it provides only file-based persistence which is not very efficient for mobile devices. Courier library provides an SQLite database implementation of the MqttClientPersistence interface provided by Paho.
Event Provider
Courier library provides many events out of the library which can be used for multiple purposes like taking an action when a particular event happens or just adding analytics for tracking the system metrics.
For example, the library provides connect events, subscribe events, ping events, message send & receive events, and many more.
MQTT Chuck
MQTT Chuck is similar to HTTP Chuck, used for inspecting the HTTP calls on an android application. MQTT chuck inspects all the outgoing or incoming packets for an underlying MQTT connection.
Similar to HTTP Chuck, it uses an interceptor to intercept all the packets, persisting them and providing a UI for accessing all the MQTT packets sent or received. It also provides multiple other features like search, share, and clear data.
Authentication
Clients need to be authenticated before making an MQTT connection in order to permit only authenticated users to access the MQTT broker. MQTT protocol uses a combination of username and password for authentication.
Courier library provides a mechanism to fetch authentication details like JWT token and broker address by making an API call, which is further used for MQTT authentication.
That’s all about the features of our Courier library. Let’s now discuss the platform-specific challenges we faced.
Android platform-specific challenges
Android OS has become very conservative about the battery usage by applications, especially when the app is in the background. Applications also have a high chance of getting killed by OS, if they are running in the background for a long time without any component visible to the user.
Also, there are multiple power-saving modes on Android like Sleep mode, Doze mode, and App Standby mode. The device enters these modes when it is not charging and the device/app is not used for a long time.
A long-running connection needs to keep running in both foreground and background. So it becomes important to consider the above cases at the time of implementation.
Courier library handles this by using AlarmPingSender which schedules RTC_WAKEUP alarms using AlarmManager’s setExactAndAllowWhileIdle()
API for sending ping requests in order to keep the connection alive.
Due to the restrictions in doze mode, neither setAndAllowWhileIdle()
nor setExactAndAllowWhileIdle()
can fire alarms more than once per 9 minutes, per app. So we introduced Adaptive Keepalive, which minimises the number of alarms scheduled by finding the most optimal keepalive time on a particular network.
What’s Next?
We are planning to expand Courier for multiple use-cases like Chat and Realtime Event Tracking. This will also help us to achieve our long-term goal of having a single long-running connection on the Gojek app.
We also have future plans of making the Courier library for Android open-source. Follow here for the open-source projects by Gojek.
Stay tuned! In the next blog, we’ll talk about the iOS Courier library.
Find more stories from our vault, here.
Also, we’re hiring! Check out open job positions by clicking below: