Hello there. It's my first blog post in English and I'm going to tell you how we built simple in-memory storage for animated cars. We show animated cars on the main screen of "Namba Taxi for clients" applications. This post is about the completed journey, algorithms and not about Go.

The beginning

The story begins in 2015 with the graduation work of our mobile developer. The work topic was 'Driver application for taxi service'. In the application, he animated the car. It looked like this.

We thought. Why not using this on the screen when the client can track driver location. And our first challenge was a lack of data. We get new driver location every 15 seconds. We can't just decrease update interval, because when driver mobile application sends data to us it gets the current order, next order and if there are any open alerts. Alert works like SOS button. When the driver presses it other drivers are in hurry to help him. And when we decrease update interval we can get more traffic to our systems. We weren't sure that we can handle it.

First steps

Our first try was simple and stupid:

Make the request and save coordinates.
Make another request and animate the car.

As you can guess, there were problems with it. We can't animate a car properly and it moves through fields, forests, lakes and quarters. It looks like this

As the solution of the problem, we used OpenStreetMap Routing Machine (OSRM) for building routes and our algorithm improved. We have the same timeout.

We make a request.
Save coordinates.
Send saved coordinates to the backend.
Build route via OSRM.
Return it to client app and animate marker.

It seems that it works now, but we faced with different problem with one-way roads

For instance, driver stays at the intersection at red point. But his device has bad accuracy and on location update driver stay on the opposite side of the crossroad. At client app we get these coordinates, save them and send to the backend. OSRM build a legit route and returns to the app and it looks ridiculous. Because marker moves very fast.

We solved this problem in a naive way. We check the shortest distance between two points and not build a route for distances less than 20 meters. With that algorithm after few days of testing, we decided to release our application and get feedback from customers.

And you know, we live in not ideal world so we decided to go to second iteration, because several things needed improvements.

First thing was a trip cost calculator. All calculations were on the driver side. In this case we save server resources because we don't send useless requests. On the another hand we have only one trust point. It's a driver's mobile app. So we need to duplicate data and save It on the server side.
Furthermore, we realized that 1 track per 15 seconds it's so little and lots of customers don't get sight of new feature because car starts moving In 15 seconds after the screen is opened. I know not so many people, who can look at the screen where nothing is going on for 15 seconds.
Moreover, we have lots of problems with GPS module on the driver side. GPS problems are associated with driver's device.
Finally, we wanted animated cars on the main screen.

Several issues to solve

Start to collect more tracks from drivers
Show animated cars on the main screen
Store intermediate trip cost on the server side
Save mobile data
Collect each track per one second

I want to tell a few words about saving mobile data. We need it because in our country we have a very cheap taxi receipt. We use taxi like public transport. For instance, you can get from one side of the city to another just for 2 euros. It's like metro in Paris. Also mobile internet cost is too high. If we save 100 bytes per second, then we'll save 2000$ at company scale.

What the data in track?

Driver location (Latitude, Longitude)
Driver session that we give upon login
Order information (OrderID and trip cost)

We decided that one track should be less than 100 bytes. And we started to look transport protocol to solve this issue

As you can see we watched for several protocols:

HTTP
WebSockets
TCP
UDP

And for us ideal option was UDP, because:

We send only datagrams
We don't need guarantees
Minimalism
Save lots of data
We have only 20 bytes overhead
Not blocked in mobile networks in our country

As for data serialization, we watched on:

JSON
MsgPack
Protobuf

📖 Want to know why Uber's app loads so quickly? Discover the strategic battle between Protocol Buffers and JSON, and how it affects everything from battery life to user experience.

We chose protocol buffers because it's so efficient on little data.

And as you can see the nearest competitor is heavier twice.

What do we have in total?

We have 42 bytes of payload
+ 20 bytes of IP headers
= 62 bytes per track.
PROFIT.

But when we get data also we need to store it, right?

Data storage

We need to store this data:

Driver's session to identify driver
Cab number to perform search for client's mobile application
Order id and trip cost to save trip data from driver's mobile application on server
Last location to perform search
N last locations to build route

Used storages

Percona — to store all data. We store drivers, orders, fares and everything.
Redis as key-value storage for caching purposes.
Elasticsearch for geocoding

As mentioned above we have 600 online drivers and using these storages to save data seems non-expedient. So we need geo index

We watched for two geo indices:

KD-tree
R-tree

We had requirements for geo index:

We need to search N nearest points.
We need a balanced tree to provide the best search in the worst scenario

KD-tree

And KD-tree doesn't fit our needs because it's unbalanced and can search only one nearest point. We can implement k-nearest neighbors over kd-tree, but we won't reinvent the wheel because R-tree already solves this issue.

R-tree

It looks like this. We can perform search N nearest points and it's balanced tree, we chose this one.

Also we need an expire mechanism because we need to invalidate drivers and show actual information to operators. For instance, remove driver via 900 seconds of inactivity. And also we need LRU data structure for storing last locations. Because we initialize storage for N items. When we try to add an item, when storage already filled. We remove the least recently used item and we add a new one. So here is our storage architecture

We store all data in-memory.
We use R-tree to perform a search of nearest drivers.
Also we use two maps for storing drivers and perform search by cab number or by session

Final algorithm

Here is the final algorithm on the backend:

Get data by UDP
Try to get driver from storage
If doesn't exist — get driver from redis
Check and validate data
Set driver to storage
If doesn't exist — initialize LRU
Update r-tree

HTTP endpoints

We implemented these endpoints to integrate to our systems

Return nearest drivers
Remove driver from storage(by cab number or session)
Get information about trip
Get information about driver

Conclusion

In the end, I want to give these conclusions that we used in our backend system:

UDP+Protobuf for data savings
In-memory storage
R-tree for nearest drivers
LRU cache for storing last locations
OSRM for map matching and building routes

As an example, you can watch here. It's too simple but implements lots of features described in the article.

Andrew Minkin

ex Team Lead

Latest articles here

Storage of Media Files on Multiple Django or My First Experience in Open Source

In this article, I will try to share my first experience in open source, on how to build a Python package by publishing it in PyPI, as well as to...

Backend development

Belek Abylov Software Engineer

Feb 28, 2017

4 min read

Running Geo Django on Elastic Beanstalk

I want to tell you this story because I wasted about 14 hours to realize how to achieve it correctly.NB❗Sometimes working with Elastic Beanstalk can...

Backend development

Andrew Minkin ex Team Lead

Jun 11, 2020

4 min read

Reduce GPS Data Errors on Android App with Kalman Filter and Accelerometer

One day I got a cool task to improve the accuracy of positioning and distance calculating based on GPS data on Android devices for the taxi service...

Mobile Development

Oleg Katkov Software Engineer

Mar 19, 2018

12 min read

Go to blog

How to Build Backend System for Uber-Like Map