Hello there. It’s my first blog post in English and I’m going to tell you how we built simple in-memory storage for animated cars. We show animated cars on the main screen of “Namba Taxi for clients” applications. This post is about the completed journey, algorithms and not about Go.
The beginning
Copy link
The story begins in 2015 with the graduation work of our mobile developer. The work topic was ‘Driver application for taxi service’. In the application, he animated the car. It looked like this.
We thought. Why not using this on the screen when the client can track driver location. And our first challenge was a lack of data. We get new driver location every 15 seconds. We can’t just decrease update interval, because when driver mobile application sends data to us it gets the current order, next order and if there are any open alerts. Alert works like SOS button. When the driver presses it other drivers are in hurry to help him. And when we decrease update interval we can get more traffic to our systems. We weren’t sure that we can handle it.
First steps
Copy link
Our first try was simple and stupid:
Make the request and save coordinates.
Make another request and animate the car.
As you can guess, there were problems with it. We can’t animate a car properly and it moves through fields, forests, lakes and quarters. It looks like this
As the solution of the problem, we used OpenStreetMap Routing Machine (OSRM) for building routes and our algorithm improved. We have the same timeout.
We make a request.
Save coordinates.
Send saved coordinates to the backend.
Build route via OSRM.
Return it to client app and animate marker.
It seems that it works now, but we faced with different problem with one-way roads
For instance, driver stays at the intersection at red point. But his device has bad accuracy and on location update driver stay on the opposite side of the crossroad. At client app we get these coordinates, save them and send to the backend. OSRM build a legit route and returns to the app and it looks ridiculous. Because marker moves very fast.
We solved this problem in a naive way. We check the shortest distance between two points and not build a route for distances less than 20 meters. With that algorithm after few days of testing, we decided to release our application and get feedback from customers.
And you know, we live in not ideal world so we decided to go to second iteration, because several things needed improvements.
First thing was a trip cost calculator. All calculations were on the driver side. In this case we save server resources because we don’t send useless requests. On the another hand we have only one trust point. It’s a driver’s mobile app. So we need to duplicate data and save It on the server side.
Furthermore, we realized that 1 track per 15 seconds it’s so little and lots of customers don’t get sight of new feature because car starts moving In 15 seconds after the screen is opened. I know not so many people, who can look at the screen where nothing is going on for 15 seconds.
Moreover, we have lots of problems with GPS module on the driver side. GPS problems are associated with driver’s device.
Finally, we wanted animated cars on the main screen.
Several issues to solve
Copy link
Start to collect more tracks from drivers
Show animated cars on the main screen
Store intermediate trip cost on the server side
Save mobile data
Collect each track per one second
I want to tell a few words about saving mobile data. We need it because in our country we have a very cheap taxi receipt. We use taxi like public transport. For instance, you can get from one side of the city to another just for 2 euros. It’s like metro in Paris. Also mobile internet cost is too high. If we save 100 bytes per second, then we’ll save 2000$ at company scale.
What the data in track?
Copy link
Driver location (Latitude, Longitude)
Driver session that we give upon login
Order information (OrderID and trip cost)
We decided that one track should be less than 100 bytes. And we started to look transport protocol to solve this issue
As you can see we watched for several protocols:
HTTP
WebSockets
TCP
UDP
And for us ideal option was UDP, because:
We send only datagrams
We don’t need guarantees
Minimalism
Save lots of data
We have only 20 bytes overhead
Not blocked in mobile networks in our country
As for data serialization, we watched on:
JSON
MsgPack
Protobuf
We chose protocol buffers because it’s so efficient on little data.
And as you can see the nearest competitor is heavier twice.
What do we have in total?
Copy link
We have 42 bytes of payload
+ 20 bytes of IP headers
= 62 bytes per track.
PROFIT.
But when we get data also we need to store it, right?
Data storage
Copy link
We need to store this data:
Driver’s session to identify driver
Cab number to perform search for client’s mobile application
Order id and trip cost to save trip data from driver’s mobile application on server
Last location to perform search
N last locations to build route
Used storages
Copy link
Percona — to store all data. We store drivers, orders, fares and everything.
Redis as key-value storage for caching purposes.
Elasticsearch for geocoding
As mentioned above we have 600 online drivers and using these storages to save data seems non-expedient. So we need geo index
We watched for two geo indices:
KD-tree
R-tree.
We had requirements for geo index:
We need to search N nearest points.
We need a balanced tree to provide the best search in the worst scenario
KD-tree
Copy link
And KD-tree doesn’t fit our needs because it’s unbalanced and can search only one nearest point. We can implement k-nearest neighbors over kd-tree, but we won’t reinvent the wheel because R-tree already solves this issue.
R-tree
Copy link
It looks like this. We can perform search N nearest points and it’s balanced tree. We chose this one
Also we need an expire mechanism because we need to invalidate drivers and show actual information to operators. For instance, remove driver via 900 seconds of inactivity. And also we need LRU data structure for storing last locations. Because we initialize storage for N items. When we try to add an item, when storage already filled. We remove the least recently used item and we add a new one. So here is our storage architecture
We store all data in-memory.
We use R-tree to perform a search of nearest drivers.
Also we use two maps for storing drivers and perform search by cab number or by session
Final algorithm
Copy link
Here is the final algorithm on the backend:
Get data by UDP
Try to get driver from storage
If doesn’t exist — get driver from redis
Check and validate data
Set driver to storage
If doesn’t exist — initialize LRU
Update r-tree
HTTP Endpoints
Copy link
We implemented these endpoints to integrate to our systems
Return nearest drivers
Remove driver from storage(by cab number or session)
Get information about trip
Get information about driver
Conclusion:
Copy link
In the end, I want to give these conclusions that we used in our backend system: