This article is the Part 1 of detailed description of my talk at HighLoad++ 2015.
Today Namba Taxi is the leading passenger transportation company in our region. I am going to tell how we achieved the fault-tolerant architecture and why we can afford to lose any of our physical servers without the loss in architecture productivity.
What are the existing types of taxi services so far?
The first type is a private taxi or «bombila» or «bordurshik» in our slang. A driver has his own rules of how to survive in the taxi market, own rates and he works without intermediaries. The second type of taxi service is a dispatching service. It may or may not be automated and use GPS-trackers and portable radios, or use other available means. The last generation of taxi service is an automated intermediary known as Uber.
What our company is like?
Our company is the service that is keen on automatization. Taking this into account, we try to catch up the Uber and during our work on the project we have acquired about 300 000 satisfied clients. The number may seem small, but it is one-third of Bishkek, the capital of our country. We have 600 drivers online and we process more than 8000 orders per day. Our daily workload looks like this:
There are two rush hours per our graph: in the morning (from 6 to 10am) and in the evening (from 4 to 8pm). Our servers process up to 3500 requests per second. We are also able to send responses to drivers within 20 milliseconds and to dispatchers within 2.5 milliseconds.
I will tell how we managed to build the fault-tolerant architecture. I will also describe the experience of choosing the software for IP telephony, and how we managed to send the voice traffic via IP-telephony using this technology.
But let me start from the very beginning.
The story began in 2011 when our CEO came to us and told that he wanted to open the taxi company in Bishkek. We started working, explained where to take servers, how to install them, and recommended the only existing player in the market of taxi service automatization.
The task was to allow the client to choose whether to call or to send a message to call a taxi. The old system looked the following way:
- An operator accepts an order and resends it to a driver,
- The driver accepts the order,
- The driver goes to the order point and picks up the client,
- The driver drives client to the destination point, and the manager gets the report in the workflow.
Workflow using providers’ software
The key features of our program are service provider and active call handling module. Operators could call and answer calls. There were SMS-notifications, partially automated workflow and a plenty of Chinese navigators used by drivers.
SMS-notifications were sent in two cases. The first time is when a driver accepts the order, and the second is when he arrives at destination place. Unfortunately, our partners were not able to arrange with each other, so we decided to change our provider. The cause was the inability of the service to work for up to 4 hours in a row without fails. We wanted to develop and embrace new markets and segments, and our provider was not ready to the changes.
Once we decided to create our own software, the work started with system requirements. By that time the taxi company had already been working for a year using provider software. We tried to implement minimum interface changes to make people who used the previous version of the program easily adapt to the new one. We also planned to create a flexible product.
Drivers were working with navigators on Windows CE, and we decided to leave them as is. We also planned to add the support of Android devices, real-time updates in dispatching interface, IP telephony. The immediate switch to our software was planned as soon as these modules were ready.
What were the limitations? The first one - the extremely high rates on mobile internet. So we had to reduce traffic between driver devices and servers. Servers were simple home stations. We had a small tech team and six months for development.
We decided to implement the web application, since it is quickly scaled by people, and is not bound by any software or device. All you need for web application is browser.
We decided to implement one core. It either represented the API for drivers, managers, and used for accepting payments, or automated the work.
- Django for Core;
- Redis for Publish/Subscribe mechanism;
- Node.js for tracking realtime events in the dispatcher interface;
- Twisted as a socket server for drivers;
- Ruby — for work with SMS;
- WebRTC — for telephony.
Why so many technologies in one project?
Firstly, Ruby has Ruby-smpp, that provides excellent work with smpp-protocol to get and send SMS. We need Node.js because of Socket.IO that allows different types of transport to give support for real time messaging.
What made us deal with raw choice of telephone stations and create the telephony on their base?
We wanted to build the system using open source software, and without any limitations to operating systems or devices. Thanks to this and to the decision to use web we could reduce number of workplaces. Staff could work remotely. We no longer needed to spend money on switching equipment and licenses. We also could attract more consumers.
In the upcoming article I will describe our first working version of the taxi software, and what we faced with while integrating it to an operating business.