How redBus uses BigQuery to master big data
By Pradeep Kumar, redBus
This guest post was written by Pradeep Kumar. Pradeep is a technical architect at
redBus, an online travel agency in India that
provides a unified online bus ticketing service. We recently published a business
case study for redBus and wanted to dive into some more technical detail for the
readers of the Google Developers Blog.
Our company has been providing Internet bus
ticketing for India since 2006. There are more than 10,000 bus routes available for booking,
and we have dozens of machines processing booking requests. Each step in the booking process
produces a lot of data – on search terms, route availability, server health and more. We
needed tools to to be able to process this data quickly and easily to determine whether
decreases in customer bookings are the result of server problems or simply less demand.
While we typically use relational databases to store and analyze data, we knew we needed
something more powerful if we wanted to analyze 500GB or more, so we started to look at open
source frameworks like
Hadoop and analysis
platforms like
Hive and
Pig. We found that these frameworks require
considerable in-house expertise and infrastructure investments and wouldn’t give us answers to
our questions as fast as we wanted. We decided to try out
Google BigQuery as a trusted
tester, with hopes that it would give us the ability to perform quick iterative analysis
without much up-front investment. Our initial tests went very well, so we started building our
analysis tools on top of BigQuery.
BigQuery allows us to run SQL-like queries to understand the bus routes in highest demand and
what types of searches users are performing. We’ve also used it to build internal dashboards
that give us a snapshot of system health.
For more information on how we structured our immutable tables, pipelined our data into
BigQuery for analysis using RabitMQ, and to see example SQL queries we’ve used, check out my
article on
developers.google.com.
Pradeep Kumar is a technical architect at redBus.
Posted by Scott Knaster,
Editor