How Safari Books Online uses Google BigQuery for business intelligence
This guest post was written by
Daniel Peter, Senior Programmer Analyst at Safari Books Online.
Cross-posted from the Google
Cloud Platform Blog
Safari Books Online is a subscription service for individuals and organizations to access a
growing library of over 30,000 technology and business books and videos. Our customers browse
and search the library from web browsers and mobile devices, generating powerful usage data
which we can use to improve our service and increase profitability. We wanted to quickly and
easily build dashboards, improve the effectiveness of our sales teams and enable ad-hoc
queries to answer specific business questions. With billions of records, we found it
challenging to get the answers to our questions fast enough with our existing MySQL
databases.
Looking for alternative solutions to build our dashboards and enable interactive ad-hoc
querying, we played with several technologies, including Hadoop. In the end, we decided to use
Google BigQuery.
Here’s how we pipe data into BigQuery:
Our data starts in our CDN and server logs, gets packaged up into compressed files, and runs
through our ETL server before finishing in BigQuery.
Here’s one of the dashboards we built using the data:
You can see that with the help of BigQuery, we can easily categorize our books. This dashboard
shows popular books by desktop and mobile, and with BigQuery, we are able to run quick queries
to dive into other usage patterns as well.
BigQuery has been very valuable for our company, and we’re just scratching the surface of what
is possible.
Check out
the
article for more details on how we manage our import jobs, transform our data, build
our dashboards, detect abuse and improve our sales team's effectiveness.
Posted by Scott Knaster,
Editor