Google Cloud Storage: high performance that just works
|
Ville |
|
Navneet |
By Navneet Joneja, Product Manager,
and Ville Aikas, Technical Lead
When evaluating options for cloud storage, customers often wonder, "How can we optimize our
storage to get the highest performance possible?". We believe you shouldn't have to, so we do
all the optimization for you – enabling you to focus on your application instead of the
minutiae of storage optimization.
The performance of cloud storage services (and indeed most web services) depends on two main
factors: the network that moves the data between us and the end user, and the performance of
the storage service itself.
1. Network
When you make a request to Google Cloud Storage, one of the key determinants of performance is
the network path between you and our servers. This path is critical because if the network is
slow or unreliable, it doesn’t really matter how fast the backend is.
There are two main ways to make the network faster:
- Serve the request from as close to the user as possible
- Optimize the network routing between the end-user and the service, including
avoiding pockets of network congestion and minimizing the number of network hops between the
user and the service.
2. Storage
The other component of system performance is how quickly our servers process your request. The
data needs to be managed optimally and once an end-user’s request reaches our servers, we need
to serve the request as fast as possible. In a sense, Google Cloud Storage is a gigantic
filesystem: authorization checks need to happen, the object in question needs to be looked up,
and the data requested needs to be read from the physical storage medium and transferred to
the end user, all as efficiently as possible.
So, how do we make sure your requests are served as fast as possible?
- Google Cloud Storage is built on Google’s proprietary network and
datacenter technology. We’ve spent more than a decade building out proprietary infrastructure
and technology to power Google’s sites (after all, we believe that fast is better than
slow). When you use Google Cloud Storage, the same network goes to work for your
data.
- We replicate data to multiple data centers and serve an end-user’s request from the
nearest data center that holds a copy of the data. We also offer a choice of regions
(currently U.S. and Europe) to allow you to keep your data close to where it’s most needed. We
then take this one step further. When you upload an object and mark it as cacheable (by
setting the standard HTTP Cache-Control
header), we automatically figure out how best to serve it using Google’s broad network
footprint, including caching it closer to the end-user if possible.
- Finally, you don’t need to worry about optimizing your storage layout (like you
would on a physical disk), or the lookups (i.e. directory and naming structure) like you would
on most file systems and some other storage services. We take care of all the "file system"
optimizations behind the scenes.
In other words, when you store your data on
Google Cloud Storage, we do all the
background work to make it fast so that you can focus on your application.
Navneet Joneja loves being at the forefront of the next generation of simple and
reliable software infrastructure, the foundation on which next-generation technology is being
built. When not working, he can usually be found dreaming up new ways to entertain his
intensely curious almost-two-year-old.
Ville Aikas likes to work on tools and services that make developers lives easier
and "just work". When not busy cranking out code, he loves to play soccer with his kids, build
robots and watch F1.
Posted by Scott Knaster,
Editor