One of the core pieces
of infrastructure at Google is something called Protocol Buffers. We are
really pleased to be open
sourcing the system, but what are these buffers?
Protocol buffers
are a flexible, efficient, automated mechanism for serializing structured data – think XML,
but smaller, faster, and simpler. You define how you want your data to be structured once,
then you can use special generated source code to easily write and read your structured data
to and from a variety of data streams and using a variety of languages. You can even update
your data structure without breaking deployed programs that are compiled against the "old"
format
It is probably best to take a peak at some code behind this. The
first thing you need to do is define a message type, which can look like the following .proto
file:
message Person { required string name = 1; required
int32 id = 2; optional string email = 3;
enum PhoneType
{ MOBILE = 0; HOME = 1; WORK = 2; }
message PhoneNumber { required string number = 1;
optional PhoneType type = 2 [default = HOME]; }
Once you have defined
a message type, you run a protocol buffer compiler on the file to create data access classes
for your platform of choice (Java, C++, Python in this release).
Then
you can easily work with the data, for example in C++:
We sat down with
Kenton Varda, a software engineer who worked on the open source effort, to get his take on
Protocol Buffers, how we ended up with them, how they compare to other solutions, and
more: