Google Open Source Report Card
Originally posted on Google
Open Source Blog
Posted by Josh Simmons, Open Source Programs Office
Open source software enables Google to build things quickly and efficiently without
reinventing the wheel, allowing us to focus on solving new problems. We stand on the shoulders
of giants and we know it. This is why we
support
open source and make it easy for Googlers to release the projects they’re working on
internally as open source.
Today we’re sharing our first Open Source Report Card, highlighting our most popular projects,
sharing a few statistics and detailing some of the projects we’ve released in 2016.
We’ve open sourced over 20 million lines of code to date and you can find a
listing
of some of our best known project releases on our website. Here are some of our most
popular projects:
- Android - a software
stack for mobile devices that includes an operating system, middleware and key
applications.
- Chromium - a project
encompassing Chromium, the software
behind Google Chrome, and Chromium
OS, the software behind Google Chrome OS devices.
- Angular - a web application framework
for JavaScript and Dart focused on developer productivity, speed and testability.
- TensorFlow - a library for
numerical computation using data flow graphics with support for scalable machine learning
across platforms from data centers to embedded devices.
- Go - a statically typed and compiled
programming language that is expressive, concise, clean and efficient.
- Kubernetes - a system for automating
deployment, operations and scaling of containerized applications.
- Polymer - a lightweight
library built on top of Web Components APIs for building encapsulated re-usable elements in
web applications.
- Protobuf -
an extensible, language-neutral and platform-neutral mechanism for serializing structured
data.
- Guava - a set of
Java core libraries that includes new collection types (such as multimap and multiset),
immutable collections, a graph library, functional types, an in-memory cache, and
APIs/utilities for concurrency, I/O, hashing, primitives, reflection, string processing and
much more.
- Yeoman - a robust and opinionated set of
scaffolding tools including libraries and a workflow that can help developers quickly build
beautiful and compelling web applications.
While it’s difficult to measure the full scope of open source at Google, we can use the subset
of projects that are on GitHub to gather some interesting data. Today our GitHub footprint
includes over 84 organizations and 3,499 repositories, 773 of which were created this
year.
Googlers use countless languages from Assembly to XSLT, but what are their favorites? GitHub
flags the most heavily used language in a repository and we can use that to find out. A survey
of GitHub repositories shows us these are some of the languages Googlers use most often:
- JavaScript
- Java
- C/C++
- Go
- Python
- TypeScript
- Dart
- PHP
- Objective-C
- C#
Many things can be gleaned using the
open
source GitHub dataset on BigQuery, like usage of
tabs
versus spaces and the most
popular
Go packages. What about how many times Googlers have committed to open source
projects on GitHub? We can search for Google.com email addresses to get a baseline number of
Googler commits. Here’s our query:
SELECT count(*) as n
FROM [bigquery-public-data:github_repos.commits]
WHERE committer.date > '2016-01-01 00:00'
AND REGEXP_EXTRACT(author.email, r'.*@(.*)') = 'google.com'
With this, we learn that Googlers have made 142,527 commits to open source projects on GitHub
since the start of the year. This dataset goes back to 2011 and we can tweak this query to
find out that Googlers have made 719,012 commits since then. Again, this is just a baseline
number as it doesn’t count commits made with other email addresses.
Looking back at the projects we’ve open-sourced in 2016 there’s a lot to be excited about. We
have released open source
software,
hardware
and
datasets.
Let’s take a look at some of this year’s releases.
Seesaw
Seesaw is a Linux Virtual Server
(LVS) based load balancing platform developed in Go by our Site Reliability Engineers. Seesaw,
like many projects, was built to scratch our own itch.
From our
blog
post announcing its release: “We needed the ability to handle traffic for unicast
and anycast VIPs, perform load balancing with NAT and DSR (also known as DR), and perform
adequate health checks against the backends. Above all we wanted a platform that allowed for
ease of management, including automated deployment of configuration changes.”
Vendor Security Assessment Questionnaire (VSAQ)
We assess the security of hundreds of vendors every year and have developed a process to
automate much of the initial information gathering with
VSAQ. Many vendors found our questionnaires
intuitive and flexible, so we decided to shared them. The VSAQ Framework includes four
extensible questionnaire templates covering web applications, privacy programs, infrastructure
as well as physical and data center security. You can learn more about it in our
announcement
blog post.
OpenThread
OpenThread, released by
Nest, is a complete implementation of the
Thread protocol for connected devices in the home.
This is especially important because of the fragmentation we’re seeing in this space.
Development of OpenThread is supported by ARM, Microsoft, Qualcomm, Texas Instruments and
other major vendors.
Magenta
Can we use machine learning to create compelling art and music? That’s the question that
animates
Magenta, a project
from the
Google Brain team
based on TensorFlow. The aim is to advance the state of the art in machine intelligence for
music and art generation and build a collaborative community of artists, coders and machine
learning researchers. Read the
release announcement for
more information.
Omnitone
Virtual reality (VR) isn’t nearly as immersive without spatial audio and much of VR
development is taking place on proprietary platforms.
Omnitone is an open library built
by members of the Chrome Team that brings spatial audio to the browser. Omnitone builds on
standard Web Audio APIs to deliver an immersive experience and can be used alongside projects
like
WebVR. Find out more in our
blog
post announcing the project’s release.
Science Journal
Today’s smartphones are packed with sensors that can tell us interesting things about the
world around us. We
launched
Science Journal to help
educators, students and citizen scientists tap into those sensors. You can learn more about
the project in our
announcement
blog post.
Cartographer
Cartographer is a library for
real-time simultaneous localization and mapping (SLAM) in 2D and 3D with
Robot Operating System (ROS) support.
Combining data from a variety of sensors, this library computes positioning and maps
surroundings. This is a key element of self-driving cars, UAVs and robotics as well as efforts
to
map
the insides of famous buildings. More information on Cartographer can be found in
our
blog post
announcing its release.
This is just a small sampling of what we’ve released this year. Follow the
Google Open Source Blog to stay
apprised of Google’s open source software, hardware and data releases.