Posted by the Google Fonts team
The Google Fonts catalog now includes Japanese web fonts. Since shipping Korean in February, we have been working to optimize the font slicing system and extend it to support Japanese. The optimization efforts proved fruitful—Korean users now transfer on average over 30% fewer bytes than our previous best solution. This type of on-going optimization is a major goal of Google Fonts.
Japanese presents many of the same core challenges as Korean:
The impact of the large character set made up of complex glyph contours is multiplicative, resulting in very large font files. Meanwhile, the complex writing system and character interactions forced us to refine our analysis process.
To begin supporting Japanese, we gathered character frequency data from millions of Japanese webpages and analyzed them to inform how to slice the fonts. Users download only the slices they need for a page, typically avoiding the majority of the font. Over time, as they visit more pages and cache more slices, their experience becomes ever faster. This approach is compatible with many scripts because it is based on observations of real-world usage.
Frequency of the popular Japanese and Korean characters on the web
As shown above, Korean and Japanese have a relatively small set of characters that are used extremely frequently, and a very long tail of rarely used characters. On any given page most of the characters will be from the high frequency part, often with a few rarer characters mixed in.
We tried fancier segmentation strategies, but the most performant method for Korean turned out to be simple:
A user of Google Fonts viewing a webpage will download only the slices needed for the characters on the page. This yielded great results, as clients downloaded 88% fewer bytes than a naive strategy of sending the whole font. While brainstorming how to make things even faster, we had a bit of a eureka moment, realizing that:
In combination, these features mean we no longer have to worry about queuing delays as we would have under HTTP/1.1, and therefore we can do much more fine-grained slicing.
Our analyses of the Japanese and Korean web shows most pages tend to use mostly common characters, plus a few rarer ones. To optimize for this, we tested a variety of finer-grained strategies on the common characters for both languages.
We concluded that the following is the best strategy for Korean, with clients downloading 38% fewer bytes than our previous best strategy:
For Japanese, we found that segmenting the first 3,000 characters into 20 slices was best, resulting in clients downloading 80% fewer bytes than they would if we just sent the whole font. Having sufficiently reduced transfer sizes, we now feel confident in offering Japanese web fonts for the first time!
Now that both Japanese and Korean are live on Google Fonts, we have even more ideas for further optimization—and we will continue to ship updates to make things faster for our users. We are also looking forward to future collaborations with the W3C to develop new web standards and go beyond what is possible with today's technologies (learn more here).
PS - Google Fonts is hiring :)