Platform Integration Guide: HarfBuzz Manual

Platform Integration Guide

GNOME integration, GLib, and GObject
FreeType integration
Cairo integration
Uniscribe integration
Core Text integration
ICU integration
Python bindings

HarfBuzz was first developed for use with the GNOME and GTK software stack commonly found in desktop Linux distributions. Nevertheless, it can be used on other operating systems and platforms, from iOS and macOS to Windows. It can also be used with other application frameworks and components, such as Android, Qt, or application-specific widget libraries.

This chapter will look at how HarfBuzz fits into a typical text-rendering pipeline, and will discuss the APIs available to integrate HarfBuzz with contemporary Linux, Mac, and Windows software. It will also show how HarfBuzz integrates with popular external libraries like FreeType and International Components for Unicode (ICU) and describe the HarfBuzz language bindings for Python.

On a GNOME system, HarfBuzz is designed to tie in with several other common system libraries. The most common architecture uses Pango at the layer directly "above" HarfBuzz; Pango is responsible for text segmentation and for ensuring that each input hb_buffer_t passed to HarfBuzz for shaping contains Unicode code points that share the same segment properties (namely, direction, language, and script, but also higher-level properties like the active font, font style, and so on).

The layer directly "below" HarfBuzz is typically FreeType, which is used to rasterize glyph outlines at the necessary optical size, hinting settings, and pixel resolution. FreeType provides APIs for accessing font and face information, so HarfBuzz includes functions to create hb_face_t and hb_font_t objects directly from FreeType objects. HarfBuzz can use FreeType's built-in functions for font_funcs vtable in an hb_font_t.

FreeType's output is bitmaps of the rasterized glyphs; on a typical Linux system these will then be drawn by a graphics library like Cairo, but those details are beyond HarfBuzz's control. On the other hand, at the top end of the stack, Pango is part of the larger GNOME framework, and HarfBuzz does include APIs for working with key components of GNOME's higher-level libraries — most notably GLib.

For other operating systems or application frameworks, the critical integration points are where HarfBuzz gets font and face information about the font used for shaping and where HarfBuzz gets Unicode data about the input-buffer code points.

The font and face information is necessary for text shaping because HarfBuzz needs to retrieve the glyph indices for particular code points, and to know the extents and advances of glyphs. Note that, in an OpenType variable font, both of those types of information can change with different variation-axis settings.

The Unicode information is necessary for shaping because the properties of a code point (such as its General Category (gc), Canonical Combining Class (ccc), and decomposition) can directly impact the shaping moves that HarfBuzz performs.

GNOME integration, GLib, and GObject

As mentioned in the preceding section, HarfBuzz offers integration APIs to help client programs using the GNOME and GTK framework commonly found in desktop Linux distributions.

GLib is the main utility library for GNOME applications. It provides basic data types and conversions, file abstractions, string manipulation, and macros, as well as facilities like memory allocation and the main event loop.

Where text shaping is concerned, GLib provides several utilities that HarfBuzz can take advantage of, including a set of Unicode-data functions and a data type for script information. Both are useful when working with HarfBuzz buffers. To make use of them, you will need to include the hb-glib.h header file.

GLib's Unicode manipulation API includes all the functionality necessary to retrieve Unicode data for the unicode_funcs structure of a HarfBuzz hb_buffer_t.

The function hb_glib_get_unicode_funcs() sets up a hb_unicode_funcs_t structure configured with the GLib Unicode functions and returns a pointer to it.

You can attach this Unicode-functions structure to your buffer, and it will be ready for use with GLib:

      #include <hb-glib.h>
      ...
      hb_unicode_funcs_t *glibufunctions;
      glibufunctions = hb_glib_get_unicode_funcs();
      hb_buffer_set_unicode_funcs(buf, glibufunctions);

For script information, GLib uses the GUnicodeScript type. Like HarfBuzz's own hb_script_t, this data type is an enumeration of Unicode scripts, but text segments passed in from GLib code will be tagged with a GUnicodeScript. Therefore, when setting the script property on a hb_buffer_t, you will need to convert between the GUnicodeScript of the input provided by GLib and HarfBuzz's hb_script_t type.

The hb_glib_script_to_script() function takes an GUnicodeScript script identifier as its sole argument and returns the corresponding hb_script_t. The hb_glib_script_from_script() does the reverse, taking an hb_script_t and returning the GUnicodeScript identifier for GLib.

Finally, GLib also provides a reference-counted object type called GBytes that is used for accessing raw memory segments with the benefits of GLib's lifecycle management. HarfBuzz provides a hb_glib_blob_create() function that lets you create an hb_blob_t directly from a GBytes object. This function takes only the GBytes object as its input; HarfBuzz registers the GLib destroy callback automatically.

The GNOME platform also features an object system called GObject. For HarfBuzz, the main advantage of GObject is a feature called GObject Introspection. This is a middleware facility that can be used to generate language bindings for C libraries. HarfBuzz uses it to build its Python bindings, which we will look at in a separate section.