ICU integration: HarfBuzz Manual

ICU integration

API reference: hb-icu.

Although HarfBuzz includes its own Unicode-data functions, it also provides integration APIs for using the International Components for Unicode (ICU) library as a source of Unicode data on any supported platform.

The principal integration point with ICU is the hb_unicode_funcs_t Unicode-functions structure attached to a buffer. This structure holds the virtual methods used for retrieving Unicode character properties, such as General Category, Script, Combining Class, decomposition mappings, and mirroring information.

To use ICU in your client program, you need to call hb_icu_get_unicode_funcs(), which creates a Unicode-functions structure populated with the ICU function for each included method. Subsequently, you can attach the Unicode-functions structure to your buffer:

hb_unicode_funcs_t *icufunctions;
icufunctions = hb_icu_get_unicode_funcs();
hb_buffer_set_unicode_funcs(buf, icufunctions);

and ICU will be used for Unicode-data access.

HarfBuzz also supplies a pair of functions (hb_icu_script_from_script() and hb_icu_script_to_script()) for converting between ICU's and HarfBuzz's internal enumerations of Unicode scripts. The hb_icu_script_from_script() function converts from a HarfBuzz hb_script_t to an ICU UScriptCode. The hb_icu_script_to_script() function does the reverse: converting from a UScriptCode identifier to a hb_script_t.

By default, HarfBuzz's ICU support is built as a separate shared library (libharfbuzz-icu.so) when compiling HarfBuzz from source. This allows client programs that do not need ICU to link against HarfBuzz without unnecessarily adding ICU as a dependency. You can also build HarfBuzz with ICU support built directly into the main HarfBuzz shared library (libharfbuzz.so), by specifying the --with-icu=builtin compile-time option.