HarfBuzz requires some simple functions for accessing
information from the Unicode Character Database (such as the
General_Category
(gc) and
Script
(sc) properties) that is useful
for shaping, as well as some useful operations like composing and
decomposing code points.
HarfBuzz includes its own internal, lightweight set of Unicode functions. At build time, it is also possible to compile support for some other options, such as the Unicode functions provided by GLib or the International Components for Unicode (ICU) library. Generally, this option is only of interest for client programs that have specific integration requirements or that do a significant amount of customization.
If your program has access to other Unicode functions, however, such as through a system library or application framework, you might prefer to use those instead of the built-in options. HarfBuzz supports this by implementing its Unicode functions as a set of virtual methods that you can replace — without otherwise affecting HarfBuzz's functionality.
The Unicode functions are specified in a structure called
unicode_funcs
which is attached to each
buffer. But even though unicode_funcs
is
associated with a hb_buffer_t, the functions
themselves are called by other HarfBuzz APIs that access
buffers, so it would be unwise for you to hook different
functions into different buffers.
In addition, you can mark your unicode_funcs
as immutable by calling
hb_unicode_funcs_make_immutable (ufuncs)
.
This is especially useful if your code is a
library or framework that will have its own client programs. By
marking your Unicode function choices as immutable, you prevent
your own client programs from changing the
unicode_funcs
configuration and introducing
inconsistencies and errors downstream.
You can retrieve the Unicode-functions configuration for
your buffer by calling hb_buffer_get_unicode_funcs()
:
hb_unicode_funcs_t *ufunctions; ufunctions = hb_buffer_get_unicode_funcs(buf);
The current version of unicode_funcs
uses six functions:
hb_unicode_combining_class_func_t
:
returns the Canonical Combining Class of a code point.
hb_unicode_general_category_func_t
:
returns the General Category (gc) of a code point.
hb_unicode_mirroring_func_t
: returns
the Mirroring Glyph code point (for bi-directional
replacement) of a code point.
hb_unicode_script_func_t
: returns the
Script (sc) property of a code point.
hb_unicode_compose_func_t
: returns the
canonical composition of a sequence of two code points.
hb_unicode_decompose_func_t
: returns
the canonical decomposition of a code point.
Note, however, that future HarfBuzz releases may alter this set.
Each Unicode function has a corresponding setter, with which you
can assign a callback to your replacement function. For example,
to replace
hb_unicode_general_category_func_t
, you can call
hb_unicode_funcs_set_general_category_func (*ufuncs, func, *user_data, destroy)
Virtualizing this set of Unicode functions is primarily intended
to improve portability. There is no need for every client
program to make the effort to replace the default options, so if
you are unsure, do not feel any pressure to customize
unicode_funcs
.