Glossary
- alphabetic character
- Characters that usually no line breaks are allowed
between pairs of them, except that other characters provide break
oppotunities
(this term is inaccurate from the point of view by grammatology).
[UAX14] classifies most of alphabetic characters to
line breaking class AL.
See also ideographic character.
- ambiguous quotation mark
- To be written
- complex breaking
- Heuristic line breaking based on dictionary for several scripts
on which breaking positions are not obvious by each characters.
[UAX14] classifys characters of several South East Asian scripts
which need complex breaking to line breaking class SA.
- direct break
- A line break opportunity exists between two adjacent characters.
See also indirect break, mandatory break.
- East_Asian_Width
- Informative property of Unicode characters defined by Unicode Standard
Annex #11 ([UAX11]).
It corresponds to the “width” (glyph spacing) of each characters on
implenentations for East Asian encodings.
See also number of columns.
- grapheme cluster
- A concept defined by Unicode Standard Annex #29 ([UAX29]).
Grapheme cluster is a sequence of Unicode character(s) that consists
of one grapheme base and optional grapheme extender and/or
“prepend” character. It is close in that people consider as
“character”.
- hangul
- A syllabary used for Korean language.
In traditional sense, hangul characters behave as
ideographic characters, while each character consists
of a few jamo which represent features of pronounciation.
- ideographic character
- Characters that usually allow line breaks both before and after
themselves
(this term is inaccurate from the point of view by grammatology).
[UAX14] classifies most of ideographic characters to
line breaking class ID.
See also alphabetic character.
- indirect break
- A line break opportunity exists between two characters only if
they are separated by one or more spaces.
See also direct break, mandatory break.
- line breaking class
- Classification of Unicode characters defined by Unicode Standard
Annex #14 ([UAX14]).
- mandatory break
- Obligatory line breaking behavior defined by core
rules and performed regardless of surrounding characters.
See also direct break, indirect break.
- non-starter
- The character that cannot be placed at beginning of lines.
[UAX14] classifies non-starters to line breaking class
NS or CJ.
It includes small hiragana, small katakana and some punctuations.
- number of columns
- Number of columns of a string is not always equal to the number of
characters it contains:
Each of characters is either wide, narrow or nonspacing;
they occupy 2, 1 or 0 columns, respectively.
Several characters may be both wide and narrow by the contexts they
are used.
Characters may have more various widths by customization.
- virama sign
- The sign that many Brahmi-derived abugidas in South Asia and
South East Asia are endowed with.
Its primary use is to cancel inherent vowel of consonants.
By several writing systems, they are used to form consonantal
clusters.
By Unicode Standard, some characters of virama signs are also used
to represent transformation of ligated character sequences.