Formatted text

The FormattedDocument class maintains style information for individual characters in the text, rather than a single style for the whole document. Styles can be accessed and modified by name, for example:

# Get the font name used at character index 0
font_name = document.get_style('font_name', 0)

# Set the font name and size for the first 5 characters
document.set_style(0, 5, dict(font_name='Arial', font_size=12))

Internally, character styles are run-length encoded over the document text; so longer documents with few style changes do not use excessive memory.

From the document's point of view, there are no predefined style names: it simply maps names and character ranges to arbitrary Python values. It is the TextLayout classes that interpret this style information; for example, by selecting a different font based on the font_name style. Unrecognised style names are ignored by the layout -- you can use this knowledge to store additional data alongside the document text (for example, a URL behind a hyperlink).

Character styles

The following character styles are recognised by all TextLayout classes.

Where an attribute is marked "as a distance" the value is assumed to be in pixels if given as an int or float, otherwise a string of the form "0u" is required, where 0 is the distance and u is the unit; one of "px" (pixels), "pt" (points), "pc" (picas), "cm" (centimeters), "mm" (millimeters) or "in" (inches). For example, "14pt" is the distance covering 14 points, which at the default DPI of 96 is 18 pixels.

font_name: Font family name, as given to pyglet.font.load.
font_size: Font size, in points.
bold: Boolean.
italic: Boolean.
underline: 4-tuple of ints in range (0, 255) giving RGBA underline color, or None (default) for no underline.
kerning: Additional space to insert between glyphs, as a distance. Defaults to 0.
baseline: Offset of glyph baseline from line baseline, as a distance. Positive values give a superscript, negative values give a subscript. Defaults to 0.
color: 4-tuple of ints in range (0, 255) giving RGBA text color
background_color: 4-tuple of ints in range (0, 255) giving RGBA text background color; or None for no background fill.

Paragraph styles

Although FormattedDocument does not distinguish between character- and paragraph-level styles, TextLayout interprets the following styles only at the paragraph level. You should take care to set these styles for complete paragraphs only, for example, by using FormattedDocument.set_paragraph_style.

These styles are ignored for layouts without the multiline flag set.

align: "left" (default), "center" or "right".
indent: Additional horizontal space to insert before the first glyph of the first line of a paragraph, as a distance.
leading: Additional space to insert between consecutive lines within a paragraph, as a distance. Defaults to 0.
line_spacing: Distance between consecutive baselines in a paragraph, as a distance. Defaults to None, which automatically calculates the tightest line spacing for each line based on the maximum font ascent and descent.
margin_left: Left paragraph margin, as a distance.
margin_right: Right paragraph margin, as a distance.
margin_top: Margin above paragraph, as a distance.
margin_bottom: Margin below paragraph, as a distance. Adjacent margins do not collapse.
tab_stops: List of horizontal tab stops, as distances, measured from the left edge of the text layout. Defaults to the empty list. When the tab stops are exhausted, they implicitly continue at 50 pixel intervals.
wrap: Boolean. If True (the default), text wraps within the width of the layout.

For the purposes of these attributes, paragraphs are split by the newline character (U+0010) or the paragraph break character (U+2029). Line breaks within a paragraph can be forced with character U+2028.

Attributed text

pyglet provides two formats for decoding formatted documents from plain text. These are useful for loading preprepared documents such as help screens. At this time there is no facility for saving (encoding) formatted documents.

The attributed text format is an encoding specific to pyglet that can exactly describe any FormattedDocument. You must use this encoding to access all of the features of pyglet text layout. For a more accessible, yet less featureful encoding, see the HTML encoding, described below.

The following example shows a simple attributed text encoded document:

Chapter 1

My father's family name being Pirrip, and my Christian name Philip,
my infant tongue could make of both names nothing longer or more
explicit than Pip.  So, I called myself Pip, and came to be called
Pip.

I give Pirrip as my father's family name, on the authority of his
tombstone and my sister - Mrs. Joe Gargery, who married the
blacksmith.  As I never saw my father or my mother, and never saw
any likeness of either of them (for their days were long before the
days of photographs), my first fancies regarding what they were
like, were unreasonably derived from their tombstones.

Newlines are ignored, unless two are made in succession, indicating a paragraph break. Line breaks can be forced with the \\ sequence:

This is the way the world ends \\
This is the way the world ends \\
This is the way the world ends \\
Not with a bang but a whimper.

Line breaks are also forced when the text is indented with one or more spaces or tabs, which is useful for typesetting code:

The following paragraph has hard line breaks for every line of code:

    import pyglet

    window = pyglet.window.Window()
    pyglet.app.run()

Text can be styled using a attribute tag:

This sentence makes a {bold True}bold{bold False} statement.

The attribute tag consists of the attribute name (in this example, bold) followed by a Python bool, int, float, string, tuple or list.

Unlike most structured documents such as HTML, attributed text has no concept of the "end" of a style; styles merely change within the document. This corresponds exactly to the representation used by FormattedDocument internally.

Some more examples follow:

{font_name 'Times New Roman'}{font_size 28}Hello{font_size 12},
{color (255, 0, 0, 255)}world{color (0, 0, 0, 255)}!

(This example uses 28pt Times New Roman for the word "Hello", and 12pt red text for the word "world").

Paragraph styles can be set by prefixing the style name with a period (.). This ensures the style range exactly encompasses the paragraph:

{.margin_left "12px"}This is a block quote, as the margin is inset.

{.margin_left "24px"}This paragraph is inset yet again.

Attributed text can be loaded as a Unicode string. In addition, any character can be inserted given its Unicode code point in numeric form, either in decimal:

This text is Copyright {#169}.

or hexadecimal:

This text is Copyright {#xa9}.

The characters { and } can be escaped by duplicating them:

Attributed text uses many "{{" and "}}" characters.

Use the decode_attributed function to decode attributed text into a FormattedDocument:

document = pyglet.text.decode_attributed('Hello, {bold True}world')

HTML

While attributed text gives access to all of the features of FormattedDocument and TextLayout, it is quite verbose and difficult produce text in. For convenience, pyglet provides an HTML 4.01 decoder that can translate a small, commonly used subset of HTML into a FormattedDocument.

Note that the decoder does not preserve the structure of the HTML document -- all notion of element hierarchy is lost in the translation, and only the visible style changes are preserved.

The following example uses decode_html to create a FormattedDocument from a string of HTML:

document = pyglet.text.decode_html('Hello, <b>world</b>')

The following elements are supported:

B BLOCKQUOTE BR CENTER CODE DD DIR DL EM FONT H1 H2 H3 H4 H5 H6 I IMG KBD
LI MENU OL P PRE Q SAMP STRONG SUB SUP TT U UL VAR

The style attribute is not supported, so font sizes must be given as HTML logical sizes in the range 1 to 7, rather than as point sizes. The corresponding font sizes, and some other stylesheet parameters, can be modified by subclassing HTMLDecoder.