Extending the parser¶
Modules such as page3 extend the CSS 2.1 parser to add support for CSS 3 syntax. They do so by sub-classing css21.CSS21Parser and overriding/extending some of its methods. If fact, the parser is made of methods in a class (rather than a set of functions) solely to enable this kind of sub-classing.
tinycss is designed to enable you to have parser subclasses outside of tinycss, without monkey-patching. If however the syntax you added is for a W3C specification, consider including your subclass in a new tinycss module and send a pull request: see Hacking tinycss.
Example: star hack¶
The star hack uses invalid declarations that are only parsed by some versions of Internet Explorer. By default, tinycss ignores invalid declarations and logs an error.
>>> from tinycss.css21 import CSS21Parser
>>> css = '#elem { width: [W3C Model Width]; *width: [BorderBox Model]; }'
>>> stylesheet = CSS21Parser().parse_stylesheet(css)
>>> stylesheet.errors
[ParseError('Parse error at 1:35, expected a property name, got DELIM',)]
>>> [decl.name for decl in stylesheet.rules[0].declarations]
['width']
If for example a minifier based on tinycss wants to support the star hack, it can by extending the parser:
>>> class CSSStarHackParser(CSS21Parser):
... def parse_declaration(self, tokens):
... has_star_hack = (tokens[0].type == 'DELIM' and tokens[0].value == '*')
... if has_star_hack:
... tokens = tokens[1:]
... declaration = super(CSSStarHackParser, self).parse_declaration(tokens)
... declaration.has_star_hack = has_star_hack
... return declaration
...
>>> stylesheet = CSSStarHackParser().parse_stylesheet(css)
>>> stylesheet.errors
[]
>>> [(d.name, d.has_star_hack) for d in stylesheet.rules[0].declarations]
[('width', False), ('width', True)]
This class extends the parse_declaration() method. It removes any * delimeter Token at the start of a declaration, and adds a has_star_hack boolean attribute on parsed Declaration objects: True if a * was removed, False for “normal” declarations.
Parser methods¶
In addition to methods of the user API (see Parsing a stylesheet), here are the methods of the CSS 2.1 parser that can be overriden or extended:
- CSS21Parser.parse_rules(tokens, context)[source]¶
Parse a sequence of rules (rulesets and at-rules).
Parameters: - tokens – An iterable of tokens.
- context – Either 'stylesheet' or an at-keyword such as '@media'. (Most at-rules are only allowed in some contexts.)
Returns: A tuple of a list of parsed rules and a list of ParseError.
- CSS21Parser.read_at_rule(at_keyword_token, tokens)[source]¶
Read an at-rule from a token stream.
Parameters: - at_keyword_token – The ATKEYWORD token that starts this at-rule You may have read it already to distinguish the rule from a ruleset.
- tokens – An iterator of subsequent tokens. Will be consumed just enough for one at-rule.
Returns: An unparsed AtRule.
Raises : ParseError if the head is invalid for the core grammar. The body is not validated. See AtRule.
- CSS21Parser.parse_at_rule(rule, previous_rules, errors, context)[source]¶
Parse an at-rule.
Subclasses that override this method must use super() and pass its return value for at-rules they do not know.
In CSS 2.1, this method handles @charset, @import, @media and @page rules.
Parameters: - rule – An unparsed AtRule.
- previous_rules – The list of at-rules and rulesets that have been parsed so far in this context. This list can be used to decide if the current rule is valid. (For example, @import rules are only allowed before anything but a @charset rule.)
- context – Either 'stylesheet' or an at-keyword such as '@media'. (Most at-rules are only allowed in some contexts.)
Raises : ParseError if the rule is invalid.
Returns: A parsed at-rule
- CSS21Parser.parse_media(tokens)[source]¶
For CSS 2.1, parse a list of media types.
Media Queries are expected to override this.
Parameters: tokens – A list of tokens Raises : ParseError on invalid media types/queries Returns: For CSS 2.1, a list of media types as strings
- CSS21Parser.parse_page_selector(tokens)[source]¶
Parse an @page selector.
Parameters: tokens – An iterable of token, typically from the head attribute of an unparsed AtRule. Returns: A page selector. For CSS 2.1, this is 'first', 'left', 'right' or None. Raises : ParseError on invalid selectors
- CSS21Parser.parse_declarations_and_at_rules(tokens, context)[source]¶
Parse a mixed list of declarations and at rules, as found eg. in the body of an @page rule.
Note that to add supported at-rules inside @page, CSSPage3Parser extends parse_at_rule(), not this method.
Parameters: - tokens – An iterable of token, typically from the body attribute of an unparsed AtRule.
- context – An at-keyword such as '@page'. (Most at-rules are only allowed in some contexts.)
Returns: A tuple of:
- A list of Declaration
- A list of parsed at-rules (empty for CSS 2.1)
- A list of ParseError
- CSS21Parser.parse_ruleset(first_token, tokens)[source]¶
Parse a ruleset: a selector followed by declaration block.
Parameters: - first_token – The first token of the ruleset (probably of the selector). You may have read it already to distinguish the rule from an at-rule.
- tokens – an iterator of subsequent tokens. Will be consumed just enough for one ruleset.
Returns: a tuple of a RuleSet and an error list. The errors are recovered ParseError in declarations. (Parsing continues from the next declaration on such errors.)
Raises : ParseError if the selector is invalid for the core grammar. Note a that a selector can be valid for the core grammar but not for CSS 2.1 or another level.
- CSS21Parser.parse_declaration_list(tokens)[source]¶
Parse a ; separated declaration list.
You may want to use parse_declarations_and_at_rules() (or some other method that uses parse_declaration() directly) instead if you have not just declarations in the same context.
Parameters: tokens – an iterable of tokens. Should stop at (before) the end of the block, as marked by }. Returns: a tuple of the list of valid Declaration and a list of ParseError
- CSS21Parser.parse_declaration(tokens)[source]¶
Parse a single declaration.
Parameters: tokens – an iterable of at least one token. Should stop at (before) the end of the declaration, as marked by a ; or }. Empty declarations (ie. consecutive ; with only white space in-between) should be skipped earlier and not passed to this method. Returns: a Declaration Raises : ParseError if the tokens do not match the ‘declaration’ production of the core grammar.
Unparsed at-rules¶
- class tinycss.css21.AtRule(at_keyword, head, body, line, column)[source]¶
An unparsed at-rule.
- at_keyword¶
The normalized (lower-case) at-keyword as a string. Eg: '@page'
- head¶
The part of the at-rule between the at-keyword and the { marking the body, or the ; marking the end of an at-rule without a body. A TokenList.
- body¶
The content of the body between { and } as a TokenList, or None if there is no body (ie. if the rule ends with ;).
The head was validated against the core grammar but not the body, as the body might contain declarations. In case of an error in a declaration, parsing should continue from the next declaration. The whole rule should not be ignored as it would be for an error in the head.
These at-rules are expected to be parsed further before reaching the user API.
Parsing helper functions¶
The tinycss.parsing module contains helper functions for parsing tokens into a more structured form:
- tinycss.parsing.strip_whitespace(tokens)[source]¶
Remove whitespace at the beggining and end of a token list.
Whitespace tokens in-between other tokens in the list are preserved.
Parameters: tokens – A list of Token or ContainerToken. Returns: A new sub-sequence of the list.
- tinycss.parsing.split_on_comma(tokens)[source]¶
Split a list of tokens on commas, ie , DELIM tokens.
Only “top-level” comma tokens are splitting points, not commas inside a function or other ContainerToken.
Parameters: tokens – An iterable of Token or ContainerToken. Returns: A list of lists of tokens
- tinycss.parsing.validate_value(tokens)[source]¶
Validate a property value.
Parameters: tokens – an iterable of tokens Raises : ParseError if there is any invalid token for the ‘value’ production of the core grammar.
- tinycss.parsing.validate_block(tokens, context)[source]¶
Raises : ParseError if there is any invalid token for the ‘block’ production of the core grammar.
Parameters: - tokens – an iterable of tokens
- context – a string for the ‘unexpected in ...’ message
- tinycss.parsing.validate_any(token, context)[source]¶
Raises : ParseError if this is an invalid token for the ‘any’ production of the core grammar.
Parameters: - token – a single token
- context – a string for the ‘unexpected in ...’ message