HTML5 Print¶

Module Contents¶

This tool pretty print your HTML, CSS and JavaScript file. The package comes with two parts:

a command line tool, html5-print

a python module, html5print

https://travis-ci.org/berniey/html5print.png?branch=master

Introduction¶

This module reformat web page code and make it more readable. It is targeted for developers, hence is not optimized for speed. I start out looking for a tool, ended up created this module. Hope it helps you!

Key features:

Pretty print HTML as well as embedded CSS and JavaScript within it

Pretty print pure CSS and JavaScript

Try to fix fragmented HTML5

Try to fix HTML with broken unicode encoding

Try to guess encoding of the document, and in some cases manage to convert 8-bit byte code back into correct UTF-8 format

Support both Python 2 and 3

Installation¶

$ [sudo] pip install html5print

Uninstallation¶

$ [sudo] pip uninstall html5print
$ [sudo] pip uninstall bs4 html5lib slimit tinycss2 requests chardet

Command Line Tool¶

Synopsis¶

$ html5-print --help
usage: html5-print [-h] [-o OUTFILE] [-s INDENT_WIDTH] [-e ENCODING]
                    [-t {html,js,css}] [-v]
                    infile

Beautify HTML5, CSS, JavaScript - Version 0.1.2 (By Bernard Yue)
This tool reformat the input and return a beautified version,
in unicode.

positional arguments:
  infile                filename | url | -, a dash, which represents stdin

optional arguments:
  -h, --help            show this help message and exit
  -o OUTFILE, --output OUTFILE
                        filename for formatted HTML, stdout if omitted
  -s INDENT_WIDTH, --indent-width INDENT_WIDTH
                        number of space for indentation, default 2
  -e ENCODING, --encoding ENCODING
                        encoding of input, default UTF-8
  -t {html,js,css}, --filetype {html,js,css}
                        type of file to parse, default "html"
  -v, --version         show program's version number and exit

Example¶

Pretty print HTML:

$ html5-print -s4 -
Press Ctrl-D when finished
<html><head><title>Small HTML page</title>
<style>p { margin: 10px 20px; color: black; }</style>
<script>function myFunction() {
document.getElementById("demo").innerHTML = "Paragraph changed.";
}</script>
</head><body>
<p>Some text for testing</body></html>
^D
<html>
    <head>
        <title>
            Small HTML page
        </title>
        <style>
            p {
                margin              : 10px 20px;
                color               : black;
            }
        </style>
        <script>
            function myFunction() {
                document.getElementById("demo").innerHTML = "Paragraph changed.";
            }
        </script>
    </head>
    <body>
        <p>
            Some text for testing
        </p>
    </body>
</html>
$

Create valid HTML5 document from HTML fragment:

$ html5-print -s4 -
Press Ctrl-D when finished
<title>Hello in different language</title>
<p>Here is "hello" in different languages</p>
<ul>
<li>Hello
<li>您好
<li>こんにちは
<li>Dobrý den,
<li>สวัสดี
^D
<html>
    <head>
        <title>
            Hello in different language
        </title>
    </head>
    <body>
        <p>
            Here is "hello" in different languages
        </p>
        <ul>
            <li>
                Hello
            </li>
            <li>
                您好
            </li>
            <li>
                こんにちは
            </li>
            <li>
                Dobrý den,
            </li>
            <li>
                สวัสดี
            </li>
        </ul>
    </body>
</html>
$

Testing¶

The module uses pytest. Use pip to install pytest.

$ [sudo] pip install pytest

Then run test as normal.

$ tar zxf html5print-0.1.2.tar.gz
$ cd html5print-0.1.2
$ python setup.py test

License¶

This module is distributed under Apache License Version 2.0.

Python API¶

class html5print.CSSBeautifier¶

Bases: html5print.utils.BeautifierBase

A CSS Beautifier that pretty print CSS. It loosely supports CSS3.

classmethod beautify(css, indent=2, encoding=None)¶

Prettifing css by reindending to width of indent per level. css is expected to be a valid Cascading Style Sheet

Parameters:	css – a valid css as multiline string indent – width od indentation per level encoding – expected encoding of css. If None, it will be guesssed
Returns:	reindented css

>>> # a single css rule
>>> from html5print import CSSBeautifier
>>> css = ".para { margin: 10px 20px; }"
>>> print(CSSBeautifier.beautify(css))
.para {
  margin              : 10px 20px;
}

>>> # multiple css rules
>>> from html5print import CSSBeautifier
>>> css = ".para { margin: 10px 20px; }"
>>> css += os.linesep + "p { border: 5px solid red; }"
>>> print(CSSBeautifier.beautify(css))
.para {
  margin              : 10px 20px;
}
p {
  border              : 5px solid red;
}

>>> # pseudo-class css rule
>>> from html5print import CSSBeautifier
>>> css = ' /* beginning of css*/\n ::after { margin: 10px 20px; }'
>>> print(CSSBeautifier.beautify(css))
/* beginning of css*/
::after {
  margin              : 10px 20px;
}

>>> # pseudo-class css rule with different indent
>>> from html5print import CSSBeautifier
>>> css = ' /* beginning of css*/\n ::after { margin: 10px 20px; }'
>>> print(CSSBeautifier.beautify(css, 4))
/* beginning of css*/
::after {
    margin              : 10px 20px;
}

>>> # pseudo-class css rules with comments in between
>>> from html5print import CSSBeautifier
>>> css = ' /* beginning of css*/\n ::after { margin: 10px 20px; }'
>>> css += os.linesep + ' /* another comment */p {'
>>> css += 'h1 : color: #36CFFF; font-weight: normal;}'
>>> print(CSSBeautifier.beautify(css, 4))
/* beginning of css*/
::after {
    margin              : 10px 20px;
}
/* another comment */
p {
    h1                  : color: #36CFFF;
    font-weight         : normal;
}

>>> # media query
>>> from html5print import CSSBeautifier
>>> css = '''@media (-webkit-min-device-pixel-ratio:0) {
... h2.collapse { margin: -22px 0 22px 18px;
... }
... ::i-block-chrome, h2.collapse { margin: 0 0 22px 0; } }
... '''
>>> print(CSSBeautifier.beautify(css, 4))
@media (-webkit-min-device-pixel-ratio:0) {
    h2.collapse {
        margin              : -22px 0 22px 18px;
    }
    ::i-block-chrome, h2.collapse {
        margin              : 0 0 22px 0;
    }
}

classmethod beautifyTextInHTML(html, indent=2, encoding=None)¶

Beautifying CSS within the <style></style> tag. HTML comments(s) (i.e. ) within the style tag, if any, will be moved to the end of the tag block.

Note: The function assumes tag <style> the first element in a new line containing the tag (except for whitespace). Indention of the style block will be the indent of <style> tag plus one indent of current indentation

Parameters:	html – html as string indent – width of indentation for embedded CSS in HTML
Returns:	html with CSS beautified (i.e. text within `<style>...</style>`)

>>> # pretty print css
>>> from html5print import CSSBeautifier
>>> html = '''<html><body>
...   <style>
...     .para { margin: 10px 20px; }
... <!-- This is what the function is dealing with-->
... p { color: red; font-style: normal; }
...   </style>
... </body></html>'''
>>> print(CSSBeautifier.beautifyTextInHTML(html))
<html><body>
  <style>
    .para {
      margin              : 10px 20px;
    }
    p {
      color               : red;
      font-style          : normal;
    }
    <!-- This is what the function is dealing with-->
  </style>
</body></html>

>>> # <style> not the first element, no pretty print
>>> from html5print import CSSBeautifier
>>> html = '''<html><body><style>
...     .para { margin: 10px 20px; }
... <!-- This is what the function is dealing with-->
... p { color: red; font-style: normal; }
...   </style>
... </body></html>'''
>>> print(CSSBeautifier.beautifyTextInHTML(html))
<html><body><style>
    .para { margin: 10px 20px; }
<!-- This is what the function is dealing with-->
p { color: red; font-style: normal; }
  </style>
</body></html>

class html5print.JSBeautifier¶

Bases: html5print.utils.BeautifierBase

A Javascript Beautifier that pretty print Javascript

classmethod beautify(js, indent=2, encoding=None)¶

Prettifing js by reindending to width of indent per level. js is expected to be a valid Javascipt

Parameters:	js – a valid javascript as multiline string indent – width od indentation per level encoding – expected encoding of js. If None, it will be guesssed
Returns:	reindented javascript

>>> from html5print import JSBeautifier
>>> js = '''function myFunction() {
... document.getElementById("demo").innerHTML = "Paragraph changed.";
... }'''

>>> # test default indent of 2 spaces
>>> print(JSBeautifier.beautify(js))
function myFunction() {
  document.getElementById("demo").innerHTML = "Paragraph changed.";
}

>>> # test indent of 4 spaces
>>> print(JSBeautifier.beautify(js, 4))
function myFunction() {
    document.getElementById("demo").innerHTML = "Paragraph changed.";
}

classmethod beautifyTextInHTML(html, indent=2, encoding=None)¶

Beautifying Javascript within the <script></script> tag. HTML comments(s) (i.e. ) within the script tag, if any, will be moved to the end of the tag block

Parameters:	html – html as string indent – width of indentation for embedded javascript in HTML
Returns:	html with javascript beautified (i.e. text within `<script>...</script>`)

>>> from html5print import JSBeautifier
>>> js = '''<html><body>
...   <script>function myFunction() {
... document.getElementById("demo").innerHTML = "Paragraph changed.";
... }
...   </script>
... </body></html>
... '''
>>> print(JSBeautifier.beautifyTextInHTML(js))
<html><body>
  <script>
    function myFunction() {
      document.getElementById("demo").innerHTML = "Paragraph changed.";
    }
  </script>
</body></html>

class html5print.HTMLBeautifier¶

Bases: html5print.utils.BeautifierBase

HTML Beautifier. Powered by BeautifulSoup 4

classmethod beautify(html, indent=2, encoding=None, formatter=u'html5')¶

Pretty print html with indentation of indent per level

Parameters:	html – html as string indent – width of indentation encoding – encoding of html formatter – formatter to use by bs4. use lxml if you want HTML4 output
Returns:	beautified html

>>> # pretty print HTML
>>> from html5print import HTMLBeautifier
>>> html = '<title>Testing</title><body><p>Some Text</p>'
>>> print(HTMLBeautifier.beautify(html))
<html>
  <head>
    <title>
      Testing
    </title>
  </head>
  <body>
    <p>
      Some Text
    </p>
  </body>
</html>

>>> # pretty print HTML with embedded CSS and Javascript
>>> from html5print import HTMLBeautifier
>>> html = '''<html><head><title>Testing</title>
... <style>p { color:red; font-weight:nornal}
... h1{color:green;}
... </style>
... </head>
... <body><p>Some Text</p>
... <script>function myFunction()
... {document.getElementById("demo").innerHTML="changed.";
... }</script>
... </body></html>
... '''
>>> print(HTMLBeautifier.beautify(html))
<html>
  <head>
    <title>
      Testing
    </title>
    <style>
      p {
        color               : red;
        font-weight         : nornal
      }
      h1 {
        color               : green;
      }
    </style>
  </head>
  <body>
    <p>
      Some Text
    </p>
    <script>
      function myFunction() {
        document.getElementById("demo").innerHTML = "changed.";
      }
    </script>
  </body>
</html>

html5print.decodeText(text, encoding=None)¶

Decoding text to encoding. If encoding is None, encoding will be guessed.

Note: encoding provided will be disregarded if it causes decoding error

Parameters:	text – string to be decoded encoding – encoding scheme of text. guess by system if None
Returns:	new decoded text as unicode

>>> import sys
>>> from html5print import decodeText
>>> s = 'Hello! 您好! こんにちは! halló!'
>>> output = decodeText(s)
>>> print(output)
Hello! 您好! こんにちは! halló!
>>> if sys.version_info[0] >= 3:
...    unicode = str
>>> isinstance(output, unicode)
True

html5print.isUnicode(text)¶

Return True if text is unicode. False otherwise. Note that because the function has to work on both Python 2 and Python 3, u’’ cannot be used in doctest.

Parameters:	text – string to check if it is unicode
Returns:	True if text is unicode, False otherwise

>>> import sys
>>> if sys.version_info[0] >= 3:
...     isUnicode(bytes('hello', 'ascii'))
... else:
...     isUnicode(bytes('hello'))
False

>>> import sys
>>> if sys.version_info[0] >= 3:
...     unicode = str
>>> isUnicode(unicode('hello'))
True

HTML5 Print¶

Module Contents¶

Introduction¶

Installation¶

Uninstallation¶

Command Line Tool¶

Synopsis¶

Example¶

Testing¶

License¶

Python API¶

Table Of Contents

This Page

Navigation

HTML5 Print¶

Module Contents¶

Introduction¶

Installation¶

Uninstallation¶

Command Line Tool¶

Synopsis¶

Example¶

Testing¶

License¶

Python API¶

Table Of Contents

This Page

Quick search

Navigation