Formatting bibliographies¶
The main purpose of Pybtex is turning machine-readable bibliography data into human-readable bibliographies formatted in a specific style. Pybtex reads bibliography data that looks like this:
@book{graham1989concrete,
title = "Concrete mathematics: a foundation for computer science",
author = "Graham, Ronald Lewis and Knuth, Donald Ervin and Patashnik, Oren",
year = "1989",
publisher = "Addison-Wesley"
}
and formats it like this:
R. L. Graham, D. E. Knuth, and O. Patashnik. Concrete mathematics: a foundation for computer science. Addison-Wesley, 1989.
Pybtex contains two different formatting engines:
- The BibTeX engine uses BibTeX
.bststyles. - The Python engine uses styles written in Python.
BibTeX engine¶
The BibTeX engine is fully compatible with BibTeX style files and is used by default.
How it works¶
When you type pybtex mydocument, the following things happen:
Pybtex reads the file
mydocument.auxin the current directory. This file is normally created by LaTeX and contains all sorts of auxiliary information collected during processing of the LaTeX document.Pybtex is interested in these three pieces of information:
- Bibliography style:
First, Pybtex searches the
.auxfile for a\bibstylecommand that specifies which formatting style will be used.For example,
\bibstyle{unsrt}instructs Pybtex to use formatting style defined in the fileunsrt.bst.- Bibliography data:
Next, Pybtex expects to find at least one
\bibdatacommand in the.auxfile that tells where to look for the bibliography data.For example,
\bibdata{mydocument}means “use the bibliography data frommydocument.bib”.- Citations:
Finally, Pybtex needs to know which entries to put into the resulting bibliography. Pybtex gets the list of citation keys from
\citationcommands in the.auxfile.For example,
\citation{graham1989concrete}means “include the entry with the keygraham1989concreteinto the resulting bibliograhy”.A wildcard citation
\citation{*}tells Pybtex to format the bibliography for all entries from all data files specified by all\bibdatacommands.
Pybtex executes the style program in the
.bstfile specified by the\bibstylecommand in the.auxfile. As a result, a.bblfile containing the resulting formatted bibliography is created.A
.bststyle file is a program in a domain-specific stack-based language. A typical piece of the.bstcode looks like this:FUNCTION {format.bvolume} { volume empty$ { "" } { "volume" volume tie.or.space.connect series empty$ 'skip$ { " of " * series emphasize * } if$ "volume and number" number either.or.check } if$ }
The code in a
.bstfile contains the complete step-by-step instructions on how to create the formatted bibliography from the given bibliography data and citation keys. For example, aREADcommand tells Pybtex to read the bibliography data from all files specified by\bibdatacommands in the.auxfile, anITERATEcommand tells Pybtex to execute a piece of code for each citation key specified by\citationcommands, and so on. The built-inwrite$function tells Pybtex to write the given string into the resulting.bblfile. Pybtex implements all these commands and built-in functions and simply executes the.bstprogram step by step.A complete reference of the
.bstlanguage can be found in the BibTeX hacking guide by Oren Patashnik. It is available by running texdoc btxhak in most TeX distributions.
Python engine¶
The Python engine is enabled by running pybtex with the -l python option.
Differences from the BibTeX engine¶
- Formatting styles are written in Python instead of the
.bstlanguage. - Formatting styles are not tied to LaTeX and do not use hardcoded LaTeX
markup. Instead of that they produce format-agnostic
pybtex.richtext.Textobjects that can be converted to any markup format (LaTeX, Markdown, HTML, etc.). - Name formatting, label formatting, and sorting styles are defined separately from the main style.
How it works¶
When you type pybtex -l python mydocument, this things happen:
Pybtex reads the file
mydocument.auxin the current directory and extracts the name of the the bibliography style, the list of bibliography data files and the list of citation keys. This step is exactly the same as with the BibTeX engine.Pybtex reads the biliography data from all data files specified in the
.auxfile into a singleBibliographyDataobject.Then the formatting style is loaded. The formatting style is a Python class with a
format_bibliography()method. Pybtex passes the bibliography data (aBibliographyDataobject) and the list of citation keys toformat_bibliography().The formatting style formats each of the requested bibliography entries in a style-specific way.
When it comes to formatting names, a name formatting style is loaded and used. A name formatting style is also a Python class with a specific interface. Similarly, a label formatting style is used to format entry labels, and a sorting style is used to sort the resulting style. Each formatting style has a default name style, a default label style and a default sorting style. The defaults can be overridden with options passed to the main style class.
Each formatted entry is put into a
FormattedEntryobject which is just a container for the formatted label, the formatted entry text (apybtex.richtext.Textobject) and the entry key. The reason that the label, the key and the main text are stored separately is to give the output backend more flexibility when converting theFormattedEntryobject to the actual markup. For example, the HTML backend may want to format the bibliography as a definition list, the LaTeX backend would use\bibitem[label]{key} textconstructs, etc.Formatted entries are put into a
FormattedBibliographyobject—it simply contains a list ofFormattedEntryobjects and some additional metadata.The resulting
FormattedBibliographyis passed to the output backend. The default backend is LaTeX. It can be changed with thepybtex --output-backendoption. The output backend converts the formatted bibliography to the specific markup format and writes it to the output file.
Python API¶
The base interface¶
Both the Python engine and the BibTeX engine use the same interface
defined in pybtex.Engine.
pybtex.Engine has a handful of methods but most of them are just
convenience wrappers for Engine.format_from_files() that does the
actual job.
-
class
pybtex.Engine¶ -
make_bibliography(aux_filename, style=None, output_encoding=None, bib_format=None, **kwargs)¶ Read the given
.auxfile and produce a formatted bibliography usingformat_from_files().Parameters: style – If not None, use this style instead of specified in the.auxfile.
-
format_from_string(bib_string, *args, **kwargs)¶ Parse the bigliography data from the given string and produce a formated bibliography using
format_from_files().This is a convenience method that calls
format_from_strings()with a single string.
-
format_from_strings(bib_strings, *args, **kwargs)¶ Parse the bigliography data from the given strings and produce a formated bibliography.
This is a convenience method that wraps each string into a StringIO, then calls
format_from_files().
-
format_from_file(filename, *args, **kwargs)¶ Read the bigliography data from the given file and produce a formated bibliography.
This is a convenience method that calls
format_from_files()with a single file. All extra arguments are passed toformat_from_files().
-
format_from_files(*args, **kwargs)¶ Read the bigliography data from the given files and produce a formated bibliography.
This is an abstract method overridden by both
pybtex.PybtexEngineandpybtex.bibtex.BibTeXEngine.
-
The BibTeXEngine class¶
The BibTeX engine lives in the pybtex.bibtex module.
The public interface consists of the BibTeXEngine class and a
couple of convenience functions.
-
class
pybtex.bibtex.BibTeXEngine¶ The Python fomatting engine.
See
pybtex.Enginefor inherited methods.-
format_from_files(bib_files_or_filenames, style, citations=['*'], bib_format=None, bib_encoding=None, output_encoding=None, bst_encoding=None, min_crossrefs=2, output_filename=None, add_output_suffix=False, **kwargs)¶ Read the bigliography data from the given files and produce a formated bibliography.
Parameters: - bib_files_or_filenames – A list of file names or file objects.
- style – The name of the formatting style.
- citations – A list of citation keys.
- bib_format – The name of the bibliography format. The default
format is
bibtex. - bib_encoding – Encoding of bibliography files.
- output_encoding – Encoding that will be used by the output backend.
- bst_encoding – Encoding of the
.bstfile. - min_crossrefs – Include cross-referenced entries after this many crossrefs. See BibTeX manual for details.
- output_filename – If
None, the result will be returned as a string. Else, the result will be written to the specified file. - add_output_suffix – Append a
.bblsuffix to the output file name.
-
-
pybtex.bibtex.make_bibliography(*args, **kwargs)¶ A convenience function that calls
BibTeXEngine.make_bibliography().
-
pybtex.bibtex.format_from_string(*args, **kwargs)¶ A convenience function that calls
BibTeXEngine.format_from_string().
-
pybtex.bibtex.format_from_strings(*args, **kwargs)¶ A convenience function that calls
BibTeXEngine.format_from_strings().
-
pybtex.bibtex.format_from_file(*args, **kwargs)¶ A convenience function that calls
BibTeXEngine.format_from_file().
-
pybtex.bibtex.format_from_files(*args, **kwargs)¶ A convenience function that calls
BibTeXEngine.format_from_files().
The PybtexEngine class¶
The Python engine resides in the pybtex module
and uses an interface similar to the BibTeX engine.
There is the PybtexEngine class and some convenience functions.
-
class
pybtex.PybtexEngine¶ The Python fomatting engine.
See
pybtex.Enginefor inherited methods.-
format_from_files(bib_files_or_filenames, style, citations=['*'], bib_format=None, bib_encoding=None, output_backend=None, output_encoding=None, min_crossrefs=2, output_filename=None, add_output_suffix=False, **kwargs)¶ Read the bigliography data from the given files and produce a formated bibliography.
Parameters: - bib_files_or_filenames – A list of file names or file objects.
- style – The name of the formatting style.
- citations – A list of citation keys.
- bib_format – The name of the bibliography format. The default
format is
bibtex. - bib_encoding – Encoding of bibliography files.
- output_backend – Which output backend to use. The default is
latex. - output_encoding – Encoding that will be used by the output backend.
- bst_encoding – Encoding of the
.bstfile. - min_crossrefs – Include cross-referenced entries after this many crossrefs. See BibTeX manual for details.
- output_filename – If
None, the result will be returned as a string. Else, the result will be written to the specified file. - add_output_suffix – Append default suffix to the output file
name (
.bblfor LaTeX,.htmlfor HTML, etc.).
-
-
pybtex.make_bibliography(*args, **kwargs)¶ A convenience function that calls
PybtexEngine.make_bibliography().
-
pybtex.format_from_string(*args, **kwargs)¶ A convenience function that calls
PybtexEngine.format_from_string().
-
pybtex.format_from_strings(*args, **kwargs)¶ A convenience function that calls
PybtexEngine.format_from_strings().
-
pybtex.format_from_file(*args, **kwargs)¶ A convenience function that calls
PybtexEngine.format_from_file().
-
pybtex.format_from_files(*args, **kwargs)¶ A convenience function that calls
PybtexEngine.format_from_files().