pepxml - pepXML file reader¶
Summary¶
pep.XML was the first widely accepted format for proteomics search engines’ output. Even though it is to be replaced by a community standard mzIdentML, it is still used commonly.
This module provides minimalistic infrastructure for access to data stored in pep.XML files. The most important function is read(), which reads peptide-spectum matches and related information and saves them into human-readable dicts. The rest of data can be obtained via get_node() function. This function relies on the terminology of the underlying lxml library.
Data access¶
read() - iterate through peptide-spectrum matches in a pep.XML file. Data for a single spectrum are converted to an easy-to-use dict.
roc_curve() - get a receiver-operator curve (min peptideprophet probability is a sample vs. false discovery rate) of peptideprophet analysis.
version_info() - get version information about the pepXML file.
iterfind() - iterate over elements in a pepXML file.
- pyteomics.pepxml.version_info(source, *args, **kwargs)¶
Provide version information about the pepXML file.
- pyteomics.pepxml.iterfind(source, *args, **kwargs)¶
Parse source and yield info on elements with specified local name or by specified “XPath”. Only local names separated with slashes are accepted. An asterisk (*) means any element. You can specify a single condition in the end, such as: “/path/to/element[some_value>1.5]” Note: you can do much more powerful filtering using plain Python. The path can be absolute or “free”. Please don’t specify namespaces.