Pyteomics documentation v2.1.5

mzid - mzIdentML file reader

«  pepxml - pepXML file reader   ::   Contents   ::   auxiliary - common functions and objects  »

mzid - mzIdentML file reader

Summary

mzIdentML is one of the standards developed by the Proteomics Informatics working group of the HUPO Proteomics Standard Initiative.

This module provides a minimalistic way to extract information from mzIdentML files. The main idea is the same as in pyteomics.pepxml: the top-level function read() allows iterating over entries in <SpectrumIdentificationResult> elements, i.e. groups of identifications for a certain spectrum. Note that each entry can contain more than one PSM (peptide-spectrum match). They are accessible with “SpectrumIdentificationItem” key.

Data access

read() - iterate through peptide-spectrum matches in a pep.XML file. Data from a single PSM group are converted to a human-readable dict.

get_by_id() - get an element by its ID and extract the data from it.

version_info() - get information about mzIdentML version and schema.

iterfind() - iterate over elements in an mzIdentML file.


pyteomics.mzid.version_info(source, *args, **kwargs)

Provide version information about the mzIdentML file.

pyteomics.mzid.iterfind(source, *args, **kwargs)

Parse source and yield info on elements with specified local name or by specified “XPath”. Only local names separated with slashes are accepted. An asterisk (*) means any element. You can specify a single condition in the end, such as: “/path/to/element[some_value>1.5]” Note: you can do much more powerful filtering using plain Python. The path can be absolute or “free”. Please don’t specify namespaces.

pyteomics.mzid.get_by_id(source, *args, **kwargs)[source]

Parse source and return the element with id attribute equal to elem_id. Returns None if no such element is found.

Parameters :

source : str or file

A path to a target mzIdentML file of the file object itself.

elem_id : str

The value of the id attribute to match.

Returns :

out : lxml.etree.Element or None

pyteomics.mzid.read(*args, **kwargs)[source]

Parse source and iterate through peptide-spectrum matches.

Parameters :

source : str or file

A path to a target mzIdentML file or the file object itself.

recursive : bool, optional

If False, subelements will not be processed when extracting info from elements. Default is True.

retrieve_refs : bool, optional

If True, additional information from references will be automatically added to the results. The file processing time will increase. Default is False.

Returns :

out : iterator

An iterator over the dicts with PSM properties.

«  pepxml - pepXML file reader   ::   Contents   ::   auxiliary - common functions and objects  »