mzid - mzIdentML file reader¶
Summary¶
mzIdentML is one of the standards developed by the Proteomics Informatics working group of the HUPO Proteomics Standard Initiative.
This module provides a minimalistic way to extract information from mzIdentML files. The main idea is the same as in pyteomics.pepxml: the top-level function read() allows iterating over entries in <SpectrumIdentificationResult> elements, i.e. groups of identifications for a certain spectrum. Note that each entry can contain more than one PSM (peptide-spectrum match). They are accessible with “SpectrumIdentificationItem” key.
Data access¶
read() - iterate through peptide-spectrum matches in a pep.XML file. Data from a single PSM group are converted to a human-readable dict.
get_by_id() - get an element by its ID and extract the data from it.
version_info() - get information about mzIdentML version and schema.
iterfind() - iterate over elements in an mzIdentML file.
- pyteomics.mzid.version_info(source, *args, **kwargs)¶
Provide version information about the mzIdentML file.
- pyteomics.mzid.iterfind(source, *args, **kwargs)¶
Parse source and yield info on elements with specified local name or by specified “XPath”. Only local names separated with slashes are accepted. An asterisk (*) means any element. You can specify a single condition in the end, such as: “/path/to/element[some_value>1.5]” Note: you can do much more powerful filtering using plain Python. The path can be absolute or “free”. Please don’t specify namespaces.
- pyteomics.mzid.get_by_id(source, *args, **kwargs)[source]¶
Parse source and return the element with id attribute equal to elem_id. Returns None if no such element is found.
Parameters : source : str or file
A path to a target mzIdentML file of the file object itself.
elem_id : str
The value of the id attribute to match.
Returns : out : lxml.etree.Element or None
- pyteomics.mzid.read(*args, **kwargs)[source]¶
Parse source and iterate through peptide-spectrum matches.
Parameters : source : str or file
A path to a target mzIdentML file or the file object itself.
recursive : bool, optional
If False, subelements will not be processed when extracting info from elements. Default is True.
retrieve_refs : bool, optional
If True, additional information from references will be automatically added to the results. The file processing time will increase. Default is False.
Returns : out : iterator
An iterator over the dicts with PSM properties.