Namespace Handling¶
Universal Feed Parser attempts to expose all possible data in feeds, including elements in extension namespaces.
Some common namespaced elements are mapped to core elements. For further information about these mappings, see Reference.
Other namespaced elements are available as prefixelement
.
The namespaces defined in the feed are available in the parsed results as
namespaces
, a dictionary of {prefix: namespaceURI}. If the feed defines a
default namespace, it is listed as namespaces['']
.
Accessing namespaced elements¶
>>> import feedparser
>>> d = feedparser.parse('http://feedparser.org/docs/examples/prism.rdf')
>>> d.feed.prism_issn
u'0028-0836'
>>> d.namespaces
{'': u'http://purl.org/rss/1.0/',
'prism': u'http://prismstandard.org/namespaces/1.2/basic/',
'rdf': u'http://www.w3.org/1999/02/22-rdf-syntax-ns#'}
The prefix used to construct the variable name is not guaranteed to be the same
as the prefix of the namespaced element in the original feed. If
Universal Feed Parser recognizes the namespace, it will use the
namespace’s preferred prefix to construct the variable name. It will also list
the namespace in the namespaces
dictionary using the namespace’s preferred
prefix.
In the previous example, the namespace
(http://prismstandard.org/namespaces/1.2/basic/) was defined with the
namespace’s preferred prefix (prism), so the prism:issn element was accessible
as the variable d.feed.prism_issn
. However, if the namespace is defined
with a non-standard prefix, Universal Feed Parser will still
construct the variable name using the preferred prefix, not the actual prefix
that is used in the feed.
This will become clear with an example.
Accessing namespaced elements with non-standard prefixes¶
>>> import feedparser
>>> d = feedparser.parse('http://feedparser.org/docs/examples/nonstandard_prefix.rdf')
>>> d.feed.prism_issn
u'0028-0836'
>>> d.feed.foo_issn
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "feedparser.py", line 158, in __getattr__
raise AttributeError, "object has no attribute '%s'" % key
AttributeError: object has no attribute 'foo_issn'
>>> d.namespaces
{'': u'http://purl.org/rss/1.0/',
'prism': u'http://prismstandard.org/namespaces/1.2/basic/',
'rdf': u'http://www.w3.org/1999/02/22-rdf-syntax-ns#'}
This is the complete list of namespaces that Universal Feed Parser recognizes and uses to construct the variable names for data in these namespaces:
Note
Universal Feed Parser treats namespaces as case-insensitive to match the behavior of certain versions of iTunes.
Warning
Data from namespaced elements is not sanitized (even if it contains HTML markup).