Introduction¶
Universal Feed Parser is a Python module for downloading and parsing syndicated feeds. It can handle RSS 0.90, Netscape RSS 0.91, Userland RSS 0.91, RSS 0.92, RSS 0.93, RSS 0.94, RSS 1.0, RSS 2.0, Atom 0.3, Atom 1.0, and CDF feeds. It also parses several popular extension modules, including Dublin Core and Apple’s iTunes extensions.
To use Universal Feed Parser, you will need Python 2.4 or later (Python 3 is supported). Universal Feed Parser is not meant to run standalone; it is a module for you to use as part of a larger Python program.
Universal Feed Parser is easy to use; the module is self-contained
in a single file, feedparser.py
, and it has one primary public
function, parse
. parse
takes a number of arguments, but only one is
required, and it can be a URL, a local
filename, or a raw string containing feed data in any format.
Parsing a feed from a remote URL¶
>>> import feedparser
>>> d = feedparser.parse('http://feedparser.org/docs/examples/atom10.xml')
>>> d['feed']['title']
u'Sample Feed'
The following example assumes you are on Windows, and that you have saved a feed at c:\incoming\atom10.xml
.
Note
Universal Feed Parser works on any platform that can run Python; use the path syntax appropriate for your platform.
Parsing a feed from a local file¶
>>> import feedparser
>>> d = feedparser.parse(r'c:\incoming\atom10.xml')
>>> d['feed']['title']
u'Sample Feed'
Universal Feed Parser can also parse a feed in memory.
Parsing a feed from a string¶
>>> import feedparser
>>> rawdata = """<rss version="2.0">
<channel>
<title>Sample Feed</title>
</channel>
</rss>"""
>>> d = feedparser.parse(rawdata)
>>> d['feed']['title']
u'Sample Feed'
Values are returned as Python Unicode strings (except when they’re not – see Character Encoding Detection for all the gory details).