Pafy Documentation¶

This is the documentation for pafy - a Python library to download YouTube content and retrieve metadata

A quick start intro with usage examples is available in the README

Development / Source code / Bug reporting: github.com/np1/pafy

API Keys¶

Specifying an API key is optional, as pafy includes one. However, it is prefered that software calling pafy provides it’s own API key, and the default may be removed in the future.

Information from Google about obtaining an API key

pafy.set_api_key(key)¶

“Sets the API key for pafy to use.”

Parameters:	key – API key to use

Pafy Objects and Stream Objects¶

Pafy objects relate to videos hosted on YouTube. They hold metadata such as title, viewcount, author and video ID

Stream objects relate to individual streams of a YouTube video. They hold stream-specific data such as resolution, bitrate and url. Each Pafy object contains multiple stream objects.

Pafy Objects¶

Create a Pafy object using the pafy.new() function, giving a YouTube video URL as the argument.

pafy.new(video_url[, basic=True][, gdata=False][, signature=True][, size=False][, callback=None])¶

Creates a new Pafy object. All optional arguments (apart from callback) are used to specify which data items are fetched on initialisation.

Parameters:

url (str) – The YouTube url or 11 character video id of the video
basic (bool) – fetch basic metadata and streams
gdata (bool) – fetch gdata info (upload date, description, category, username, likes, dislikes)
signature (bool) – Note: The signature argument now has no effect and will be removed in a future version
size (bool) – fetch the size of each stream (slow)(decrypts urls if needed)
callback (function) – a callback function to receive status strings

Return type:

pafy.Pafy

If any of basic, gdata or size are False, those data items will be fetched only when first called for.

The defaults are recommended for most cases. If you wish to create many video objects at once, you may want to set all to False, eg:

vid = pafy.new(basic=False)

This will be quick because no http requests will be made on initialisation.

Setting size to True will override the basic argument and force basic data to be fetched too (basic data is required to obtain Stream objects)

Example:

import pafy
myvid = pafy.new("http://www.youtube.com/watch?v=dQw4w9WgXc")

Pafy Attributes¶

Once you have created a Pafy object using pafy.new(), several data attributes are available

Pafy.author¶: The author of the video (str)

Pafy.bigthumb(*str*)¶: The url of the video’s display image (not always available)

Pafy.bigthumbhd¶: The url of the video’s larger display image (not always available) (str)

Pafy.category¶: The category of the video (str)

Pafy.description¶: The video description text (str)

Pafy.dislikes¶: The number of dislikes received for the video (int)

Pafy.duration¶: The duration of the stream (string formatted as HH:MM:SS)

Pafy.keywords¶: A list of the video’s keywords (not always available) ([str])

Pafy.length¶: The duration of the streams in seconds (int)

Pafy.likes¶: The number of likes received for the video (int)

Pafy.published¶: The upload date of the video (e.g., 2012-10-02 17:17:24) (str)

Pafy.mix¶: The mix playlist provided by youtube for this video (dict)

Pafy.rating¶: The rating of the video (0-5), (float)

Pafy.thumb¶: The url of the video’s thumbnail image (str)

Pafy.title¶: The title of the video (str)

Pafy.username¶: The username of the uploader (str)

Pafy.videoid¶: The 11-character video id (str)

Pafy.viewcount¶: The viewcount of the video (int)

An example of accessing this video metadata is shown below:

import pafy
v = pafy.new("dQw4w9WgXcQ")
print(v.title)
print(v.duration)
print(v.rating)
print(v.author)
print(v.length)
print(v.keywords)
print(v.thumb)
print(v.videoid)
print(v.viewcount)

Which will result in this output:

Rick Astley - Never Gonna Give You Up
00:03:33
4.75177729422
RickAstleyVEVO
213
['Rick', 'Astley', 'Sony', 'BMG', 'Music', 'UK', 'Pop']
https://i1.ytimg.com/vi/dQw4w9WgXcQ/default.jpg
dQw4w9WgXcQ
69788014

Pafy Methods¶

The Pafy.getbest(), Pafy.getbestaudio() and Pafy.getbestvideo() methods are a quick way to access the highest quality streams for a particular video without needing to query the stream lists.

Pafy.getbest([preftype="any"][, ftypestrict=True])¶

Selects the stream with the highest resolution. This will return a “normal” stream (ie. one with video and audio)

Parameters:	preftype (str) – Preferred type, set to mp4, webm, flv, 3gp or any ftypestrict (boolean) – Set to False to return a type other than that specified in preftype if it has a higher resolution
Return type:	`pafy.Stream`

Pafy.getbestaudio([preftype="any"][, ftypestrict=True])¶

Selects the audio stream with the highest bitrate.

Parameters:	preftype (str) – Preferred type, set to ogg or m4a or any ftypestrict (boolean) – Set to False to return a type other than that specified in preftype if that has the highest bitrate
Return type:	`pafy.Stream`

Pafy.getbestvideo([preftype="any"][, ftypestrict=True])¶

Selects the video-only stream with the highest resolution. This will return a “video” stream (ie. one with no audio)

Parameters:	preftype (str) – Preferred type, set to m4v, webm or any ftypestrict (boolean) – Set to False to return a type other than that specified in preftype if it has a higher resolution
Return type:	`pafy.Stream`

Stream Lists¶

A Pafy object provides multiple stream lists. These are:

Pafy.streams¶: A list of regular streams (streams containing both audio and video)

Pafy.audiostreams¶: A list of audio-only streams; aac streams (.m4a) and ogg vorbis streams (.ogg) if available

Pafy.videostreams¶: A list of video-only streams (Note: these streams have no audio data)

Pafy.oggstreams¶: A list of ogg vorbis encoded audio streams (Note: may be empty for some videos)

Pafy.m4astreams¶: A list of aac encoded audio streams

Pafy.allstreams¶: A list of all available streams

An example of accessing stream lists:

>>> import pafy
>>> v = pafy.new("cyMHZVT91Dw")
>>> v.audiostreams
[audio:m4a@48k, audio:m4a@128k, audio:m4a@256k]
>>> v.streams
[normal:webm@640x360, normal:mp4@640x360, normal:flv@320x240, normal:3gp@320x240, normal:3gp@176x144]
>>> v.allstreams
[normal:webm@640x360, normal:mp4@640x360, normal:flv@320x240, normal:3gp@320x240, normal:3gp@176x144, video:m4v@854x480, video:m4v@640x360, video:m4v@426x240, video:m4v@256x144, audio:m4a@48k, audio:m4a@128k, audio:m4a@256k]

Stream Objects¶

class pafy.Stream¶

After you have created a Pafy object using new(), you can then access the streams using one of the Stream Lists, or by calling Pafy.getbest() or Pafy.getbestaudio() on the object.

Stream Attributes¶

A Stream object can be used to access the following attributes

Stream.url¶: The direct access URL of the stream. This can be used to stream the media in mplayer or vlc, or for downloading with wget or curl. To download directly, use the Stream.download() method.

Stream.url_https¶: The direct access HTTPS URL of the stream.

Stream.bitrate¶: The bitrate of the stream - if it is an audio stream, otherwise None, This is a string of the form “192k”.

Stream.dimensions¶: A 2-tuple (x, y) representing the resolution of a video stream.

Stream.extension¶: The format of the stream, will be one of: 'ogg', 'm4a', 'mp4', 'flv', 'webm', '3gp'

Stream.mediatype¶: A string attribute that is 'normal', 'audio' or 'video', depending on the content of the stream

Stream.quality¶: The resolution or the bitrate of the stream, depending on whether the stream is video or audio respectively

Stream.resolution¶

The resolution of a video as a string, eg: “820x640”. Note if the stream is 3D this will be appended; eg: “820x640-3D”.

For audio streams, this will be set to “0x0”

Stream.rawbitrate¶

The bitrate of an audio stream, int

For video streams, this will be set to None

Stream.threed¶: True if the stream is a 3D video (boolean)

Stream.title¶: The title of the video, this will be the same as Pafy.title

Stream.notes¶: Any additional notes regarding the stream (eg, 6-channel surround) str

An example of accessing Stream attributes:

>>> import pafy
>>> v = pafy.new("cyMHZVT91Dw")
>>> v.audiostreams
[audio:m4a@48k, audio:m4a@128k, audio:m4a@256k]
>>> mystream = v.audiostreams[2]
>>> mystream.rawbitrate
255940
>>> mystream.bitrate
'256k'
>>> mystream.url
'http://r20---sn-aigllnes.c.youtube.com/videoplayback?ipbits=8&clen=1130...

Stream Methods¶

Stream.get_filesize()¶: Returns the filesize of a stream

Stream.download([filepath=""][, quiet=False][, callback=None][, meta=False][, remux_audio=False])¶

Downloads the stream object, returns the path of the downloaded file.

Parameters:

filepath (string) – The filepath to use to save the stream, defaults to (sanitised) title.extension if ommitted
quiet (boolean) – If True, supress output of the download progress
callback (function or None) – Call back function to use for receiving download progress
meta (bool) – If True, video id and itag are appended to filename
remux_audio (bool) – If True, remux audio file downloads (fixes some compatibility issues with file format, requires ffmpeg/avconv)

Return type:

str

If a callback function is provided, it will be called repeatedly for each chunk downloaded. It must be a function that takes the following five arguments;

total bytes in stream, int
total bytes downloaded, int
ratio downloaded (0-1), float
download rate (kbps), float
ETA in seconds, float

`Stream.download()` example¶

Example of using stream.download():

import pafy
v = pafy.new("cyMHZVT91Dw")
s = v.getbest()
print("Size is %s" % s.get_filesize())
filename = s.download()  # starts download

Will download to the current working directory and output the following progress statistics:

Size is 34775366
1,015,808 Bytes [2.92%] received. Rate: [ 640 kbps].  ETA: [51 secs]

Download using callback example:

import pafy

# callback function, this callback simply prints the bytes received,
# ratio downloaded and eta.
def mycb(total, recvd, ratio, rate, eta):
    print(recvd, ratio, eta)

p = pafy.new("cyMHZVT91Dw")
ba = p.getbestaudio()
filename = ba.download(quiet=True, callback=mycb)

The output of this will appear as follows, while the file is downloading:

(16384, 0.001449549245392125, 20.05230682669207)
(32768, 0.00289909849078425, 16.88200659636641)
(49152, 0.004348647736176375, 15.196503182407469)
(65536, 0.0057981969815685, 14.946467230009146)
(81920, 0.007247746226960625, 15.066431667096913)
(98304, 0.00869729547235275, 14.978577915171627)
(114688, 0.010146844717744874, 14.529802172976945)
(131072, 0.011596393963137, 14.31917945870373)
...

Playlist Retrieval¶

The pafy.get_playlist() function is initialised with similar arguments to pafy.new() and will return a dict containing metadata and Pafy objects as listed in the YouTube playlist.

pafy.get_playlist(playlist_url[, basic=False][, gdata=False][, signature=False][, size=False][, callback=None])¶

Parameters:

playlist_url (str) – The YouTube playlist url
basic (bool) – fetch basic metadata and streams
gdata (bool) – fetch gdata info (upload date, description, category, username, likes, dislikes)
signature (bool) – fetch data required to decrypt urls, if encrypted
size (bool) – fetch the size of each stream (slow)(decrypts urls if needed)
callback (function) – a callback function to receive status strings

Return type:

dict

The returned dict contains the following keys:

playlist_id: the id of the playlist

likes: the number of likes for the playlist

dislikes: the number of dislikes for the playlist

title: the title of the playlist

author: the author of the playlist

description: the description of the playlist

items: a list of dicts with each dict representing a video and containing the following keys:

pafy: The Pafy object for this video, initialised with the arguments given to pafy.get_playlist()

playlist_meta: a dict of various video-specific metadata fetched from the playlist data, including:

added, likes, dislikes, thumbnail, is_cc, is_hd, user_id, cc_license, privacy, category_id

`pafy.get_playlist()` example¶

>>> import pafy
>>> plurl = "https://www.youtube.com/playlist?list=PL634F2B56B8C346A2"
>>> playlist = pafy.get_playlist(plurl)
>>>
>>> playlist['title']
u'Rick Astley playlist'
>>>
>>> playlist['author']
u'Deborah Back'
>>>
>>> len(playlist['items'])
43
>>>
>>> playlist['items'][21]['pafy']
Title: Body and Soul - Rick astley
Author: jadiafa
ID: QtHnEJ8UArY
Duration: 00:04:11
Rating: 5.0
Views: 18855
Thumbnail: http://i1.ytimg.com/vi/QtHnEJ8UArY/default.jpg
Keywords: Rick, astely, body, and, soul, pop
>>>
>>> playlist['items'][21]['pafy'].audiostreams
[audio:m4a@128k]
>>>
>>> playlist['items'][21]['pafy'].getbest()
normal:webm@640x360
>>>
>>> playlist['items'][21]['pafy'].getbest().url
u'http://r4---sn-4g57knzr.googlevideo.com/videoplayback?ipbits=0&ratebypas...'

The pafy.get_playlist2() serves the same purpose as the pafy.get_playlist(), but uses version 3 of youtube’s api, making it able to retrieve playlists of over 200 items. It also provides a different interface, returning a pafy.Playlist instead of a dictionary.

pafy.get_playlist2(playlist_url[, basic=False][, gdata=False][, signature=False][, size=False][, callback=None])¶

Parameters:

playlist_url (str) – The YouTube playlist url
basic (bool) – fetch basic metadata and streams
gdata (bool) – fetch gdata info (upload date, description, category, username, likes, dislikes)
signature (bool) – fetch data required to decrypt urls, if encrypted
size (bool) – fetch the size of each stream (slow)(decrypts urls if needed)
callback (function) – a callback function to receive status strings

Return type:

pafy.Playlist

Playlist Attributes¶

Once you have retrieved a playlist with pafy.get_playlist2() you can iterate over it to get the Pafy objects for the items in it, or use len(playlist) to get its length. In addition, you can access the following attributes:

Pafy.plid¶: The ID of the playlist (str)

Pafy.title: The title of the playlist (str)

Pafy.author: The author of the playlist (str)

Pafy.description: The description of the playlist (str)

Pafy Documentation¶

API Keys¶

Pafy Objects and Stream Objects¶

Pafy Objects¶

Pafy Attributes¶

Pafy Methods¶

Stream Lists¶

Stream Objects¶

Stream Attributes¶

Stream Methods¶

Stream.download() example¶

Playlist Retrieval¶

pafy.get_playlist() example¶

Playlist Attributes¶

`Stream.download()` example¶

`pafy.get_playlist()` example¶