Pafy Documentation

This is the documentation for Pafy - a Python library for retrieving content and metadata from YouTube

A quick start intro with usage examples is available in the README

Development / Source code / Bug reporting: github.com/np1/pafy

Homepage: np1.github.io/pafy

Pafy Objects and Stream Objects

Pafy objects relate to videos hosted on YouTube. They hold metadata such as title, viewcount, author and video ID

Stream objects relate to individual streams of a YouTube video. They hold stream-specific data such as resolution, bitrate and url. Each Pafy object contains multiple stream objects.

Pafy Objects

Create a Pafy object using the pafy.new() function, giving a YouTube video URL as the argument.

pafy.new(video_url[, basic=True][, gdata=False][, signature=True][, size=False][, callback=None])

Creates a new Pafy object. All optional arguments (apart from callback) are used to specify which data items are fetched on initialisation.

Parameters:
  • url (str) – The YouTube url or 11 character video id of the video
  • basic (bool) – fetch basic metadata and streams
  • gdata (bool) – fetch gdata info (upload date, description, category, username, likes, dislikes)
  • signature (bool) – fetch data required to decrypt urls, if encrypted
  • size (bool) – fetch the size of each stream (slow)(decrypts urls if needed)
  • callback (function) – a callback function to receive status strings
Return type:

Pafy object

If any of basic, gdata, signature or size are False, those data items will be fetched only when first called for.

The defaults are recommended for most cases. If you wish to create many video objects at once, you may want to set all to False, eg:

vid = pafy.new(basic=False, signature=False)

This will be quick because no http requests will be made on initialisation.

Setting signature or size to True will override the basic argument and force basic data to be fetched too (basic data is required to obtain Stream objects and determine whether signatures are encrypted).

Similarly, setting size to True will force the signature data to be fetched if the videos have encrypted signatures, so will override the value set in the signature argument.

Example:

import pafy
myvid = pafy.new("http://www.youtube.com/watch?v=dQw4w9WgXc")

Pafy Attributes

Once you have created a Pafy object using pafy.new(), several data attributes are available

Pafy.author

The author of the video (str)

Pafy.bigthumb(*str*)

The url of the video’s display image (not always available)

Pafy.bigthumbhd

The url of the video’s larger display image (not always available) (str)

Pafy.category

The category of the video (str)

Pafy.description

The video description text (str)

Pafy.dislikes

The number of dislikes received for the video (int)

Pafy.duration

The duration of the stream (string formatted as HH:MM:SS)

Pafy.keywords

A list of the video’s keywords (not always available) ([str])

Pafy.length

The duration of the streams in seconds (int)

Pafy.likes

The number of likes received for the video (int)

Pafy.published

The upload date of the video (e.g., 2012-10-02 17:17:24) (str)

Pafy.rating

The rating of the video (0-5), (float)

Pafy.thumb

The url of the video’s thumbnail image (str)

Pafy.title

The title of the video (str)

Pafy.username

The username of the uploader (str)

Pafy.videoid

The 11-character video id (str)

Pafy.viewcount

The viewcount of the video (int)

An example of accessing this video metadata is shown below:

import pafy
v = pafy.new("dQw4w9WgXcQ")
print(v.title)
print(v.duration)
print(v.rating)
print(v.author)
print(v.length)
print(v.keywords)
print(v.thumb)
print(v.videoid)
print(v.viewcount)

Which will result in this output:

Rick Astley - Never Gonna Give You Up
00:03:33
4.75177729422
RickAstleyVEVO
213
['Rick', 'Astley', 'Sony', 'BMG', 'Music', 'UK', 'Pop']
https://i1.ytimg.com/vi/dQw4w9WgXcQ/default.jpg
dQw4w9WgXcQ
69788014

Pafy Methods

The Pafy.getbest() and Pafy.getbestaudio() methods are a quick way to access the highest quality streams for a particular video without needing to query the stream lists.

Pafy.getbest([preftype="any"][, ftypestrict=True])

Selects the stream with the highest resolution. This will return a “normal” stream (ie. one with video and audio)

Parameters:
  • preftype (str) – Preferred type, set to mp4, webm, flv, 3gp or any
  • ftypestrict (boolean) – Set to False to return a type other than that specified in preftype if it has a higher resolution
Return type:

pafy.Stream

Pafy.getbestaudio([preftype="any"][, ftypestrict=True])

Selects the audio stream with the highest bitrate.

Parameters:
  • preftype (str) – Preferred type, set to ogg or m4a or any
  • ftypestrict (boolean) – Set to False to return a type other than that specified in preftype if that has the highest bitrate
Return type:

pafy.Stream

Stream Lists

A Pafy object provides multiple stream lists. These are:

Pafy.streams

A list of regular streams (streams containing both audio and video)

Pafy.audiostreams

A list of audio-only streams (aac streams (.m4a) and ogg vorbis streams (.ogg))

Pafy.videostreams

A list of video-only streams (Note: these streams have no audio data)

Pafy.oggstreams

A list of ogg vorbis encoded audio streams

Pafy.m4astreams

A list of aac encoded audio streams

Pafy.allstreams

A list of all available streams

An example of accessing stream lists:

>>> import pafy
>>> v = pafy.new("cyMHZVT91Dw")
>>> v.audiostreams
[audio:m4a@48k, audio:m4a@128k, audio:m4a@256k]
>>> v.streams
[normal:webm@640x360, normal:mp4@640x360, normal:flv@320x240, normal:3gp@320x240, normal:3gp@176x144]
>>> v.allstreams
[normal:webm@640x360, normal:mp4@640x360, normal:flv@320x240, normal:3gp@320x240, normal:3gp@176x144, video:m4v@854x480, video:m4v@640x360, video:m4v@426x240, video:m4v@256x144, audio:m4a@48k, audio:m4a@128k, audio:m4a@256k]

Stream Objects

class pafy.Stream

After you have created a Pafy object using new(), you can then access the streams using one of the Stream Lists, or by calling Pafy.getbest() or Pafy.getbestaudio() on the object.

Stream Attributes

A Stream object can be used to access the following attributes
Stream.url

The direct access URL of the stream. This can be used to stream the media in mplayer or vlc, or for downloading with wget or curl. To download directly, use the Stream.download() method

Stream.bitrate

The bitrate of the stream - if it is an audio stream, otherwise None, This is a string of the form “192k”.

Stream.dimensions

A 2-tuple (x, y) representing the resolution of a video stream.

Stream.extension

The format of the stream, will be one of: 'ogg', 'm4a', 'mp4', 'flv', 'webm', '3gp'

Stream.mediatype

A string attribute that is 'normal', 'audio' or 'video', depending on the content of the stream

Stream.quality

The resolution or the bitrate of the stream, depending on whether the stream is video or audio respectively

Stream.resolution

The resolution of a video as a string, eg: “820x640”. Note if the stream is 3D this will be appended; eg: “820x640-3D”.

For audio streams, this will be set to “0x0”

Stream.rawbitrate

The bitrate of an audio stream, int

For video streams, this will be set to None

Stream.threed

Whether the stream is a 3D video (boolean)

Stream.title

The title of the video, this will be the same as Pafy.title

Stream.notes

Any additional notes regarding the stream (eg, 6-channel surround) str

An example of accessing Stream attributes:

>>> import pafy
>>> v = pafy.new("cyMHZVT91Dw")
>>> v.audiostreams
[audio:m4a@48k, audio:m4a@128k, audio:m4a@256k]
>>> mystream = v.audiostreams[2]
>>> mystream.rawbitrate
255940
>>> mystream.bitrate
'256k'
>>> mystream.url
'http://r20---sn-aigllnes.c.youtube.com/videoplayback?ipbits=8&clen=1130...

Stream Methods

Stream.get_filesize()

Returns the filesize of a stream

Stream.download([filepath=""][, quiet=False][, callback=None])

Downloads the stream object

Parameters:
  • filepath (string) – The filepath to use to save the stream, defaults to title.extension if ommitted
  • quiet (boolean) – Whether to supress output of the download progress
  • callback (function or None) – Call back function to use for receiving download progress

If a callback function is provided, it will be called repeatedly for each chunk downloaded. It must be a function that takes five arguments. These are:

  • total bytes in stream, int
  • total bytes downloaded, int
  • ratio downloaded (0-1), float
  • download rate (kbps), float
  • ETA in seconds, float

Stream.download() example

Example of using stream.download():

import pafy
v = pafy.new("cyMHZVT91Dw")
s = v.getbest()
print("Size is %s" % s.get_filesize())
s.download()

Will download the file to the current working directory with the filename title.extension (eg. “cute kittens.mp4”) and output the following progress statistics:

Size is 34775366
1,015,808 Bytes [2.92%] received. Rate: [ 640 kbps].  ETA: [51 secs]

Download using callback example:

import pafy

# callback function, this callback simply prints the bytes received,
# ratio downloaded and eta.
def mycb(total, recvd, ratio, rate, eta):
    print(recvd, ratio, eta)

p = pafy.new("cyMHZVT91Dw")
ba = p.getbestaudio()
ba.download(quiet=True, callback=mycb)

The output of this will appear as follows, while the file is downloading:

(16384, 0.001449549245392125, 20.05230682669207)
(32768, 0.00289909849078425, 16.88200659636641)
(49152, 0.004348647736176375, 15.196503182407469)
(65536, 0.0057981969815685, 14.946467230009146)
(81920, 0.007247746226960625, 15.066431667096913)
(98304, 0.00869729547235275, 14.978577915171627)
(114688, 0.010146844717744874, 14.529802172976945)
(131072, 0.011596393963137, 14.31917945870373)
...

Playlist Retrieval

The pafy.get_playlist() function is initialised with similar arguments to pafy.new() and will return a dict containing metadata and Pafy objects as listed in the YouTube playlist.

pafy.get_playlist(playlist_url[, basic=False][, gdata=False][, signature=False][, size=False][, callback=None])
Parameters:
  • playlist_url (str) – The YouTube playlist url
  • basic (bool) – fetch basic metadata and streams
  • gdata (bool) – fetch gdata info (upload date, description, category, username, likes, dislikes)
  • signature (bool) – fetch data required to decrypt urls, if encrypted
  • size (bool) – fetch the size of each stream (slow)(decrypts urls if needed)
  • callback (function) – a callback function to receive status strings
Return type:

dict

The returned dict contains the following keys:

playlist_id: the id of the playlist

likes: the number of likes for the playlist

dislikes: the number of dislikes for the playlist

title: the title of the playlist

author: the author of the playlist

description: the description of the playlist

items: a list of dicts with each dict representing a video and containing the following keys:

pafy: The Pafy object for this video, initialised with the arguments given to pafy.get_playlist()

playlist_meta: a dict of various video-specific metadata fetched from the playlist data, including:

added, likes, dislikes, thumbnail, is_cc, is_hd, user_id, cc_license, privacy, category_id

pafy.get_playlist() example

>>> import pafy
>>> plurl = "https://www.youtube.com/playlist?list=PL634F2B56B8C346A2"
>>> playlist = pafy.get_playlist(plurl)
>>>
>>> playlist['title']
u'Rick Astley playlist'
>>>
>>> playlist['author']
u'Deborah Back'
>>>
>>> len(playlist['items'])
43
>>>
>>> playlist['items'][21]['pafy']
Title: Body and Soul - Rick astley
Author: jadiafa
ID: QtHnEJ8UArY
Duration: 00:04:11
Rating: 5.0
Views: 18855
Thumbnail: http://i1.ytimg.com/vi/QtHnEJ8UArY/default.jpg
Keywords: Rick, astely, body, and, soul, pop
>>>
>>> playlist['items'][21]['pafy'].audiostreams
[audio:m4a@128k]
>>>
>>> playlist['items'][21]['pafy'].getbest()
normal:webm@640x360
>>>
>>> playlist['items'][21]['pafy'].getbest().url
u'http://r4---sn-4g57knzr.googlevideo.com/videoplayback?ipbits=0&ratebypas...'