seqparse package

Submodules

seqparse.containers module

Classes utilized by the Seqparse class.

class seqparse.containers.FileExtension(name=None, parent=None)[source]

Bases: _abcoll.MutableMapping

Container for frame sequences, indexed by zero-padding.

Parameters:
  • name (str, optional) – The file extension used by the contents of the container (ie, “exr”, “tif”).
  • parent (FileSequenceContainer, optional) – The container from which this instance was spawned.
name

str – name of the file extension.

output()[source]

Calculate a sorted list of all contained file extentions.

Yields:FrameSequence, sorted by zero-pad length.
parent

FileSequenceContainer – parent of the instance.

class seqparse.containers.FileSequenceContainer(name=None, file_path=None)[source]

Bases: _abcoll.MutableMapping

Container for file sequences, indexed by file extension.

Parameters:
  • name (str, optional) – Base name of the contained files.
  • file_path (str, optional) – Directory in which the contained files reside.
full_name

str – Full (base) name of the file sequence.

name

str – Base name of the file sequence (no containing directory).

output()[source]

Calculate a sorted list of all contained file sequences.

Yields:
FileSequence, sorted (in order) by file path, extension, and zero-

padding length.

path

str – directory in which the contained files reside.

class seqparse.containers.SingletonContainer(file_names=None, file_path=None)[source]

Bases: _abcoll.MutableSet

Container for singleton files, indexed alphabetically by file path.

Parameters:
  • file_names (list-like of str, optional) – List of base file names to store in the container.
  • file_path (str, optional) – Directory in which the contained files reside.
add(item)[source]

Defining item addition logic (per standard set).

cache_stat(base_name, input_stat)[source]

Cache file system stat data for the specified file base name.

Input disk stat value will be stored in a new stat_result instance.

Parameters:
  • base_name (str) – Base name of the file for which the supplied disk stats are being cached.
  • input_stat (stat_result) – Value that you’d like to cache.
Returns:

stat_result that was successfully cached.

discard(item)[source]

Defining item discard logic (per standard set).

output()[source]

Calculate formatted list of all contained file sequences.

Yields:File, sorted alphabetically.
path

Directory in which the contained files are located.

stat(base_name=None)[source]

Individual file system status, indexed by base name.

This method only returns cached disk stats (if any exist). Use the cache_stat method if you’d like to set new values.

Parameters:base_name (str, optional) – Base name of the file for which you’d like to return the disk stats.
Returns:None if a file has been specified but disk stats have not been cached. stat_result if a file has been specified and disk stats have been previously cached. dict of disk stats, indexed by str base name if no name has been specified.
update(iterable)[source]

Defining item update logic (per standard set).

seqparse.files module

Singleton file-related data structures utilized by the Seqparse module.

class seqparse.files.File(file_name, stat=None)[source]

Bases: object

Simple representation of files on disk.

Parameters:
  • file_name (str) – Full path to the input file.
  • stat (stat_result, optional) – Disk stats you’d like to cache for the specified file.
full_name

str – Full name of the sequence, including containing directory.

mtime

int – Modification time of the file.

Returns None if the files have not been stat’d on disk.

name

str – Base name of the file sequence (no containing directory).

path

str – Directory in which the contained files are located.

size

int – Size of the file in bytes.

Returns None if the files have not been stat’d on disk.

stat(force=False, lazy=False)[source]

File system status.

Parameters:
  • force (bool, optional) – Whether to force disk stat query, regardless of caching status.
  • lazy (bool, optional) – Whether to query disk stats should no cached value exist.
Returns:

None if a frame has been specified but disk stats have not been cached. stat_result if a frame has been specified and disk stats have been previously cached.

seqparse.regex module

Container for all regular expressions used by the seqparse module.

class seqparse.regex.SeqparseRegexMixin[source]

Bases: object

Base for classes that need to perform regular expression matches.

bits_match(val, as_dict=False)[source]

Calculate first, last, step for valid string frame chunks.

Parameters:
  • val (str) – Input chunk of a frame range.
  • as_dict (bool, optional) – Whether to output return values as a dict of regex groups. Defaults to False.
Returns:

None if input is an invalid chunk, tuple consisting of (first frame, last frame, frame step) with as_dict = False, or dict of regex groups with as_dict = True.

file_name_match(val, as_dict=False)[source]

Calculate base name, frame, extension for valid string file name.

Parameters:
  • val (str) – Input file name.
  • as_dict (bool, optional) – Whether to output return values as a dict of regex groups. Defaults to False.
Returns:

None if input is an invalid sequence file name, tuple consisting of (base name, frame, extension) with as_dict = False, or dict of regex groups with as_dict = True.

file_seq_match(val, as_dict=False)[source]

Calculate base name, sequence, extension for valid file sequence.

Parameters:
  • val (str) – Input file sequence.
  • as_dict (bool, optional) – Whether to output return values as a dict of regex groups. Defaults to False.
Returns:

None if input is an invalid sequence file sequence, tuple consisting of (base name, frame sequence, extension) with as_dict = False, or dict of regex groups with as_dict = True.

is_frame_sequence(val)[source]

Whether a string frame sequence is valid.

Parameters:val (str) – Input frame sequence.
Returns:True if frame sequence is valid, False if it is not.

seqparse.seqparse module

The main engine for the seqparse module.

class seqparse.seqparse.Seqparse[source]

Bases: seqparse.regex.SeqparseRegexMixin

Storage and parsing engine for file sequences.

The primary usage for this class is to scan specified locations on disk for file sequences and singletons (any file that cannot be classified as a member of a file sequence).

Examples

With the following file structure ...

test_dir/
TEST_DIR.0001.tif TEST_DIR.0002.tif TEST_DIR.0003.tif TEST_DIR.0004.tif TEST_DIR.0010.tif SINGLETON.jpg
>>> from seqparse.seqparse import Seqparse
>>> parser = Seqparse()
>>> parser.scan_path("test_dir")
>>> for item in parser.output():
...     print str(item)
...
test_dir/TEST_DIR.0001-0004,0010.tif
test_dir/SINGLETON.jpg
>>> for item in parser.output(seqs_only=True):
...     print str(item)
...
test_dir/TEST_DIR.0001-0004,0010.tif
>>> for item in parser.output(missing=True):
...     print str(item)
...
test_dir/TEST_DIR.0005-0009.tif
add_file(file_name)[source]

Add a file to the parser instance.

Parameters:file_name (str) – The name of the file you’d like to add to the parser.
Returns:None
locations

A dictionary of tracked singletons and file sequences.

output(missing=False, seqs_only=False)[source]

Yield a list of contained singletons and file sequences.

Parameters:
  • missing (bool, optional) – Whether to yield “inverted” file sequences (ie, the missing files). Defaults to False. NOTE: Using this option implies that seqs_only == True.
  • seqs_only (bool, optional) – Whether to only yield file sequences (if any). Defaults to False.
Yields:

File and/or FileSequence instances, depending on input arguments.

scan_options

A dictionary of options used while scanning disk for files.

scan_path(search_paths, max_levels=-1, min_levels=-1)[source]

Scan supplied path, add all discovered files to the instance.

Parameters:
  • search_paths (str) – The location(s) on disk you’d like to scan for file sequences and singletons.
  • max_levels (int, optional) – Descend at most the specified number (a non- negative integer) of directories below the starting point. max_levels == 0 means only scan the starting-point itself.
  • min_levels (int, optional) – Do not scan at levels less than specified number (a non-negative integer). min_levels == 1 means scan all levels except the starting-point.
Returns:

None

sequences

A dictionary of tracked file sequences.

singletons

A dictionary of tracked singleton files.

static validate_frame_sequence(frame_seq)[source]

Whether the supplied frame (not file) sequence is valid.

Parameters:frame_seq (str) – The string representation of the frame sequence to validate.
Returns:None if the supplied sequence is invalid, a (possibly corrected) string file sequence if valid.

Examples

>>> from seqparse.seqparse import Seqparse
>>> parser = Seqparse()
>>> print parser.validate_frame_sequence("0001-0001")
0001
>>> print parser.validate_frame_sequence("0001-")
None
>>> print parser.validate_frame_sequence("3,1,5,7")
1-7x2

seqparse.sequences module

Sequence-related data structures utilized by the Seqparse module.

class seqparse.sequences.FileSequence(name=None, frames=None, ext=None, pad=1)[source]

Bases: seqparse.sequences.FrameSequence

Representative for sequences of files.

You may create a new instance of this class a couple of ways:

  1. Either clone a new instance from an existing one ...

    >>> clone = FileSequence(parent)
    
  2. ... or create a valid instance by providing a file extension and at least one frame (the base name is optional in every sense of the word):

    >>> fseq = FileSequence(frames="0001", ext="exr")
    
Parameters:
  • name (str, optional) – Base name (including containing directory) for the file sequence.
  • frames (many types, optional) – Initial frame range to store in the instance. Acceptable input types include FrameChunk, FrameSequence instances, string representation of a frame sequence, list, set, or tuple of integer frames.
  • ext (str, optional) – File extension for the sequence.
  • pad (int, optional) – Frame padding for the sequence. Defaults to 1.
cache_stat(frame, input_stat)[source]

Cache file system stat data for the specified frame.

Input disk stat value will be stored in a new stat_result instance.

Parameters:
  • frame (int) – Frame for which you’d like to cache the supplied stat data.
  • input_stat (stat_result) – Value that you’d like to cache.
Returns:

stat_result that was successfully cached.

calculate(force=False)[source]

Calculate the output file sequence.

Output string file sequence will always be recalculated if the instance has been marked as “dirty” when its contents have been modified by an external process.

Parameters:force (bool, optional) – Whether to force recalculation.
Returns:None
ctime

int – Most recent inode or file change time for a file in the sequence.

Returns None if the files have not been stat’d on disk.

discard(item)[source]

Defining item discard logic (per standard set).

ext

str – File extension for the sequence.

frames

iterator(str) – the file sequence’s padded frames.

full_name

str – Full name of the sequence, including containing directory.

invert()[source]

Calculate file names missing from the sequence.

Returns:FileSequence containing the missing files (if any).
mtime

int – Most recent file modification time for a file in the sequence.

Returns None if the files have not been stat’d on disk.

name

str – Base name of the file sequence (no containing directory).

Note: Setting this property will modify both the full_path and path properties.

path

str – Directory in which the contained files are located.

Note: Setting the name property will reset the contained value.

pretty_frames

str – pretty representation of the file sequence’s print frames.

size

int – Total size of the file sequence in bytes.

Returns None if the files have not been stat’d on disk.

stat(frame=None, force=False, lazy=False)[source]

Individual frame file system status.

Parameters:
  • frame (int, optional) – Frame for which you’d like to return the disk stats.
  • force (bool, optional) – Whether to force disk stat query, regardless of caching status.
  • lazy (bool, optional) – Whether to query disk stats should no cached value exist.
Returns:

None if a frame has been specified but disk stats have not been cached. stat_result if a frame has been specified and disk stats have been previously cached. dict of disk stats, indexed by int frame if no frame has been specified.

update(other)[source]

Defining item update logic (per standard set).

class seqparse.sequences.FrameSequence(frames=None, pad=1)[source]

Bases: _abcoll.MutableSet, seqparse.regex.SeqparseRegexMixin

Representative for zero-padded frame sequences.

Parameters:
  • frames (many types, optional) – Initial frame range to store in the instance. Acceptable input types include FrameChunk, FrameSequence instances, string representation of a frame sequence, list, set, or tuple of integer frames.
  • pad (int, optional) – Initial zero-padding for the instance. Ignored if input iterable is a either a FrameChunk, FrameSequence, or string representation of a frame sequence.

Examples

All of the following will result in equivalent output:

>>> FrameSequence(range(1,6), pad=4)
>>> FrameSequence(set([1, 2, 3, 4, 5]), pad=4)
>>> FrameSequence("0001-0005")
>>> FrameSequence(FrameChunk(first=1, last=5, pad=4))
add(item)[source]

Defining item addition logic (per standard set).

calculate(force=False)[source]

Calculate the output file sequence.

Output string file sequence will always be recalculated if the instance has been marked as “dirty” when its contents have been modified by an external process.

Parameters:force (bool, optional) – Whether to force recalculation.
Returns:None
discard(item)[source]

Defining item discard logic (per standard set).

invert()[source]

Calculate frames missing from the sequence.

Returns:FrameSequence containing the missing frames (if any).
is_dirty

bool – Whether output needs to be recalculated after an update.

is_padded

bool – Whether the FrameSequence contains any zero-padded frames.

pad

if isinstance(frames, FrameSequence) – self.stat().update(copy.deepcopy(frames.stat()))

update(iterable)[source]

Defining item update logic (per standard set).

exception seqparse.sequences.SeqparsePadException(message)[source]

Bases: exceptions.Exception

Exception thrown when unexpected frame padding is encountered.

Module contents

seqparse: A nifty way to list your file sequences.

The seqparse module may be used to ...

  • Scan specified paths for file sequences and “singletons,”
  • Construct frame and file sequence from supplied values, and
  • Query disk for overall footprint of tracked files.

The module also comes supplied with a simple command-line tool named “seqls.”

Frame sequences are broken down into comma-separated chunks of the format

(first frame)-(last frame)x(step)

where the following rules apply:

  • Frame numbers can be zero-padded,
  • Frame step (increment) is always a positive integer,
  • The number of digits in a frame may exceed the padding of a sequence, eg “001,010,100,1000”,
  • Frame chunks with a specified step will always consist of three or more frames.

Examples of proper frame sequences:

  • Non-padded sequence, frames == (1, 3, 5, 7): 1-7x2
  • Four-padded sequence, frames == (1, 3, 5, 7): 0001-0007x2
  • Three-padded sequence, frames == (11, 13): 011,013
  • Two-padded sequence (1, 3, 5, 7, 11, 13, 102): 01-07x2,11,13,102
seqparse.get_parser()[source]

Create a new Seqparse instance.

Returns:Valid Seqparse instance.

Examples

>>> from seqparse import get_parser
>>> get_parser()
Seqparse(sequences=0, singletons=0)
seqparse.get_sequence(frames, pad=1)[source]

Create a new FrameSequence instance.

Parameters:
  • frames (str, list, tuple, or set) – Either a string representation of a valid frame sequence or a list-like iterable of integer frames.
  • pad (int, optional) – Desired zero-padding for the new FrameSequence instance. Defaults to 1.
Returns:

Valid FrameSequence instance.

Examples

>>> from seqparse import get_sequence
>>> get_sequence(range(5))
FrameSequence(pad=1, frames=set([0, 1, 2, 3, 4]))
>>> get_sequence([1, 2, 3])
FrameSequence(pad=1, frames=set([1, 2, 3]))
>>> get_sequence("0001-0005x2")
FrameSequence(pad=4, frames=set([1, 3, 5]))
seqparse.get_version(pretty=False)[source]

Report which version of seqparse you’re using.

Parameters:pretty (bool) –
Returns:str seqparse version.
seqparse.invert(iterable)[source]

Create an iterator representing a sequence’s missing frames or files.

Parameters:iterable (FrameChunk, FrameSequence, or FileSequence) – Iterable that you’d like to invert.
Returns:
Valid (inverted) FrameSequence from input FrameChunk and FrameSequence
instances, FileSequence for input FileSequence instances. Note: Should the input sequences contain no gaps, will return an empty sequence instance.

Examples

>>> from seqparse import get_sequence, invert
>>> seq = get_sequence("0001-0005x2")
>>> print repr(seq), str(seq)
FrameSequence(pad=4, frames=set([1, 3, 5])) 0001-0005x2
>>> inverted = invert(seq)
>>> print repr(inverted), str(inverted)
FrameSequence(pad=4, frames=set([2, 4])) 0002,0004
seqparse.validate_frame_sequence(frame_seq)[source]

Whether the supplied string frame (not file) sequence is valid.

Parameters:frame_seq (str) – The frame sequence you’d like to validate.
Returns:None for invalid inputs, corrected/validated str frame sequence for valid input (see below for examples).

Examples

>>> from seqparse import validate_frame_sequence
>>> print validate_frame_sequence("0001-0001")
0001
>>> print validate_frame_sequence("0001-")
None