seqparse package¶
Submodules¶
seqparse.containers module¶
Classes utilized by the Seqparse class.
-
class
seqparse.containers.
FileExtension
(name=None, parent=None)[source]¶ Bases:
_abcoll.MutableMapping
Container for frame sequences, indexed by zero-padding.
Parameters: - name (str, optional) – The file extension used by the contents of the container (ie, “exr”, “tif”).
- parent (FileSequenceContainer, optional) – The container from which this instance was spawned.
-
name
¶ str – name of the file extension.
-
output
()[source]¶ Calculate a sorted list of all contained file extentions.
Yields: FrameSequence, sorted by zero-pad length.
-
parent
¶ FileSequenceContainer – parent of the instance.
-
class
seqparse.containers.
FileSequenceContainer
(name=None, file_path=None)[source]¶ Bases:
_abcoll.MutableMapping
Container for file sequences, indexed by file extension.
Parameters: - name (str, optional) – Base name of the contained files.
- file_path (str, optional) – Directory in which the contained files reside.
-
full_name
¶ str – Full (base) name of the file sequence.
-
name
¶ str – Base name of the file sequence (no containing directory).
-
output
()[source]¶ Calculate a sorted list of all contained file sequences.
Yields: - FileSequence, sorted (in order) by file path, extension, and zero-
padding length.
-
path
¶ str – directory in which the contained files reside.
-
class
seqparse.containers.
SingletonContainer
(file_names=None, file_path=None)[source]¶ Bases:
_abcoll.MutableSet
Container for singleton files, indexed alphabetically by file path.
Parameters: - file_names (list-like of str, optional) – List of base file names to store in the container.
- file_path (str, optional) – Directory in which the contained files reside.
-
cache_stat
(base_name, input_stat)[source]¶ Cache file system stat data for the specified file base name.
Input disk stat value will be stored in a new stat_result instance.
Parameters: - base_name (str) – Base name of the file for which the supplied disk stats are being cached.
- input_stat (stat_result) – Value that you’d like to cache.
Returns: stat_result that was successfully cached.
-
output
()[source]¶ Calculate formatted list of all contained file sequences.
Yields: File, sorted alphabetically.
-
path
¶ Directory in which the contained files are located.
-
stat
(base_name=None)[source]¶ Individual file system status, indexed by base name.
This method only returns cached disk stats (if any exist). Use the cache_stat method if you’d like to set new values.
Parameters: base_name (str, optional) – Base name of the file for which you’d like to return the disk stats. Returns: None if a file has been specified but disk stats have not been cached. stat_result if a file has been specified and disk stats have been previously cached. dict of disk stats, indexed by str base name if no name has been specified.
seqparse.files module¶
Singleton file-related data structures utilized by the Seqparse module.
-
class
seqparse.files.
File
(file_name, stat=None)[source]¶ Bases:
object
Simple representation of files on disk.
Parameters: - file_name (str) – Full path to the input file.
- stat (stat_result, optional) – Disk stats you’d like to cache for the specified file.
-
full_name
¶ str – Full name of the sequence, including containing directory.
-
mtime
¶ int – Modification time of the file.
Returns None if the files have not been stat’d on disk.
-
name
¶ str – Base name of the file sequence (no containing directory).
-
path
¶ str – Directory in which the contained files are located.
-
size
¶ int – Size of the file in bytes.
Returns None if the files have not been stat’d on disk.
-
stat
(force=False, lazy=False)[source]¶ File system status.
Parameters: - force (bool, optional) – Whether to force disk stat query, regardless of caching status.
- lazy (bool, optional) – Whether to query disk stats should no cached value exist.
Returns: None if a frame has been specified but disk stats have not been cached. stat_result if a frame has been specified and disk stats have been previously cached.
seqparse.regex module¶
Container for all regular expressions used by the seqparse module.
-
class
seqparse.regex.
SeqparseRegexMixin
[source]¶ Bases:
object
Base for classes that need to perform regular expression matches.
-
bits_match
(val, as_dict=False)[source]¶ Calculate first, last, step for valid string frame chunks.
Parameters: - val (str) – Input chunk of a frame range.
- as_dict (bool, optional) – Whether to output return values as a dict of regex groups. Defaults to False.
Returns: None if input is an invalid chunk, tuple consisting of (first frame, last frame, frame step) with as_dict = False, or dict of regex groups with as_dict = True.
-
file_name_match
(val, as_dict=False)[source]¶ Calculate base name, frame, extension for valid string file name.
Parameters: - val (str) – Input file name.
- as_dict (bool, optional) – Whether to output return values as a dict of regex groups. Defaults to False.
Returns: None if input is an invalid sequence file name, tuple consisting of (base name, frame, extension) with as_dict = False, or dict of regex groups with as_dict = True.
-
file_seq_match
(val, as_dict=False)[source]¶ Calculate base name, sequence, extension for valid file sequence.
Parameters: - val (str) – Input file sequence.
- as_dict (bool, optional) – Whether to output return values as a dict of regex groups. Defaults to False.
Returns: None if input is an invalid sequence file sequence, tuple consisting of (base name, frame sequence, extension) with as_dict = False, or dict of regex groups with as_dict = True.
-
seqparse.seqparse module¶
The main engine for the seqparse module.
-
class
seqparse.seqparse.
Seqparse
[source]¶ Bases:
seqparse.regex.SeqparseRegexMixin
Storage and parsing engine for file sequences.
The primary usage for this class is to scan specified locations on disk for file sequences and singletons (any file that cannot be classified as a member of a file sequence).
Examples
With the following file structure ...
- test_dir/
- TEST_DIR.0001.tif TEST_DIR.0002.tif TEST_DIR.0003.tif TEST_DIR.0004.tif TEST_DIR.0010.tif SINGLETON.jpg
>>> from seqparse.seqparse import Seqparse >>> parser = Seqparse() >>> parser.scan_path("test_dir") >>> for item in parser.output(): ... print str(item) ... test_dir/TEST_DIR.0001-0004,0010.tif test_dir/SINGLETON.jpg >>> for item in parser.output(seqs_only=True): ... print str(item) ... test_dir/TEST_DIR.0001-0004,0010.tif >>> for item in parser.output(missing=True): ... print str(item) ... test_dir/TEST_DIR.0005-0009.tif
-
add_file
(file_name)[source]¶ Add a file to the parser instance.
Parameters: file_name (str) – The name of the file you’d like to add to the parser. Returns: None
-
locations
¶ A dictionary of tracked singletons and file sequences.
-
output
(missing=False, seqs_only=False)[source]¶ Yield a list of contained singletons and file sequences.
Parameters: - missing (bool, optional) – Whether to yield “inverted” file sequences (ie, the missing files). Defaults to False. NOTE: Using this option implies that seqs_only == True.
- seqs_only (bool, optional) – Whether to only yield file sequences (if any). Defaults to False.
Yields: File and/or FileSequence instances, depending on input arguments.
-
scan_options
¶ A dictionary of options used while scanning disk for files.
-
scan_path
(search_paths, max_levels=-1, min_levels=-1)[source]¶ Scan supplied path, add all discovered files to the instance.
Parameters: - search_paths (str) – The location(s) on disk you’d like to scan for file sequences and singletons.
- max_levels (int, optional) – Descend at most the specified number (a non- negative integer) of directories below the starting point. max_levels == 0 means only scan the starting-point itself.
- min_levels (int, optional) – Do not scan at levels less than specified number (a non-negative integer). min_levels == 1 means scan all levels except the starting-point.
Returns: None
-
sequences
¶ A dictionary of tracked file sequences.
-
singletons
¶ A dictionary of tracked singleton files.
-
static
validate_frame_sequence
(frame_seq)[source]¶ Whether the supplied frame (not file) sequence is valid.
Parameters: frame_seq (str) – The string representation of the frame sequence to validate. Returns: None if the supplied sequence is invalid, a (possibly corrected) string file sequence if valid. Examples
>>> from seqparse.seqparse import Seqparse >>> parser = Seqparse() >>> print parser.validate_frame_sequence("0001-0001") 0001 >>> print parser.validate_frame_sequence("0001-") None >>> print parser.validate_frame_sequence("3,1,5,7") 1-7x2
seqparse.sequences module¶
Sequence-related data structures utilized by the Seqparse module.
-
class
seqparse.sequences.
FileSequence
(name=None, frames=None, ext=None, pad=1)[source]¶ Bases:
seqparse.sequences.FrameSequence
Representative for sequences of files.
You may create a new instance of this class a couple of ways:
Either clone a new instance from an existing one ...
>>> clone = FileSequence(parent)
... or create a valid instance by providing a file extension and at least one frame (the base name is optional in every sense of the word):
>>> fseq = FileSequence(frames="0001", ext="exr")
Parameters: - name (str, optional) – Base name (including containing directory) for the file sequence.
- frames (many types, optional) – Initial frame range to store in the instance. Acceptable input types include FrameChunk, FrameSequence instances, string representation of a frame sequence, list, set, or tuple of integer frames.
- ext (str, optional) – File extension for the sequence.
- pad (int, optional) – Frame padding for the sequence. Defaults to 1.
-
cache_stat
(frame, input_stat)[source]¶ Cache file system stat data for the specified frame.
Input disk stat value will be stored in a new stat_result instance.
Parameters: - frame (int) – Frame for which you’d like to cache the supplied stat data.
- input_stat (stat_result) – Value that you’d like to cache.
Returns: stat_result that was successfully cached.
-
calculate
(force=False)[source]¶ Calculate the output file sequence.
Output string file sequence will always be recalculated if the instance has been marked as “dirty” when its contents have been modified by an external process.
Parameters: force (bool, optional) – Whether to force recalculation. Returns: None
-
ctime
¶ int – Most recent inode or file change time for a file in the sequence.
Returns None if the files have not been stat’d on disk.
-
ext
¶ str – File extension for the sequence.
-
frames
¶ iterator(str) – the file sequence’s padded frames.
-
full_name
¶ str – Full name of the sequence, including containing directory.
-
invert
()[source]¶ Calculate file names missing from the sequence.
Returns: FileSequence containing the missing files (if any).
-
mtime
¶ int – Most recent file modification time for a file in the sequence.
Returns None if the files have not been stat’d on disk.
-
name
¶ str – Base name of the file sequence (no containing directory).
Note: Setting this property will modify both the full_path and path properties.
-
path
¶ str – Directory in which the contained files are located.
Note: Setting the name property will reset the contained value.
-
pretty_frames
¶ str – pretty representation of the file sequence’s print frames.
-
size
¶ int – Total size of the file sequence in bytes.
Returns None if the files have not been stat’d on disk.
-
stat
(frame=None, force=False, lazy=False)[source]¶ Individual frame file system status.
Parameters: - frame (int, optional) – Frame for which you’d like to return the disk stats.
- force (bool, optional) – Whether to force disk stat query, regardless of caching status.
- lazy (bool, optional) – Whether to query disk stats should no cached value exist.
Returns: None if a frame has been specified but disk stats have not been cached. stat_result if a frame has been specified and disk stats have been previously cached. dict of disk stats, indexed by int frame if no frame has been specified.
-
class
seqparse.sequences.
FrameSequence
(frames=None, pad=1)[source]¶ Bases:
_abcoll.MutableSet
,seqparse.regex.SeqparseRegexMixin
Representative for zero-padded frame sequences.
Parameters: - frames (many types, optional) – Initial frame range to store in the instance. Acceptable input types include FrameChunk, FrameSequence instances, string representation of a frame sequence, list, set, or tuple of integer frames.
- pad (int, optional) – Initial zero-padding for the instance. Ignored if input iterable is a either a FrameChunk, FrameSequence, or string representation of a frame sequence.
Examples
All of the following will result in equivalent output:
>>> FrameSequence(range(1,6), pad=4) >>> FrameSequence(set([1, 2, 3, 4, 5]), pad=4) >>> FrameSequence("0001-0005") >>> FrameSequence(FrameChunk(first=1, last=5, pad=4))
-
calculate
(force=False)[source]¶ Calculate the output file sequence.
Output string file sequence will always be recalculated if the instance has been marked as “dirty” when its contents have been modified by an external process.
Parameters: force (bool, optional) – Whether to force recalculation. Returns: None
-
invert
()[source]¶ Calculate frames missing from the sequence.
Returns: FrameSequence containing the missing frames (if any).
-
is_dirty
¶ bool – Whether output needs to be recalculated after an update.
-
is_padded
¶ bool – Whether the FrameSequence contains any zero-padded frames.
-
pad
¶ if isinstance(frames, FrameSequence) – self.stat().update(copy.deepcopy(frames.stat()))
Module contents¶
seqparse: A nifty way to list your file sequences.
The seqparse module may be used to ...
- Scan specified paths for file sequences and “singletons,”
- Construct frame and file sequence from supplied values, and
- Query disk for overall footprint of tracked files.
The module also comes supplied with a simple command-line tool named “seqls.”
Frame sequences are broken down into comma-separated chunks of the format
(first frame)-(last frame)x(step)
where the following rules apply:
- Frame numbers can be zero-padded,
- Frame step (increment) is always a positive integer,
- The number of digits in a frame may exceed the padding of a sequence, eg “001,010,100,1000”,
- Frame chunks with a specified step will always consist of three or more frames.
Examples of proper frame sequences:
- Non-padded sequence, frames == (1, 3, 5, 7): 1-7x2
- Four-padded sequence, frames == (1, 3, 5, 7): 0001-0007x2
- Three-padded sequence, frames == (11, 13): 011,013
- Two-padded sequence (1, 3, 5, 7, 11, 13, 102): 01-07x2,11,13,102
-
seqparse.
get_parser
()[source]¶ Create a new Seqparse instance.
Returns: Valid Seqparse instance. Examples
>>> from seqparse import get_parser >>> get_parser() Seqparse(sequences=0, singletons=0)
-
seqparse.
get_sequence
(frames, pad=1)[source]¶ Create a new FrameSequence instance.
Parameters: - frames (str, list, tuple, or set) – Either a string representation of a valid frame sequence or a list-like iterable of integer frames.
- pad (int, optional) – Desired zero-padding for the new FrameSequence instance. Defaults to 1.
Returns: Valid FrameSequence instance.
Examples
>>> from seqparse import get_sequence >>> get_sequence(range(5)) FrameSequence(pad=1, frames=set([0, 1, 2, 3, 4])) >>> get_sequence([1, 2, 3]) FrameSequence(pad=1, frames=set([1, 2, 3])) >>> get_sequence("0001-0005x2") FrameSequence(pad=4, frames=set([1, 3, 5]))
-
seqparse.
get_version
(pretty=False)[source]¶ Report which version of seqparse you’re using.
Parameters: pretty (bool) – Returns: str seqparse version.
-
seqparse.
invert
(iterable)[source]¶ Create an iterator representing a sequence’s missing frames or files.
Parameters: iterable (FrameChunk, FrameSequence, or FileSequence) – Iterable that you’d like to invert. Returns: - Valid (inverted) FrameSequence from input FrameChunk and FrameSequence
- instances, FileSequence for input FileSequence instances. Note: Should the input sequences contain no gaps, will return an empty sequence instance.
Examples
>>> from seqparse import get_sequence, invert >>> seq = get_sequence("0001-0005x2") >>> print repr(seq), str(seq) FrameSequence(pad=4, frames=set([1, 3, 5])) 0001-0005x2 >>> inverted = invert(seq) >>> print repr(inverted), str(inverted) FrameSequence(pad=4, frames=set([2, 4])) 0002,0004
-
seqparse.
validate_frame_sequence
(frame_seq)[source]¶ Whether the supplied string frame (not file) sequence is valid.
Parameters: frame_seq (str) – The frame sequence you’d like to validate. Returns: None for invalid inputs, corrected/validated str frame sequence for valid input (see below for examples). Examples
>>> from seqparse import validate_frame_sequence >>> print validate_frame_sequence("0001-0001") 0001 >>> print validate_frame_sequence("0001-") None