============================ tl.rename core functionality ============================ The ``tl.rename.core`` module defines those parts of the package's functionality that are not concerned with the user interface or specific file name transformation algorithms. It covers reading the new file names from a file or standard input, applying all known transformations to the original file names, and renaming the files accordingly. Reading new file names ====================== While the ``read_names_from_file`` function is counted among tl.rename's core functionality, it is really just another file name transformation. It is passed a sequence of old file names and returns a sequence of new ones. If no processing options are given, the original names are returned: >>> from tl.rename.core import read_names_from_file >>> read_names_from_file(['foo', 'bar/baz']) ['foo', 'bar/baz'] The function accepts an optional ``names_file`` parameter which is assumed to be a file like object with each new name listed on a separate line: >>> from StringIO import StringIO >>> new_names = """\ ... as/df ... fdsa ... """ >>> read_names_from_file(['foo', 'bar/baz'], names_file=StringIO(new_names)) ['as/df', 'fdsa'] Whether the multi-line string read from the file-like object ends with a line break makes no difference: >>> new_names = """\ ... as/df ... fdsa""" >>> read_names_from_file(['foo', 'bar/baz'], names_file=StringIO(new_names)) ['as/df', 'fdsa'] Other whitespace including empty lines ends up in the file names, however: >>> new_names = """\ ... ... as/df ... fdsa """ >>> read_names_from_file(['foo', 'bar/baz'], names_file=StringIO(new_names)) ['', ' as/df', 'fdsa '] Applying all registered transformations ======================================= How it works ------------ The ``tl.rename.core`` module keeps a list of known file name transformations the first of which is ``read_names_from_file``: >>> from tl.rename.core import transformations >>> transformations [, ...] It also defines a function ``transform`` that applies the transformations to a list of file names, in order. This function works just like any of the individual transformations, taking a sequence of old names and any number of optional keyword arguments and returning a list of new names. Without any options, it also returns the original file names: >>> from tl.rename.core import transform >>> transform(['foo', 'bar/baz']) ['foo', 'bar/baz'] Options given to ``transform`` are passed to all transformations so each of them can pick whichever options are of interest to it. We stick with ``read_names_from_file`` in this example: >>> new_names = """\ ... as/df ... fdsa ... """ >>> transform(['foo', 'bar/baz'], names_file=StringIO(new_names)) ['as/df', 'fdsa'] Note that it is possible and allowed to change the whole file path, not just the part after the last path separator. If a ``slice`` option is passed to ``transform``, any transformations apply only to the specified slice of each name [#apply-slice]_: >>> transform(['01 - foo.txt', '02 - bar/baz.ogg'], ... applied_slice=(5, -4), names_file=StringIO(new_names)) ['01 - as/df.txt', '02 - fdsa.ogg'] What it guards against ---------------------- In contrast to the individual transformations, ``transform`` makes sure that no ambiguities arise. To begin with, old file names passed to ``transform`` must be unique [#assert-unique]_: >>> transform(['foo', 'foo']) Traceback (most recent call last): AssertionError: Original names are not unique. An exception is also raised if any of the transformations introduces an ambiguity: >>> new_names = """\ ... as/df ... as/df ... """ >>> transform(['foo', 'bar/baz'], names_file=StringIO(new_names)) Traceback (most recent call last): AssertionError: Result of transformation is not unique. If using slices, ambiguities are considered with respect to whole file names, so the slices themselves may be ambiguous: >>> transform(['foo', 'bar/baz'], ... applied_slice=(1, None), names_file=StringIO(new_names)) ['fas/df', 'bas/df'] Another mistake ``transform`` guards against is for a transformation to return a different number of file names than it was passed. This works both with and without using slices: >>> new_names = """\ ... as/df ... """ >>> transform(['foo', 'bar/baz'], names_file=StringIO(new_names)) Traceback (most recent call last): AssertionError: Transformation changed number of names. >>> transform(['foo', 'bar/baz'], ... applied_slice=(1, 3), names_file=StringIO(new_names)) Traceback (most recent call last): AssertionError: Transformation changed number of names. >>> new_names = """\ ... as/df ... fdsa ... asdf ... """ >>> transform(['foo', 'bar/baz'], names_file=StringIO(new_names)) Traceback (most recent call last): AssertionError: Transformation changed number of names. >>> transform(['foo', 'bar/baz'], ... applied_slice=(1, 3), names_file=StringIO(new_names)) Traceback (most recent call last): AssertionError: Transformation changed number of names. Renaming files ============== Dry-run mode ------------ The ``rename`` function finally applies the changes made by the ``transform`` run. It accepts as arguments the lists of old and new file paths, as well as an option to turn on dry-run mode. In dry-run mode, it just prints the changes that would be applied: >>> from tl.rename.core import rename >>> rename(['foo', 'bar/baz'], ['as/df', 'fdsa'], dry_run=True) foo -> as/df bar/baz -> fdsa Notice how unchanged paths are discarded: >>> rename(['foo', 'bar/baz'], ['foo', 'fdsa'], dry_run=True) bar/baz -> fdsa Basic usage ----------- In order to demonstrate the ``rename`` function's actions on the file system, we create and list sandboxes containing sample directories and files: >>> from tl.testing.fs import new_sandbox, ls >>> new_sandbox("""\ ... d bar ... f bar/baz some content ... f foo other content ... """) >>> ls() d bar f bar/baz some content f foo other content >>> rename(['foo', 'bar'], ['asdf', 'fdsa']) >>> ls() f asdf other content d fdsa f fdsa/baz some content A file path may be almost any string including whitespace and non-printable characters [#assert-valid]_: >>> rename(['asdf'], [' bar\tbaz\n\xff ']) >>> sorted(os.listdir('.')) [' bar\tbaz\n\xff ', 'fdsa'] >>> rename([' bar\tbaz\n\xff '], ['asdf']) >>> ls() f asdf other content d fdsa f fdsa/baz some content Trying to rename an item that does not exist by the time the transformations are finished and renaming is undertaken will result in an error: >>> rename(['not-here'], ['whatever']) Traceback (most recent call last): OSError: [Errno 2] No such file or directory The renaming of directories doesn't care about trailing path separators: >>> rename(['fdsa'], ['foobar/']) >>> ls() f asdf other content d foobar f foobar/baz some content >>> rename(['foobar/'], ['fdsa']) >>> ls() f asdf other content d fdsa f fdsa/baz some content Moving between directories -------------------------- Files may be moved between directories by renaming: >>> rename(['asdf'], ['fdsa/bar']) >>> ls() d fdsa f fdsa/bar other content f fdsa/baz some content Renaming a directory with some content works as expected: >>> rename(['fdsa'], ['foo']) >>> ls() d foo f foo/bar other content f foo/baz some content Moving a file to a directory that does not yet exist will create directories as needed along the new path: >>> rename(['foo/bar'], ['as/df/bar']) >>> ls() d as d as/df f as/df/bar other content d foo f foo/baz some content On the other hand, moving the last file out of a directory results in that directory and any empty parents of it to be removed: >>> rename(['as/df/bar'], ['bar']) >>> ls() f bar other content d foo f foo/baz some content An existing empty directory can be moved and renamed without being deleted: >>> new_sandbox("""\ ... d foo ... d foo/bar ... """) >>> rename(['foo/bar'], ['baz']) >>> ls() d baz Renaming to existing paths -------------------------- If a file is renamed to a path that is already used by a file, that other file is replaced. The same goes for two directories if the target is empty: >>> new_sandbox("""\ ... f foo first file ... f bar second file ... """) >>> rename(['foo'], ['bar']) >>> ls() f bar first file >>> new_sandbox("""\ ... d foo ... f foo/baz ... d bar ... """) >>> rename(['foo'], ['bar']) >>> ls() d bar f bar/baz If the target directory is not empty, renaming is not possible lest the directory's content be lost: >>> new_sandbox("""\ ... d foo ... d bar ... f bar/baz ... """) >>> rename(['foo'], ['bar']) Traceback (most recent call last): OSError: [Errno 39] Directory not empty Renaming a file to an existing directory or a directory to an existing file does not work either: >>> new_sandbox("""\ ... f foo ... d bar ... """) >>> rename(['foo'], ['bar']) Traceback (most recent call last): OSError: [Errno 21] Is a directory >>> rename(['bar'], ['foo']) Traceback (most recent call last): OSError: [Errno 20] Not a directory Renaming to paths renamed in turn --------------------------------- In contrast to the above, it is possible to rename an item to an existing one without the latter being removed if it is renamed by the same ``rename`` call: >>> new_sandbox("""\ ... f asdf first ... f bar second ... f baz third ... f foo fourth ... """) >>> rename(['asdf', 'bar'], ['bar', 'baz']) >>> ls() f bar first f baz second f foo fourth This also works in circles and between two items: >>> rename(['bar', 'baz', 'foo'], ['foo', 'bar', 'baz']) >>> ls() f bar second f baz fourth f foo first >>> rename(['bar', 'foo'], ['foo', 'bar']) >>> ls() f bar first f baz fourth f foo second Handling of symbolic links -------------------------- Symbolic links are never followed. Renaming a symbolic link gives a new name to the link, not its target: >>> new_sandbox("""\ ... l bar -> baz ... f foo FOO ... l loo -> foo ... f xyz XYZ ... """) >>> rename(['loo'], ['goo']) >>> ls() l bar -> baz f foo FOO l goo -> foo f xyz XYZ Renaming a broken link works just fine: >>> rename(['bar'], ['barr']) >>> ls() l barr -> baz f foo FOO l goo -> foo f xyz XYZ Renaming a file to the name of an existing symbolic link replaces the link, not its target: >>> rename(['xyz'], ['goo']) >>> ls() l barr -> baz f foo FOO f goo XYZ A directory cannot be renamed to the name of a symbolic link, regardless of whether the link target is a directory: >>> new_sandbox("""\ ... d bar ... f foo ... l l_bar -> bar ... l l_baz -> baz ... l l_foo -> foo ... d xyz ... """) >>> rename(['xyz'], ['l_bar']) Traceback (most recent call last): OSError: [Errno 20] Not a directory >>> rename(['xyz'], ['l_baz']) Traceback (most recent call last): OSError: [Errno 20] Not a directory >>> rename(['xyz'], ['l_foo']) Traceback (most recent call last): OSError: [Errno 20] Not a directory On the other hand, renaming a file to the name of a symbolic link to an existing directory replaces the link (while it would not be possible to replace a directory): >>> rename(['foo'], ['l_bar']) >>> ls() d bar f l_bar l l_baz -> baz l l_foo -> foo d xyz The combined runner =================== A simple ``run`` function ties all the things demonstrated above together. Its signature is basically that of ``transform``, with the ``dry_run`` option passed to ``rename``: >>> from tl.rename.core import run >>> new_sandbox("""\ ... f bar BAR ... f baz BAZ ... f foo FOO ... """) >>> new_names = """\ ... foo ... asdf/bsdf ... """ >>> run(['bar', 'baz'], names_file=StringIO(new_names), dry_run=True) bar -> foo baz -> asdf/bsdf >>> ls() f bar BAR f baz BAZ f foo FOO >>> run(['bar', 'baz'], names_file=StringIO(new_names)) >>> ls() d asdf f asdf/bsdf BAZ f foo BAR All old and new file names must be valid [#assert-valid]_. In particular, they cannot be empty strings so as to avoid ambiguities: >>> run(['\x00'], names_file=StringIO('bar')) Traceback (most recent call last): AssertionError: Invalid old file name: '\x00' >>> run(['foo'], names_file=StringIO('\x00')) Traceback (most recent call last): AssertionError: Invalid new file name: '\x00' .. rubric:: Footnotes .. [#apply-slice] **Slicing file names** The ``apply_slice`` function takes a sequence of file names and any keyword parameters and returns a triple of sequences which contain the left, middle and right portions of the names as determined by the ``applied_slice`` option. If the option is not given, left and right portions are empty, and the middle portions are the whole names: >>> from tl.rename.core import apply_slice >>> apply_slice(['foo', 'bar-baz']) (['', ''], ['foo', 'bar-baz'], ['', '']) The same result is obtained if the start and stop index of the slice are both omitted. Slices are given as tuples of start and stop index: >>> apply_slice(['foo', 'bar-baz'], applied_slice=(None, None)) (['', ''], ['foo', 'bar-baz'], ['', '']) Other values produce the results expected from Python's simple slices: >>> apply_slice(['foo', 'bar-baz'], applied_slice=(None, 2)) (['', ''], ['fo', 'ba'], ['o', 'r-baz']) >>> apply_slice(['foo', 'bar-baz'], applied_slice=(4, None)) (['foo', 'bar-'], ['', 'baz'], ['', '']) >>> apply_slice(['foo', 'bar-baz'], applied_slice=(1, 5)) (['f', 'b'], ['oo', 'ar-b'], ['', 'az']) >>> apply_slice(['foo', 'bar-baz'], applied_slice=(1, -1)) (['f', 'b'], ['o', 'ar-ba'], ['o', 'z']) >>> apply_slice(['foo', 'bar-baz'], applied_slice=(-2, 100)) (['f', 'bar-b'], ['oo', 'az'], ['', '']) .. [#assert-unique] **Ensuring unique file names** The ``assert_unique`` function takes a sequence of file names and an error message and raises an ``AttributeError`` with that error message if and only if the file names are not unique: >>> from tl.rename.core import assert_unique >>> assert_unique(['foo', 'bar', 'baz'], 'not unique') >>> assert_unique(['foo', 'foo', 'baz'], 'not unique') Traceback (most recent call last): AssertionError: not unique File paths are normalised using ``os.path.abspath`` prior to comparison. Ambiguities are thus noticed even when mixing absolute and relative paths: >>> import os >>> assert_unique(['as/df', '%s/as/df' % os.getcwd()], 'not unique') Traceback (most recent call last): AssertionError: not unique As another consequence, paths that differ only by a trailing path separator are considered equivalent: >>> assert_unique(['as/df', 'as/df/'], 'not unique') Traceback (most recent call last): AssertionError: not unique .. [#assert-valid] **Ensuring valid file names** A file path is considered invalid if it contains null bytes (in order to avoid a ``TypeError`` while renaming) or is an empty string (in order to avoid ambiguities). The ``assert_valid`` function takes an iterable of file names and an error message and raises an ``AttributeError`` with that error message if it first encounters an invalid name. The error message is supposed to contain exactly one ``%r`` formatting operator to include the representation of the invalid name: >>> from tl.rename.core import assert_valid >>> assert_valid(['fdsa', 'foo\x00bar', ''], 'invalid: %r') Traceback (most recent call last): AssertionError: invalid: 'foo\x00bar' >>> assert_valid(['fdsa/', '', 'foo\x00bar'], 'invalid: %r') Traceback (most recent call last): AssertionError: invalid: '' .. Local Variables: .. mode: rst .. End: