acidfile quickstart

acidfile module provides the ACIDFile object. This object can be used as a regular file object but instead of write one copy of the data, it will write several copies to disk in an ACID manner.

This algorithm was explained by Elvis Pfützenreuter in his blog post Achieving ACID transactions with common files

Latest stable version can be found on PyPI.

acidfile is compatible with python 2.7 and 3.3

Installation

Latest version can be installed via pip

In [1]:
%%bash
pip install --upgrade acidfile
Downloading/unpacking acidfile
  Downloading acidfile-1.1.0.tar.gz
  Running setup.py egg_info for package acidfile
    
Installing collected packages: acidfile
  Running setup.py install for acidfile
    
Successfully installed acidfile
Cleaning up...

Running the tests

Clone this repository and install the develop requirements.

In [12]:
%%bash
git clone https://github.com/nilp0inter/acidfile.git
cd acidfile
pip install -r requirements/develop.txt
python setup.py develop
tox
Cloning into 'acidfile'...
Downloading/unpacking tox==1.6.1 (from -r requirements/develop.txt (line 1))
  Running setup.py egg_info for package tox
    
Downloading/unpacking virtualenv>=1.9.1 (from tox==1.6.1->-r requirements/develop.txt (line 1))
  Running setup.py egg_info for package virtualenv
    
    warning: no files found matching '*.egg' under directory 'virtualenv_support'
    warning: no previously-included files matching '*' found under directory 'docs/_templates'
    warning: no previously-included files matching '*' found under directory 'docs/_build'
Downloading/unpacking py>=1.4.15 (from tox==1.6.1->-r requirements/develop.txt (line 1))
  Running setup.py egg_info for package py
    
Installing collected packages: tox, virtualenv, py
  Running setup.py install for tox
    
    Installing tox-quickstart script to /home/nil/Envs/acidfile3/bin
    Installing tox script to /home/nil/Envs/acidfile3/bin
  Running setup.py install for virtualenv
    
    warning: no files found matching '*.egg' under directory 'virtualenv_support'
    warning: no previously-included files matching '*' found under directory 'docs/_templates'
    warning: no previously-included files matching '*' found under directory 'docs/_build'
    Installing virtualenv script to /home/nil/Envs/acidfile3/bin
    Installing virtualenv-3.3 script to /home/nil/Envs/acidfile3/bin
  Running setup.py install for py
    
Successfully installed tox virtualenv py
Cleaning up...
running develop
running egg_info
creating src/acidfile.egg-info
writing dependency_links to src/acidfile.egg-info/dependency_links.txt
writing src/acidfile.egg-info/PKG-INFO
writing top-level names to src/acidfile.egg-info/top_level.txt
writing manifest file 'src/acidfile.egg-info/SOURCES.txt'
reading manifest file 'src/acidfile.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'src/acidfile.egg-info/SOURCES.txt'
running build_ext
Creating /home/nil/Envs/acidfile3/lib/python3.3/site-packages/acidfile.egg-link (link to src)
Adding acidfile 1.1.0 to easy-install.pth file

Installed /home/nil/Projects/acidfile/docs/acidfile/src
Processing dependencies for acidfile==1.1.0
Finished processing dependencies for acidfile==1.1.0
GLOB sdist-make: /home/nil/Projects/acidfile/docs/acidfile/setup.py
py27 create: /home/nil/Projects/acidfile/docs/acidfile/.tox/py27
py27 installdeps: behave
py27 inst: /home/nil/Projects/acidfile/docs/acidfile/.tox/dist/acidfile-1.1.0.zip
py27 runtests: commands[0] | behave tests/features
Feature: Basic file usage # tests/features/basic.feature:3
  In order to use the package as developer I need to write and read
  data from the file.
  Scenario: Read and Write        # tests/features/basic.feature:7
    Given an example acidfile     # tests/features/steps/steps.py:13
    When I write some data        # tests/features/steps/steps.py:17
    And I reopen it               # tests/features/steps/steps.py:25
    Then I can read the same data # tests/features/steps/steps.py:30

Feature: Acidfile must be consistent # tests/features/consistency.feature:3
  The acidfile data must be discarded if the inner-file was modified
  or damaged.
  Scenario: One inner-file damaged       # tests/features/consistency.feature:7
    Given an example acidfile            # tests/features/steps/steps.py:13
    When I write some data               # tests/features/steps/steps.py:17
    And I close the file                 # tests/features/steps/steps.py:34
    And I corrupt one of the inner files # tests/features/steps/steps.py:64
    And I open it again                  # tests/features/steps/steps.py:44
    Then I can read the same data        # tests/features/steps/steps.py:30

  Scenario: All inner-files damaged   # tests/features/consistency.feature:15
    Given an example acidfile         # tests/features/steps/steps.py:13
    When I write some data            # tests/features/steps/steps.py:17
    And I close the file              # tests/features/steps/steps.py:34
    And I corrupt all the inner files # tests/features/steps/steps.py:71
    And I open it again               # tests/features/steps/steps.py:44
    Then I can't read any data        # tests/features/steps/steps.py:55

Feature: Use acidfile as a context manager. # tests/features/context.feature:3
  As a programmer i'd like to use the acidfile as a context
  manager as the open can.
  Scenario: Read in context                                    # tests/features/context.feature:7
    Given an example acidfile                                  # tests/features/steps/steps.py:13
    When I write some data                                     # tests/features/steps/steps.py:17
    And I close the file                                       # tests/features/steps/steps.py:34
    Then I can open in a with statement and read the same data # tests/features/steps/steps.py:97

  Scenario: Write in context                                   # tests/features/context.feature:13
    Given an acidfile written in a with statement              # tests/features/steps/steps.py:102
    Then I can open in a with statement and read the same data # tests/features/steps/steps.py:97

Feature: The number of copies of inner-files must be configurable # tests/features/copies.feature:3
  As a programmer i'd like to configure the number of inner-copies of the data
  that would be written.
  Scenario: One inner file is not possible                      # tests/features/copies.feature:7
    Given an example acidfile with no copies must raise on init # tests/features/steps/steps.py:107

  Scenario Outline: Inner-file copies       # tests/features/copies.feature:10
    Given an example acidfile with 1 copies # tests/features/steps/steps.py:116
    When I write some data                  # tests/features/steps/steps.py:17
    And I close the file                    # tests/features/steps/steps.py:34
    Then I can see 1 inner-files            # tests/features/steps/steps.py:121

  Scenario Outline: Inner-file copies       # tests/features/copies.feature:10
    Given an example acidfile with 2 copies # tests/features/steps/steps.py:116
    When I write some data                  # tests/features/steps/steps.py:17
    And I close the file                    # tests/features/steps/steps.py:34
    Then I can see 2 inner-files            # tests/features/steps/steps.py:121

  Scenario Outline: Inner-file copies       # tests/features/copies.feature:10
    Given an example acidfile with 3 copies # tests/features/steps/steps.py:116
    When I write some data                  # tests/features/steps/steps.py:17
    And I close the file                    # tests/features/steps/steps.py:34
    Then I can see 3 inner-files            # tests/features/steps/steps.py:121

  Scenario Outline: Inner-file copies       # tests/features/copies.feature:10
    Given an example acidfile with 4 copies # tests/features/steps/steps.py:116
    When I write some data                  # tests/features/steps/steps.py:17
    And I close the file                    # tests/features/steps/steps.py:34
    Then I can see 4 inner-files            # tests/features/steps/steps.py:121

Feature: Acidfile must be durable # tests/features/durability.feature:3
  The acidfile data must survive even if one of the inner files that
  support it is deleted.
  Scenario: Inner-file deleted          # tests/features/durability.feature:7
    Given an example acidfile           # tests/features/steps/steps.py:13
    When I write some data              # tests/features/steps/steps.py:17
    And I close the file                # tests/features/steps/steps.py:34
    And I remove one of the inner files # tests/features/steps/steps.py:38
    And I open it again                 # tests/features/steps/steps.py:44
    Then I can read the same data       # tests/features/steps/steps.py:30

  Scenario: All inner-files deleted  # tests/features/durability.feature:15
    Given an example acidfile        # tests/features/steps/steps.py:13
    When I write some data           # tests/features/steps/steps.py:17
    And I close the file             # tests/features/steps/steps.py:34
    And I remove all the inner files # tests/features/steps/steps.py:48
    And I open it again              # tests/features/steps/steps.py:44
    Then I can't read any data       # tests/features/steps/steps.py:55

Feature: Extended file usage # tests/features/extended_file.feature:3
  Acidfile must behave like any other file-like object so all the
  not implemented method must be passed to de inner memory file.
  Scenario: Seek the file             # tests/features/extended_file.feature:7
    Given an example acidfile         # tests/features/steps/steps.py:13
    When I write some data            # tests/features/steps/steps.py:17
    And seek to the start of the file # tests/features/steps/steps.py:93
    Then I can read the same data     # tests/features/steps/steps.py:30

Feature: Acidfile must be isolated # tests/features/isolation.feature:3
  The latest version of the data must be retrieved if two valid inner-files
  were found.
  Scenario Outline: First inner-file not updated               # tests/features/isolation.feature:7
    Given an example acidfile                                  # tests/features/steps/steps.py:13
    And an auxiliary acidfile                                  # tests/features/steps/steps.py:79
    When I write some auxiliary data                           # tests/features/steps/steps.py:21
    And I close the auxiliary file                             # tests/features/steps/steps.py:83
    And I write some data                                      # tests/features/steps/steps.py:17
    And I close the file                                       # tests/features/steps/steps.py:34
    And replace example inner-file number 0 with auxiliary one # tests/features/steps/steps.py:87
    And I open it again                                        # tests/features/steps/steps.py:44
    Then I can read the same data                              # tests/features/steps/steps.py:30

  Scenario Outline: First inner-file not updated               # tests/features/isolation.feature:7
    Given an example acidfile                                  # tests/features/steps/steps.py:13
    And an auxiliary acidfile                                  # tests/features/steps/steps.py:79
    When I write some auxiliary data                           # tests/features/steps/steps.py:21
    And I close the auxiliary file                             # tests/features/steps/steps.py:83
    And I write some data                                      # tests/features/steps/steps.py:17
    And I close the file                                       # tests/features/steps/steps.py:34
    And replace example inner-file number 1 with auxiliary one # tests/features/steps/steps.py:87
    And I open it again                                        # tests/features/steps/steps.py:44
    Then I can read the same data                              # tests/features/steps/steps.py:30

7 features passed, 0 failed, 0 skipped
15 scenarios passed, 0 failed, 0 skipped
73 steps passed, 0 failed, 0 skipped, 0 undefined
Took 0m0.218s
py33 create: /home/nil/Projects/acidfile/docs/acidfile/.tox/py33
py33 installdeps: behave
py33 inst: /home/nil/Projects/acidfile/docs/acidfile/.tox/dist/acidfile-1.1.0.zip
py33 runtests: commands[0] | behave tests/features
Feature: Basic file usage # tests/features/basic.feature:3
  In order to use the package as developer I need to write and read
  data from the file.
  Scenario: Read and Write        # tests/features/basic.feature:7
    Given an example acidfile     # tests/features/steps/steps.py:13
    When I write some data        # tests/features/steps/steps.py:17
    And I reopen it               # tests/features/steps/steps.py:25
    Then I can read the same data # tests/features/steps/steps.py:30

Feature: Acidfile must be consistent # tests/features/consistency.feature:3
  The acidfile data must be discarded if the inner-file was modified
  or damaged.
  Scenario: One inner-file damaged       # tests/features/consistency.feature:7
    Given an example acidfile            # tests/features/steps/steps.py:13
    When I write some data               # tests/features/steps/steps.py:17
    And I close the file                 # tests/features/steps/steps.py:34
    And I corrupt one of the inner files # tests/features/steps/steps.py:64
    And I open it again                  # tests/features/steps/steps.py:44
    Then I can read the same data        # tests/features/steps/steps.py:30

  Scenario: All inner-files damaged   # tests/features/consistency.feature:15
    Given an example acidfile         # tests/features/steps/steps.py:13
    When I write some data            # tests/features/steps/steps.py:17
    And I close the file              # tests/features/steps/steps.py:34
    And I corrupt all the inner files # tests/features/steps/steps.py:71
    And I open it again               # tests/features/steps/steps.py:44
    Then I can't read any data        # tests/features/steps/steps.py:55

Feature: Use acidfile as a context manager. # tests/features/context.feature:3
  As a programmer i'd like to use the acidfile as a context
  manager as the open can.
  Scenario: Read in context                                    # tests/features/context.feature:7
    Given an example acidfile                                  # tests/features/steps/steps.py:13
    When I write some data                                     # tests/features/steps/steps.py:17
    And I close the file                                       # tests/features/steps/steps.py:34
    Then I can open in a with statement and read the same data # tests/features/steps/steps.py:97

  Scenario: Write in context                                   # tests/features/context.feature:13
    Given an acidfile written in a with statement              # tests/features/steps/steps.py:102
    Then I can open in a with statement and read the same data # tests/features/steps/steps.py:97

Feature: The number of copies of inner-files must be configurable # tests/features/copies.feature:3
  As a programmer i'd like to configure the number of inner-copies of the data
  that would be written.
  Scenario: One inner file is not possible                      # tests/features/copies.feature:7
    Given an example acidfile with no copies must raise on init # tests/features/steps/steps.py:107

  Scenario Outline: Inner-file copies       # tests/features/copies.feature:10
    Given an example acidfile with 1 copies # tests/features/steps/steps.py:116
    When I write some data                  # tests/features/steps/steps.py:17
    And I close the file                    # tests/features/steps/steps.py:34
    Then I can see 1 inner-files            # tests/features/steps/steps.py:121

  Scenario Outline: Inner-file copies       # tests/features/copies.feature:10
    Given an example acidfile with 2 copies # tests/features/steps/steps.py:116
    When I write some data                  # tests/features/steps/steps.py:17
    And I close the file                    # tests/features/steps/steps.py:34
    Then I can see 2 inner-files            # tests/features/steps/steps.py:121

  Scenario Outline: Inner-file copies       # tests/features/copies.feature:10
    Given an example acidfile with 3 copies # tests/features/steps/steps.py:116
    When I write some data                  # tests/features/steps/steps.py:17
    And I close the file                    # tests/features/steps/steps.py:34
    Then I can see 3 inner-files            # tests/features/steps/steps.py:121

  Scenario Outline: Inner-file copies       # tests/features/copies.feature:10
    Given an example acidfile with 4 copies # tests/features/steps/steps.py:116
    When I write some data                  # tests/features/steps/steps.py:17
    And I close the file                    # tests/features/steps/steps.py:34
    Then I can see 4 inner-files            # tests/features/steps/steps.py:121

Feature: Acidfile must be durable # tests/features/durability.feature:3
  The acidfile data must survive even if one of the inner files that
  support it is deleted.
  Scenario: Inner-file deleted          # tests/features/durability.feature:7
    Given an example acidfile           # tests/features/steps/steps.py:13
    When I write some data              # tests/features/steps/steps.py:17
    And I close the file                # tests/features/steps/steps.py:34
    And I remove one of the inner files # tests/features/steps/steps.py:38
    And I open it again                 # tests/features/steps/steps.py:44
    Then I can read the same data       # tests/features/steps/steps.py:30

  Scenario: All inner-files deleted  # tests/features/durability.feature:15
    Given an example acidfile        # tests/features/steps/steps.py:13
    When I write some data           # tests/features/steps/steps.py:17
    And I close the file             # tests/features/steps/steps.py:34
    And I remove all the inner files # tests/features/steps/steps.py:48
    And I open it again              # tests/features/steps/steps.py:44
    Then I can't read any data       # tests/features/steps/steps.py:55

Feature: Extended file usage # tests/features/extended_file.feature:3
  Acidfile must behave like any other file-like object so all the
  not implemented method must be passed to de inner memory file.
  Scenario: Seek the file             # tests/features/extended_file.feature:7
    Given an example acidfile         # tests/features/steps/steps.py:13
    When I write some data            # tests/features/steps/steps.py:17
    And seek to the start of the file # tests/features/steps/steps.py:93
    Then I can read the same data     # tests/features/steps/steps.py:30

Feature: Acidfile must be isolated # tests/features/isolation.feature:3
  The latest version of the data must be retrieved if two valid inner-files
  were found.
  Scenario Outline: First inner-file not updated               # tests/features/isolation.feature:7
    Given an example acidfile                                  # tests/features/steps/steps.py:13
    And an auxiliary acidfile                                  # tests/features/steps/steps.py:79
    When I write some auxiliary data                           # tests/features/steps/steps.py:21
    And I close the auxiliary file                             # tests/features/steps/steps.py:83
    And I write some data                                      # tests/features/steps/steps.py:17
    And I close the file                                       # tests/features/steps/steps.py:34
    And replace example inner-file number 0 with auxiliary one # tests/features/steps/steps.py:87
    And I open it again                                        # tests/features/steps/steps.py:44
    Then I can read the same data                              # tests/features/steps/steps.py:30

  Scenario Outline: First inner-file not updated               # tests/features/isolation.feature:7
    Given an example acidfile                                  # tests/features/steps/steps.py:13
    And an auxiliary acidfile                                  # tests/features/steps/steps.py:79
    When I write some auxiliary data                           # tests/features/steps/steps.py:21
    And I close the auxiliary file                             # tests/features/steps/steps.py:83
    And I write some data                                      # tests/features/steps/steps.py:17
    And I close the file                                       # tests/features/steps/steps.py:34
    And replace example inner-file number 1 with auxiliary one # tests/features/steps/steps.py:87
    And I open it again                                        # tests/features/steps/steps.py:44
    Then I can read the same data                              # tests/features/steps/steps.py:30

7 features passed, 0 failed, 0 skipped
15 scenarios passed, 0 failed, 0 skipped
73 steps passed, 0 failed, 0 skipped, 0 undefined
Took 0m0.210s
_______________________________________________ summary ________________________________________________
  py27: commands succeeded
  py33: commands succeeded
  congratulations :)

Usage examples

Basic usage

Writing

In [2]:
from acidfile import ACIDFile
      
myfile = ACIDFile('myfile.txt', 'w')
myfile.write(b'Some important data.')
myfile.close()

At the close invocation two copies will be written to disk: myfile.txt.0 and below myfile.txt.1. Each one will have an creation timestamp and a HMAC signature.

In [4]:
ls myfile.txt*
myfile.txt.0  myfile.txt.1

Basic usage

Reading

In [6]:
myfile = ACIDFile('myfile.txt', 'r')
print(myfile.read())
myfile.close()
b'Some important data.'

If any of the files is damaged due to turning off without proper shutdown or disk failure, manipulation, etc. It will be detected by the internal HMAC and the other's file data would be used instead.

If you want to read an acidfile, never pass the full path of the real file, instead use the file name that you use in the creation step.

  • ✗ ACIDFile('/tmp/myfile.txt.0', 'r')
  • ✗ ACIDFile('/tmp/myfile.txt.1', 'r')
  • ✓ ACIDFile('/tmp/myfile.txt', 'r')

Context manager

ACIDFile can (and should) be used as a regular context manager:

In [7]:
with ACIDFile('myfile.txt', 'w') as myfile:
    myfile.write(b'Some important data.')

Number of copies

The number of inner copies of the data can be configured through the copies parameter.

In [8]:
with ACIDFile('myfile.txt', 'w', copies=5) as myfile:
    myfile.write(b'Some super important data.')
In [9]:
ls myfile.txt*
myfile.txt.0  myfile.txt.1  myfile.txt.2  myfile.txt.3  myfile.txt.4  myfile.txt.5

Checksum Key

The key used for compute and check the internal HMAC signature can be setted by the key parameter.

It's recommended to change that key in order to protect against fraud, making more difficult for a tamperer to put a fake file in place of the legitimate one.

In [10]:
with ACIDFile('myotherfile.txt', 'w', key=b'a better key for my file') as myfile:
    myfile.write(b'Other stuff')

Without the valid key, the file can't be verified.

In [11]:
with ACIDFile('myotherfile.txt', 'r', key=b'mismatching key') as myfile:
    print(myfile.read())
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-11-571c3fa9e6d2> in <module>()
      1 with ACIDFile('myotherfile.txt', 'r', key=b'mismatching key') as myfile:
----> 2     print(myfile.read())

/home/nil/Envs/acidfile3/lib/python3.3/site-packages/acidfile/__init__.py in read(self, size)
     84                             continue
     85             if not self.loaded:
---> 86                 raise IOError("Can't read file")
     87         return self._file.read(size)
     88 

OSError: Can't read file

Thank you!

Thanks for watching!