acidfile
module provides the ACIDFile object. This object can be used as a regular file object but instead of write one copy of the data, it will write several copies to disk in an ACID manner.
This algorithm was explained by Elvis Pfützenreuter in his blog post Achieving ACID transactions with common files
Latest stable version can be found on PyPI.
acidfile
is compatible with python 2.7 and 3.3
Latest version can be installed via pip
%%bash
pip install --upgrade acidfile
Downloading/unpacking acidfile Downloading acidfile-1.1.0.tar.gz Running setup.py egg_info for package acidfile Installing collected packages: acidfile Running setup.py install for acidfile Successfully installed acidfile Cleaning up...
Clone this repository and install the develop requirements.
%%bash
git clone https://github.com/nilp0inter/acidfile.git
cd acidfile
pip install -r requirements/develop.txt
python setup.py develop
tox
Cloning into 'acidfile'... Downloading/unpacking tox==1.6.1 (from -r requirements/develop.txt (line 1)) Running setup.py egg_info for package tox Downloading/unpacking virtualenv>=1.9.1 (from tox==1.6.1->-r requirements/develop.txt (line 1)) Running setup.py egg_info for package virtualenv warning: no files found matching '*.egg' under directory 'virtualenv_support' warning: no previously-included files matching '*' found under directory 'docs/_templates' warning: no previously-included files matching '*' found under directory 'docs/_build' Downloading/unpacking py>=1.4.15 (from tox==1.6.1->-r requirements/develop.txt (line 1)) Running setup.py egg_info for package py Installing collected packages: tox, virtualenv, py Running setup.py install for tox Installing tox-quickstart script to /home/nil/Envs/acidfile3/bin Installing tox script to /home/nil/Envs/acidfile3/bin Running setup.py install for virtualenv warning: no files found matching '*.egg' under directory 'virtualenv_support' warning: no previously-included files matching '*' found under directory 'docs/_templates' warning: no previously-included files matching '*' found under directory 'docs/_build' Installing virtualenv script to /home/nil/Envs/acidfile3/bin Installing virtualenv-3.3 script to /home/nil/Envs/acidfile3/bin Running setup.py install for py Successfully installed tox virtualenv py Cleaning up... running develop running egg_info creating src/acidfile.egg-info writing dependency_links to src/acidfile.egg-info/dependency_links.txt writing src/acidfile.egg-info/PKG-INFO writing top-level names to src/acidfile.egg-info/top_level.txt writing manifest file 'src/acidfile.egg-info/SOURCES.txt' reading manifest file 'src/acidfile.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' writing manifest file 'src/acidfile.egg-info/SOURCES.txt' running build_ext Creating /home/nil/Envs/acidfile3/lib/python3.3/site-packages/acidfile.egg-link (link to src) Adding acidfile 1.1.0 to easy-install.pth file Installed /home/nil/Projects/acidfile/docs/acidfile/src Processing dependencies for acidfile==1.1.0 Finished processing dependencies for acidfile==1.1.0 GLOB sdist-make: /home/nil/Projects/acidfile/docs/acidfile/setup.py py27 create: /home/nil/Projects/acidfile/docs/acidfile/.tox/py27 py27 installdeps: behave py27 inst: /home/nil/Projects/acidfile/docs/acidfile/.tox/dist/acidfile-1.1.0.zip py27 runtests: commands[0] | behave tests/features Feature: Basic file usage # tests/features/basic.feature:3 In order to use the package as developer I need to write and read data from the file. Scenario: Read and Write # tests/features/basic.feature:7 Given an example acidfile # tests/features/steps/steps.py:13 When I write some data # tests/features/steps/steps.py:17 And I reopen it # tests/features/steps/steps.py:25 Then I can read the same data # tests/features/steps/steps.py:30 Feature: Acidfile must be consistent # tests/features/consistency.feature:3 The acidfile data must be discarded if the inner-file was modified or damaged. Scenario: One inner-file damaged # tests/features/consistency.feature:7 Given an example acidfile # tests/features/steps/steps.py:13 When I write some data # tests/features/steps/steps.py:17 And I close the file # tests/features/steps/steps.py:34 And I corrupt one of the inner files # tests/features/steps/steps.py:64 And I open it again # tests/features/steps/steps.py:44 Then I can read the same data # tests/features/steps/steps.py:30 Scenario: All inner-files damaged # tests/features/consistency.feature:15 Given an example acidfile # tests/features/steps/steps.py:13 When I write some data # tests/features/steps/steps.py:17 And I close the file # tests/features/steps/steps.py:34 And I corrupt all the inner files # tests/features/steps/steps.py:71 And I open it again # tests/features/steps/steps.py:44 Then I can't read any data # tests/features/steps/steps.py:55 Feature: Use acidfile as a context manager. # tests/features/context.feature:3 As a programmer i'd like to use the acidfile as a context manager as the open can. Scenario: Read in context # tests/features/context.feature:7 Given an example acidfile # tests/features/steps/steps.py:13 When I write some data # tests/features/steps/steps.py:17 And I close the file # tests/features/steps/steps.py:34 Then I can open in a with statement and read the same data # tests/features/steps/steps.py:97 Scenario: Write in context # tests/features/context.feature:13 Given an acidfile written in a with statement # tests/features/steps/steps.py:102 Then I can open in a with statement and read the same data # tests/features/steps/steps.py:97 Feature: The number of copies of inner-files must be configurable # tests/features/copies.feature:3 As a programmer i'd like to configure the number of inner-copies of the data that would be written. Scenario: One inner file is not possible # tests/features/copies.feature:7 Given an example acidfile with no copies must raise on init # tests/features/steps/steps.py:107 Scenario Outline: Inner-file copies # tests/features/copies.feature:10 Given an example acidfile with 1 copies # tests/features/steps/steps.py:116 When I write some data # tests/features/steps/steps.py:17 And I close the file # tests/features/steps/steps.py:34 Then I can see 1 inner-files # tests/features/steps/steps.py:121 Scenario Outline: Inner-file copies # tests/features/copies.feature:10 Given an example acidfile with 2 copies # tests/features/steps/steps.py:116 When I write some data # tests/features/steps/steps.py:17 And I close the file # tests/features/steps/steps.py:34 Then I can see 2 inner-files # tests/features/steps/steps.py:121 Scenario Outline: Inner-file copies # tests/features/copies.feature:10 Given an example acidfile with 3 copies # tests/features/steps/steps.py:116 When I write some data # tests/features/steps/steps.py:17 And I close the file # tests/features/steps/steps.py:34 Then I can see 3 inner-files # tests/features/steps/steps.py:121 Scenario Outline: Inner-file copies # tests/features/copies.feature:10 Given an example acidfile with 4 copies # tests/features/steps/steps.py:116 When I write some data # tests/features/steps/steps.py:17 And I close the file # tests/features/steps/steps.py:34 Then I can see 4 inner-files # tests/features/steps/steps.py:121 Feature: Acidfile must be durable # tests/features/durability.feature:3 The acidfile data must survive even if one of the inner files that support it is deleted. Scenario: Inner-file deleted # tests/features/durability.feature:7 Given an example acidfile # tests/features/steps/steps.py:13 When I write some data # tests/features/steps/steps.py:17 And I close the file # tests/features/steps/steps.py:34 And I remove one of the inner files # tests/features/steps/steps.py:38 And I open it again # tests/features/steps/steps.py:44 Then I can read the same data # tests/features/steps/steps.py:30 Scenario: All inner-files deleted # tests/features/durability.feature:15 Given an example acidfile # tests/features/steps/steps.py:13 When I write some data # tests/features/steps/steps.py:17 And I close the file # tests/features/steps/steps.py:34 And I remove all the inner files # tests/features/steps/steps.py:48 And I open it again # tests/features/steps/steps.py:44 Then I can't read any data # tests/features/steps/steps.py:55 Feature: Extended file usage # tests/features/extended_file.feature:3 Acidfile must behave like any other file-like object so all the not implemented method must be passed to de inner memory file. Scenario: Seek the file # tests/features/extended_file.feature:7 Given an example acidfile # tests/features/steps/steps.py:13 When I write some data # tests/features/steps/steps.py:17 And seek to the start of the file # tests/features/steps/steps.py:93 Then I can read the same data # tests/features/steps/steps.py:30 Feature: Acidfile must be isolated # tests/features/isolation.feature:3 The latest version of the data must be retrieved if two valid inner-files were found. Scenario Outline: First inner-file not updated # tests/features/isolation.feature:7 Given an example acidfile # tests/features/steps/steps.py:13 And an auxiliary acidfile # tests/features/steps/steps.py:79 When I write some auxiliary data # tests/features/steps/steps.py:21 And I close the auxiliary file # tests/features/steps/steps.py:83 And I write some data # tests/features/steps/steps.py:17 And I close the file # tests/features/steps/steps.py:34 And replace example inner-file number 0 with auxiliary one # tests/features/steps/steps.py:87 And I open it again # tests/features/steps/steps.py:44 Then I can read the same data # tests/features/steps/steps.py:30 Scenario Outline: First inner-file not updated # tests/features/isolation.feature:7 Given an example acidfile # tests/features/steps/steps.py:13 And an auxiliary acidfile # tests/features/steps/steps.py:79 When I write some auxiliary data # tests/features/steps/steps.py:21 And I close the auxiliary file # tests/features/steps/steps.py:83 And I write some data # tests/features/steps/steps.py:17 And I close the file # tests/features/steps/steps.py:34 And replace example inner-file number 1 with auxiliary one # tests/features/steps/steps.py:87 And I open it again # tests/features/steps/steps.py:44 Then I can read the same data # tests/features/steps/steps.py:30 7 features passed, 0 failed, 0 skipped 15 scenarios passed, 0 failed, 0 skipped 73 steps passed, 0 failed, 0 skipped, 0 undefined Took 0m0.218s py33 create: /home/nil/Projects/acidfile/docs/acidfile/.tox/py33 py33 installdeps: behave py33 inst: /home/nil/Projects/acidfile/docs/acidfile/.tox/dist/acidfile-1.1.0.zip py33 runtests: commands[0] | behave tests/features Feature: Basic file usage # tests/features/basic.feature:3 In order to use the package as developer I need to write and read data from the file. Scenario: Read and Write # tests/features/basic.feature:7 Given an example acidfile # tests/features/steps/steps.py:13 When I write some data # tests/features/steps/steps.py:17 And I reopen it # tests/features/steps/steps.py:25 Then I can read the same data # tests/features/steps/steps.py:30 Feature: Acidfile must be consistent # tests/features/consistency.feature:3 The acidfile data must be discarded if the inner-file was modified or damaged. Scenario: One inner-file damaged # tests/features/consistency.feature:7 Given an example acidfile # tests/features/steps/steps.py:13 When I write some data # tests/features/steps/steps.py:17 And I close the file # tests/features/steps/steps.py:34 And I corrupt one of the inner files # tests/features/steps/steps.py:64 And I open it again # tests/features/steps/steps.py:44 Then I can read the same data # tests/features/steps/steps.py:30 Scenario: All inner-files damaged # tests/features/consistency.feature:15 Given an example acidfile # tests/features/steps/steps.py:13 When I write some data # tests/features/steps/steps.py:17 And I close the file # tests/features/steps/steps.py:34 And I corrupt all the inner files # tests/features/steps/steps.py:71 And I open it again # tests/features/steps/steps.py:44 Then I can't read any data # tests/features/steps/steps.py:55 Feature: Use acidfile as a context manager. # tests/features/context.feature:3 As a programmer i'd like to use the acidfile as a context manager as the open can. Scenario: Read in context # tests/features/context.feature:7 Given an example acidfile # tests/features/steps/steps.py:13 When I write some data # tests/features/steps/steps.py:17 And I close the file # tests/features/steps/steps.py:34 Then I can open in a with statement and read the same data # tests/features/steps/steps.py:97 Scenario: Write in context # tests/features/context.feature:13 Given an acidfile written in a with statement # tests/features/steps/steps.py:102 Then I can open in a with statement and read the same data # tests/features/steps/steps.py:97 Feature: The number of copies of inner-files must be configurable # tests/features/copies.feature:3 As a programmer i'd like to configure the number of inner-copies of the data that would be written. Scenario: One inner file is not possible # tests/features/copies.feature:7 Given an example acidfile with no copies must raise on init # tests/features/steps/steps.py:107 Scenario Outline: Inner-file copies # tests/features/copies.feature:10 Given an example acidfile with 1 copies # tests/features/steps/steps.py:116 When I write some data # tests/features/steps/steps.py:17 And I close the file # tests/features/steps/steps.py:34 Then I can see 1 inner-files # tests/features/steps/steps.py:121 Scenario Outline: Inner-file copies # tests/features/copies.feature:10 Given an example acidfile with 2 copies # tests/features/steps/steps.py:116 When I write some data # tests/features/steps/steps.py:17 And I close the file # tests/features/steps/steps.py:34 Then I can see 2 inner-files # tests/features/steps/steps.py:121 Scenario Outline: Inner-file copies # tests/features/copies.feature:10 Given an example acidfile with 3 copies # tests/features/steps/steps.py:116 When I write some data # tests/features/steps/steps.py:17 And I close the file # tests/features/steps/steps.py:34 Then I can see 3 inner-files # tests/features/steps/steps.py:121 Scenario Outline: Inner-file copies # tests/features/copies.feature:10 Given an example acidfile with 4 copies # tests/features/steps/steps.py:116 When I write some data # tests/features/steps/steps.py:17 And I close the file # tests/features/steps/steps.py:34 Then I can see 4 inner-files # tests/features/steps/steps.py:121 Feature: Acidfile must be durable # tests/features/durability.feature:3 The acidfile data must survive even if one of the inner files that support it is deleted. Scenario: Inner-file deleted # tests/features/durability.feature:7 Given an example acidfile # tests/features/steps/steps.py:13 When I write some data # tests/features/steps/steps.py:17 And I close the file # tests/features/steps/steps.py:34 And I remove one of the inner files # tests/features/steps/steps.py:38 And I open it again # tests/features/steps/steps.py:44 Then I can read the same data # tests/features/steps/steps.py:30 Scenario: All inner-files deleted # tests/features/durability.feature:15 Given an example acidfile # tests/features/steps/steps.py:13 When I write some data # tests/features/steps/steps.py:17 And I close the file # tests/features/steps/steps.py:34 And I remove all the inner files # tests/features/steps/steps.py:48 And I open it again # tests/features/steps/steps.py:44 Then I can't read any data # tests/features/steps/steps.py:55 Feature: Extended file usage # tests/features/extended_file.feature:3 Acidfile must behave like any other file-like object so all the not implemented method must be passed to de inner memory file. Scenario: Seek the file # tests/features/extended_file.feature:7 Given an example acidfile # tests/features/steps/steps.py:13 When I write some data # tests/features/steps/steps.py:17 And seek to the start of the file # tests/features/steps/steps.py:93 Then I can read the same data # tests/features/steps/steps.py:30 Feature: Acidfile must be isolated # tests/features/isolation.feature:3 The latest version of the data must be retrieved if two valid inner-files were found. Scenario Outline: First inner-file not updated # tests/features/isolation.feature:7 Given an example acidfile # tests/features/steps/steps.py:13 And an auxiliary acidfile # tests/features/steps/steps.py:79 When I write some auxiliary data # tests/features/steps/steps.py:21 And I close the auxiliary file # tests/features/steps/steps.py:83 And I write some data # tests/features/steps/steps.py:17 And I close the file # tests/features/steps/steps.py:34 And replace example inner-file number 0 with auxiliary one # tests/features/steps/steps.py:87 And I open it again # tests/features/steps/steps.py:44 Then I can read the same data # tests/features/steps/steps.py:30 Scenario Outline: First inner-file not updated # tests/features/isolation.feature:7 Given an example acidfile # tests/features/steps/steps.py:13 And an auxiliary acidfile # tests/features/steps/steps.py:79 When I write some auxiliary data # tests/features/steps/steps.py:21 And I close the auxiliary file # tests/features/steps/steps.py:83 And I write some data # tests/features/steps/steps.py:17 And I close the file # tests/features/steps/steps.py:34 And replace example inner-file number 1 with auxiliary one # tests/features/steps/steps.py:87 And I open it again # tests/features/steps/steps.py:44 Then I can read the same data # tests/features/steps/steps.py:30 7 features passed, 0 failed, 0 skipped 15 scenarios passed, 0 failed, 0 skipped 73 steps passed, 0 failed, 0 skipped, 0 undefined Took 0m0.210s _______________________________________________ summary ________________________________________________ py27: commands succeeded py33: commands succeeded congratulations :)
from acidfile import ACIDFile
myfile = ACIDFile('myfile.txt', 'w')
myfile.write(b'Some important data.')
myfile.close()
At the close invocation two copies will be written to disk: myfile.txt.0 and below myfile.txt.1. Each one will have an creation timestamp and a HMAC signature.
ls myfile.txt*
myfile.txt.0 myfile.txt.1
myfile = ACIDFile('myfile.txt', 'r')
print(myfile.read())
myfile.close()
b'Some important data.'
If any of the files is damaged due to turning off without proper shutdown or disk failure, manipulation, etc. It will be detected by the internal HMAC and the other's file data would be used instead.
If you want to read an acidfile
, never pass the full path of the real file, instead use the file name that you use in the creation step.
ACIDFile can (and should) be used as a regular context manager:
with ACIDFile('myfile.txt', 'w') as myfile:
myfile.write(b'Some important data.')
The number of inner copies of the data can be configured through the copies parameter.
with ACIDFile('myfile.txt', 'w', copies=5) as myfile:
myfile.write(b'Some super important data.')
ls myfile.txt*
myfile.txt.0 myfile.txt.1 myfile.txt.2 myfile.txt.3 myfile.txt.4 myfile.txt.5
The key used for compute and check the internal HMAC signature can be setted by the key parameter.
It's recommended to change that key in order to protect against fraud, making more difficult for a tamperer to put a fake file in place of the legitimate one.
with ACIDFile('myotherfile.txt', 'w', key=b'a better key for my file') as myfile:
myfile.write(b'Other stuff')
Without the valid key, the file can't be verified.
with ACIDFile('myotherfile.txt', 'r', key=b'mismatching key') as myfile:
print(myfile.read())
--------------------------------------------------------------------------- OSError Traceback (most recent call last) <ipython-input-11-571c3fa9e6d2> in <module>() 1 with ACIDFile('myotherfile.txt', 'r', key=b'mismatching key') as myfile: ----> 2 print(myfile.read()) /home/nil/Envs/acidfile3/lib/python3.3/site-packages/acidfile/__init__.py in read(self, size) 84 continue 85 if not self.loaded: ---> 86 raise IOError("Can't read file") 87 return self._file.read(size) 88 OSError: Can't read file
Thanks for watching!