data¶
data
is a small Python module that allows you to treat input in a singular
way and leave it up to the caller to supply a byte-string, a unicode object, a
file-like or a filename.
>>> open('helloworld.txt', 'w').write('hello, world from a file')
>>> from data import Data as I
>>> a = I(u'hello, world')
>>> b = I(file='helloworld.txt')
>>> c = I(open('helloworld.txt'))
>>> print unicode(a)
hello, world
>>> print unicode(b)
hello, world from a file
>>> print unicode(c)
hello, world from a file
This can be made even more convenient using the data
decorator:
>>> from data.decorators import data
>>> @data('buf')
... def parse_buffer(buf, magic_mode=False):
... return 'buf passed in as ' + repr(buf)
...
>>> parse_buffer('hello')
"buf passed in as Data(data='hello', encoding='utf8')"
>>> rv = parse_buffer(open('helloworld.txt'))
>>> assert 'file=' in rv
Fitting in¶
All instances support methods like read
or __str__
that make it easy to
fit it into existing APIs:
>>> d = I('some data')
>>> d.read(4)
u'some'
>>> d.read(4)
u' dat'
>>> d.read(4)
u'a'
>>> e = I(u'more data')
>>> str(e)
'more data'
Note how read
returns unicode. Additionally, readb
is available:
>>> f = I(u'I am \xdcnicode.')
>>> f.readb()
'I am \xc3\x9cnicode.'
Every data
object has an encoding attribute which is used for converting
from and to unicode.
>>> g = I(u'I am \xdcnicode.', encoding='latin1')
>>> g.readb()
'I am \xdcnicode.'
Iteration and line reading are also supported:
>>> h = I('I am\nof many\nlines')
>>> h.readline()
u'I am\n'
>>> h.readlines()
[u'of many\n', u'lines']
>>> i = I('line one\nline two\n')
>>> list(iter(i))
[u'line one\n', u'line two\n']
Extras¶
save_to¶
Some useful convenience methods are available:
>>> j = I('example')
>>> j.save_to('example.txt')
The save_to
method will use the most efficient way possible to save the
data to a file (copyfileobj
or write()
). It can also be passed a
file-like object:
>>> k = I('example2')
>>> with open('example2.txt', 'wb') as out:
... k.save_to(out)
...
temp_saved¶
If you need the output inside a secure temporary file, temp_saved
is
available:
>>> l = I('goes into tmp')
>>> with l.temp_saved() as tmp:
... print tmp.name.startswith('/tmp/tmp')
... print l.read()
...
True
goes into tmp
temp_saved
functions almost identically to tempfile.NamedTemporaryFile
,
with one difference: There is no delete
argument. The file is removed only
when the context manager exits.
Where it is useful¶
data
can be used on both sides of an API, either while passing values in:
>>> import json
>>> from data import Data as I
>>> m = I('{"this": "json"}')
>>> json.load(m)
{u'this': u'json'}
or when getting values passed (see the data decorator example above). If necessary, you can also support APIs that allow users to pass in filenames:
>>> class Parser(object):
... @data('input')
... def parse(self, input, parser_opt=False):
... return input
... def parse_file(self, input_file, *args, **kwargs):
... return self.parse(I(file=input_file), *args, **kwargs)
...
>>> p = Parser()
>>> p.parse_file('/dev/urandom')
Data(file='/dev/urandom', encoding='utf8')
See the documentation at http://pythonhosted.org/data for an API reference.
Python 2 and 3¶
data
works the same on Python 2 and 3 thanks to six, a few compatibility functions and a
testsuite.
Python 3 is supported from 3.3 onwards, Python 2 from 2.6.