Work with big data sheet

Formatting while transcoding a big data file

If you are transcoding a big data set, conventional formatting method would not help unless a on-demand free RAM is available. However, there is a way to minimize the memory footprint of pyexcel while the formatting is performed.

Let’s continue from previous example. Suppose we want to transcode “your_file.csv” to “your_file.xls” but increase each element by 1.

What we can do is to define a row renderer function as the following:

>>> def increment_by_one(row):
...     for element in row:
...         yield element + 1

Then pass it onto save_as function using row_renderer:

>>> pe.isave_as(file_name="your_file.csv",
...             row_renderer=increment_by_one,
...             dest_file_name="your_file.xlsx")

Note

If the data content is from a generator, isave_as has to be used.

We can verify if it was done correctly:

>>> pe.get_sheet(file_name="your_file.xlsx")
your_file.csv:
+---+----+----+
| 2 | 22 | 32 |
+---+----+----+
| 3 | 23 | 33 |
+---+----+----+
| 4 | 24 | 34 |
+---+----+----+
| 5 | 25 | 35 |
+---+----+----+
| 6 | 26 | 36 |
+---+----+----+
| 7 | 27 | 37 |
+---+----+----+