dropcols.py is a Python module and program that removes (or conversely, extracts) selected columns from a delimited text file, such as a CSV file. It is analogous to the *nix "cut" program, except that it works on CSV files and allows columns to be selected by name (regular expressions) in addition to by number. Either the columns to keep, or the columns to remove, or both, can be specified.
Syntax and Options
Usage Notes
- The first line of the file should contain the names of the columns.
- Output is written to stdout.
- Column numbers can be used instead of (or as well as) regular expressions on the column names. Ranges and comma-separated lists of column numbers can be used. The first column is number 1.
- If only a 'drop' list is specified, all columns will be kept except those on the 'drop' list. If only a 'keep' list is specified, all columns will be dropped except those on the 'keep' list. If both lists are specified, the 'keep' specifications will apply only to (and may thereby undo) the columns dropped per the 'drop' list.
- Column order in the output file will be the same as in the input file, regardless of the order of items in the 'drop' and 'keep' lists.
- The -s option allows verification that the correct columns will be selected without requiring processing of the entire input file.
Examples
To keep columns 2, 4, 5, and 6, plus a column headed "Status", any of the following expressions can be used with the 'keep' option.
The range specification can even be reversed.
Copyright and License
Copyright (c) 2007-2011, R.Dreas Nielsen
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. The GNU General Public License is available at http://www.gnu.org/licenses/.