Extract features from images using a pretrained ConvNet.
Based on Yangqing Jia and Jeff Donahue’s DeCAF. Please make sure you read and accept DeCAF’s license before you use this class.
If classify_direct=False, expects its input X to be a list of image filenames or arrays as produced by np.array(Image.open(filename)).
Parameters: |
|
---|
You’ll need to manually install DeCAF for ConvNetFeatures to work.
You will also need to download a tarball that contains pretrained parameter files from Yangqing Jia’s homepage.
Refer to the location of the two files contained in the tarball when you instantiate ConvNetFeatures like so:
convnet = ConvNetFeatures(
pretrained_params='/path/to/imagenet.decafnet.epoch90',
pretrained_meta='/path/to/imagenet.decafnet.meta',
)
For more information on how DeCAF works, please refer to [1].
What follows is a simple example that uses ConvNetFeatures and scikit-learn to classify images from the Kaggle Dogs vs. Cats challenge. Before you start, you must download the images from the Kaggle competition page. The train/ folder will be referred to further down as TRAIN_DATA_DIR.
We’ll first define a few imports and the paths to the files that we just downloaded:
import os
from nolearn.decaf import ConvNetFeatures
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.pipeline import Pipeline
from sklearn.utils import shuffle
DECAF_IMAGENET_DIR = '/path/to/imagenet-files/'
TRAIN_DATA_DIR = '/path/to/dogs-vs-cats-training-images/'
A get_dataset function will return a list of all image filenames and labels, shuffled for our convenience:
def get_dataset():
cat_dir = TRAIN_DATA_DIR + 'cat/'
cat_filenames = [cat_dir + fn for fn in os.listdir(cat_dir)]
dog_dir = TRAIN_DATA_DIR + 'dog/'
dog_filenames = [dog_dir + fn for fn in os.listdir(dog_dir)]
labels = [0] * len(cat_filenames) + [1] * len(dog_filenames)
filenames = cat_filenames + dog_filenames
return shuffle(filenames, labels, random_state=0)
We can now define our sklearn.pipeline.Pipeline, which merely consists of ConvNetFeatures and a sklearn.linear_model.LogisticRegression classifier.
def main():
convnet = ConvNetFeatures(
pretrained_params=DECAF_IMAGENET_DIR + 'imagenet.decafnet.epoch90',
pretrained_meta=DECAF_IMAGENET_DIR + 'imagenet.decafnet.meta',
)
clf = LogisticRegression()
pl = Pipeline([
('convnet', convnet),
('clf', clf),
])
X, y = get_dataset()
X_train, y_train = X[:100], y[:100]
X_test, y_test = X[100:300], y[100:300]
print "Fitting..."
pl.fit(X_train, y_train)
print "Predicting..."
y_pred = pl.predict(X_test)
print "Accuracy: %.3f" % accuracy_score(y_test, y_pred)
main()
Note that we use only 100 images to train our classifier (and 200 for testing). Regardless, and thanks to the magic of pre-trained convolutional nets, we’re able to reach an accuracy of around 94%, which is an improvement of 11% over the classifier described in [2].
[1] | Jeff Donahue, Yangqing Jia, Oriol Vinyals, Judy Hoffman, Ning Zhang, Eric Tzeng, and Trevor Darrell. Decaf: A deep convolutional activation feature for generic visual recognition. arXiv preprint arXiv:1310.1531, 2013. |
[2] | P. Golle. Machine learning attacks against the asirra captcha. In ACM CCS 2008, 2008. |