Check the TFeat descriptor from BMVC 2016.
Results should be much better than PN-Net introduced below
Code for the arxiv paper PN-Net: Conjoined Triple Deep Network for Learning Local Image Descriptors.
The network extracts feature descriptors from grayscale local patches
of size 32x32.
nn.Sequential {
[input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> output]
(1): cudnn.SpatialConvolution(1 -> 32, 7x7)
(2): cudnn.Tanh
(3): cudnn.SpatialMaxPooling(2,2,2,2)
(4): cudnn.SpatialConvolution(32 -> 64, 6x6)
(5): cudnn.Tanh
(6): nn.View
(7): nn.Linear(4096 -> 128)
(8): cudnn.Tanh
}For optimization details refer to the arxiv publication. Training code is now also available.
Get the Phototour datasets in .t7 format from UBC-Phototour-Patches-Torch
and extract liberty yosemite and notredame
Run th eval.lua
The script will print a series of evaluation results for the patch
pairs from the notredame 100k dataset e.g.
1 4.3915
0 10.6367
1 5.6122
0 10.5520
0 10.4561
0 10.1167
0 9.8624
1 3.2972
1 2.1507
1 2.9709
1 3.4437
1 7.6362 The first column contains the patch pair label (0 negative, 1 positive), the second row contains the L2 distance between the two patches of the pair, based on the features extracted from the last layer of our CNN.
From this output, one can compute the ROC curve (e.g. cppROC).
In the train folder, run th run.lua.
When training with 1.2M
triplets on a GTX TITAN X, each epoch takes approximately 2mins
Examples of the training triplets used

Examples of positive nearest neighbour patch matching using the pnnet descriptor in the Oxford matching dataset.
Efficiency comparison with MatchNet and deepcompare, both from CVPR 2015. For more results refer to the arxiv paper.

