Image fingerprinting system combining gradient-based feature extraction (C) with metric learning (Python / Keras).
Codebase from an undergraduate ML research internship. The original system was developed by the research lab; this repository preserves the implementation I worked with for portfolio reference. The project explores image fingerprinting via two complementary pipelines:
- Classical pipeline (C + Python) — Uses nabla (gradient) operators on bitmap images to derive feature histograms.
- Deep learning pipeline — MobileNet backbone + Dense embedding layer trained with triplet loss to produce a 64-dimensional image fingerprint ("DNA").
Two embeddings can then be compared via cosine similarity for image retrieval and matching tasks.
| Layer | Technology |
|---|---|
| Native processing | C, CMake |
| Numerical / image | NumPy, Pillow, SciPy |
| Deep learning | TensorFlow / Keras, MobileNet |
| Metric learning | Triplet loss (margin = 0.4, embedding dim = 64) |
nabla-dna-master/
├── CMakeLists.txt # Builds the `mkdna` C executable
├── src/mkdna/ # C source: bitmap → DNA fingerprint
├── python/ # Python toolchain (BMP, histogram, distance, sobel, ...)
│ ├── mkdna.py # Python driver
│ ├── nbmp.py, sobel.py, histo.py
│ ├── dist.py, getdist.py, dist_histo.py
│ └── ...
└── DL-DNA/ # Deep learning pipeline
├── dl_dna.py # CLI entry point
├── dl_dna_model.py # Abstract model base + cosine similarity
├── triplet_model.py # Triplet-loss MobileNet (anchor / positive / negative)
├── mobilenet.py # Plain MobileNet baseline
└── lineEnumerator.py
cd nabla-dna-master
mkdir build && cd build
cmake ..
make # produces the `mkdna` executablecd nabla-dna-master/DL-DNA
pip install -r requirements.txt# Train with triplet loss, save the trained model
python dl_dna.py -m triplet_loss -t triples.txt -s model.keras -f images/
# Get the embedding for a single image
python dl_dna.py -l model.keras some_image_name
# Compute cosine similarity between two images
python dl_dna.py -l model.keras image_a image_b
# Compute similarities for a batch (pairs file)
python dl_dna.py -l model.keras pairs.txtEnvironment variables: N_UNITS, IMAGE_FOLDER, EPOCHS.
This repository is a snapshot of the codebase from the internship and is preserved here for portfolio reference. Module naming and architecture reflect the original lab conventions.