An image captioning model using ResNet-50 encoder + Transformer decoder trained on MS COCO. Served via FastAPI with a drag-and-drop frontend and Docker support for CPU deployment.
-
Updated
Mar 13, 2026 - Jupyter Notebook
An image captioning model using ResNet-50 encoder + Transformer decoder trained on MS COCO. Served via FastAPI with a drag-and-drop frontend and Docker support for CPU deployment.
An image captioning model using ResNet-50 encoder + Transformer decoder trained on MS COCO. Served via FastAPI with a drag-and-drop frontend and Docker support for CPU deployment.
Add a description, image, and links to the torch-vision topic page so that developers can more easily learn about it.
To associate your repository with the torch-vision topic, visit your repo's landing page and select "manage topics."