Skip to content

Crowd-AI-Lab/lamp-cap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

LaMP-Cap: Personalized Scientific Figure Captioning Dataset


This is the Github repo of the arXiv preprint paper, LaMP-Cap: Personalized Figure Caption Generation With Multimodal Figure Profiles.

LaMP-Cap was built based on SciCap Challenge dataset, designed specifically for personalized and context-aware scientific figure caption generation. Unlike traditional caption datasets, LaMP-Cap provides not only target figures and their metadata, but also profile figures from the same scientific paper, enabling research into leveraging multimodal context for improved captioning.

LaMP-Cap is intended for non-commercial research only and is released under the CC BY-NC-SA 4.0 license. By using LaMP-Cap, you agree to the terms in the license.

Overview Figure

How to Cite?


@inproceedings{ng2025lamp,
  title= "LaMP-Cap: Personalized Figure Caption Generation With Multimodal Figure Profiles",
  author= "Ng, 'Sam' Ho Yin and Hsu, Ting-Yao and Anantha Ramakrishnan, Aashish and Kveton, Branislav and Lipka, Nedim and Dernoncourt, Franck and Lee, Dongwon and Yu, Tong and Kim, Sungchul and Rossi, Ryan A and Huang, 'Kenneth' Ting-hao",
  booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2025",
  month = nov,
  year = "2025"
}

Note: The paper is acepted to EMNLP 2025 Findings. The final BibTeX citation will be available upon publication. In the meantime, please cite the pre-print version.

Dataset Description


LaMP-Cap is curated from the SciCap Challenge Dataset and focuses on personalized captioning, where profile figures (related images and captions from the same paper) provide rich context for the target figure. This design supports the study of context-aware and user-personalized caption generation in scientific domains.

Profile Figures Distribution

Download the SciCap Challenge Dataset


You can dowload the SciCap Challenge dataset from Hugging Face here: Download Link.

Our LaMP-Cap data is based on metadata files from the SciCap Challenge dataset. Each arXiv paper groups its figures together. We randomly picked one figure per paper as the target, and the rest became profile figures. Our metadata only includes papers with at least two figures (so each target has at least one profile figure). This means only part of the original SciCap dataset is included in our data. Our metadata files help filter and organize figures for personalized caption generation in our use case.

So in this github you can find the following:

Folder Structure:


.
├── README.md
├── img                                   # Contains related table/figures from our arXiv paper
├── metadata                              # Contains the annotations for Dataset splits
    ├── train-metadata.json               #target-profile pairing metadata for training set
    ├── test-metadata.json                #target-profile pairing metadata for test set 
    └── val-metadata.json                 #target-profile pairing metadata for validation set

Example Data Instance:

An actual JSON object from LaMP-Cap:

{
  "arXiv_id": 1707.05196,
  "categories": "physics.acc-ph",
  "target": {
    "image_id": 757913,
    "caption_id": 1092105,
    "caption_length": 35,
    "figure_type": "Graph Plot"
  },
  "profile": [
    {
      "image_id": 501525,
      "caption_id": 835717,
      "caption_length": 42,
      "figure_type": "Graph Plot"
    },
    {
      "image_id": 519953,
      "caption_id": 854145,
      "caption_length": 33,
      "figure_type": "Graph Plot"
    },
    {
      "image_id": 586922,
      "caption_id": 921114,
      "caption_length": 14,
      "figure_type": "Graph Plot"
    }
  ]
}

JSON Schema:

  • arXiv_id: Unique identifier for the paper (from arXiv).
  • categories: arXiv primary category of the source paper (e.g., "physics.acc-ph")
  • target: Metadata for the target figure to be captioned:
    • image_id: Figure image ID (matches SciCap).
    • caption_id: Caption ID (matches SciCap).
    • caption_length: Number of tokens in the caption.
    • figure_type: Type of figure (e.g., "Graph Plot").
  • profile: List of profile figures providing context for personalized captioning.
    • Each entry contain the same fields as target.

Baseline Performance

The caption quality was measured by BLEU and ROUGE score, using the test set of the corresponding data collection as a reference. We measure the data for the similarity test from the generated caption against the original caption. We also measure the performance with variations such as no profile, 1 profile and all profiles.

Profile Figures Distribution


License

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License and inherits the licensing terms of the SciCap Challenge Dataset.

LaMP-Cap is only for non-commercial use, and is released under CC BY-NC-SA 4.0. By using LaMP-Cap, you agree to the terms in the license.

The original SciCap dataset is based on the arXiv dataset, which uses the CC0 1.0 Universal (CC0 1.0) Public Domain Dedication license for the metadata, which grants permission to remix, remake, annotate, and publish the metadata.

About

LaMP-Cap: Personalized Figure Caption Generation With Multimodal Figure Profiles

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors