Paper Summary Agent for Obsidian

paper-summary-agent is a local skill package for Codex-style runtimes. It is designed to fetch papers, analyze them, and save the final Korean Markdown output inside an Obsidian vault so the note and extracted figure assets work with relative links.

This repository is a skill package, not a standalone app. The main entry point is SKILL.md, and the helper scripts under scripts/ support fetching, text extraction, PDF figure extraction, and final Markdown writing.

What This Repository Does

resolves a paper from title, DOI, URL, or direct PDF URL
fetches source files and extracts text
analyzes methods, experiments, figures, and tables
writes a Korean Markdown summary into an Obsidian-friendly folder
stores extracted figure assets next to the Markdown note with relative paths

Install The Skill

Place this folder under the local skills directory that your runtime scans and keep the folder name as paper-summary-agent.

Typical layout:

<skills-root>/paper-summary-agent/

For Codex, <skills-root> is commonly ~/.codex/skills. For other runtimes such as OpenClaw, use that runtime's local skills directory.

If you want to clone it directly:

git clone https://github.com/ch040602/Skills_Paper-summary.git <skills-root>/paper-summary-agent

If the skill is already installed:

git -C <skills-root>/paper-summary-agent pull

Restart or reload the runtime if it caches the available skills.

Install Python Dependencies

Install the helper-script dependencies before running the skill:

pip install requests beautifulsoup4 lxml pypdf python-slugify pymupdf pillow

PDF-related packages:

pypdf: PDF text extraction
pymupdf: figure and table extraction from PDFs
pillow: image post-processing for extracted figures

HTML-related packages:

requests: fetching paper pages and PDFs
beautifulsoup4 and lxml: HTML parsing

Configure Download And Obsidian Summary Paths

Path configuration lives in scripts/paths.py. The helper scripts support these environment variables:

PAPER_SUMMARY_AGENT_BASE_DIR
PAPER_SUMMARY_AGENT_DOWNLOAD_DIR
PAPER_SUMMARY_AGENT_SUMMARY_DIR

If PAPER_SUMMARY_AGENT_DOWNLOAD_DIR or PAPER_SUMMARY_AGENT_SUMMARY_DIR is set with a relative path, it is resolved relative to this repository root. That keeps configuration portable and avoids hard-coding user-specific absolute paths.

Defaults:

downloaded source files: ~/Documents/paper-summary-agent/downloaded/
final Markdown notes for Obsidian: ~/Desktop/obsidian/summary_paper/

Recommended setup for Obsidian:

point PAPER_SUMMARY_AGENT_SUMMARY_DIR to a folder inside your Obsidian vault
keep the summary Markdown file and extracted figure asset folder under the same parent so image links stay relative

PowerShell example:

$env:PAPER_SUMMARY_AGENT_DOWNLOAD_DIR = "./data/downloaded"
$env:PAPER_SUMMARY_AGENT_SUMMARY_DIR = "../my-obsidian-vault/summary_paper"

Bash example:

export PAPER_SUMMARY_AGENT_DOWNLOAD_DIR=./data/downloaded
export PAPER_SUMMARY_AGENT_SUMMARY_DIR=../my-obsidian-vault/summary_paper

With that setup, the skill will write:

../my-obsidian-vault/summary_paper/YYYYMMDD_<slugified_title>.md
../my-obsidian-vault/summary_paper/<slugified_title>/figure_*.png

The Markdown generated by scripts/save_summary.py embeds figures with relative paths, which is the intended behavior for Obsidian.

Execute The Skill

Once the folder is registered, run it through the agent runtime rather than launching this repository directly.

Typical triggers:

a paper title
a DOI
an arXiv / project / publisher URL
a direct PDF URL

Example prompts:

Summarize this paper: Attention Is All You Need

Read this arXiv paper: https://arxiv.org/pdf/1706.03762.pdf

Use paper-summary-agent on DOI 10.48550/arXiv.1706.03762

The skill instructions in SKILL.md tell the runtime to resolve the source, fetch files, analyze the paper, and save a Korean Markdown report.

Runtime Outputs

During execution, the helper scripts usually create:

downloaded source files under the configured download directory
fetch/extract/normalize provenance manifests next to downloaded text/source files
a final Markdown note under the configured Obsidian summary directory
an asset folder named after the paper slug beside that note when PDF figures are extracted
a .paper_summary_assets.json manifest inside that asset folder, used to reuse extracted figures and tables when the same PDF is saved again

Because the note and asset folder are siblings, the generated image links remain relative and portable across machines as long as the vault structure stays the same.

Source fetching and text extraction are cached with local manifest files. A repeated fetch_paper.py call for the same URL reuses the previously downloaded file when its recorded hash still matches. extract_text.py and normalize_text.py likewise skip unchanged work based on source/output signatures.

PDF figure and table extraction is cached by source file path, size, modification time, content hash, and extractor cache version. If the PDF changes, the cache manifest is incomplete, or a cached figure image is missing, the helper scripts regenerate the figure/table assets and refresh the manifest.

Repository Layout

Important files:

SKILL.md: runtime-facing workflow instructions
resources/agents_append.md: extra prompt content used by the workflow
scripts/resolve_paper.py: classifies input as title, DOI, URL, or PDF URL
scripts/fetch_paper.py: downloads the source page or PDF
scripts/extract_text.py: extracts text from downloaded HTML or PDF
scripts/normalize_text.py: cleans extracted text
scripts/context_router.py: builds a role-scoped context manifest so paper-analysis subagents do not all receive the full paper text
scripts/extract_figures.py: extracts PDF figures
scripts/save_summary.py: writes the final Markdown summary and embeds figures when available

Updating The Skill

If you already installed the repository as a local skill, update it in place:

git -C <skills-root>/paper-summary-agent pull

Then restart or reload your runtime if skill definitions are cached.

Notes

Keep the folder structure intact when copying the skill.
Do not move SKILL.md out of the repository root.
resolve_paper.py is only a lightweight classifier; full source resolution is expected to be handled by the agent workflow.
fetch_paper.py requires network access.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
resources		resources
scripts		scripts
tests		tests
.gitignore		.gitignore
README.md		README.md
SKILL.md		SKILL.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Paper Summary Agent for Obsidian

What This Repository Does

Install The Skill

Install Python Dependencies

Configure Download And Obsidian Summary Paths

Execute The Skill

Runtime Outputs

Repository Layout

Updating The Skill

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Paper Summary Agent for Obsidian

What This Repository Does

Install The Skill

Install Python Dependencies

Configure Download And Obsidian Summary Paths

Execute The Skill

Runtime Outputs

Repository Layout

Updating The Skill

Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages