Skip to content

ch040602/Skills_Paper-summary

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Paper Summary Agent for Obsidian

paper-summary-agent is a local skill package for Codex-style runtimes. It is designed to fetch papers, analyze them, and save the final Korean Markdown output inside an Obsidian vault so the note and extracted figure assets work with relative links.

This repository is a skill package, not a standalone app. The main entry point is SKILL.md, and the helper scripts under scripts/ support fetching, text extraction, PDF figure extraction, and final Markdown writing.

What This Repository Does

  • resolves a paper from title, DOI, URL, or direct PDF URL
  • fetches source files and extracts text
  • analyzes methods, experiments, figures, and tables
  • writes a Korean Markdown summary into an Obsidian-friendly folder
  • stores extracted figure assets next to the Markdown note with relative paths

Install The Skill

Place this folder under the local skills directory that your runtime scans and keep the folder name as paper-summary-agent.

Typical layout:

<skills-root>/paper-summary-agent/

For Codex, <skills-root> is commonly ~/.codex/skills. For other runtimes such as OpenClaw, use that runtime's local skills directory.

If you want to clone it directly:

git clone https://github.com/ch040602/Skills_Paper-summary.git <skills-root>/paper-summary-agent

If the skill is already installed:

git -C <skills-root>/paper-summary-agent pull

Restart or reload the runtime if it caches the available skills.

Install Python Dependencies

Install the helper-script dependencies before running the skill:

pip install requests beautifulsoup4 lxml pypdf python-slugify pymupdf pillow

PDF-related packages:

  • pypdf: PDF text extraction
  • pymupdf: figure and table extraction from PDFs
  • pillow: image post-processing for extracted figures

HTML-related packages:

  • requests: fetching paper pages and PDFs
  • beautifulsoup4 and lxml: HTML parsing

Configure Download And Obsidian Summary Paths

Path configuration lives in scripts/paths.py. The helper scripts support these environment variables:

  • PAPER_SUMMARY_AGENT_BASE_DIR
  • PAPER_SUMMARY_AGENT_DOWNLOAD_DIR
  • PAPER_SUMMARY_AGENT_SUMMARY_DIR

If PAPER_SUMMARY_AGENT_DOWNLOAD_DIR or PAPER_SUMMARY_AGENT_SUMMARY_DIR is set with a relative path, it is resolved relative to this repository root. That keeps configuration portable and avoids hard-coding user-specific absolute paths.

Defaults:

  • downloaded source files: ~/Documents/paper-summary-agent/downloaded/
  • final Markdown notes for Obsidian: ~/Desktop/obsidian/summary_paper/

Recommended setup for Obsidian:

  • point PAPER_SUMMARY_AGENT_SUMMARY_DIR to a folder inside your Obsidian vault
  • keep the summary Markdown file and extracted figure asset folder under the same parent so image links stay relative

PowerShell example:

$env:PAPER_SUMMARY_AGENT_DOWNLOAD_DIR = "./data/downloaded"
$env:PAPER_SUMMARY_AGENT_SUMMARY_DIR = "../my-obsidian-vault/summary_paper"

Bash example:

export PAPER_SUMMARY_AGENT_DOWNLOAD_DIR=./data/downloaded
export PAPER_SUMMARY_AGENT_SUMMARY_DIR=../my-obsidian-vault/summary_paper

With that setup, the skill will write:

  • ../my-obsidian-vault/summary_paper/YYYYMMDD_<slugified_title>.md
  • ../my-obsidian-vault/summary_paper/<slugified_title>/figure_*.png

The Markdown generated by scripts/save_summary.py embeds figures with relative paths, which is the intended behavior for Obsidian.

Execute The Skill

Once the folder is registered, run it through the agent runtime rather than launching this repository directly.

Typical triggers:

  • a paper title
  • a DOI
  • an arXiv / project / publisher URL
  • a direct PDF URL

Example prompts:

Summarize this paper: Attention Is All You Need
Read this arXiv paper: https://arxiv.org/pdf/1706.03762.pdf
Use paper-summary-agent on DOI 10.48550/arXiv.1706.03762

The skill instructions in SKILL.md tell the runtime to resolve the source, fetch files, analyze the paper, and save a Korean Markdown report.

Runtime Outputs

During execution, the helper scripts usually create:

  • downloaded source files under the configured download directory
  • fetch/extract/normalize provenance manifests next to downloaded text/source files
  • a final Markdown note under the configured Obsidian summary directory
  • an asset folder named after the paper slug beside that note when PDF figures are extracted
  • a .paper_summary_assets.json manifest inside that asset folder, used to reuse extracted figures and tables when the same PDF is saved again

Because the note and asset folder are siblings, the generated image links remain relative and portable across machines as long as the vault structure stays the same.

Source fetching and text extraction are cached with local manifest files. A repeated fetch_paper.py call for the same URL reuses the previously downloaded file when its recorded hash still matches. extract_text.py and normalize_text.py likewise skip unchanged work based on source/output signatures.

PDF figure and table extraction is cached by source file path, size, modification time, content hash, and extractor cache version. If the PDF changes, the cache manifest is incomplete, or a cached figure image is missing, the helper scripts regenerate the figure/table assets and refresh the manifest.

Repository Layout

Important files:

Updating The Skill

If you already installed the repository as a local skill, update it in place:

git -C <skills-root>/paper-summary-agent pull

Then restart or reload your runtime if skill definitions are cached.

Notes

  • Keep the folder structure intact when copying the skill.
  • Do not move SKILL.md out of the repository root.
  • resolve_paper.py is only a lightweight classifier; full source resolution is expected to be handled by the agent workflow.
  • fetch_paper.py requires network access.

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages