Skip to content

[Feature]: Auto-generate update infrastructure for llms.txt skills #7

@pgesiak

Description

@pgesiak

Feature Type

Plugin Enhancement

Related Plugin

llmstxt

Problem Statement

When /llmstxt-to-skill creates a skill from llms.txt, it provides no mechanism to keep downloaded documentation in sync with upstream sources.

Current Limitation:

Generated:

  • ✅ Skill directory structure
  • ✅ SKILL.md with frontmatter
  • ✅ Downloaded documentation files
  • ✅ llms.txt manifest copy

NOT Generated:

  • ❌ Update scripts to re-sync from source
  • ❌ Commands to check for updates
  • ❌ Documentation on update process
  • ❌ Frontmatter timestamps for staleness detection
  • ❌ Protection logic for SKILL-* files

User Impact

Users must implement update infrastructure from scratch for each skill:

Real Example (openclaw-docs):
Required manual implementation of:

  • update-openclaw-docs.sh (9.2KB script)
  • generate-skill-index.sh (1.5KB script)
  • update-docs.md command (1.8KB)
  • Updated README-SKILL-FILES.md with instructions
  • Protection logic for SKILL-* files

This took significant effort and should be automatic.

Why This Matters

  1. Stale Documentation - No mechanism to keep skills updated
  2. Manual Work - Users reinvent the wheel for each skill
  3. Inconsistency - Each skill has different update patterns
  4. Error-Prone - Easy to miss files or introduce bugs
  5. Maintenance Burden - Updates require manual intervention
  6. Poor UX - "Batteries included" should mean update capability

Proposed Solution

Auto-Generate Update Infrastructure

When creating skill from llms.txt, automatically generate:

  1. Update script (scripts/update-docs.sh)
  2. Index generator (scripts/generate-index.sh)
  3. Update command (commands/update-docs.md)
  4. Documentation (references/README.md)
  5. Frontmatter in all downloaded files
  6. Protection logic for SKILL-* files

User Experience

Before (Current):

/llmstxt-to-skill https://docs.example.com/llms.txt

✓ Skill created at .claude/skills/example-docs
  212 files downloaded

# User must implement updates manually

After (Proposed):

/llmstxt-to-skill https://docs.example.com/llms.txt

✓ Skill created at .claude/skills/example-docs
  212 files downloaded
  Update infrastructure included:
    • scripts/update-docs.sh
    • scripts/generate-index.sh
    • commands/update-docs.md
    • Frontmatter added to all files

To update documentation later:
  cd .claude/skills/example-docs
  ./scripts/update-docs.sh

Updating the skill:

cd .claude/skills/example-docs
./scripts/update-docs.sh

════════════════════════════════════════════════════════════
  Documentation Update
════════════════════════════════════════════════════════════

STEP 1: Manifest-Based Update
────────────────────────────────────────────────────────────

→ Downloading latest llms.txt...
  ✓ Downloaded

→ Comparing manifests...

📊 Manifest changes detected:
   • Added entries: 3
   • Removed entries: 1

→ Downloading new files...
   • new-feature.md ... ✓
   • api-v2.md ... ✓
   • deployment.md ... ✓

→ Removing deleted files...
   • deprecated-api.md ✓ removed

→ Updated llms.txt
→ Regenerating SKILL-INDEX.md...

📦 Summary:
   • Files added: 3
   • Files removed: 1

════════════════════════════════════════════════════════════
✅ Update complete
════════════════════════════════════════════════════════════

Implementation Design

Generated File Structure

.claude/skills/{skill-name}/
├── SKILL.md                          # Main skill file
├── commands/
│   └── update-docs.md                # Generated: Update command
├── references/
│   ├── llms.txt                      # Downloaded manifest
│   ├── SKILL-INDEX.md                # Generated: Auto-updated index
│   ├── README.md                     # Generated: Documentation
│   └── *.md                          # Downloaded files (with frontmatter)
└── scripts/
    ├── update-docs.sh                # Generated: Main update script
    └── generate-index.sh             # Generated: Index generator

Component 1: Update Script

Location: scripts/update-docs.sh (auto-generated)

Template variables:

SKILL_NAME="{{SKILL_NAME}}"           # e.g., "example-docs"
REFS_DIR="{{REFS_DIR}}"               # e.g., "references"
LLMS_URL="{{LLMS_URL}}"               # e.g., "https://docs.example.com/llms.txt"

Core functionality:

  1. Download fresh llms.txt
  2. Compare with local manifest
  3. Identify added/removed/changed entries
  4. Download new files with frontmatter
  5. Remove deleted files (skip SKILL-* files)
  6. Regenerate index
  7. Report changes

Example script:

#!/bin/bash
# Auto-generated by llmstxt-to-skill
# Source: https://docs.example.com/llms.txt

REFS_DIR="references"
LLMS_URL="https://docs.example.com/llms.txt"

# Download fresh manifest
curl -fsSL "$LLMS_URL" -o /tmp/llms-new.txt

# Compare manifests
if diff "$REFS_DIR/llms.txt" /tmp/llms-new.txt > /dev/null; then
    echo "✓ No changes"
    exit 0
fi

# Process added entries
comm -13 <(sort "$REFS_DIR/llms.txt") <(sort /tmp/llms-new.txt) | \
while IFS= read -r line; do
    url=$(echo "$line" | sed -E 's/.*\((https:\/\/[^)]+)\).*/\1/')
    filename=$(basename "$url")

    # Skip SKILL-* files (never overwrite)
    [[ "$filename" =~ ^SKILL- ]] && continue

    # Download with metadata
    download_with_metadata "$url" "$REFS_DIR/$filename"
    echo "  ✓ Downloaded: $filename"
done

# Process removed entries
comm -23 <(sort "$REFS_DIR/llms.txt") <(sort /tmp/llms-new.txt) | \
while IFS= read -r line; do
    url=$(echo "$line" | sed -E 's/.*\((https:\/\/[^)]+)\).*/\1/')
    filename=$(basename "$url")

    # Skip SKILL-* files (never remove)
    [[ "$filename" =~ ^SKILL- ]] && continue

    rm -f "$REFS_DIR/$filename"
    echo "  ✓ Removed: $filename"
done

# Update manifest
mv /tmp/llms-new.txt "$REFS_DIR/llms.txt"

# Regenerate index
./scripts/generate-index.sh

echo "✓ Updated manifest"

Component 2: Update Command

Location: commands/update-docs.md (auto-generated)

Template:

---
name: update-docs
description: Update {{SKILL_NAME}} documentation from {{DOCS_DOMAIN}}
allowed-tools: [Bash, Read, Write]
---

# Update {{SKILL_NAME}} Documentation

Run the update script to sync documentation from {{LLMS_URL}}.

## Usage

```bash
cd {{SKILL_DIR}}
./scripts/update-docs.sh

Process

Step 1: Manifest-Based Update (Automatic)

  • Downloads latest llms.txt
  • Identifies changes
  • Updates files

Step 2: File-by-File Verification (Optional)

  • Compares each file with remote
  • Prompts for selective updates

Protected Files

All SKILL-*.md files are never modified (manually maintained).


### Component 3: Frontmatter Generation

For each downloaded file, add YAML frontmatter:

```yaml
---
source: https://docs.example.com/path/to/file.md
title: simplified-filename
description: Human Readable Title
fetched: 2026-02-01T18:30:00.000Z
---

[Original file content follows...]

Functionality:

  • Enables staleness detection
  • Tracks source URL for re-downloading
  • Records fetch timestamp
  • Preserves title/description

Implementation:

function addFrontmatter(url: string, content: string, filename: string): string {
  const title = path.basename(filename, '.md');
  const timestamp = new Date().toISOString();

  // Check if content already has frontmatter
  if (content.trim().startsWith('---')) {
    return updateFrontmatter(content, { fetched: timestamp });
  }

  // Add new frontmatter
  return `---
source: ${url}
title: ${title}
fetched: ${timestamp}
---

${content}`;
}

Component 4: SKILL-* File Protection

Pattern matching in all scripts:

# Skip SKILL-* files (skill-generated, protected)
if [[ "$filename" =~ ^SKILL- ]]; then
    continue
fi

Applied to:

  • Download operations (don't overwrite)
  • Remove operations (don't delete)
  • Verify operations (don't check)
  • Count operations (exclude from totals)

Find command exclusions:

find "$REFS_DIR" -name "*.md" ! -name "SKILL-*" ! -name "README*"

Template System

Template Variables

Global variables (available in all templates):

{
  // Skill metadata
  SKILL_NAME: "example-docs",
  SKILL_DIR: ".claude/skills/example-docs",

  // Source information
  LLMS_URL: "https://docs.example.com/llms.txt",
  DOCS_URL: "https://docs.example.com",
  DOCS_DOMAIN: "docs.example.com",

  // Paths
  REFS_DIR: "references",
  SCRIPTS_DIR: "scripts",
  COMMANDS_DIR: "commands",

  // Statistics
  DOC_COUNT: 241,
  TOTAL_SIZE: "2.5MB",

  // Timestamps
  TIMESTAMP: "2026-02-01T18:30:00.000Z",
  DATE: "2026-02-01"
}

Template Rendering

Simple placeholder replacement:

function renderTemplate(template: string, vars: Record<string, string>): string {
  return template.replace(/\{\{(\w+)\}\}/g, (match, key) => {
    return vars[key] || match;
  });
}

Generation Workflow

Phase 1: Skill Creation (Existing)

Current behavior (no changes):

  1. Download llms.txt from URL
  2. Parse manifest for file list
  3. Create skill directory structure
  4. Download all documentation files
  5. Generate SKILL.md with frontmatter
  6. Save llms.txt locally

Phase 2: Infrastructure Generation (NEW)

After skill creation, automatically:

  1. Render templates with skill-specific variables
  2. Create scripts directory and generate:
    • update-docs.sh (executable)
    • generate-index.sh (executable)
  3. Create commands directory and generate:
    • update-docs.md
  4. Add frontmatter to all downloaded files
  5. Generate initial index (SKILL-INDEX.md)
  6. Create README in references directory
  7. Set permissions (scripts executable)

Phase 3: Validation (NEW)

Verify generated infrastructure:

  1. ✅ All template files rendered correctly
  2. ✅ Scripts are executable (755 permissions)
  3. ✅ Frontmatter valid YAML in all files
  4. ✅ Index contains all files from manifest
  5. ✅ Update script can run successfully (dry-run)

Phase 4: User Notification (NEW)

✓ Skill created: example-docs
  Location: .claude/skills/example-docs

  Documentation:
    • 241 files downloaded
    • Frontmatter added to all files
    • Index generated: SKILL-INDEX.md

  Update infrastructure:
    • scripts/update-docs.sh
    • scripts/generate-index.sh
    • commands/update-docs.md
    • references/README.md

  Usage:
    Update docs: ./scripts/update-docs.sh
    Or use command: /example-docs:update-docs

Update Script Features

Step 1: Manifest Comparison

Functionality:

  • Download latest llms.txt
  • Compare with local copy using diff/comm
  • Identify added entries
  • Identify removed entries
  • Calculate change summary

Output:

→ Comparing manifests...

📊 Manifest changes detected:
   • Added entries: 5
   • Removed entries: 2
   • Total entries: 244 (was 241)

Step 2: File Operations

Download new files:

→ Downloading new files...
   • advanced-features.md ... ✓
   • api-reference-v2.md ... ✓
   • migration-guide.md ... ✓

Remove deleted files:

→ Removing deleted files...
   • deprecated-api-v1.md ✓ removed
   • legacy-config.md ✓ removed

Protection in action:

→ Removing deleted files...
   • SKILL-ARCHITECTURE.md ⊗ skipped (protected)
   • deprecated-api-v1.md ✓ removed

Step 3: Optional File-by-File Verification

Interactive mode:

STEP 2: File-by-File Verification (Optional)

Compare all individual files? (y/N) y

→ Comparing 241 files...

✓ [1/241] setup.md
✓ [2/241] installation.md
🔄 [3/241] configuration.md - CHANGED
   Lines different: 47
   Update this file? (y/N) y
   ✓ Updated

Step 4: Summary Report

📦 Summary:
   • Files added: 5
   • Files removed: 2
   • Files updated: 3
   • Total documents: 244

✅ Update complete

Frontmatter Schema

Standard Frontmatter

All downloaded files:

---
source: https://docs.example.com/path/to/file.md
title: file-name
description: Human Readable Title (optional)
fetched: 2026-02-01T18:30:00.000Z
---

Extended Frontmatter (Optional)

For tracking updates:

---
source: https://docs.example.com/path/to/file.md
title: file-name
description: Human Readable Title
fetched: 2026-02-01T18:30:00.000Z
updated: 2026-02-15T10:45:00.000Z
updates: 2
checksum: abc123...
---

Fields:

  • source - Original URL (required)
  • title - Simplified filename (required)
  • description - Human-readable title (optional)
  • fetched - Initial download timestamp (required)
  • updated - Last update timestamp (optional)
  • updates - Number of times updated (optional)
  • checksum - Content hash for change detection (optional)

Benefits

User Experience

Before (Manual):

  • ❌ Users implement updates from scratch
  • ❌ Inconsistent patterns across skills
  • ❌ No standard commands
  • ❌ Documentation goes stale
  • ❌ Time-consuming maintenance

After (Auto-Generated):

  • ✅ Update infrastructure included by default
  • ✅ Consistent patterns across all llms.txt skills
  • ✅ Standard commands work everywhere
  • ✅ Documentation stays fresh
  • ✅ One-command updates

Developer Experience

Skill creators:

  • Don't write update scripts
  • Don't implement frontmatter logic
  • Don't protect SKILL-* files manually
  • Focus on skill content, not infrastructure

Skill maintainers:

  • Simple update command
  • Clear status reporting
  • Safe operations (protection built-in)
  • Automated index regeneration

Consistency

Same structure everywhere:

All llms.txt skills have:
  ✓ scripts/update-docs.sh
  ✓ scripts/generate-index.sh
  ✓ commands/update-docs.md
  ✓ references/README.md
  ✓ Frontmatter in all files
  ✓ SKILL-* protection

Same commands work:

# Works for ANY llms.txt skill
cd .claude/skills/{any-skill}
./scripts/update-docs.sh

Success Criteria

Functional

Auto-generation works:

  • Update scripts created automatically
  • Templates render correctly
  • Scripts are executable
  • Protection logic included

Updates work:

  • Can detect manifest changes
  • Downloads new files correctly
  • Removes deleted files safely
  • Preserves SKILL-* files

Frontmatter works:

  • Added to all downloaded files
  • Valid YAML format
  • Includes required fields
  • Updates on re-download

Quality

Code quality:

  • Scripts well-commented
  • Error handling included
  • Edge cases covered
  • Platform-compatible (macOS, Linux)

Documentation quality:

  • Clear usage instructions
  • Examples provided
  • Troubleshooting guide
  • Migration path documented

User Experience

Ease of use:

  • No manual scripting required
  • Simple update command
  • Clear progress reporting
  • Helpful error messages

Consistency:

  • Same structure across skills
  • Predictable behavior
  • Standard commands

Dependencies

Required Tools

System dependencies:

  • bash (4.0+)
  • curl or wget
  • diff / comm
  • awk / sed
  • find / grep

Checked at generation:

# Verify dependencies before generating
for cmd in bash curl diff comm awk sed find grep; do
    if ! command -v $cmd &> /dev/null; then
        echo "❌ Required: $cmd"
        exit 1
    fi
done

Real-World Reference

Proof of Concept

openclaw-docs skill demonstrates this pattern:

Files created (manually, should be auto-generated):

.claude/skills/openclaw-docs/
├── scripts/
│   ├── update-openclaw-docs.sh (9.2KB)
│   └── generate-skill-index.sh (1.5KB)
├── commands/
│   └── update-docs.md (1.8KB)
└── references/
    ├── README-SKILL-FILES.md
    └── SKILL-INDEX.md (241 docs)

Functionality demonstrated:

  • ✅ Manifest comparison
  • ✅ File download with frontmatter
  • ✅ SKILL-* file protection
  • ✅ Index generation
  • ✅ Optional file-by-file verification
  • ✅ Clear progress reporting

This should be the default for ALL llms.txt skills.


Alternatives Considered

1. External Update Tool

Idea: Separate CLI tool for updates

Rejected:

  • ❌ Extra dependency to install
  • ❌ Not "batteries included"
  • ❌ Requires external maintenance
  • ❌ Skill not self-contained

2. Manual Process Only

Idea: Document how to update, let users implement

Rejected:

  • ❌ Users reinvent the wheel
  • ❌ Inconsistent implementations
  • ❌ Error-prone
  • ❌ Poor developer experience

3. Cloud-Based Updates

Idea: Skills check cloud service for updates

Rejected:

  • ❌ Requires network/cloud dependency
  • ❌ Privacy concerns
  • ❌ No offline capability
  • ❌ Centralization issues

4. Git-Based Updates

Idea: Use git submodules or sparse checkout

Rejected:

  • ❌ Requires git knowledge
  • ❌ llms.txt sources aren't always git repos
  • ❌ Complex for simple use case
  • ❌ Frontmatter wouldn't be preserved

Chosen approach (auto-generated scripts) is simplest and most reliable.


Submitted: 2026-02-01
Category: Feature Enhancement, Developer Experience
Complexity: Medium (template system + generation logic)
Impact: High (affects all llms.txt skills, major UX improvement)

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureNew feature or requestplugin:llmstxtllmstxt pluginpriority:highHigh priority, needs immediate attentiontriageNeeds initial review and categorization

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions