[Feedback]: llmstxt-to-skill generates skeletal skills requiring hours of manual work

## Feedback Category
Developer Experience

## Summary
The `llmstxt-to-skill` plugin generates skeletal skills requiring 2-3 hours of manual work to make production-ready. It needs to auto-generate content, update infrastructure, and maintenance tooling.

## Current State

**What's Generated:**
```
skills/openclaw/
├── SKILL.md              # Lists 241 references, no content
└── references/
    ├── llms.txt          # Manifest
    └── *.md              # 241 downloaded docs
```

**SKILL.md example:**
```markdown
---
name: openclaw
description: OpenClaw documentation and reference. Use when asking about this documentation.
---

# OpenClaw

This skill provides access to OpenClaw documentation with 241 reference documents.

## Reference Documents
- [auth-monitoring](references/auth-monitoring.md): Auth Monitoring
- [cron-jobs](references/cron-jobs.md): Cron Jobs
[... 239 more lines ...]
```

## Problems

❌ **Can't answer questions** - No actual content, just file lists
❌ **Generic triggers** - "OpenClaw documentation" won't trigger on real queries
❌ **No updates** - No mechanism to sync with upstream
❌ **Wrong naming** - Should be "openclaw-docs"
❌ **No tooling** - No scripts, validation, or maintenance commands

## User Impact

After running `/llmstxt-to-skill https://docs.openclaw.ai/llms.txt`, spent 2-3 hours:
1. Extracting key content from 241 docs → SKILL.md
2. Writing update scripts from scratch (9.2KB)
3. Building index generators (1.5KB)
4. Adding frontmatter to track staleness
5. Creating protection logic for SKILL-* files
6. Documenting update process

**All of this should be automatic.**

---

## 8 Required Improvements

### 1. Generate Actual Content (Not Lists)

**Current:**
```markdown
## Reference Documents
- [file1.md](references/file1.md): Title 1
- [file2.md](references/file2.md): Title 2
```

**Should Generate:**
```markdown
## Installation (macOS)

Three methods:

1. **Curl (Recommended):**
   ```bash
   curl -fsSL https://openclaw.ai/install.sh | bash
   openclaw onboard --install-daemon
   ```

2. **npm:**
   ```bash
   npm install -g openclaw@latest
   ```

3. **From source:**
   ```bash
   git clone && pnpm install && pnpm build
   ```

For detailed troubleshooting: [SKILL-INSTALLATION.md](references/SKILL-INSTALLATION.md)

## Core Concepts

- **Gateway**: WebSocket server managing channels (WhatsApp, Telegram, Discord)
- **Agent**: Isolated AI instance with workspace and sessions
- **Workspace**: `~/.openclaw/workspace` with AGENTS.md, SOUL.md, memory/
- **Skills**: Extensible tools loaded from workspace/skills/

[... 800 more words of useful content ...]
```

**Target:** 1,200-2,000 words of actionable content extracted from key docs.

**Implementation approach:**
```typescript
async function generateSkillContent(llmstxt: ParsedLlmsTxt): string {
  // 1. Identify key docs
  const keyDocs = {
    installation: findDocs(llmstxt, ['setup', 'install', 'getting-started']),
    architecture: findDocs(llmstxt, ['architecture', 'overview', 'concepts']),
    configuration: findDocs(llmstxt, ['config', 'configuration']),
  };

  // 2. Fetch and summarize
  const sections = await Promise.all([
    summarizeInstallation(keyDocs.installation),
    summarizeArchitecture(keyDocs.architecture),
    summarizeConfiguration(keyDocs.configuration),
  ]);

  // 3. Build SKILL.md with content
  return buildSkillMd({
    quickStart: sections[0],
    coreConcepts: sections[1],
    configuration: sections[2],
    references: organizeReferences(llmstxt),
  });
}
```

---

### 2. Concrete Trigger Terms (Not Meta-Descriptions)

**Current:**
```yaml
description: OpenClaw documentation and reference. Use when asking about this documentation.
```

**Should Be:**
```yaml
description: OpenClaw personal assistant gateway WebSocket architecture agent runtime workspace configuration memory system skills loading AGENTS.md SOUL.md USER.md IDENTITY.md BOOT.md session management channel routing macOS setup permissions TCC iMessage WhatsApp Telegram Discord Signal Slack CLI commands hooks system browser automation vector search pairing authentication
```

**Rule:** Extract concrete terms from doc titles, skip meta-words like "documentation", "reference", "guide".

```typescript
function generateTriggerDescription(llmstxt: ParsedLlmsTxt): string {
  const terms = llmstxt.entries
    .flatMap(entry => extractKeywords(entry.title))
    .filter(term => !isMetaWord(term)); // Skip "documentation", "guide", etc.

  return terms.slice(0, 80).join(' ');
}

function isMetaWord(word: string): boolean {
  const meta = ['documentation', 'reference', 'guide', 'docs', 'manual'];
  return meta.includes(word.toLowerCase());
}
```

---

### 3. Auto-Generate Update Scripts

**Should Create:** `scripts/update-docs.sh`

```bash
#!/bin/bash
# Auto-generated by llmstxt-to-skill
# Source: https://docs.openclaw.ai/llms.txt

REFS_DIR="references"
LLMS_URL="https://docs.openclaw.ai/llms.txt"

# Download fresh manifest
curl -fsSL "$LLMS_URL" -o /tmp/llms-new.txt

# Compare manifests
if diff "$REFS_DIR/llms.txt" /tmp/llms-new.txt > /dev/null; then
    echo "✓ No changes"
    exit 0
fi

# Process added entries
comm -13 <(sort "$REFS_DIR/llms.txt") <(sort /tmp/llms-new.txt) | \
while IFS= read -r line; do
    url=$(echo "$line" | sed -E 's/.*$(https:\/\/[^)]+)$.*/\1/')
    filename=$(basename "$url")

    # Skip SKILL-* files (never overwrite)
    [[ "$filename" =~ ^SKILL- ]] && continue

    # Download with metadata
    download_with_metadata "$url" "$REFS_DIR/$filename"
    echo "  ✓ Downloaded: $filename"
done

# Update manifest
mv /tmp/llms-new.txt "$REFS_DIR/llms.txt"
echo "✓ Updated manifest"
```

**Critical:** Protect SKILL-* files from overwrites.

---

### 4. Fix Naming (Add "-docs" Suffix)

**Current:** `https://docs.openclaw.ai/llms.txt` → skill name: `openclaw`

**Should Be:** `https://docs.openclaw.ai/llms.txt` → skill name: `openclaw-docs`

```typescript
function generateSkillName(llmsTxtUrl: string): string {
  const url = new URL(llmsTxtUrl);
  const hostname = url.hostname;
  const pathname = url.pathname;

  const baseName = extractBaseName(hostname);

  // Detect documentation patterns
  const isDocsSite =
    hostname.startsWith('docs.') ||
    hostname.startsWith('developer.') ||
    hostname.includes('.readthedocs.io') ||
    hostname.includes('.github.io') ||
    pathname.includes('/docs/') ||
    pathname.includes('/documentation/');

  return isDocsSite ? `${baseName}-docs` : baseName;
}
```

---

### 5. Auto-Compact Large Skills

If SKILL.md >2,500 words, prompt:

```
⚠️  SKILL.md is 4,231 words (recommended: <2,500)
Auto-compact for better performance? (y/N)
```

If yes:
- Extract sections >400 words into `SKILL-*.md` files
- Replace with summaries + links
- Result: SKILL.md = 1,200 words, extracted 4 files

---

### 6. Generate Usage Documentation

**Create:** `references/README.md`

```markdown
# OpenClaw Docs Skill

Generated from: https://docs.openclaw.ai/llms.txt

## Directory Structure

- `SKILL.md` - Main file (auto-loads)
- `scripts/update-docs.sh` - Update from source
- `references/SKILL-*.md` - Extracted sections (manual edit)
- `references/*.md` - Downloaded docs (auto-update)

## Updating Documentation

```bash
./scripts/update-docs.sh
```

## File Types

1. **SKILL-*.md** - Never auto-updated (skill-generated)
2. **\*.md** - Auto-updatable from source

## Maintenance

Check for updates: `./scripts/update-docs.sh --check`
```

---

### 7. Organize References by Topic (Not Alphabetically)

**Current:**
```markdown
## Reference Documents (Alphabetical)
- [agent.md](references/agent.md)
- [architecture.md](references/architecture.md)
- [auth.md](references/auth.md)
```

**Should Be:**
```markdown
## Reference Documentation

### Getting Started
- [SKILL-INSTALLATION.md](references/SKILL-INSTALLATION.md) - Complete walkthrough
- [setup.md](references/setup.md) - Initial setup
- [onboarding.md](references/onboarding.md) - Onboarding wizard

### Architecture
- [SKILL-ARCHITECTURE.md](references/SKILL-ARCHITECTURE.md) - System overview
- [gateway.md](references/gateway.md) - Gateway details
- [agent.md](references/agent.md) - Agent system

### Channels
- [whatsapp.md](references/whatsapp.md) - WhatsApp integration
- [telegram.md](references/telegram.md) - Telegram setup
- [discord.md](references/discord.md) - Discord integration
```

```typescript
function organizeReferences(llmstxt: ParsedLlmsTxt): ReferencesByCategory {
  const categories = {
    'Getting Started': ['setup', 'install', 'getting-started', 'onboard'],
    'Architecture': ['architecture', 'gateway', 'agent', 'session'],
    'Channels': ['whatsapp', 'telegram', 'discord', 'slack'],
    'Configuration': ['config', 'environment', 'settings'],
  };

  return categorizeEntries(llmstxt.entries, categories);
}
```

---

### 8. Add Validation Tooling

**Create:** `scripts/validate-skill.sh`

```bash
#!/bin/bash

echo "→ Validating skill..."

# Check SKILL.md size
words=$(wc -w < SKILL.md)
if [ "$words" -gt 3500 ]; then
    echo "⚠️  SKILL.md is $words words (consider compaction)"
fi

# Check frontmatter
if ! grep -q "^name:" SKILL.md; then
    echo "❌ Missing frontmatter"
    exit 1
fi

# Check reference count
refs=$(find references -name "*.md" | wc -l)
manifest=$(grep -c "^- \[" references/llms.txt)
echo "✓ References: $refs files ($manifest in manifest)"

# Check stale docs (>30 days old)
stale=$(find references -name "*.md" ! -name "SKILL-*" -mtime +30 | wc -l)
[ "$stale" -gt 0 ] && echo "⚠️  $stale docs >30 days old"

echo "✓ Skill health: OK"
```

---

## Implementation Phases

### Phase 1: Critical (Do First)
1. ✅ Content generation - Extract key docs into SKILL.md
2. ✅ Concrete triggers - Build from topic keywords
3. ✅ Fix naming - Add "-docs" suffix

### Phase 2: Essential (Next)
4. ✅ Update scripts - Generate automatically
5. ✅ Usage docs - Create README
6. ✅ Organize refs - Group by topic

### Phase 3: Nice-to-Have
7. ✅ Auto-compact - Prompt if >2500 words
8. ✅ Validation - Generate health check script

---

## Expected Impact

**Before:**
- Skill generated: 30 seconds
- Manual work: 2-3 hours
- No update mechanism
- Stale over time

**After:**
- Skill generated: 45 seconds (complete)
- Manual work: 10 minutes (90% reduction)
- One-command updates
- Stays fresh

---

## Testing Checklist

Test with:
- [ ] Small site (~20 pages)
- [ ] Medium site (~50 pages)  
- [ ] Large site (>200 pages)

Verify:
- [ ] SKILL.md has content (not lists)
- [ ] Trigger uses concrete terms
- [ ] Name has "-docs" suffix (if docs site)
- [ ] Update scripts generated & executable
- [ ] Large skills auto-compact or prompt
- [ ] References organized by topic
- [ ] README and validation present

---

## Questions for Developers

1. **LLM summarization?** Use LLM to extract content or parse/template?
2. **Compaction threshold?** Auto at 2500 words or only prompt?
3. **Update frequency?** Should scripts check periodically (cron)?
4. **Version tracking?** Track llms.txt hash for smarter updates?
5. **Multi-source?** Support combining multiple llms.txt files?

---

**Reported:** 2026-02-01
**Plugin:** llmstxt v1.0.0
**Reference:** OpenClaw docs skill (manual implementation)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feedback]: llmstxt-to-skill generates skeletal skills requiring hours of manual work #5

Feedback Category

Summary

Current State

Problems

User Impact

8 Required Improvements

1. Generate Actual Content (Not Lists)

Core Concepts

2. Concrete Trigger Terms (Not Meta-Descriptions)

3. Auto-Generate Update Scripts

4. Fix Naming (Add "-docs" Suffix)

5. Auto-Compact Large Skills

6. Generate Usage Documentation

File Types

Maintenance

8. Add Validation Tooling

Implementation Phases

Phase 1: Critical (Do First)

Phase 2: Essential (Next)

Phase 3: Nice-to-Have

Expected Impact

Testing Checklist

Questions for Developers

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Feedback]: llmstxt-to-skill generates skeletal skills requiring hours of manual work #5

Description

Feedback Category

Summary

Current State

Problems

User Impact

8 Required Improvements

1. Generate Actual Content (Not Lists)

Core Concepts

2. Concrete Trigger Terms (Not Meta-Descriptions)

3. Auto-Generate Update Scripts

4. Fix Naming (Add "-docs" Suffix)

5. Auto-Compact Large Skills

6. Generate Usage Documentation

File Types

Maintenance

8. Add Validation Tooling

Implementation Phases

Phase 1: Critical (Do First)

Phase 2: Essential (Next)

Phase 3: Nice-to-Have

Expected Impact

Testing Checklist

Questions for Developers

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions