Skip to content

Latest commit

 

History

History
1259 lines (1014 loc) · 38.5 KB

File metadata and controls

1259 lines (1014 loc) · 38.5 KB

Product Requirements Document (PRD)

Smart Network Automation and Troubleshooting Tool (SNATT)


Document Control

Item Details
Project Name Smart Network Automation and Troubleshooting Tool (SNATT)
Version 1.0
Date October 5, 2025
Status Draft
Owner Network Automation Team
Document Type Product Requirements Document

1. Executive Summary

1.1 Product Vision

SNATT is a Python-based GUI application that revolutionizes network management by automating discovery, monitoring, and troubleshooting of enterprise network devices. It eliminates repetitive CLI-based operations through an intelligent, user-friendly dashboard interface.

1.2 Business Objective

  • Reduce network troubleshooting time by 70%
  • Minimize human error in network operations
  • Centralize network device management
  • Enable non-expert staff to perform basic network diagnostics
  • Improve network documentation and configuration backup processes

1.3 Target Users

  • Primary: Network Engineers and Network Administrators
  • Secondary: IT Support Teams, NOC Operators
  • Tertiary: Students and Network Learners

1.4 Success Metrics

  • Time to troubleshoot common issues reduced from 30 min to <5 min
  • 95% accuracy in device discovery
  • Support for 100+ concurrent device connections
  • Report generation in <10 seconds
  • User satisfaction score >4.5/5

2. Product Overview

2.1 Problem Statement

Network engineers spend 60-80% of their time on repetitive tasks:

  • Manually connecting to devices via CLI
  • Running the same diagnostic commands repeatedly
  • Tracking device configurations manually
  • Creating reports from CLI outputs
  • Troubleshooting connectivity issues one device at a time

Impact: High operational costs, slow response times, increased human error, poor documentation.

2.2 Solution

SNATT provides an automated, GUI-driven platform that:

  • Auto-discovers network devices across subnets
  • Executes troubleshooting workflows automatically
  • Provides real-time health monitoring
  • Generates professional reports with one click
  • Maintains configuration backups with version control

2.3 Product Positioning

SNATT is NOT a replacement for enterprise monitoring tools like SolarWinds or PRTG.

SNATT IS a lightweight, automation-first troubleshooting assistant for engineers who need quick insights and automated diagnostics without complex enterprise deployments.


3. User Personas

Persona 1: Senior Network Engineer (Sarah)

  • Experience: 8+ years in networking
  • Goals: Automate routine tasks, quickly identify issues, maintain compliance
  • Pain Points: Spends hours on repetitive CLI commands, manual report creation
  • SNATT Usage: Uses automation features for bulk diagnostics and reporting

Persona 2: Junior NOC Operator (Mike)

  • Experience: 1-2 years in IT support
  • Goals: Learn networking, respond to alerts efficiently
  • Pain Points: Struggles with CLI, needs guidance on troubleshooting steps
  • SNATT Usage: Uses GUI dashboard for guided troubleshooting workflows

Persona 3: Network Student (Priya)

  • Experience: Learning networking fundamentals
  • Goals: Practice automation, understand device behavior
  • Pain Points: Limited access to physical equipment, needs hands-on experience
  • SNATT Usage: Uses with lab environments (GNS3, EVE-NG) to learn automation

4. Functional Requirements

4.1 Network Discovery Module

FR-1.1: Subnet Scanning

Priority: P0 (Critical)

  • Description: System shall scan user-specified IP ranges/subnets to identify active devices
  • Input: IP range (e.g., 192.168.1.0/24) or custom range (192.168.1.1-192.168.1.50)
  • Process:
    • ICMP ping sweep for host discovery
    • Port scanning (22, 23, 161) to identify network devices
    • Optional ARP table analysis
  • Output: List of responsive IP addresses
  • Performance: Scan 254 hosts in <30 seconds

FR-1.2: Device Type Identification

Priority: P0 (Critical)

  • Description: System shall identify device vendor and type
  • Methods:
    • SNMP sysDescr polling
    • SSH banner analysis
    • MAC address OUI lookup
  • Output: Device vendor (Cisco, Juniper, HP, etc.), model if available
  • Supported Vendors: Cisco, Juniper, HP/Aruba, Huawei, MikroTik

FR-1.3: Device Information Display

Priority: P0 (Critical)

  • Description: Display discovered devices in sortable table
  • Columns: IP Address, Hostname, Vendor, Device Type, Status, Last Seen
  • Features:
    • Sort by any column
    • Filter by vendor/status
    • Export list to CSV
    • Save/Load device inventory

4.2 Connection Management Module

FR-2.1: Credential Management

Priority: P0 (Critical)

  • Description: Secure storage and retrieval of device credentials
  • Features:
    • Support for multiple credential sets
    • Encrypted storage (using keyring library)
    • Credential templates per vendor
    • SSH key-based authentication support
  • Security: Passwords encrypted at rest, never logged in plaintext

FR-2.2: Multi-Device Connection

Priority: P0 (Critical)

  • Description: Establish SSH connections to multiple devices concurrently
  • Protocols: SSH (primary), Telnet (optional, with warning)
  • Features:
    • Connection pooling for efficiency
    • Automatic retry on transient failures (3 attempts)
    • Timeout configuration (default 10 seconds)
    • Connection status indicators (Connected/Failed/Timeout)
  • Performance: Handle 50+ concurrent connections

FR-2.3: API Integration

Priority: P1 (High)

  • Description: Support for REST API connections for modern devices
  • Supported APIs:
    • Cisco DNA Center
    • Meraki Dashboard API
    • Generic REST API endpoints
  • Features: API key management, rate limiting

4.3 Troubleshooting & Diagnostics Module

FR-3.1: Standard Diagnostic Workflows

Priority: P0 (Critical)

  • Description: Execute predefined troubleshooting command sequences
  • Workflow Examples:
    • Interface Health Check:
      • show ip interface brief
      • show interface status
      • Identify down/error-disabled ports
    • CPU/Memory Check:
      • show processes cpu sorted
      • show memory statistics
      • Alert if CPU >80% or memory >90%
    • Connectivity Check:
      • show ip route
      • show arp
      • ping <gateway>
    • Log Analysis:
      • show logging | include ERROR|CRITICAL
      • Parse and categorize errors

FR-3.2: Issue Categorization

Priority: P0 (Critical)

  • Description: Automatically categorize detected issues
  • Categories:
    • Critical: Device unreachable, >90% CPU, all interfaces down
    • Warning: Interface down, high CPU (>80%), authentication failures
    • Info: Informational logs, normal status
  • Visual Indicators:
    • Red icon/text for Critical
    • Yellow icon/text for Warning
    • Green icon/text for Normal

FR-3.3: Custom Command Execution

Priority: P1 (High)

  • Description: Allow users to define custom diagnostic commands
  • Features:
    • Command template library
    • Save custom workflows
    • Regex-based output parsing
    • Variable substitution (e.g., {interface_name})

FR-3.4: Results Display

Priority: P0 (Critical)

  • Description: Present diagnostic results in organized format
  • Views:
    • Summary dashboard with health indicators
    • Detailed per-device results
    • Raw command output (collapsible)
    • Issue highlighting with color coding

4.4 Configuration Backup Module

FR-4.1: Configuration Retrieval

Priority: P0 (Critical)

  • Description: Retrieve running and startup configurations
  • Commands by Vendor:
    • Cisco: show running-config, show startup-config
    • Juniper: show configuration
    • HP: display current-configuration
  • Storage: Plain text files with device hostname prefix

FR-4.2: Batch Backup

Priority: P0 (Critical)

  • Description: Backup configurations from multiple devices simultaneously
  • Features:
    • Select all/specific devices
    • Progress indicator
    • Error handling (continue on failure)
    • Summary report of successful/failed backups

FR-4.3: Version Control

Priority: P1 (High)

  • Description: Maintain configuration history with timestamps
  • Naming Convention: {hostname}_{YYYYMMDD_HHMMSS}.cfg
  • Features:
    • Automatic timestamping
    • Configuration diff viewer (compare versions)
    • Rollback capability (future enhancement)
  • Storage: Organized by device in backups/ directory

FR-4.4: Scheduled Backups

Priority: P2 (Medium)

  • Description: Automate recurring backups
  • Schedule Options: Daily, Weekly, Monthly
  • Features: Email notification on completion/failure

4.5 Reporting Module

FR-5.1: Report Generation

Priority: P0 (Critical)

  • Description: Generate comprehensive network health reports
  • Report Sections:
    • Executive Summary (health score, critical issues count)
    • Device Inventory (discovered devices table)
    • Health Status (per-device breakdown)
    • Interface Status (all interfaces, up/down counts)
    • Critical Issues (detailed list with timestamps)
    • Recommendations (automated suggestions)

FR-5.2: Export Formats

Priority: P0 (Critical)

  • Supported Formats:
    • Excel (.xlsx): Multi-sheet workbook with formatting
    • PDF: Professional layout with logo, charts
    • CSV: Raw data export
    • HTML: Web-viewable report
  • Features:
    • Auto-formatting (color coding, borders)
    • Charts (health distribution pie chart, uptime graphs)

FR-5.3: Report Customization

Priority: P2 (Medium)

  • Description: Allow users to customize report content
  • Options:
    • Select report sections to include
    • Add custom notes/comments
    • Company logo upload
    • Template selection

FR-5.4: Report History

Priority: P2 (Medium)

  • Description: Maintain archive of generated reports
  • Features:
    • Timestamped storage
    • Quick access to recent reports
    • Search by date/device

4.6 Graphical User Interface (GUI)

FR-6.1: Dashboard Layout

Priority: P0 (Critical)

  • Description: Modern, intuitive main dashboard
  • Layout Sections:
    • Top Menu Bar: File, Edit, Tools, Help
    • Side Navigation: Discovery, Diagnostics, Backup, Reports, Settings
    • Main Content Area: Context-sensitive view
    • Status Bar: Connection status, last action timestamp
  • Design: Clean, professional, high contrast

FR-6.2: Discovery Interface

Priority: P0 (Critical)

  • Components:
    • IP range input field with validation
    • "Scan Network" button (prominent)
    • Progress bar during scanning
    • Results table (sortable, filterable)
    • Device selection checkboxes
    • "Connect Selected" button

FR-6.3: Diagnostics Interface

Priority: P0 (Critical)

  • Components:
    • Device selection dropdown/multi-select
    • Workflow selection (predefined templates)
    • "Run Diagnostics" button
    • Real-time status updates
    • Results panel with color-coded issues
    • "Export Results" button

FR-6.4: Backup Interface

Priority: P0 (Critical)

  • Components:
    • Device selection (all/specific)
    • Backup type selection (running/startup/both)
    • "Backup Now" button
    • Progress indicator
    • Backup history table
    • "View Diff" button (compare configs)

FR-6.5: Reports Interface

Priority: P0 (Critical)

  • Components:
    • Report type selection
    • Date range picker
    • Device filter
    • Format selection (PDF/Excel/CSV)
    • "Generate Report" button
    • Report preview pane
    • "Export" button

FR-6.6: Settings Interface

Priority: P1 (High)

  • Sections:
    • Credentials Management (add/edit/delete)
    • Connection Settings (timeouts, retries)
    • Backup Settings (directory, retention)
    • Report Settings (logo, templates)
    • Appearance (theme, font size)

FR-6.7: Visual Feedback

Priority: P0 (Critical)

  • Features:
    • Loading spinners during operations
    • Toast notifications (success/error messages)
    • Color-coded status indicators
    • Tooltips on hover
    • Progress bars for long operations

4.7 Data Management

FR-7.1: Device Inventory Storage

Priority: P0 (Critical)

  • Format: JSON or SQLite database
  • Data Stored: IP, hostname, vendor, device type, credentials reference, last scan time
  • Features: Import/Export inventory

FR-7.2: Application Settings

Priority: P1 (High)

  • Storage: Configuration file (config.ini or JSON)
  • Settings: Default timeouts, backup paths, report templates

FR-7.3: Logging

Priority: P1 (High)

  • Description: Application activity logging for troubleshooting
  • Log Levels: DEBUG, INFO, WARNING, ERROR, CRITICAL
  • Storage: Rotating log files (max 10MB, keep 5 files)
  • Content: User actions, connection attempts, errors, command executions

5. Non-Functional Requirements

5.1 Performance

NFR-1.1: Response Time

  • Network scan (254 hosts): <30 seconds
  • Device connection: <5 seconds per device
  • Command execution: <10 seconds per command
  • Report generation: <10 seconds
  • GUI responsiveness: <100ms for user interactions

NFR-1.2: Scalability

  • Support up to 500 devices in inventory
  • Handle 50 concurrent SSH connections
  • Process 1000+ log lines per second

NFR-1.3: Resource Usage

  • Memory usage: <500MB under normal load
  • CPU usage: <50% during scanning operations
  • Disk space: Configurable, default 1GB for backups

5.2 Security

NFR-2.1: Credential Security

  • Passwords encrypted using AES-256 or OS keyring
  • No plaintext credential storage
  • Secure credential transmission (SSH/TLS only)

NFR-2.2: Communication Security

  • SSH protocol for device connections
  • Certificate validation for API connections
  • No support for insecure protocols (Telnet disabled by default)

NFR-2.3: Audit Trail

  • Log all user actions with timestamps
  • Log all device connections and commands executed
  • Tamper-proof logging mechanism

NFR-2.4: Access Control

  • Password protection for application (optional)
  • Role-based features (future enhancement)

5.3 Reliability

NFR-3.1: Availability

  • Application uptime: 99.9% (excluding maintenance)
  • Graceful handling of device unavailability
  • No data loss on unexpected shutdown

NFR-3.2: Error Handling

  • Graceful degradation on partial failures
  • Clear error messages with resolution guidance
  • Automatic recovery from transient network errors

NFR-3.3: Data Integrity

  • Configuration backups verified (checksum)
  • Database transactions for inventory updates
  • Backup of application data before updates

5.4 Usability

NFR-4.1: Ease of Use

  • Intuitive interface requiring no training for basic operations
  • Tooltips and help text throughout UI
  • Guided workflows for common tasks

NFR-4.2: Accessibility

  • Keyboard shortcuts for all major functions
  • High contrast mode support
  • Resizable UI elements

NFR-4.3: Documentation

  • User manual with screenshots
  • Tooltips on all UI elements
  • Video tutorials (future)

5.5 Compatibility

NFR-5.1: Platform Support

  • Primary: Windows 10/11
  • Secondary: Linux (Ubuntu 20.04+, RHEL 8+)
  • Tertiary: macOS (10.15+)

NFR-5.2: Network Device Support

  • Tier 1 (Full Support): Cisco IOS, Cisco IOS-XE, Cisco NX-OS
  • Tier 2 (Tested): Juniper Junos, HP/Aruba, Huawei
  • Tier 3 (Community): MikroTik, Ubiquiti, Others

NFR-5.3: Python Version

  • Python 3.8 - 3.11 (primary support)
  • Python 3.12+ (testing)

5.6 Maintainability

NFR-6.1: Code Quality

  • PEP 8 compliance
  • Type hints throughout codebase
  • Unit test coverage >80%
  • Documentation strings for all public functions

NFR-6.2: Modularity

  • Clear separation of concerns (MVC/MVVM pattern)
  • Plugin architecture for device drivers (future)
  • Extensible command templates

NFR-6.3: Updates

  • Auto-update check on startup
  • One-click update mechanism (future)
  • Backward compatibility for data files

6. Technical Architecture

6.1 System Architecture

┌─────────────────────────────────────────────────────┐
│                   GUI Layer (View)                   │
│            CustomTkinter / PyQt5                     │
│  ┌──────────┬──────────┬──────────┬──────────┐    │
│  │Discovery │Diagnostic│ Backup   │ Reports   │    │
│  │  Panel   │  Panel   │  Panel   │  Panel    │    │
│  └──────────┴──────────┴──────────┴──────────┘    │
└────────────────────┬────────────────────────────────┘
                     │
┌────────────────────▼────────────────────────────────┐
│            Application Layer (Controller)            │
│  ┌──────────────────────────────────────────────┐  │
│  │         Main Controller / Orchestrator        │  │
│  └──────────────────────────────────────────────┘  │
└─────┬──────────┬──────────┬──────────┬─────────────┘
      │          │          │          │
┌─────▼──┐  ┌───▼────┐ ┌───▼────┐ ┌──▼──────┐
│Discovery│  │Troubl- │ │Backup  │ │Reporting│
│ Engine │  │shooting │ │Manager │ │ Engine  │
│        │  │ Engine  │ │        │ │         │
└─────┬──┘  └───┬────┘ └───┬────┘ └──┬──────┘
      │         │          │         │
      └─────────┴──────────┴─────────┘
                     │
            ┌────────▼──────────┐
            │ Connection Manager │
            │   (Netmiko/NAPALM) │
            └────────┬───────────┘
                     │
        ┌────────────┴────────────┐
        │                         │
   ┌────▼─────┐           ┌──────▼────┐
   │  Data    │           │  External │
   │  Layer   │           │  Systems  │
   │(SQLite/  │           │ (Network  │
   │ JSON)    │           │ Devices)  │
   └──────────┘           └───────────┘

6.2 Module Breakdown

6.2.1 Discovery Engine

Responsibilities:

  • IP range parsing and validation
  • ICMP ping sweep
  • Port scanning
  • SNMP polling
  • Device identification

Key Classes:

  • NetworkScanner: Main scanner orchestrator
  • PingScanner: ICMP ping operations
  • SNMPDiscovery: SNMP-based discovery
  • DeviceIdentifier: Vendor/type detection

Dependencies: scapy, pysnmp, concurrent.futures


6.2.2 Connection Manager

Responsibilities:

  • Credential storage/retrieval
  • SSH connection establishment
  • Connection pooling
  • Error handling and retries
  • API client management

Key Classes:

  • ConnectionPool: Manages active connections
  • SSHConnector: SSH connection wrapper (Netmiko)
  • APIConnector: REST API client
  • CredentialManager: Secure credential handling

Dependencies: netmiko, paramiko, requests, keyring


6.2.3 Troubleshooting Engine

Responsibilities:

  • Command execution workflows
  • Output parsing
  • Issue detection and categorization
  • Result aggregation

Key Classes:

  • DiagnosticRunner: Workflow executor
  • CommandParser: Output parsing with regex
  • IssueClassifier: Categorizes problems
  • WorkflowTemplate: Command sequence definitions

Dependencies: netmiko, textfsm, re


6.2.4 Backup Manager

Responsibilities:

  • Configuration retrieval
  • File storage with timestamps
  • Configuration diffing
  • Backup scheduling

Key Classes:

  • ConfigBackup: Backup orchestrator
  • ConfigRetriever: Device-specific config commands
  • ConfigDiffer: Configuration comparison
  • BackupScheduler: Automated backup jobs

Dependencies: netmiko, difflib, apscheduler


6.2.5 Reporting Engine

Responsibilities:

  • Data aggregation
  • Report generation
  • Chart creation
  • Export to multiple formats

Key Classes:

  • ReportGenerator: Main report creator
  • ExcelExporter: Excel file generation
  • PDFExporter: PDF report creation
  • ChartBuilder: Graph/chart generation

Dependencies: pandas, openpyxl, reportlab, matplotlib


6.2.6 GUI Layer

Responsibilities:

  • User interface rendering
  • Event handling
  • Data binding
  • Visual feedback

Key Classes:

  • MainWindow: Application main window
  • DiscoveryPanel: Network discovery UI
  • DiagnosticPanel: Troubleshooting UI
  • BackupPanel: Backup management UI
  • ReportPanel: Report generation UI
  • SettingsPanel: Application settings

Dependencies: customtkinter or PyQt5


6.3 Data Models

Device Model

{
  "id": "uuid",
  "ip_address": "192.168.1.1",
  "hostname": "router-01",
  "vendor": "Cisco",
  "device_type": "router",
  "model": "ISR 4331",
  "os_version": "IOS 15.6",
  "credential_id": "cred_uuid",
  "last_seen": "2025-10-05T10:30:00Z",
  "status": "reachable",
  "tags": ["production", "edge"]
}

Diagnostic Result Model

{
  "id": "uuid",
  "device_id": "device_uuid",
  "timestamp": "2025-10-05T10:35:00Z",
  "workflow": "interface_health",
  "severity": "warning",
  "issues": [
    {
      "type": "interface_down",
      "interface": "GigabitEthernet0/0/1",
      "description": "Interface is administratively down"
    }
  ],
  "raw_output": "..."
}

Configuration Backup Model

{
  "id": "uuid",
  "device_id": "device_uuid",
  "timestamp": "2025-10-05T10:40:00Z",
  "config_type": "running-config",
  "file_path": "backups/router-01_20251005_104000.cfg",
  "size_bytes": 45620,
  "checksum": "sha256_hash"
}

6.4 Technology Stack

Layer Technology Purpose
GUI CustomTkinter Modern, themed UI framework
Alternative GUI PyQt5 Robust, feature-rich (if CustomTkinter insufficient)
Network Automation Netmiko Multi-vendor SSH automation
Network Automation NAPALM Unified API for network devices
SSH Paramiko Low-level SSH protocol
SNMP PySNMP SNMP polling and traps
Network Scanning Scapy Packet crafting and scanning
Data Processing Pandas Data manipulation and analysis
Excel Export OpenPyXL Excel file generation
PDF Export ReportLab PDF report creation
Charts Matplotlib Graph and chart generation
Database SQLite Lightweight embedded database
Scheduling APScheduler Background job scheduling
Credential Storage Keyring OS-level secure credential storage
Logging Python Logging Application logging
Testing Pytest Unit and integration testing
Code Quality Black, Flake8 Code formatting and linting

7. User Stories & Acceptance Criteria

Epic 1: Network Discovery

US-1.1: Scan Subnet for Devices

As a network engineer
I want to scan a subnet to discover active devices
So that I can quickly identify all network equipment without manual IP tracking

Acceptance Criteria:

  • User can enter IP range in CIDR notation (e.g., 192.168.1.0/24)
  • User can enter custom IP range (e.g., 192.168.1.1-192.168.1.50)
  • System validates IP input format
  • Scan completes in <30 seconds for /24 subnet
  • Results display: IP, hostname (if resolvable), vendor, status
  • User can export results to CSV

US-1.2: Identify Device Vendor

As a network engineer
I want the system to automatically identify device vendors
So that I can apply vendor-specific configurations and commands

Acceptance Criteria:

  • System attempts SNMP sysDescr query
  • System attempts SSH banner analysis
  • System performs MAC OUI lookup
  • Vendor displayed in results table
  • Accuracy >90% for Cisco, Juniper, HP

Epic 2: Troubleshooting

US-2.1: Run Interface Health Check

As a network engineer
I want to check interface status across multiple devices
So that I can quickly identify connectivity issues

Acceptance Criteria:

  • User selects one or more devices
  • User clicks "Run Interface Check"
  • System executes show ip interface brief or equivalent
  • Results show all interfaces with status (up/down/admin down)
  • Down interfaces highlighted in red
  • Summary shows: X interfaces down, Y interfaces up

US-2.2: Detect High CPU Usage

As a NOC operator
I want to be alerted to high CPU usage
So that I can investigate performance issues proactively

Acceptance Criteria:

  • System executes show processes cpu or equivalent
  • CPU usage parsed and displayed as percentage
  • Warning if CPU >80% (yellow indicator)
  • Critical if CPU >90% (red indicator)
  • Top 3 processes shown

Epic 3: Configuration Backup

US-3.1: Backup Single Device Configuration

As a network engineer
I want to backup a device configuration
So that I have a recovery point before making changes

Acceptance Criteria:

  • User selects device and clicks "Backup Now"
  • System retrieves running-config
  • Config saved as {hostname}_{timestamp}.cfg
  • Success notification displayed
  • File accessible in backup directory

US-3.2: Batch Backup Multiple Devices

As a network administrator
I want to backup all device configurations at once
So that I can maintain compliance and disaster recovery readiness

Acceptance Criteria:

  • User clicks "Backup All Devices"
  • Progress bar shows completion percentage
  • System continues on individual failures
  • Summary report: X successful, Y failed
  • Failed devices listed with error reasons

Epic 4: Reporting

US-4.1: Generate Network Health Report

As a network manager
I want to generate a professional network health report
So that I can share status with management and stakeholders

Acceptance Criteria:

  • User clicks "Generate Report"
  • Report includes: device inventory, health summary, critical issues
  • Report generated in <10 seconds
  • Available in Excel and PDF formats
  • Charts show health distribution (pie chart)
  • Professional formatting with headers/footers

Epic 5: User Interface

US-5.1: Navigate Application Easily

As a user
I want clear navigation between modules
So that I can efficiently perform different tasks

Acceptance Criteria:

  • Side navigation menu always visible
  • Active section highlighted
  • Clicking section loads corresponding panel
  • Breadcrumb trail for nested views

8. System Workflows

Workflow 1: Complete Discovery to Report

  1. User launches SNATT application
  2. User navigates to Discovery panel
  3. User enters subnet: 192.168.1.0/24
  4. User clicks "Scan Network"
  5. System performs ping sweep (progress shown)
  6. System identifies 15 devices
  7. Results displayed in table
  8. User selects all devices and clicks "Connect"
  9. System establishes SSH connections (credentials from settings)
  10. User navigates to Diagnostics panel
  11. User selects "Full Health Check" workflow
  12. User clicks "Run Diagnostics"
  13. System executes commands on all devices
  14. Results displayed with 2 critical issues, 3 warnings
  15. User navigates to Reports panel
  16. User clicks "Generate Report"
  17. System creates Excel report
  18. User clicks "Open Report"
  19. Excel opens with detailed findings

Total Time: ~5 minutes (vs. 60+ minutes manually)


Workflow 2: Scheduled Backup

  1. User navigates to Settings > Backup
  2. User enables "Scheduled Backups"
  3. User selects frequency: Daily at 2:00 AM
  4. User saves settings
  5. System schedules background job
  6. At 2:00 AM, system:
    • Connects to all devices in inventory
    • Retrieves running-config
    • Saves with timestamp
    • Generates summary report
    • Sends email notification (if configured)

9. Security & Compliance

9.1 Security Requirements

  • Authentication: Optional password protection for application
  • Encryption: AES-256 for stored credentials
  • Secure Communication: SSH/TLS only, no plaintext protocols
  • Audit Logging: All actions logged with user ID and timestamp
  • Input Validation: All user inputs validated and sanitized
  • No Hardcoded Credentials: All credentials user-provided or keyring-stored

9.2 Compliance Considerations

  • Data Privacy: No PII collected or stored
  • Configuration Data: Device configs may contain sensitive data—encrypted storage recommended
  • Access Control: Single-user application (multi-user future enhancement)

10. Testing Strategy

10.1 Unit Testing

  • Coverage Target: >80%
  • Framework: Pytest
  • Focus Areas:
    • Data parsing functions
    • Device identification logic
    • Report generation logic
    • Credential encryption/decryption

10.2 Integration Testing

  • Environment: GNS3/EVE-NG lab with virtual devices
  • Test Scenarios:
    • Discovery across multi-vendor environment
    • Concurrent connections to 20+ devices
    • Backup retrieval and storage
    • Report generation with large datasets

10.3 User Acceptance Testing (UAT)

  • Participants: 3-5 network engineers
  • Duration: 2 weeks
  • Scenarios:
    • Daily troubleshooting workflows
    • Emergency issue detection
    • Report generation for management
  • Feedback Collection: Survey + interview

10.4 Performance Testing

  • Load Test: 100 devices in inventory, 50 concurrent connections
  • Stress Test: 500 devices, rapid scanning
  • Metrics: Response times, memory usage, CPU utilization

11. Deployment & Installation

11.1 Installation Methods

Method 1: Executable (Recommended for End Users)

  • Package: PyInstaller-generated .exe (Windows) or .app (macOS)
  • Size: ~150MB (includes Python runtime and dependencies)
  • Installation: Double-click installer, follow wizard
  • Updates: Built-in update checker (future)

Method 2: Python Source (Recommended for Developers)

  • Requirements: Python 3.8+, pip
  • Installation:
    pip install -r requirements.txt
    python snatt.py
  • Updates: git pull or re-download source

11.2 System Requirements

  • OS: Windows 10+, Linux (Ubuntu 20.04+), macOS 10.15+
  • RAM: 4GB minimum, 8GB recommended
  • Storage: 500MB for application, 1GB+ for backups
  • Network: Direct access to managed network segments

11.3 First-Time Setup

  1. Launch application
  2. Welcome wizard appears
  3. User configures:
    • Default credential set
    • Backup directory
    • Report template
  4. User performs test scan to verify connectivity
  5. Setup complete

12. Release Plan

Phase 1: MVP (Minimum Viable Product) - v0.1

Timeline: 8 weeks
Features:

  • Network discovery (ICMP ping + basic vendor ID)
  • SSH connection to Cisco devices
  • Basic interface health check
  • Configuration backup (single device)
  • CSV export

Success Criteria: Successfully discover and backup 10 Cisco devices


Phase 2: Core Features - v1.0

Timeline: 12 weeks from Phase 1
Features:

  • Multi-vendor support (Cisco, Juniper, HP)
  • Full troubleshooting workflows (CPU, memory, logs)
  • Batch operations (multi-device backup)
  • Excel and PDF reports
  • GUI with all panels
  • Credential management

Success Criteria: 20+ users in beta, >4.0 satisfaction score


Phase 3: Advanced Features - v1.5

Timeline: 16 weeks from Phase 2
Features:

  • Configuration diff viewer
  • Scheduled backups
  • Custom command workflows
  • API integration (Meraki, DNA Center)
  • Enhanced reporting (charts, trends)
  • Application logging and diagnostics

Success Criteria: 100+ active users, <5% bug rate


Phase 4: Enterprise Features - v2.0

Timeline: 24 weeks from Phase 3
Features:

  • Multi-user support with RBAC
  • Web-based dashboard option
  • SNMP trap monitoring
  • Alerting (email, Slack, Teams)
  • Machine learning anomaly detection
  • Compliance checking

Success Criteria: Enterprise adoption, commercial viability


13. Risk Assessment

Risk 1: Device Compatibility

Description: Not all vendors/models may be supported
Impact: High
Probability: Medium
Mitigation:

  • Focus on top 3 vendors (Cisco, Juniper, HP) initially
  • Community contributions for additional vendors
  • Graceful fallback to raw CLI mode

Risk 2: Performance with Large Networks

Description: Application may slow with 500+ devices
Impact: Medium
Probability: Low
Mitigation:

  • Implement pagination in UI
  • Connection pooling and async operations
  • Database indexing
  • Performance testing with large datasets

Risk 3: Security Vulnerabilities

Description: Credential theft or unauthorized access
Impact: Critical
Probability: Low
Mitigation:

  • Use OS keyring for credential storage
  • Regular security audits
  • No network exposure (local application)
  • Input validation throughout

Risk 4: User Adoption

Description: Users may resist new tools
Impact: High
Probability: Medium
Mitigation:

  • Intuitive UI requiring minimal training
  • Video tutorials and documentation
  • Free tier for individual users
  • Community building (forums, Discord)

14. Success Metrics (KPIs)

Product Metrics

  • Adoption: 500 downloads in first 6 months
  • Active Users: 100 weekly active users by month 6
  • Retention: 60% users return after 30 days
  • Feature Usage: Discovery and Backup used by >80% of users

Technical Metrics

  • Performance: 95th percentile response time <5s
  • Reliability: <1% crash rate
  • Compatibility: Support for 95% of Cisco, Juniper, HP devices

Business Metrics

  • Time Savings: Average 60% reduction in troubleshooting time
  • User Satisfaction: NPS score >40
  • Support Tickets: <10 tickets per 100 users per month

15. Future Enhancements (Post v2.0)

  1. Real-Time Monitoring Dashboard

    • Live SNMP-based monitoring
    • Threshold alerts
    • Historical performance graphs
  2. Configuration Compliance Checking

    • Template-based config validation
    • Automatic remediation suggestions
    • Compliance reports (PCI-DSS, HIPAA, etc.)
  3. Network Topology Mapping

    • Auto-generate network diagrams
    • Visual representation of device connections
    • Interactive topology exploration
  4. AI-Powered Insights

    • Machine learning for anomaly detection
    • Predictive failure analysis
    • Automated root cause analysis
  5. Mobile Application

    • iOS/Android app for on-the-go management
    • Push notifications for critical alerts
  6. Cloud-Based Central Management

    • Multi-site management from single pane
    • Centralized reporting across locations
  7. Integration Marketplace

    • Third-party plugin support
    • ServiceNow, Jira integration
    • Webhook support for custom integrations

16. Open Questions

  1. GUI Framework Decision: CustomTkinter (modern, lightweight) vs. PyQt (feature-rich, complex)?

    • Recommendation: Start with CustomTkinter for MVP, migrate to PyQt if needed
  2. Licensing Model: Free open-source, freemium, or commercial?

    • Recommendation: Open-source (MIT) for community version, commercial enterprise version
  3. Telnet Support: Should we support insecure Telnet protocol?

    • Recommendation: Optional with warning, disabled by default
  4. Configuration Storage: SQLite (embedded) vs. PostgreSQL (server)?

    • Recommendation: SQLite for MVP, PostgreSQL for enterprise version

17. Glossary

Term Definition
ARP Address Resolution Protocol - maps IP to MAC addresses
CLI Command Line Interface
ICMP Internet Control Message Protocol - used for ping
NAPALM Network Automation and Programmability Abstraction Layer with Multivendor support
Netmiko Multi-vendor library for SSH connections to network devices
NOC Network Operations Center
OUI Organizationally Unique Identifier - vendor-specific MAC prefix
SNMP Simple Network Management Protocol
SDN Software-Defined Networking
SSH Secure Shell - encrypted network protocol

18. Appendices

Appendix A: Supported Vendors & Commands

Vendor Device Type show ip interface brief show processes cpu show running-config
Cisco IOS Router/Switch
Cisco NX-OS Switch
Juniper Junos Router/Switch show interfaces terse show system processes show configuration
HP/Aruba Switch display interface brief display cpu-usage display current-configuration

Appendix B: Sample Report Output

[Screenshot placeholder: Excel report with tabs for Summary, Device Inventory, Issues, Recommendations]

Appendix C: Configuration File Example

{
  "application": {
    "theme": "dark",
    "auto_save_interval": 300
  },
  "network": {
    "default_timeout": 10,
    "max_concurrent_connections": 50,
    "retry_attempts": 3
  },
  "backup": {
    "directory": "E:/automa/backups",
    "retention_days": 90,
    "auto_backup_enabled": false
  },
  "reporting": {
    "company_name": "Network Engineering Team",
    "logo_path": "",
    "default_format": "excel"
  }
}

19. Document Revision History

Version Date Author Changes
1.0 2025-10-05 GitHub Copilot Initial PRD creation

20. Approvals

Role Name Date Signature
Product Owner
Tech Lead
QA Lead
Stakeholder

END OF DOCUMENT


This PRD is a living document and will be updated as the project evolves. All stakeholders should review and provide feedback before development begins.