-
Notifications
You must be signed in to change notification settings - Fork 0
Development Guide
- Introduction
- Project Structure
- Core Components
- Architecture Overview
- Detailed Component Analysis
- Dependency Analysis
- Performance Considerations
- Testing Strategies
- Debugging Techniques
- Build and Deployment
- Contribution Guidelines
- Code Review Process
- Development Environment Setup
- Troubleshooting Guide
- Conclusion
InfraWatch is a real-time Solana infrastructure monitoring dashboard with a modern full-stack architecture. The backend is a Node.js/Express server with Socket.io for real-time updates, cron-based polling for network metrics, and optional integrations with PostgreSQL and Redis for persistence and caching. The frontend is a React application using Vite for development, Zustand for state management, and Recharts for data visualization.
The repository follows a clear separation of concerns:
- backend: Express server, routing, services, models, jobs, and WebSocket setup
- frontend: React application with components, stores, hooks, services, and pages
- Shared configuration and environment variables managed centrally
graph TB
subgraph "Backend"
S["server.js"]
C["src/config/index.js"]
R["src/routes/index.js"]
WS["src/websocket/index.js"]
DB["src/models/db.js"]
RD["src/models/redis.js"]
SRV["src/services/solanaRpc.js"]
H["src/services/helius.js"]
CP["src/jobs/criticalPoller.js"]
end
subgraph "Frontend"
APP["src/App.jsx"]
MAIN["src/main.jsx"]
API["src/services/api.js"]
STORE["src/stores/networkStore.js"]
WS_HOOK["src/hooks/useWebSocket.js"]
VITE["frontend/vite.config.js"]
end
S --> R
S --> WS
S --> DB
S --> RD
S --> CP
CP --> SRV
CP --> H
APP --> STORE
APP --> WS_HOOK
WS_HOOK --> API
API --> VITE
Diagram sources
- backend/server.js:1-128
- backend/src/routes/index.js:1-24
- backend/src/websocket/index.js:1-81
- backend/src/models/db.js:1-98
- backend/src/models/redis.js:1-161
- backend/src/services/solanaRpc.js:1-340
- backend/src/services/helius.js:1-188
- backend/src/jobs/criticalPoller.js:1-108
- frontend/src/App.jsx:1-31
- frontend/src/main.jsx:1-12
- frontend/src/services/api.js:1-43
- frontend/src/stores/networkStore.js:1-25
- frontend/src/hooks/useWebSocket.js:1-30
- frontend/vite.config.js:1-18
Section sources
- backend/server.js:1-128
- frontend/src/App.jsx:1-31
This section documents the primary building blocks of InfraWatch and their responsibilities.
-
Backend server and middleware stack
- Express server with Helmet, compression, CORS, JSON parsing, and global error handling
- Health check endpoint and centralized configuration loading
- Graceful shutdown handling for SIGTERM/SIGINT signals
-
Routing and API surface
- Route aggregator mounting network, RPC, validators, epoch, and alerts sub-routers
- Standardized 404 and error handling middleware
-
Data collection services
- Solana RPC service for network health, TPS, slot info, epoch info, delinquent validators, and congestion scoring
- Helius service for priority fee estimates and enhanced TPS data
- RPC prober service (referenced by critical poller) for provider health checks
-
Persistence and caching
- PostgreSQL connection pool with lazy initialization and error handling
- Redis client with retry strategy, JSON serialization, and TTL support
-
Real-time communication
- Socket.io server with connection tracking and broadcast utilities
- Frontend WebSocket hook for connecting and receiving network updates
-
Background jobs
- Critical poller job running every 30 seconds to collect snapshots, probe RPC providers, persist data, update cache, and emit WebSocket events
Section sources
- backend/server.js:1-128
- backend/src/routes/index.js:1-24
- backend/src/services/solanaRpc.js:1-340
- backend/src/services/helius.js:1-188
- backend/src/models/db.js:1-98
- backend/src/models/redis.js:1-161
- backend/src/websocket/index.js:1-81
- backend/src/jobs/criticalPoller.js:1-108
The system architecture combines a reactive backend with a real-time frontend:
sequenceDiagram
participant Cron as "Critical Poller"
participant Solana as "Solana RPC Service"
participant Helius as "Helius Service"
participant DB as "PostgreSQL"
participant Redis as "Redis"
participant IO as "Socket.io Server"
participant FE as "Frontend"
Cron->>Solana : collectNetworkSnapshot()
Cron->>Helius : getPriorityFeeEstimate()
Solana-->>Cron : Network snapshot
Helius-->>Cron : Priority fees
Cron->>DB : Insert snapshots and RPC checks
Cron->>Redis : Cache current and RPC latest
Cron->>IO : Emit network.update and rpc.update
IO-->>FE : Real-time updates
Diagram sources
- backend/src/jobs/criticalPoller.js:1-108
- backend/src/services/solanaRpc.js:1-340
- backend/src/services/helius.js:1-188
- backend/src/models/db.js:1-98
- backend/src/models/redis.js:1-161
- backend/src/websocket/index.js:1-81
- Centralized configuration module loads environment variables with sensible defaults and constructs Helius RPC URLs from API keys
- Express server applies security middleware, compression, CORS, JSON parsing, and mounts health check, routes, and error handlers
- Socket.io server configured with CORS settings from configuration and exposed globally for use by other modules
- Data stores initialized during startup with graceful failure handling; pollers started after initialization
Section sources
- backend/src/config/index.js:1-68
- backend/server.js:1-128
- Route aggregator mounts sub-routers for network, RPC, validators, epoch, and alerts
- Global error handler and 404 handler ensure consistent error responses
- Health check endpoint provides operational status
Section sources
- backend/src/routes/index.js:1-24
- backend/server.js:70-79
- Provides network health, TPS, slot progression, epoch info, delinquent validators, and confirmation time
- Calculates congestion score using weighted components (TPS, priority fees, slot latency)
- Collects a complete network snapshot and handles errors gracefully
flowchart TD
Start(["collectNetworkSnapshot"]) --> Health["getNetworkHealth()"]
Start --> TPS["getCurrentTps()"]
Start --> Slot["getSlotInfo()"]
Start --> Epoch["getEpochInfo()"]
Start --> Delinquent["getDelinquentValidators()"]
Start --> Conf["getConfirmationTime()"]
Health --> Merge["Merge metrics"]
TPS --> Merge
Slot --> Merge
Epoch --> Merge
Delinquent --> Merge
Conf --> Merge
Merge --> Congestion{"Priority fees available?"}
Congestion --> |Yes| Score["calculateCongestionScore()"]
Congestion --> |No| Skip["Skip congestion score"]
Score --> Return["Return snapshot"]
Skip --> Return
Diagram sources
- backend/src/services/solanaRpc.js:275-328
Section sources
- backend/src/services/solanaRpc.js:1-340
- Fetches priority fee estimates and enhanced TPS data via Helius RPC
- Includes robust error handling and timeout configuration
- Returns null when API key is not configured
Section sources
- backend/src/services/helius.js:1-188
- Lazy-initialized connection pool with connection limits and timeouts
- Query wrapper ensures proper client lifecycle and error logging
- Graceful handling when DATABASE_URL is not configured
Section sources
- backend/src/models/db.js:1-98
- Lazy-initialized client with exponential backoff retry strategy
- JSON serialization/deserialization helpers with TTL support
- Robust error handling and connection state tracking
Section sources
- backend/src/models/redis.js:1-161
- Tracks connected clients and logs connection/disconnection events
- Provides broadcast utilities for network and RPC updates
- Exposes connection count for monitoring
Section sources
- backend/src/websocket/index.js:1-81
- Connects to Socket.io with fallback transports
- Updates Zustand store with real-time network data
- Handles connection state changes
Section sources
- frontend/src/hooks/useWebSocket.js:1-30
- frontend/src/stores/networkStore.js:1-25
- Runs every 30 seconds using node-cron
- Coordinates data collection, persistence, caching, and real-time broadcasting
- Implements concurrency guard to prevent overlapping executions
sequenceDiagram
participant Scheduler as "node-cron"
participant CP as "Critical Poller"
participant SRV as "Solana RPC"
participant H as "Helius"
participant Q as "Queries"
participant R as "Redis"
participant IO as "Socket.io"
Scheduler->>CP : Schedule tick
CP->>CP : Check running flag
CP->>SRV : collectNetworkSnapshot()
CP->>H : getPriorityFeeEstimate()
CP->>Q : Insert snapshots and RPC checks
CP->>R : Cache current and RPC latest
CP->>IO : Emit network.update and rpc.update
Diagram sources
- backend/src/jobs/criticalPoller.js:21-103
Section sources
- backend/src/jobs/criticalPoller.js:1-108
- React Router-based routing with nested routes under AppShell
- Centralized App component orchestrating page-level routes
Section sources
- frontend/src/App.jsx:1-31
- Minimalist store managing current network state, history, epoch info, connection status, and update timestamps
- Actions for updating state and managing history range
Section sources
- frontend/src/stores/networkStore.js:1-25
- Axios instance with base URL pointing to /api, request/response interceptors, and error logging
- Centralized configuration for API requests
Section sources
- frontend/src/services/api.js:1-43
- Vite dev server with proxy configuration for /api and /socket.io
- Frontend runs on port 5173, backend on port 3001
Section sources
- frontend/vite.config.js:1-18
The backend maintains clear boundaries between layers:
graph LR
Server["server.js"] --> Routes["routes/index.js"]
Server --> WS["websocket/index.js"]
Server --> DB["models/db.js"]
Server --> Redis["models/redis.js"]
Server --> Poller["jobs/criticalPoller.js"]
Poller --> Solana["services/solanaRpc.js"]
Poller --> Helius["services/helius.js"]
Poller --> Queries["models/queries.js"]
Poller --> CacheKeys["models/cacheKeys.js"]
Frontend["Frontend"] --> API["services/api.js"]
API --> Vite["vite.config.js"]
Diagram sources
- backend/server.js:1-128
- backend/src/routes/index.js:1-24
- backend/src/websocket/index.js:1-81
- backend/src/models/db.js:1-98
- backend/src/models/redis.js:1-161
- backend/src/jobs/criticalPoller.js:1-108
- backend/src/services/solanaRpc.js:1-340
- backend/src/services/helius.js:1-188
- frontend/src/services/api.js:1-43
- frontend/vite.config.js:1-18
Section sources
- backend/package.json:1-36
- frontend/package.json:1-39
- Concurrency control: Critical poller uses a running flag to prevent overlapping executions
- Asynchronous operations: Services use Promise.all for concurrent data fetching
- Caching: Redis cache reduces database load and improves response times
- Connection pooling: PostgreSQL pool manages connections efficiently
- Retry strategy: Redis client implements exponential backoff for resilience
- Compression and security: Express compression and Helmet reduce payload sizes and improve security posture
- Backend
- Unit tests for individual services (RPC, Helius, DB, Redis) focusing on error handling and edge cases
- Integration tests for critical poller workflow and data persistence
- Mock external services (RPC, Helius) for deterministic testing
- Frontend
- Component tests for UI elements and state transitions
- WebSocket integration tests verifying real-time updates
- API service tests with mocked interceptors
[No sources needed since this section provides general guidance]
- Backend
- Enable verbose logging in development mode
- Use structured error logging with error handlers
- Monitor Socket.io connection counts and events
- Verify database and Redis connectivity during startup
- Frontend
- Utilize browser developer tools for network inspection
- Monitor WebSocket connection status in the store
- Inspect API request/response cycles with interceptors
Section sources
- backend/server.js:109-124
- backend/src/websocket/index.js:13-33
- Backend
- Production start script uses Node.js without watch mode
- Requires Node.js version 20+ as specified in engines
- Frontend
- Vite build generates optimized production assets
- Preview command for local testing of built assets
- Environment
- Centralized configuration via environment variables with sensible defaults
- CORS origin configurable for development and production
Section sources
- backend/package.json:6-21
- frontend/package.json:6-11
- backend/src/config/index.js:27-65
- Fork and branch from the main branch for features and fixes
- Follow existing code style and patterns
- Include unit/integration tests for new functionality
- Update documentation for significant changes
- Keep commits focused and well-documented
[No sources needed since this section provides general guidance]
- All changes require at least one reviewer from maintainers
- Focus areas: correctness, performance, security, maintainability
- Ensure tests pass and code adheres to established patterns
- Verify environment variable usage and configuration safety
[No sources needed since this section provides general guidance]
- Backend
- Install dependencies with npm ci
- Configure environment variables (.env) with required keys
- Start with npm run dev for hot reloading
- Frontend
- Install dependencies with npm ci
- Start Vite dev server with npm run dev
- Proxy configuration automatically forwards /api and /socket.io to backend
- Database and Redis
- Configure DATABASE_URL and REDIS_URL for persistent features
- Optional Helius API key for enhanced metrics
Section sources
- backend/package.json:6-9
- frontend/package.json:6-11
- frontend/vite.config.js:7-16
- Backend startup issues
- Check NODE_ENV and PORT configuration
- Verify DATABASE_URL and REDIS_URL are accessible
- Review error logs for initialization failures
- Frontend connectivity
- Confirm Vite proxy settings match backend port
- Check browser console for WebSocket connection errors
- Validate CORS origin configuration
- Data freshness
- Monitor critical poller logs for execution timing
- Verify Redis cache keys and TTL values
- Check database insertion logs for failures
Section sources
- backend/src/config/index.js:8-13
- backend/server.js:89-102
- frontend/vite.config.js:9-15
InfraWatch provides a robust foundation for Solana infrastructure monitoring with clear separation of concerns, real-time capabilities, and extensible architecture. The documented patterns for services, state management, and background jobs enable contributors to develop features efficiently while maintaining system reliability and performance.