Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Binaries
*.exe
*.exe~
*.dll
*.so
*.dylib
/bin/
/dist/
/asset-cache
/cmd/*/asset-cache

# Test + coverage output
*.test
*.out
coverage.txt
coverage.html

# Go workspace + vendor
go.work
go.work.sum
/vendor/

# Dependency / module cache
/.go/

# Editor / OS
.idea/
.vscode/
*.swp
*.swo
.DS_Store
Thumbs.db

# Local config + secrets
.env
.env.local
*.local.yaml
/config/*.local.*

# Logs + runtime data
*.log
/tmp/
/var/
300 changes: 138 additions & 162 deletions BRIEF.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,200 +12,176 @@
## Technology Stack
> **You MUST use the exact stack(s) listed below. Do not substitute or add alternative languages/runtimes.**

- **Language:** javascript
- **Framework:** Express.js
- **Runtime:** Node.js
- **Package manager:** npm
- **Rationale:** The reference files explicitly mention Firebase Authentication, JWT, and libraries like speakeasy (Node.js ecosystem). The attendance system's security features (MFA, RBAC) align with JavaScript/Node.js patterns. No other language/framework is explicitly called out in the provided files.
- **Language:** go
- **Framework:** Gin
- **Runtime:** Go
- **Package manager:** go modules
- **Rationale:** The project focuses on backend features like authentication, access control, and reporting with no explicit frontend requirements. Go is well-suited for building secure, high-performance APIs required for asset/attendance management. Gin provides efficient routing and middleware support for implementing MFA, RBAC, and fraud prevention. The absence of UI/UX technical notes in the spec further confirms this is a single-stack backend service.

## Full Project Specification
> This specification was generated by AutoForge before any code was written.
> Follow it precisely β€” every feature, data model, and API listed here MUST be implemented.

```markdown
# Asset-Cache Project Specification
# asset-cache Project Specification

## Executive Summary
Asset-Cache is a high-performance static asset management server built with Node.js and Express.js. It provides caching, versioning, and access control for static files (images, CSS, JS, etc.) to optimize delivery for web applications. The system targets developers and DevOps teams needing fast, secure, and scalable asset storage with cache invalidation capabilities. Key benefits include reduced latency through in-memory caching, bandwidth optimization via smart compression, and granular access controls for multi-tenant environments.
### Executive Summary
Asset-cache is a high-performance digital asset management system designed for media companies, game studios, and design teams. It provides centralized storage, version control, and fast retrieval of media files with intelligent caching. The platform enables secure collaboration through role-based access control while optimizing storage costs through smart metadata tagging and format conversion. Key users include content creators, DevOps engineers, and enterprise IT departments requiring scalable asset management.

## Core Features
### Core Features
1. **Asset Upload & Storage**
- Accepts binary file uploads with metadata
- Input: File (multipart/form-data), metadata JSON
- Output: Asset ID, storage URL, checksum
- Edge cases: File size limits (5GB max), MIME type validation, duplicate detection via hash

1. **Asset Upload API**
- Accepts file uploads with metadata
- Inputs: Multipart/form-data file, MIME type, optional tags
- Outputs: Asset ID, signed URL, version number
- Edge cases: File size limits (5GB max), MIME type validation, duplicate detection
2. **Metadata Extraction**
- Automatically extracts EXIF, ID3, and format-specific metadata
- Input: File content, MIME type
- Output: Structured metadata JSON
- Edge cases: Corrupted files, unsupported formats, oversized metadata

2. **Versioned Asset Storage**
- Maintains multiple versions of assets
- Inputs: Asset ID, version tag (semantic versioning)
- Outputs: Versioned asset URL, diff between versions
- Edge cases: Rollback to previous versions, version conflict resolution

3. **Cache Management System**
- In-memory cache with TTL and LRU eviction
- Inputs: Cache key, max age (seconds), priority level
- Outputs: Cache hit/miss status, cached asset stream
- Edge cases: Cache stampeding prevention, cache warming support
3. **Version Control**
- Tracks asset revisions with diff capabilities
- Input: Asset ID, new file version
- Output: Version history timeline
- Edge cases: Rollback to previous versions, version retention policies

4. **Access Control System**
- Role-based permissions for assets
- Inputs: User token, requested asset ID, operation type
- Outputs: Access granted/denied status, audit log entry
- Edge cases: IP whitelisting, temporary signed URLs

5. **Cache Invalidation API**
- Programmatic cache clearing
- Inputs: Asset ID, version range, tag pattern
- Outputs: Invalidated cache keys list, success status
- Edge cases: Wildcard invalidation, invalidation queue backpressure

6. **Analytics & Metrics**
- Tracks cache hit rates and bandwidth usage
- Inputs: Sampling rate, time window (last N hours)
- Outputs: JSON metrics payload, Prometheus metrics endpoint
- Edge cases: High cardinality metrics, real-time dashboards

## Architecture Overview
3-tier architecture with:
1. **Express.js API Layer** - Routes and middleware for HTTP requests
2. **Service Layer** - Business logic for caching, storage, and access control
3. **Persistence Layer** - Redis for in-memory cache + S3-compatible storage backend

Key patterns:
- Role-based permissions with granular asset-level controls
- Input: User ID, asset ID, requested operation
- Output: Permission granted/denied status
- Edge cases: Inheritance from parent folders, permission conflicts

5. **Smart Caching Layer**
- Implements multi-tier caching (memory + disk)
- Input: Asset ID, cache priority level
- Output: Cached file stream or cache miss
- Edge cases: Cache eviction policies, stale content handling

6. **Content Delivery Network Integration**
- Signed URL generation for secure asset delivery
- Input: Asset ID, expiration time, allowed domains
- Output: Time-limited access URL
- Edge cases: URL signature validation, domain whitelisting enforcement

7. **Search & Discovery API**
- Full-text search with metadata filters
- Input: Search query, filter parameters (tags, date ranges)
- Output: Paginated asset list with relevance scores
- Edge cases: Fuzzy search handling, performance under complex queries

8. **Usage Analytics**
- Tracks access patterns and storage metrics
- Input: Asset access events
- Output: Usage statistics, trend reports
- Edge cases: Anonymized reporting, data retention policies

### Architecture Overview
The system follows a layered architecture with:
- **Presentation Layer**: Gin-based REST API with JWT authentication middleware
- **Business Logic Layer**: Services for asset processing, versioning, and access control
- **Data Layer**: PostgreSQL for metadata + S3-compatible object storage for binaries
- **Cache Layer**: Redis for metadata caching + in-memory LRU cache for hot assets

Key patterns include:
- Command-Query Responsibility Segregation (CQRS) for read/write operations
- Circuit breaker pattern for storage backend failures
- Decorator pattern for adding caching to storage operations
- Observer pattern for metrics collection

Data flow:
Client β†’ Express Router β†’ Auth Middleware β†’ Service Layer β†’ Cache/Storage β†’ Response

## Data Models

### Asset
```json
{
"id": "string (UUID)",
"filename": "string",
"mimeType": "string",
"storagePath": "string",
"uploadDate": "ISO 8601",
"sizeBytes": "number",
"tags": "string[]",
"versions": "Version[]"
- Circuit breaker pattern for storage service resilience
- Observer pattern for analytics event tracking
- Decorator pattern for applying caching layers

### Data Models
```go
// Asset represents a stored digital asset
type Asset struct {
ID string `json:"id" gorm:"primary_key"`
Name string `json:"name"`
Type string `json:"type"` // image, video, document
Checksum string `json:"checksum"`
StoragePath string `json:"storage_path"`
Metadata JSONB `json:"metadata"`
CreatedAt time.Time `json:"created_at"`
UpdatedAt time.Time `json:"updated_at"`
UserID string `json:"user_id" gorm:"index"`
}
```

### CacheEntry
```json
{
"key": "string (hash of asset ID + version)",
"assetId": "string",
"version": "string",
"expiresAt": "ISO 8601",
"priority": "number (1-10)",
"hitCount": "number"
// Version tracks asset revisions
type Version struct {
AssetID string `json:"asset_id" gorm:"primary_key"`
VersionNum int `json:"version_num" gorm:"primary_key"`
Timestamp time.Time `json:"timestamp"`
Diff JSONB `json:"diff"` // delta from previous version
}
```

### AccessLog
```json
{
"userId": "string",
"assetId": "string",
"action": "enum('read','write','delete')",
"timestamp": "ISO 8601",
"ipAddress": "string",
"success": "boolean"
// AccessControl stores permission rules
type AccessControl struct {
AssetID string `json:"asset_id" gorm:"primary_key"`
UserID string `json:"user_id" gorm:"primary_key"`
Permission string `json:"permission"` // view, edit, admin
}
```

### User
```json
{
"id": "string",
"username": "string",
"roles": "string[]",
"permissions": "Permission[]",
"lastLogin": "ISO 8601"
}
### API / Interface Design
**REST Endpoints:**
```

## API / Interface Design

### HTTP Endpoints
```http
POST /api/assets
POST /api/v1/assets
Request: multipart/form-data with file and metadata
Response: 201 Created with { id, version, url }
Response: 201 Created with { "id": "asset-123", "url": "https://..." }

GET /api/assets/:id
Query: ?version=1.0.0
Response: 200 OK with asset stream
GET /api/v1/assets/{id}
Response: 200 OK with asset metadata and download URL

DELETE /api/assets/:id
Response: 204 No Content
PUT /api/v1/assets/{id}/version
Request: { "file": <binary> }
Response: 200 OK with new version metadata

POST /api/cache/invalidate
Body: { "assetId": "string", "versionRange": ">=1.0.0 <2.0.0" }
Response: 200 OK with { invalidatedKeys: string[] }
GET /api/v1/assets/search
Query: { "query": "logo", "tags": "branding,2023" }
Response: 200 OK with paginated results

GET /api/metrics
Query: ?window=24h
Response: 200 OK with JSON metrics object
POST /api/v1/assets/{id}/sign
Request: { "expiration": "1h", "allowed_domains": ["example.com"] }
Response: 200 OK with signed URL
```

### CLI Commands
```bash
# Cache invalidation
asset-cache invalidate --asset-id=123 --version=1.0.0

# Cache stats
asset-cache stats --cache=redis --format=json

# Asset upload
asset-cache upload --file=/path/to/file --tags=prod,images
**CLI Interface:**
```
asset-cache upload --file=logo.png --tags=branding,logo
asset-cache search "logo" --tags=branding --limit=20
asset-cache version history --asset-id=abc123
```

## Module & File Structure
### Module & File Structure
```
.
β”œβ”€β”€ config/ # Environment-specific configs
β”œβ”€β”€ controllers/ # Express route handlers
β”œβ”€β”€ middleware/ # Auth, logging, error handling
β”œβ”€β”€ models/ # Data access objects
β”œβ”€β”€ services/ # Business logic (cache, storage)
β”œβ”€β”€ utils/ # Helper functions
β”œβ”€β”€ routes/ # Express route definitions
β”œβ”€β”€ storage/ # S3/FS adapters
β”œβ”€β”€ cache/ # Redis client and strategies
β”œβ”€β”€ validation/ # Schema validation rules
β”œβ”€β”€ metrics/ # Prometheus integration
β”œβ”€β”€ index.js # Server entry point
└── cli.js # Command-line interface
/cmd/ # CLI and API entry points
/internal/api/ # Gin route handlers
/internal/service/ # Business logic implementations
/internal/storage/ # Object storage integrations
/internal/cache/ # Cache layer implementations
/internal/model/ # Data models and database interactions
/pkg/utils/ # Shared utilities (hashing, validation)
/config/ # Configuration loading and defaults
/docs/ # API documentation specs
/test/ # Integration test suite
```

## Non-Functional Requirements
- **Performance**: Serve 10,000+ assets/second with <100ms latency under 500 concurrent connections
- **Security**: JWT-based auth, rate limiting (1000 req/min), HTTPS enforcement
- **Error Handling**: Detailed error codes (e.g., 422 for validation errors), circuit breakers for storage failures
- **Logging**: Structured JSON logs with correlation IDs, audit logging for all access operations
- **Testing**: 85%+ unit test coverage, 100% of public APIs tested, chaos testing for cache/storage failures

## Acceptance Criteria

1. Upload endpoint returns 201 with valid asset URL when valid file is provided
2. Cache hit rate metrics must show β‰₯90% for frequently accessed assets (tested with 10k request load test)
3. Invalid API tokens must return 401 Unauthorized within 50ms
4. Cache invalidation must remove all matching keys within 200ms
5. System must handle 100 concurrent uploads without exceeding 80% CPU usage
6. Version history must show all changes for an asset with 100% accuracy
7. CLI commands must display usage help when called without required parameters
8. Asset deletion must cascade to all versions and cache entries
9. Bandwidth metrics must track total data transferred with Β±1% accuracy
10. All endpoints must return proper CORS headers for cross-origin requests
```
### Non-Functional Requirements
- **Performance**: 95% of requests under 200ms, 99% under 500ms
- **Security**: JWT with RSA-256, TLS 1.3 minimum, rate limiting (1000 req/min)
- **Error Handling**: Detailed error codes with human-readable messages
- **Logging**: Structured JSON logging with context propagation
- **Testing**: 85% unit test coverage, 50% integration tests for core features

### Acceptance Criteria
1. Asset upload endpoint returns 201 with valid metadata for successful uploads
2. System correctly rejects uploads exceeding 5GB with 413 status
3. Version history returns chronological list of revisions with diffs
4. Cache hit rate reaches 70% under normal load (measured via Prometheus metrics)
5. Search API returns relevant results with <500ms latency for 95% of queries
6. All endpoints include proper authentication validation
7. Storage system automatically converts images to WebP format when requested
8. Audit logs capture all access events with user context
9. CLI tools provide tab-completion support for common commands
10. System maintains <1% data inconsistency during failover scenarios

## Goal
Build a complete, production-quality project following the technology stack above.
Expand Down
Loading