docscanner/
│
├── cmd/
│ └── scanner/
│ └── main.go
│
├── internal/
│ ├── analyzer/
│ │ ├── analyzer.go
│ │ ├── word.go
│ │ └── pdf.go
│ │
│ ├── scanner/
│ │ ├── walker.go
│ │ └── workerpool.go
│ │
│ └── model/
│ └── result.go
│
├── go.mod
Why this layout?
cmd/→ entrypointsinternal/→ private business logicanalyzer/→ pluggable detection modulesscanner/→ orchestration + concurrencymodel/→ shared data types
This is idiomatic Go service layout.
-
cmd/scanner/main.gofunc main()– CLI entry, wiring of all components.
-
internal/scanner/walker.gofunc WalkDirectory(root string, fileChan chan<- string) error– recursive directory traversal.
-
internal/scanner/workerpool.gofunc StartWorkerPool(numWorkers int, files <-chan string, analyzers []analyzer.Analyzer, results chan<- *model.ScanResult, wg *sync.WaitGroup)– concurrent analysis workers.
-
internal/analyzer/analyzer.gotype Analyzer interface– contract for all analyzers.
-
internal/analyzer/word.gotype WordAnalyzer struct{}– OOXML / macro detection.
-
internal/analyzer/pdf.gotype PDFAnalyzer struct{}– heuristic PDF keyword detection.
-
internal/model/result.gotype ScanResult struct– core data model serialized as JSON.