diff --git a/pkg/analyzers/shotness/README.md b/pkg/analyzers/shotness/README.md index fbf8a3d..8f3f5c2 100644 --- a/pkg/analyzers/shotness/README.md +++ b/pkg/analyzers/shotness/README.md @@ -7,24 +7,41 @@ Knowing *that* a file changed is good. Knowing *what part* of the file changed i File-level statistics are too coarse. A "utils.go" file might be huge and change constantly, but are those changes in the same function or scattered everywhere? We need fine-grained resolution. ## How analyzer solves it -"Shotness" (Structural Hotness) tracks changes to specific structural elements—like functions or classes—defined by a User DSL. It tells you which functions are modified most frequently. +"Shotness" (Structural Hotness) tracks changes to specific structural elements—like functions or classes—defined by a User DSL. It tells you which functions are modified most frequently and which ones change together. ## Historical context -This is an evolution of "hotspot" analysis, moving from file-granularity to logical-unit-granularity. +This is an evolution of "hotspot" analysis (Adam Tornhill, *Your Code as a Crime Scene*), moving from file-granularity to logical-unit-granularity. The concept originated in Hercules (src-d/hercules) and has been refined here with normalized coupling strength metrics. ## Real world examples - **Testing Strategy:** If `ProcessPayment()` changes in 50% of commits, it needs extremely robust tests. - **Volatility Analysis:** Identifying unstable functions that might need refactoring to adhere to the Open/Closed Principle. +- **Team Assessment:** Functions with high coupling strength (> 0.8) are candidates for extraction into shared modules. +- **Risk Prioritization:** HIGH risk nodes (≥ 20 changes) should be reviewed for design flaws, not just bugs. ## How analyzer works here 1. **Configuration:** User defines a DSL query (e.g., `filter(.roles has "Function")`) to select nodes of interest. -2. **Node Tracking:** As files change, the analyzer tracks these specific named nodes. +2. **Node Tracking:** As files change, the analyzer tracks these specific named nodes via diff hunk mapping. 3. **Renames:** It handles function renames (if supported by UAST diffing) to maintain history. 4. **Co-occurrence:** It also tracks which functions change together (Structural Coupling). +5. **Normalization:** Coupling strength is normalized to [0, 1] using the formula: `co_changes / max(co_changes, changes_a, changes_b)`. + +## Output Formats +- **JSON/YAML:** Structured metrics with `node_hotness`, `node_coupling`, `hotspot_nodes`, and `aggregate` sections. +- **Text:** Terminal-friendly output with colored progress bars, risk classification, and coupling arrows. +- **Plot:** Interactive HTML dashboard with TreeMap, HeatMap, and Bar Chart visualizations. + +## Metrics +- **Hotness Score:** Normalized [0, 1] relative to the most changed function. +- **Coupling Strength:** Normalized [0, 1] confidence metric for co-change pairs. +- **Risk Level:** HIGH (≥ 20), MEDIUM (≥ 10), LOW (< 10) change count thresholds. +- **Aggregate:** Summary statistics including average coupling strength across all pairs. ## Limitations - **Performance:** Fine-grained UAST diffing is more expensive than file-level diffing. - **DSL Complexity:** Requires understanding the UAST structure to write effective queries. +- **Large Functions:** Any change within a function's line range counts as a change to that function. ## Further plans - Pre-defined queries for common languages. +- Temporal decay: weight recent changes higher than old ones. +- Cross-repository coupling analysis. diff --git a/pkg/analyzers/shotness/analyzer_test.go b/pkg/analyzers/shotness/analyzer_test.go index 42134df..8dc3af4 100644 --- a/pkg/analyzers/shotness/analyzer_test.go +++ b/pkg/analyzers/shotness/analyzer_test.go @@ -3,12 +3,15 @@ package shotness import ( "bytes" "encoding/json" + "errors" "testing" "github.com/stretchr/testify/assert" "github.com/stretchr/testify/require" "github.com/Sumatoshi-tech/codefang/pkg/analyzers/analyze" + "github.com/Sumatoshi-tech/codefang/pkg/gitlib" + "github.com/Sumatoshi-tech/codefang/pkg/uast" "github.com/Sumatoshi-tech/codefang/pkg/uast/pkg/node" ) @@ -454,3 +457,537 @@ func TestAnalyzer_Serialize_YAML_UsesComputedMetrics(t *testing.T) { assert.Contains(t, output, "hotspot_nodes:") assert.Contains(t, output, "aggregate:") } + +func TestNodeSummary_String(t *testing.T) { + t.Parallel() + + ns := NodeSummary{Type: "Function", Name: "foo", File: "main.go"} + assert.Equal(t, "Function_foo_main.go", ns.String()) +} + +func TestAnalyzer_CPUHeavy(t *testing.T) { + t.Parallel() + + s := NewAnalyzer() + assert.True(t, s.CPUHeavy()) +} + +func TestAnalyzer_SequentialOnly(t *testing.T) { + t.Parallel() + + s := NewAnalyzer() + assert.False(t, s.SequentialOnly()) +} + +func TestAnalyzer_NeedsUAST(t *testing.T) { + t.Parallel() + + s := NewAnalyzer() + assert.True(t, s.NeedsUAST()) +} + +func TestShouldConsumeCommit_SingleParent(t *testing.T) { + t.Parallel() + + s := NewAnalyzer() + require.NoError(t, s.Initialize(nil)) + + commit := &mockCommit{hash: [20]byte{1}, parents: 1} + assert.True(t, s.shouldConsumeCommit(commit)) +} + +func TestShouldConsumeCommit_FirstMerge(t *testing.T) { + t.Parallel() + + s := NewAnalyzer() + require.NoError(t, s.Initialize(nil)) + + commit := &mockCommit{hash: [20]byte{1}, parents: 2} + assert.True(t, s.shouldConsumeCommit(commit)) +} + +func TestShouldConsumeCommit_DuplicateMerge(t *testing.T) { + t.Parallel() + + s := NewAnalyzer() + require.NoError(t, s.Initialize(nil)) + + commit := &mockCommit{hash: [20]byte{1}, parents: 2} + assert.True(t, s.shouldConsumeCommit(commit)) + assert.False(t, s.shouldConsumeCommit(commit)) +} + +func TestAddNode_NewNode(t *testing.T) { + t.Parallel() + + s := NewAnalyzer() + require.NoError(t, s.Initialize(nil)) + + n := &node.Node{Type: "Function", Token: "test"} + allNodes := map[string]bool{} + + s.addNode("testFunc", n, "main.go", allNodes) + + assert.True(t, allNodes["Function_testFunc_main.go"]) + assert.NotNil(t, s.nodes["Function_testFunc_main.go"]) + assert.Equal(t, 1, s.nodes["Function_testFunc_main.go"].Count) +} + +func TestAddNode_ExistingNode_DifferentCommit(t *testing.T) { + t.Parallel() + + s := NewAnalyzer() + require.NoError(t, s.Initialize(nil)) + + n := &node.Node{Type: "Function", Token: "test"} + allNodes := map[string]bool{} + + s.addNode("testFunc", n, "main.go", allNodes) + + // Second time with fresh allNodes. + allNodes2 := map[string]bool{} + s.addNode("testFunc", n, "main.go", allNodes2) + + assert.Equal(t, 2, s.nodes["Function_testFunc_main.go"].Count) +} + +func TestAddNode_ExistingNode_SameCommit(t *testing.T) { + t.Parallel() + + s := NewAnalyzer() + require.NoError(t, s.Initialize(nil)) + + n := &node.Node{Type: "Function", Token: "test"} + allNodes := map[string]bool{} + + s.addNode("testFunc", n, "main.go", allNodes) + s.addNode("testFunc", n, "main.go", allNodes) + + // Same commit: should not increment count. + assert.Equal(t, 1, s.nodes["Function_testFunc_main.go"].Count) +} + +func TestUpdateCouplings(t *testing.T) { + t.Parallel() + + s := NewAnalyzer() + require.NoError(t, s.Initialize(nil)) + + s.nodes["a"] = &nodeShotness{Count: 1, Couples: map[string]int{}} + s.nodes["b"] = &nodeShotness{Count: 1, Couples: map[string]int{}} + + allNodes := map[string]bool{"a": true, "b": true} + s.updateCouplings(allNodes) + + assert.Equal(t, 1, s.nodes["a"].Couples["b"]) + assert.Equal(t, 1, s.nodes["b"].Couples["a"]) +} + +func TestApplyRename(t *testing.T) { + t.Parallel() + + s := NewAnalyzer() + require.NoError(t, s.Initialize(nil)) + + key := "Function_foo_old.go" + s.nodes[key] = &nodeShotness{ + Summary: NodeSummary{Type: "Function", Name: "foo", File: "old.go"}, + Count: 5, + Couples: map[string]int{}, + } + s.files["old.go"] = map[string]*nodeShotness{key: s.nodes[key]} + + s.applyRename("old.go", "new.go") + + newKey := "Function_foo_new.go" + + assert.Nil(t, s.nodes[key]) + assert.NotNil(t, s.nodes[newKey]) + assert.Equal(t, "new.go", s.nodes[newKey].Summary.File) + assert.NotNil(t, s.files["new.go"]) + assert.Nil(t, s.files["old.go"]) +} + +func TestGenLine2Node(t *testing.T) { + t.Parallel() + + n := &node.Node{ + Type: "Function", + Pos: &node.Positions{StartLine: 2, EndLine: 4}, + } + + result := genLine2Node(map[string]*node.Node{"fn": n}, 5) + require.Len(t, result, 5) + assert.Nil(t, result[0]) // Line 1. + assert.Len(t, result[1], 1) // Line 2. + assert.Len(t, result[2], 1) // Line 3. + assert.Len(t, result[3], 1) // Line 4. + assert.Nil(t, result[4]) // Line 5. +} + +func TestGenLine2Node_NilPos(t *testing.T) { + t.Parallel() + + n := &node.Node{Type: "Function", Pos: nil} + + result := genLine2Node(map[string]*node.Node{"fn": n}, 3) + require.Len(t, result, 3) + assert.Nil(t, result[0]) + assert.Nil(t, result[1]) + assert.Nil(t, result[2]) +} + +func TestResolveEndLine_WithEndLine(t *testing.T) { + t.Parallel() + + n := &node.Node{ + Pos: &node.Positions{StartLine: 5, EndLine: 10}, + } + + assert.Equal(t, 10, resolveEndLine(n, n.Pos)) +} + +func TestResolveEndLine_WalksChildren(t *testing.T) { + t.Parallel() + + child := &node.Node{ + Pos: &node.Positions{StartLine: 8, EndLine: 15}, + } + parent := &node.Node{ + Pos: &node.Positions{StartLine: 5, EndLine: 5}, + Children: []*node.Node{child}, + } + + assert.Equal(t, 15, resolveEndLine(parent, parent.Pos)) +} + +func TestReverseNodeMap(t *testing.T) { + t.Parallel() + + n1 := &node.Node{ID: "id1"} + n2 := &node.Node{ID: "id2"} + + result := reverseNodeMap(map[string]*node.Node{"name1": n1, "name2": n2}) + assert.Equal(t, "name1", result["id1"]) + assert.Equal(t, "name2", result["id2"]) +} + +func TestRebuildFilesMap(t *testing.T) { + t.Parallel() + + s := NewAnalyzer() + require.NoError(t, s.Initialize(nil)) + + s.nodes["Function_foo_a.go"] = &nodeShotness{ + Summary: NodeSummary{Type: "Function", Name: "foo", File: "a.go"}, + Count: 1, + Couples: map[string]int{}, + } + s.nodes["Function_bar_b.go"] = &nodeShotness{ + Summary: NodeSummary{Type: "Function", Name: "bar", File: "b.go"}, + Count: 2, + Couples: map[string]int{}, + } + + s.rebuildFilesMap() + + assert.Len(t, s.files, 2) + assert.NotNil(t, s.files["a.go"]["Function_foo_a.go"]) + assert.NotNil(t, s.files["b.go"]["Function_bar_b.go"]) +} + +func TestExtractTC(t *testing.T) { + t.Parallel() + + byTick := make(map[int]*TickData) + + cd := &CommitData{ + NodesTouched: map[string]NodeDelta{ + "Function_foo_a.go": { + Summary: NodeSummary{Type: "Function", Name: "foo", File: "a.go"}, + CountDelta: 1, + }, + }, + Couples: []CouplingPair{}, + } + + tc := analyze.TC{Tick: 0, Data: cd} + + err := extractTC(tc, byTick) + require.NoError(t, err) + require.Contains(t, byTick, 0) + assert.Equal(t, 1, byTick[0].Nodes["Function_foo_a.go"].Count) +} + +func TestExtractTC_NilData(t *testing.T) { + t.Parallel() + + byTick := make(map[int]*TickData) + tc := analyze.TC{Tick: 0, Data: nil} + + err := extractTC(tc, byTick) + require.NoError(t, err) + assert.Empty(t, byTick) +} + +func TestExtractTC_WrongDataType(t *testing.T) { + t.Parallel() + + byTick := make(map[int]*TickData) + tc := analyze.TC{Tick: 0, Data: "not_commit_data"} + + err := extractTC(tc, byTick) + require.NoError(t, err) + assert.Empty(t, byTick) +} + +func TestExtractTC_WithCouples(t *testing.T) { + t.Parallel() + + byTick := make(map[int]*TickData) + + cd := &CommitData{ + NodesTouched: map[string]NodeDelta{ + "a": {Summary: NodeSummary{Name: "foo"}, CountDelta: 1}, + "b": {Summary: NodeSummary{Name: "bar"}, CountDelta: 1}, + }, + Couples: []CouplingPair{{Key1: "a", Key2: "b"}}, + } + + err := extractTC(analyze.TC{Tick: 0, Data: cd}, byTick) + require.NoError(t, err) + + assert.Equal(t, 1, byTick[0].Nodes["a"].Couples["b"]) + assert.Equal(t, 1, byTick[0].Nodes["b"].Couples["a"]) +} + +func TestMergeState_BothNil(t *testing.T) { + t.Parallel() + + result := mergeState(nil, nil) + assert.Nil(t, result) +} + +func TestMergeState_ExistingNil(t *testing.T) { + t.Parallel() + + incoming := &TickData{Nodes: map[string]*nodeShotnessData{"a": {Count: 1}}} + result := mergeState(nil, incoming) + assert.Equal(t, incoming, result) +} + +func TestMergeState_IncomingNil(t *testing.T) { + t.Parallel() + + existing := &TickData{Nodes: map[string]*nodeShotnessData{"a": {Count: 1}}} + result := mergeState(existing, nil) + assert.Equal(t, existing, result) +} + +func TestMergeState_BothPresent(t *testing.T) { + t.Parallel() + + existing := &TickData{Nodes: map[string]*nodeShotnessData{ + "a": {Count: 5, Couples: map[string]int{"b": 2}}, + }} + incoming := &TickData{Nodes: map[string]*nodeShotnessData{ + "a": {Count: 3, Couples: map[string]int{"b": 1, "c": 4}}, + "d": {Count: 7, Couples: map[string]int{}}, + }} + + result := mergeState(existing, incoming) + assert.Equal(t, 8, result.Nodes["a"].Count) + assert.Equal(t, 3, result.Nodes["a"].Couples["b"]) + assert.Equal(t, 4, result.Nodes["a"].Couples["c"]) + assert.Equal(t, 7, result.Nodes["d"].Count) +} + +func TestMergeState_NilNodesMap(t *testing.T) { + t.Parallel() + + existing := &TickData{Nodes: nil} + incoming := &TickData{Nodes: map[string]*nodeShotnessData{ + "a": {Count: 1, Couples: map[string]int{}}, + }} + + result := mergeState(existing, incoming) + assert.NotNil(t, result.Nodes) + assert.Equal(t, 1, result.Nodes["a"].Count) +} + +func TestSizeState_Nil(t *testing.T) { + t.Parallel() + + assert.Equal(t, int64(0), sizeState(nil)) +} + +func TestSizeState_WithData(t *testing.T) { + t.Parallel() + + state := &TickData{Nodes: map[string]*nodeShotnessData{ + "a": {Count: 1, Couples: map[string]int{"b": 1, "c": 2}}, + }} + + size := sizeState(state) + assert.Positive(t, size) +} + +func TestBuildTick_Nil(t *testing.T) { + t.Parallel() + + tick, err := buildTick(5, nil) + require.NoError(t, err) + assert.Equal(t, 5, tick.Tick) +} + +func TestBuildTick_WithData(t *testing.T) { + t.Parallel() + + state := &TickData{Nodes: map[string]*nodeShotnessData{ + "a": {Count: 1}, + }} + + tick, err := buildTick(3, state) + require.NoError(t, err) + assert.Equal(t, 3, tick.Tick) + assert.Equal(t, state, tick.Data) +} + +func TestCopyIntMap(t *testing.T) { + t.Parallel() + + src := map[string]int{"a": 1, "b": 2} + dst := copyIntMap(src) + + assert.Equal(t, src, dst) + + dst["c"] = 3 + + assert.NotContains(t, src, "c") +} + +func TestComputeMetricsSafe_EmptyReport(t *testing.T) { + t.Parallel() + + result, err := computeMetricsSafe(analyze.Report{}) + require.NoError(t, err) + require.NotNil(t, result) +} + +func TestComputeMetricsSafe_WithData(t *testing.T) { + t.Parallel() + + report := analyze.Report{ + "Nodes": []NodeSummary{ + {Type: "Function", Name: "foo", File: "a.go"}, + }, + "Counters": []map[int]int{ + {0: 10}, + }, + } + + result, err := computeMetricsSafe(report) + require.NoError(t, err) + assert.Len(t, result.NodeHotness, 1) +} + +func TestComputedMetrics_AnalyzerName(t *testing.T) { + t.Parallel() + + m := &ComputedMetrics{} + assert.Equal(t, "shotness", m.AnalyzerName()) +} + +func TestComputedMetrics_ToJSON(t *testing.T) { + t.Parallel() + + m := &ComputedMetrics{} + assert.Equal(t, m, m.ToJSON()) +} + +func TestComputedMetrics_ToYAML(t *testing.T) { + t.Parallel() + + m := &ComputedMetrics{} + assert.Equal(t, m, m.ToYAML()) +} + +func TestConfigure_WithFacts(t *testing.T) { + t.Parallel() + + s := NewAnalyzer() + err := s.Configure(map[string]any{ + ConfigShotnessDSLStruct: "filter(.roles has \"Class\")", + ConfigShotnessDSLName: ".props.className", + }) + + require.NoError(t, err) + assert.Equal(t, "filter(.roles has \"Class\")", s.DSLStruct) + assert.Equal(t, ".props.className", s.DSLName) +} + +func TestConfigure_Defaults(t *testing.T) { + t.Parallel() + + s := NewAnalyzer() + err := s.Configure(map[string]any{}) + require.NoError(t, err) + assert.Equal(t, DefaultShotnessDSLStruct, s.DSLStruct) + assert.Equal(t, DefaultShotnessDSLName, s.DSLName) +} + +func TestHandleDeletion(t *testing.T) { + t.Parallel() + + s := NewAnalyzer() + require.NoError(t, s.Initialize(nil)) + + key := "Function_foo_deleted.go" + s.nodes[key] = &nodeShotness{ + Summary: NodeSummary{Type: "Function", Name: "foo", File: "deleted.go"}, + Count: 3, + Couples: map[string]int{}, + } + s.files["deleted.go"] = map[string]*nodeShotness{key: s.nodes[key]} + + change := uast.Change{ + Change: &gitlib.Change{ + From: gitlib.ChangeEntry{Name: "deleted.go"}, + }, + } + + s.handleDeletion(change) + + assert.Nil(t, s.nodes[key]) + assert.Nil(t, s.files["deleted.go"]) +} + +// errMockNotImpl is returned by mock methods that are not implemented. +var errMockNotImpl = errors.New("mock: not implemented") + +type mockCommit struct { + hash gitlib.Hash + parents int +} + +func (m *mockCommit) Hash() gitlib.Hash { return m.hash } +func (m *mockCommit) NumParents() int { return m.parents } +func (m *mockCommit) Author() gitlib.Signature { return gitlib.Signature{} } +func (m *mockCommit) Committer() gitlib.Signature { return gitlib.Signature{} } +func (m *mockCommit) Message() string { return "" } + +func (m *mockCommit) Parent(_ int) (*gitlib.Commit, error) { + return nil, errMockNotImpl +} + +func (m *mockCommit) Tree() (*gitlib.Tree, error) { + return nil, errMockNotImpl +} + +func (m *mockCommit) Files() (*gitlib.FileIter, error) { + return nil, errMockNotImpl +} + +func (m *mockCommit) File(_ string) (*gitlib.File, error) { + return nil, errMockNotImpl +} diff --git a/pkg/analyzers/shotness/metrics.go b/pkg/analyzers/shotness/metrics.go index d449ca4..19ae1df 100644 --- a/pkg/analyzers/shotness/metrics.go +++ b/pkg/analyzers/shotness/metrics.go @@ -43,11 +43,12 @@ type NodeHotnessData struct { // NodeCouplingData contains coupling between code nodes. type NodeCouplingData struct { - Node1Name string `json:"node1_name" yaml:"node1_name"` - Node1File string `json:"node1_file" yaml:"node1_file"` - Node2Name string `json:"node2_name" yaml:"node2_name"` - Node2File string `json:"node2_file" yaml:"node2_file"` - CoChanges int `json:"co_changes" yaml:"co_changes"` + Node1Name string `json:"node1_name" yaml:"node1_name"` + Node1File string `json:"node1_file" yaml:"node1_file"` + Node2Name string `json:"node2_name" yaml:"node2_name"` + Node2File string `json:"node2_file" yaml:"node2_file"` + CoChanges int `json:"co_changes" yaml:"co_changes"` + Strength float64 `json:"coupling_strength" yaml:"coupling_strength"` } // HotspotNodeData identifies hot nodes that change frequently. @@ -61,20 +62,18 @@ type HotspotNodeData struct { // AggregateData contains summary statistics. type AggregateData struct { - TotalNodes int `json:"total_nodes" yaml:"total_nodes"` - TotalChanges int `json:"total_changes" yaml:"total_changes"` - TotalCouplings int `json:"total_couplings" yaml:"total_couplings"` - AvgChangesPerNode float64 `json:"avg_changes_per_node" yaml:"avg_changes_per_node"` - HotNodes int `json:"hot_nodes" yaml:"hot_nodes"` + TotalNodes int `json:"total_nodes" yaml:"total_nodes"` + TotalChanges int `json:"total_changes" yaml:"total_changes"` + TotalCouplings int `json:"total_couplings" yaml:"total_couplings"` + AvgChangesPerNode float64 `json:"avg_changes_per_node" yaml:"avg_changes_per_node"` + AvgCouplingStrength float64 `json:"avg_coupling_strength" yaml:"avg_coupling_strength"` + HotNodes int `json:"hot_nodes" yaml:"hot_nodes"` } // Hotspot thresholds. const ( HotspotThresholdHigh = 20 HotspotThresholdMedium = 10 - - // Coupling divisor for strength calculation. - couplingDivisor = 2 ) // --- Pure Metric Functions ---. @@ -128,7 +127,8 @@ func computeNodeHotness(input *ReportData) []NodeHotnessData { return result } -// computeNodeCoupling calculates node coupling data. +// computeNodeCoupling calculates node coupling data with normalized strength. +// Strength formula: co_changes(A,B) / max(changes(A), changes(B)). func computeNodeCoupling(input *ReportData) []NodeCouplingData { var result []NodeCouplingData @@ -138,10 +138,11 @@ func computeNodeCoupling(input *ReportData) []NodeCouplingData { } node1 := input.Nodes[i] + selfChangesI := counters[i] for j, coChanges := range counters { if j <= i || j >= len(input.Nodes) { - continue // Skip self and lower triangle. + continue } if coChanges == 0 { @@ -150,17 +151,24 @@ func computeNodeCoupling(input *ReportData) []NodeCouplingData { node2 := input.Nodes[j] + selfChangesJ := 0 + if j < len(input.Counters) { + selfChangesJ = input.Counters[j][j] + } + + strength := computeCouplingStrength(coChanges, selfChangesI, selfChangesJ) + result = append(result, NodeCouplingData{ Node1Name: node1.Name, Node1File: node1.File, Node2Name: node2.Name, Node2File: node2.File, CoChanges: coChanges, + Strength: strength, }) } } - // Sort by co-changes descending. sort.Slice(result, func(i, j int) bool { return result[i].CoChanges > result[j].CoChanges }) @@ -168,22 +176,41 @@ func computeNodeCoupling(input *ReportData) []NodeCouplingData { return result } +// computeCouplingStrength returns normalized coupling confidence in [0, 1]. +// Formula: co_changes / max(co_changes, changes_a, changes_b). +// Including co_changes in the denominator guarantees the result never exceeds 1. +func computeCouplingStrength(coChanges, changesA, changesB int) float64 { + maxChanges := max(coChanges, max(changesA, changesB)) + if maxChanges <= 0 { + return 0 + } + + return float64(coChanges) / float64(maxChanges) +} + +// Risk level constants. +const ( + RiskLevelHigh = "HIGH" + RiskLevelMedium = "MEDIUM" + RiskLevelLow = "LOW" +) + func classifyChangeRisk(changeCount int) string { switch { case changeCount >= HotspotThresholdHigh: - return "HIGH" + return RiskLevelHigh case changeCount >= HotspotThresholdMedium: - return "MEDIUM" + return RiskLevelMedium default: - return "" + return RiskLevelLow } } -// computeHotspotNodes identifies hotspot nodes. +// computeHotspotNodes identifies hotspot nodes (MEDIUM and HIGH risk only). func computeHotspotNodes(input *ReportData) []HotspotNodeData { var result []HotspotNodeData - for i, node := range input.Nodes { + for i, n := range input.Nodes { if i >= len(input.Counters) { continue } @@ -196,20 +223,19 @@ func computeHotspotNodes(input *ReportData) []HotspotNodeData { } riskLevel := classifyChangeRisk(changeCount) - if riskLevel == "" { - continue // Skip low-risk nodes. + if riskLevel == RiskLevelLow { + continue } result = append(result, HotspotNodeData{ - Name: node.Name, - Type: node.Type, - File: node.File, + Name: n.Name, + Type: n.Type, + File: n.File, ChangeCount: changeCount, RiskLevel: riskLevel, }) } - // Sort by change count descending. sort.Slice(result, func(i, j int) bool { return result[i].ChangeCount > result[j].ChangeCount }) @@ -225,30 +251,47 @@ func computeAggregate(input *ReportData) AggregateData { var totalChanges, totalCouplings, hotNodes int + var strengthSum float64 + + var pairCount int + for i, counters := range input.Counters { - if selfCount, ok := counters[i]; ok { - totalChanges += selfCount - if selfCount >= HotspotThresholdMedium { - hotNodes++ - } + selfI := counters[i] + totalChanges += selfI + + if selfI >= HotspotThresholdMedium { + hotNodes++ } - // Count couplings (non-self entries). - for j := range counters { - if j != i { - totalCouplings++ + for j, coChanges := range counters { + if j <= i || coChanges == 0 { + continue + } + + totalCouplings++ + pairCount++ + + selfJ := 0 + if j < len(input.Counters) { + selfJ = input.Counters[j][j] } + + strengthSum += computeCouplingStrength(coChanges, selfI, selfJ) } } agg.TotalChanges = totalChanges - agg.TotalCouplings = totalCouplings / couplingDivisor // Divide by 2 since counted twice. + agg.TotalCouplings = totalCouplings agg.HotNodes = hotNodes if agg.TotalNodes > 0 { agg.AvgChangesPerNode = float64(totalChanges) / float64(agg.TotalNodes) } + if pairCount > 0 { + agg.AvgCouplingStrength = strengthSum / float64(pairCount) + } + return agg } diff --git a/pkg/analyzers/shotness/metrics_test.go b/pkg/analyzers/shotness/metrics_test.go index 88884df..8f10a64 100644 --- a/pkg/analyzers/shotness/metrics_test.go +++ b/pkg/analyzers/shotness/metrics_test.go @@ -227,11 +227,11 @@ func TestHotspotNodeMetric_ValidData(t *testing.T) { // Should be sorted by change count descending. assert.Equal(t, testNodeName2, result[0].Name) - assert.Equal(t, "HIGH", result[0].RiskLevel) + assert.Equal(t, RiskLevelHigh, result[0].RiskLevel) assert.Equal(t, HotspotThresholdHigh, result[0].ChangeCount) assert.Equal(t, testNodeName3, result[1].Name) - assert.Equal(t, "MEDIUM", result[1].RiskLevel) + assert.Equal(t, RiskLevelMedium, result[1].RiskLevel) assert.Equal(t, HotspotThresholdMedium, result[1].ChangeCount) } @@ -269,9 +269,10 @@ func TestAggregateMetric_ValidData(t *testing.T) { assert.Equal(t, 2, result.TotalNodes) assert.Equal(t, HotspotThresholdHigh+10, result.TotalChanges) // Total is 30. - assert.Equal(t, 1, result.TotalCouplings) // Computed as half of 2. + assert.Equal(t, 1, result.TotalCouplings) // Upper triangle only. assert.InDelta(t, 15.0, result.AvgChangesPerNode, floatDelta) // Average is 15.0. assert.Equal(t, 2, result.HotNodes) // Both nodes are hot. + assert.InDelta(t, 0.25, result.AvgCouplingStrength, floatDelta) } func TestClassifyChangeRisk(t *testing.T) { @@ -282,11 +283,11 @@ func TestClassifyChangeRisk(t *testing.T) { count int expected string }{ - {"Low Risk", HotspotThresholdMedium - 1, ""}, - {"Medium Risk Min", HotspotThresholdMedium, "MEDIUM"}, - {"Medium Risk Max", HotspotThresholdHigh - 1, "MEDIUM"}, - {"High Risk Min", HotspotThresholdHigh, "HIGH"}, - {"High Risk Max", HotspotThresholdHigh + 100, "HIGH"}, + {"Low Risk", HotspotThresholdMedium - 1, RiskLevelLow}, + {"Medium Risk Min", HotspotThresholdMedium, RiskLevelMedium}, + {"Medium Risk Max", HotspotThresholdHigh - 1, RiskLevelMedium}, + {"High Risk Min", HotspotThresholdHigh, RiskLevelHigh}, + {"High Risk Max", HotspotThresholdHigh + 100, RiskLevelHigh}, } for _, tt := range tests { @@ -297,3 +298,72 @@ func TestClassifyChangeRisk(t *testing.T) { }) } } + +// --- Coupling Strength Tests ---. + +func TestComputeCouplingStrength_Basic(t *testing.T) { + t.Parallel() + + tests := []struct { + name string + co int + a int + b int + expected float64 + }{ + {"equal changes", 5, 5, 5, 1.0}, + {"half coupled", 5, 10, 10, 0.5}, + {"asymmetric", 3, 3, 10, 0.3}, + {"zero max", 0, 0, 0, 0.0}, + {"co exceeds self", 5, 3, 4, 1.0}, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + t.Parallel() + + result := computeCouplingStrength(tt.co, tt.a, tt.b) + assert.InDelta(t, tt.expected, result, floatDelta) + }) + } +} + +func TestNodeCouplingMetric_IncludesStrength(t *testing.T) { + t.Parallel() + + input := &ReportData{ + Nodes: []NodeSummary{ + {Name: testNodeName1, Type: testNodeType, File: testFile1}, + {Name: testNodeName2, Type: testNodeType, File: testFile2}, + }, + Counters: []map[int]int{ + {0: 10, 1: 5}, + {0: 5, 1: 20}, + }, + } + + result := computeNodeCoupling(input) + + require.Len(t, result, 1) + assert.Equal(t, 5, result[0].CoChanges) + assert.InDelta(t, 0.25, result[0].Strength, floatDelta) +} + +func TestAggregateMetric_IncludesAvgCouplingStrength(t *testing.T) { + t.Parallel() + + input := &ReportData{ + Nodes: []NodeSummary{ + {Name: testNodeName1}, + {Name: testNodeName2}, + }, + Counters: []map[int]int{ + {0: 10, 1: 5}, + {0: 5, 1: 10}, + }, + } + + result := computeAggregate(input) + + assert.InDelta(t, 0.5, result.AvgCouplingStrength, floatDelta) +} diff --git a/pkg/analyzers/shotness/plot.go b/pkg/analyzers/shotness/plot.go index 14eeb52..ac3f5f4 100644 --- a/pkg/analyzers/shotness/plot.go +++ b/pkg/analyzers/shotness/plot.go @@ -2,6 +2,8 @@ package shotness import ( "errors" + "fmt" + "io" "path/filepath" "sort" @@ -41,6 +43,18 @@ func RegisterPlotSections() { }) } +func (s *Analyzer) generatePlot(report analyze.Report, writer io.Writer) error { + sections, err := s.GenerateSections(report) + if err != nil { + return fmt.Errorf("generate sections: %w", err) + } + + page := plotpage.NewPage("Shotness Analysis", "Function-level change frequency and coupling") + page.Add(sections...) + + return page.Render(writer) +} + // GenerateSections returns the sections for combined reports. func (s *Analyzer) GenerateSections(report analyze.Report) ([]plotpage.Section, error) { nodes, counters, err := extractShotnessData(report) diff --git a/pkg/analyzers/shotness/plot_test.go b/pkg/analyzers/shotness/plot_test.go new file mode 100644 index 0000000..d15fe15 --- /dev/null +++ b/pkg/analyzers/shotness/plot_test.go @@ -0,0 +1,432 @@ +package shotness + +import ( + "bytes" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + + "github.com/Sumatoshi-tech/codefang/pkg/analyzers/analyze" +) + +func TestExtractShotnessData_TypedInput(t *testing.T) { + t.Parallel() + + report := analyze.Report{ + "Nodes": []NodeSummary{ + {Type: "Function", Name: "foo", File: "a.go"}, + }, + "Counters": []map[int]int{ + {0: 5}, + }, + } + + nodes, counters, err := extractShotnessData(report) + require.NoError(t, err) + require.Len(t, nodes, 1) + require.Len(t, counters, 1) + assert.Equal(t, "foo", nodes[0].Name) + assert.Equal(t, 5, counters[0][0]) +} + +func TestExtractShotnessData_MissingNodes(t *testing.T) { + t.Parallel() + + report := analyze.Report{} + + _, _, err := extractShotnessData(report) + require.ErrorIs(t, err, ErrInvalidNodes) +} + +func TestExtractShotnessData_MissingCounters(t *testing.T) { + t.Parallel() + + report := analyze.Report{ + "Nodes": []NodeSummary{{Name: "f"}}, + } + + _, _, err := extractShotnessData(report) + require.ErrorIs(t, err, ErrInvalidCounters) +} + +func TestExtractShotnessFromJSON_HotnessPath(t *testing.T) { + t.Parallel() + + report := analyze.Report{ + "node_hotness": []any{ + map[string]any{ + "name": "funcA", + "type": "Function", + "file": "a.go", + "change_count": float64(10), + }, + map[string]any{ + "name": "funcB", + "type": "Function", + "file": "b.go", + "change_count": float64(5), + }, + }, + "node_coupling": []any{ + map[string]any{ + "node1_name": "funcA", + "node2_name": "funcB", + "co_changes": float64(3), + }, + }, + } + + nodes, counters, err := extractShotnessData(report) + require.NoError(t, err) + require.Len(t, nodes, 2) + require.Len(t, counters, 2) + assert.Equal(t, "funcA", nodes[0].Name) + assert.Equal(t, 10, counters[0][0]) + assert.Equal(t, 3, counters[0][1]) + assert.Equal(t, 3, counters[1][0]) +} + +func TestExtractShotnessFromJSON_NilHotness(t *testing.T) { + t.Parallel() + + report := analyze.Report{ + "node_hotness": nil, + } + + nodes, counters, err := extractShotnessData(report) + require.NoError(t, err) + assert.Nil(t, nodes) + assert.Nil(t, counters) +} + +func TestExtractShotnessFromJSON_EmptyHotness(t *testing.T) { + t.Parallel() + + report := analyze.Report{ + "node_hotness": []any{}, + } + + nodes, counters, err := extractShotnessData(report) + require.NoError(t, err) + assert.Nil(t, nodes) + assert.Nil(t, counters) +} + +func TestExtractShotnessFromJSON_InvalidHotnessType(t *testing.T) { + t.Parallel() + + report := analyze.Report{ + "node_hotness": "not_a_list", + } + + _, _, err := extractShotnessData(report) + require.ErrorIs(t, err, ErrInvalidNodes) +} + +func TestShotnessToInt_Float64(t *testing.T) { + t.Parallel() + + assert.Equal(t, 42, shotnessToInt(float64(42))) +} + +func TestShotnessToInt_Int(t *testing.T) { + t.Parallel() + + assert.Equal(t, 7, shotnessToInt(7)) +} + +func TestShotnessToInt_Int64(t *testing.T) { + t.Parallel() + + assert.Equal(t, 99, shotnessToInt(int64(99))) +} + +func TestShotnessToInt_Unknown(t *testing.T) { + t.Parallel() + + assert.Equal(t, 0, shotnessToInt("not_a_number")) +} + +func TestAssertString_Valid(t *testing.T) { + t.Parallel() + + val, ok := assertString(map[string]any{"key": "value"}, "key") + assert.True(t, ok) + assert.Equal(t, "value", val) +} + +func TestAssertString_Missing(t *testing.T) { + t.Parallel() + + val, ok := assertString(map[string]any{}, "key") + assert.False(t, ok) + assert.Empty(t, val) +} + +func TestBuildFileHierarchy(t *testing.T) { + t.Parallel() + + nodes := []NodeSummary{ + {Name: "f1", File: "a.go"}, + {Name: "f2", File: "a.go"}, + {Name: "f3", File: "b.go"}, + } + counters := []map[int]int{ + {0: 10}, + {1: 5}, + {2: 3}, + } + + fm, ft := buildFileHierarchy(nodes, counters) + require.Len(t, fm, 2) + assert.Equal(t, 15, ft["a.go"]) + assert.Equal(t, 3, ft["b.go"]) +} + +func TestBuildRootNodes_LimitMaxFiles(t *testing.T) { + t.Parallel() + + fileMap := make(map[string][]NodeSummary) + fileTotals := make(map[string]int) + + for i := range maxFiles + 5 { + fname := "file_" + string(rune('a'+i)) + fileMap[fname] = nil + fileTotals[fname] = i + } + + result := buildRootNodes(nil, fileTotals) + assert.LessOrEqual(t, len(result), maxFiles) +} + +func TestGetActiveNodes(t *testing.T) { + t.Parallel() + + nodes := []NodeSummary{ + {Name: "active"}, + {Name: "inactive"}, + } + counters := []map[int]int{ + {0: 5}, + {1: 0}, + } + + actives := getActiveNodes(nodes, counters) + require.Len(t, actives, 1) + assert.Equal(t, "active", actives[0].name) +} + +func TestExtractNames(t *testing.T) { + t.Parallel() + + actives := []activeNode{ + {name: "a"}, + {name: "b"}, + } + + names := extractNames(actives) + assert.Equal(t, []string{"a", "b"}, names) +} + +func TestBuildHeatMapData(t *testing.T) { + t.Parallel() + + actives := []activeNode{ + {idx: 0, name: "a", count: 5}, + {idx: 1, name: "b", count: 3}, + } + counters := []map[int]int{ + {0: 5, 1: 2}, + {0: 2, 1: 3}, + } + + data, maxVal := buildHeatMapData(actives, counters) + assert.Len(t, data, 4) // 2x2 matrix. + assert.InDelta(t, 5.0, maxVal, 0.01) +} + +func TestComputeScores(t *testing.T) { + t.Parallel() + + nodes := []NodeSummary{ + {Name: "hot"}, + {Name: "cold"}, + } + counters := []map[int]int{ + {0: 20, 1: 5}, + {0: 5, 1: 2}, + } + + scores := computeScores(nodes, counters) + require.Len(t, scores, 2) + assert.Equal(t, "hot", scores[0].name) + assert.Equal(t, 20, scores[0].self) +} + +func TestBuildBarData(t *testing.T) { + t.Parallel() + + scores := []nodeScore{ + {name: "a", self: 10, coupled: 5}, + {name: "b", self: 3, coupled: 1}, + } + + labels, selfData, coupledData := buildBarData(scores) + assert.Equal(t, []string{"a", "b"}, labels) + assert.Equal(t, []int{10, 3}, selfData) + assert.Equal(t, []int{5, 1}, coupledData) +} + +func TestCreateEmptyChart(t *testing.T) { + t.Parallel() + + chart := createEmptyChart() + require.NotNil(t, chart) +} + +func TestGenerateChart_WithData(t *testing.T) { + t.Parallel() + + s := NewAnalyzer() + report := analyze.Report{ + "Nodes": []NodeSummary{ + {Type: "Function", Name: "foo", File: "a.go"}, + {Type: "Function", Name: "bar", File: "b.go"}, + }, + "Counters": []map[int]int{ + {0: 10, 1: 3}, + {0: 3, 1: 5}, + }, + } + + chart, err := s.GenerateChart(report) + require.NoError(t, err) + require.NotNil(t, chart) +} + +func TestGenerateChart_Empty(t *testing.T) { + t.Parallel() + + s := NewAnalyzer() + report := analyze.Report{ + "Nodes": []NodeSummary{}, + "Counters": []map[int]int{}, + } + + chart, err := s.GenerateChart(report) + require.NoError(t, err) + require.NotNil(t, chart) // Returns empty chart. +} + +func TestGenerateSections_WithData(t *testing.T) { + t.Parallel() + + s := NewAnalyzer() + report := analyze.Report{ + "Nodes": []NodeSummary{ + {Type: "Function", Name: "foo", File: "a.go"}, + {Type: "Function", Name: "bar", File: "b.go"}, + {Type: "Function", Name: "baz", File: "c.go"}, + }, + "Counters": []map[int]int{ + {0: 10, 1: 3, 2: 1}, + {0: 3, 1: 5, 2: 2}, + {0: 1, 1: 2, 2: 8}, + }, + } + + sections, err := s.GenerateSections(report) + require.NoError(t, err) + require.Len(t, sections, 3) + assert.Equal(t, "Code Hotness TreeMap", sections[0].Title) + assert.Equal(t, "Function Coupling Matrix", sections[1].Title) + assert.Equal(t, "Top Hot Functions", sections[2].Title) +} + +func TestGenerateSections_Empty(t *testing.T) { + t.Parallel() + + s := NewAnalyzer() + report := analyze.Report{ + "Nodes": []NodeSummary{}, + "Counters": []map[int]int{}, + } + + sections, err := s.GenerateSections(report) + require.NoError(t, err) + assert.Nil(t, sections) +} + +func TestGeneratePlot_WithData(t *testing.T) { + t.Parallel() + + s := NewAnalyzer() + report := analyze.Report{ + "Nodes": []NodeSummary{ + {Type: "Function", Name: "foo", File: "a.go"}, + {Type: "Function", Name: "bar", File: "b.go"}, + {Type: "Function", Name: "baz", File: "c.go"}, + }, + "Counters": []map[int]int{ + {0: 10, 1: 3, 2: 1}, + {0: 3, 1: 5, 2: 2}, + {0: 1, 1: 2, 2: 8}, + }, + } + + var buf bytes.Buffer + + err := s.generatePlot(report, &buf) + require.NoError(t, err) + assert.Contains(t, buf.String(), "Shotness Analysis") +} + +func TestSerialize_PlotFormat(t *testing.T) { + t.Parallel() + + s := NewAnalyzer() + report := analyze.Report{ + "Nodes": []NodeSummary{ + {Type: "Function", Name: "foo", File: "a.go"}, + {Type: "Function", Name: "bar", File: "b.go"}, + {Type: "Function", Name: "baz", File: "c.go"}, + }, + "Counters": []map[int]int{ + {0: 10, 1: 3, 2: 1}, + {0: 3, 1: 5, 2: 2}, + {0: 1, 1: 2, 2: 8}, + }, + } + + var buf bytes.Buffer + + err := s.Serialize(report, analyze.FormatPlot, &buf) + require.NoError(t, err) + assert.Positive(t, buf.Len()) +} + +func TestApplyCouplingData_NoCouplingField(t *testing.T) { + t.Parallel() + + counters := []map[int]int{{0: 5}, {1: 3}} + nameToIdx := map[string]int{"a": 0, "b": 1} + + applyCouplingData(analyze.Report{}, counters, nameToIdx) + + // No coupling data -> counters unchanged. + assert.Equal(t, 5, counters[0][0]) + assert.Equal(t, 3, counters[1][1]) +} + +func TestApplyCouplingData_InvalidCouplingType(t *testing.T) { + t.Parallel() + + counters := []map[int]int{{0: 5}} + nameToIdx := map[string]int{"a": 0} + + report := analyze.Report{"node_coupling": "not_a_list"} + applyCouplingData(report, counters, nameToIdx) + + assert.Equal(t, 5, counters[0][0]) +} diff --git a/pkg/analyzers/shotness/text.go b/pkg/analyzers/shotness/text.go new file mode 100644 index 0000000..bd8727c --- /dev/null +++ b/pkg/analyzers/shotness/text.go @@ -0,0 +1,244 @@ +package shotness + +import ( + "context" + "fmt" + "io" + "path/filepath" + "strconv" + + "github.com/Sumatoshi-tech/codefang/pkg/analyzers/analyze" + "github.com/Sumatoshi-tech/codefang/pkg/analyzers/common/terminal" +) + +const ( + textBarWidth = 20 + textLabelWidth = 24 + textHalfLabel = textLabelWidth / 2 + textIndent = " " + textMaxHot = 10 + textMaxCouplings = 10 + textMaxHotspots = 10 + percentFactor = 100 + summaryLabelWidth = 22 +) + +// Serialize dispatches to text, plot, or base format serialization. +func (s *Analyzer) Serialize(result analyze.Report, format string, writer io.Writer) error { + if format == analyze.FormatPlot { + return s.generatePlot(result, writer) + } + + if format == analyze.FormatText { + return s.generateText(result, writer) + } + + if s.BaseHistoryAnalyzer != nil { + return s.BaseHistoryAnalyzer.Serialize(result, format, writer) + } + + return fmt.Errorf("%w: %s", analyze.ErrUnsupportedFormat, format) +} + +// SerializeTICKs converts TICKs to report and serializes. +func (s *Analyzer) SerializeTICKs(ticks []analyze.TICK, format string, writer io.Writer) error { + if format == analyze.FormatPlot || format == analyze.FormatText { + report, err := s.ReportFromTICKs(context.Background(), ticks) + if err != nil { + return err + } + + if format == analyze.FormatPlot { + return s.generatePlot(report, writer) + } + + return s.generateText(report, writer) + } + + if s.BaseHistoryAnalyzer != nil { + return s.BaseHistoryAnalyzer.SerializeTICKs(ticks, format, writer) + } + + return fmt.Errorf("%w: %s", analyze.ErrUnsupportedFormat, format) +} + +// generateText writes a human-readable shotness summary to the writer. +func (s *Analyzer) generateText(report analyze.Report, writer io.Writer) error { + metrics, err := ComputeAllMetrics(report) + if err != nil { + return fmt.Errorf("compute metrics: %w", err) + } + + cfg := terminal.NewConfig() + width := cfg.Width + + header := terminal.DrawHeader( + "Shotness Analysis", + fmt.Sprintf("%d nodes", metrics.Aggregate.TotalNodes), + width, + ) + fmt.Fprintln(writer, header) + fmt.Fprintln(writer) + + writeSummarySection(writer, cfg, metrics.Aggregate) + + if len(metrics.NodeHotness) > 0 { + fmt.Fprintln(writer) + writeHottestFunctions(writer, cfg, metrics.NodeHotness) + } + + if len(metrics.HotspotNodes) > 0 { + fmt.Fprintln(writer) + writeRiskNodes(writer, cfg, metrics.HotspotNodes) + } + + if len(metrics.NodeCoupling) > 0 { + fmt.Fprintln(writer) + writeStrongestCouplings(writer, cfg, metrics.NodeCoupling) + } + + fmt.Fprintln(writer) + + return nil +} + +func writeSummarySection(writer io.Writer, cfg terminal.Config, agg AggregateData) { + fmt.Fprintf(writer, "%s%s\n", textIndent, + cfg.Colorize("Summary", terminal.ColorBlue)) + fmt.Fprintf(writer, "%s%s\n", textIndent, + terminal.DrawSeparator(cfg.Width-len(textIndent)*2)) + + fmt.Fprintf(writer, "%s%-*s %d\n", textIndent, summaryLabelWidth, "Total Nodes", agg.TotalNodes) + fmt.Fprintf(writer, "%s%-*s %d\n", textIndent, summaryLabelWidth, "Total Changes", agg.TotalChanges) + fmt.Fprintf(writer, "%s%-*s %.1f\n", textIndent, summaryLabelWidth, "Avg Changes/Node", agg.AvgChangesPerNode) + fmt.Fprintf(writer, "%s%-*s %d\n", textIndent, summaryLabelWidth, "Total Couplings", agg.TotalCouplings) + + strengthColor := terminal.ColorForScore(1.0 - agg.AvgCouplingStrength) + fmt.Fprintf(writer, "%s%-*s %s\n", textIndent, summaryLabelWidth, "Avg Coupling Strength", + cfg.Colorize(fmt.Sprintf("%.0f%%", agg.AvgCouplingStrength*percentFactor), strengthColor)) + + hotColor := terminal.ColorNone + if agg.HotNodes > 0 { + hotColor = terminal.ColorRed + } + + fmt.Fprintf(writer, "%s%-*s %s\n", textIndent, summaryLabelWidth, "Hot Nodes", + cfg.Colorize(strconv.Itoa(agg.HotNodes), hotColor)) +} + +func writeHottestFunctions(writer io.Writer, cfg terminal.Config, nodes []NodeHotnessData) { + fmt.Fprintf(writer, "%s%s\n", textIndent, + cfg.Colorize("Hottest Functions", terminal.ColorBlue)) + fmt.Fprintf(writer, "%s%s\n", textIndent, + terminal.DrawSeparator(cfg.Width-len(textIndent)*2)) + + shown := min(len(nodes), textMaxHot) + + for _, n := range nodes[:shown] { + label := formatNodeLabel(n.Name, n.File) + label = terminal.TruncateWithEllipsis(label, textLabelWidth) + + bar := terminal.DrawProgressBar(n.HotnessScore, textBarWidth) + scoreColor := hotnessColor(n.HotnessScore) + + fmt.Fprintf(writer, "%s%-*s [%s] %s (%d changes)\n", + textIndent, + textLabelWidth, label, + bar, + cfg.Colorize(fmt.Sprintf("%.1f", n.HotnessScore), scoreColor), + n.ChangeCount) + } + + if len(nodes) > textMaxHot { + fmt.Fprintf(writer, "%s%s\n", textIndent, + cfg.Colorize(fmt.Sprintf(" ... and %d more", len(nodes)-textMaxHot), terminal.ColorGray)) + } +} + +func writeRiskNodes(writer io.Writer, cfg terminal.Config, nodes []HotspotNodeData) { + fmt.Fprintf(writer, "%s%s\n", textIndent, + cfg.Colorize("Risk Assessment", terminal.ColorBlue)) + fmt.Fprintf(writer, "%s%s\n", textIndent, + terminal.DrawSeparator(cfg.Width-len(textIndent)*2)) + + shown := min(len(nodes), textMaxHotspots) + + for _, n := range nodes[:shown] { + label := formatNodeLabel(n.Name, n.File) + label = terminal.TruncateWithEllipsis(label, textLabelWidth) + + riskColor := riskLevelColor(n.RiskLevel) + + fmt.Fprintf(writer, "%s%-*s %s (%d changes)\n", + textIndent, + textLabelWidth, label, + cfg.Colorize(fmt.Sprintf("%-6s", n.RiskLevel), riskColor), + n.ChangeCount) + } + + if len(nodes) > textMaxHotspots { + fmt.Fprintf(writer, "%s%s\n", textIndent, + cfg.Colorize(fmt.Sprintf(" ... and %d more", len(nodes)-textMaxHotspots), terminal.ColorGray)) + } +} + +func writeStrongestCouplings(writer io.Writer, cfg terminal.Config, couplings []NodeCouplingData) { + fmt.Fprintf(writer, "%s%s\n", textIndent, + cfg.Colorize("Strongest Couplings", terminal.ColorBlue)) + fmt.Fprintf(writer, "%s%s\n", textIndent, + terminal.DrawSeparator(cfg.Width-len(textIndent)*2)) + + shown := min(len(couplings), textMaxCouplings) + + for _, c := range couplings[:shown] { + left := terminal.TruncateWithEllipsis(c.Node1Name, textHalfLabel) + right := terminal.TruncateWithEllipsis(c.Node2Name, textHalfLabel) + + strengthPct := c.Strength * percentFactor + strengthColor := couplingStrengthColor(c.Strength) + + fmt.Fprintf(writer, "%s%-*s %s %-*s %s (%d co-changes)\n", + textIndent, + textHalfLabel, left, + cfg.Colorize("↔", terminal.ColorGray), + textHalfLabel, right, + cfg.Colorize(fmt.Sprintf("%3.0f%%", strengthPct), strengthColor), + c.CoChanges) + } + + if len(couplings) > textMaxCouplings { + fmt.Fprintf(writer, "%s%s\n", textIndent, + cfg.Colorize(fmt.Sprintf(" ... and %d more", len(couplings)-textMaxCouplings), terminal.ColorGray)) + } +} + +// formatNodeLabel builds "name (file)" from the node name and file path. +func formatNodeLabel(name, file string) string { + if file == "" { + return name + } + + return fmt.Sprintf("%s (%s)", name, filepath.Base(file)) +} + +// hotnessColor returns a color based on hotness score (inverted: high = red). +func hotnessColor(score float64) terminal.Color { + return terminal.ColorForScore(1.0 - score) +} + +// riskLevelColor maps risk level to terminal color. +func riskLevelColor(level string) terminal.Color { + switch level { + case RiskLevelHigh: + return terminal.ColorRed + case RiskLevelMedium: + return terminal.ColorYellow + default: + return terminal.ColorGreen + } +} + +// couplingStrengthColor maps coupling strength to terminal color (high = concerning). +func couplingStrengthColor(strength float64) terminal.Color { + return terminal.ColorForScore(1.0 - strength) +} diff --git a/pkg/analyzers/shotness/text_test.go b/pkg/analyzers/shotness/text_test.go new file mode 100644 index 0000000..f93b9c9 --- /dev/null +++ b/pkg/analyzers/shotness/text_test.go @@ -0,0 +1,196 @@ +package shotness + +import ( + "bytes" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + + "github.com/Sumatoshi-tech/codefang/pkg/analyzers/analyze" +) + +func TestGenerateText_Summary(t *testing.T) { + t.Parallel() + + report := buildTestReport() + s := NewAnalyzer() + + var buf bytes.Buffer + + err := s.generateText(report, &buf) + require.NoError(t, err) + + output := buf.String() + assert.Contains(t, output, "Shotness Analysis") + assert.Contains(t, output, "Summary") + assert.Contains(t, output, "Total Nodes") + assert.Contains(t, output, "Total Changes") + assert.Contains(t, output, "Avg Changes/Node") + assert.Contains(t, output, "Avg Coupling Strength") +} + +func TestGenerateText_HottestFunctions(t *testing.T) { + t.Parallel() + + report := buildTestReport() + s := NewAnalyzer() + + var buf bytes.Buffer + + err := s.generateText(report, &buf) + require.NoError(t, err) + + output := buf.String() + assert.Contains(t, output, "Hottest Functions") + assert.Contains(t, output, "processPayment") + assert.Contains(t, output, "changes") +} + +func TestGenerateText_StrongestCouplings(t *testing.T) { + t.Parallel() + + report := buildTestReport() + s := NewAnalyzer() + + var buf bytes.Buffer + + err := s.generateText(report, &buf) + require.NoError(t, err) + + output := buf.String() + assert.Contains(t, output, "Strongest Couplings") + assert.Contains(t, output, "↔") + assert.Contains(t, output, "co-changes") +} + +func TestGenerateText_EmptyReport(t *testing.T) { + t.Parallel() + + report := analyze.Report{} + s := NewAnalyzer() + + var buf bytes.Buffer + + err := s.generateText(report, &buf) + require.NoError(t, err) + + output := buf.String() + assert.Contains(t, output, "Shotness Analysis") + assert.Contains(t, output, "0 nodes") +} + +func TestSerialize_Text(t *testing.T) { + t.Parallel() + + report := buildTestReport() + s := NewAnalyzer() + + var buf bytes.Buffer + + err := s.Serialize(report, analyze.FormatText, &buf) + require.NoError(t, err) + assert.Positive(t, buf.Len()) +} + +func TestSerialize_JSON_Passthrough(t *testing.T) { + t.Parallel() + + report := buildTestReport() + s := NewAnalyzer() + + var buf bytes.Buffer + + err := s.Serialize(report, analyze.FormatJSON, &buf) + require.NoError(t, err) + assert.Positive(t, buf.Len()) +} + +func TestSerialize_UnsupportedFormat(t *testing.T) { + t.Parallel() + + report := buildTestReport() + s := NewAnalyzer() + + var buf bytes.Buffer + + err := s.Serialize(report, "invalid_format", &buf) + require.Error(t, err) + assert.ErrorIs(t, err, analyze.ErrUnsupportedFormat) +} + +func TestFormatNodeLabel_WithFile(t *testing.T) { + t.Parallel() + + label := formatNodeLabel("processPayment", "pkg/core/engine.go") + assert.Equal(t, "processPayment (engine.go)", label) +} + +func TestFormatNodeLabel_NoFile(t *testing.T) { + t.Parallel() + + label := formatNodeLabel("processPayment", "") + assert.Equal(t, "processPayment", label) +} + +func TestHotnessColor(t *testing.T) { + t.Parallel() + + assert.NotEqual(t, hotnessColor(0.0), hotnessColor(1.0)) +} + +func TestRiskLevelColor(t *testing.T) { + t.Parallel() + + assert.NotEqual(t, riskLevelColor(RiskLevelHigh), riskLevelColor(RiskLevelLow)) +} + +func TestCouplingStrengthColor(t *testing.T) { + t.Parallel() + + assert.NotEqual(t, couplingStrengthColor(0.0), couplingStrengthColor(1.0)) +} + +func TestGenerateText_RiskAssessment(t *testing.T) { + t.Parallel() + + report := buildHotReport() + s := NewAnalyzer() + + var buf bytes.Buffer + + err := s.generateText(report, &buf) + require.NoError(t, err) + + output := buf.String() + assert.Contains(t, output, "Risk Assessment") + assert.Contains(t, output, RiskLevelHigh) +} + +func buildTestReport() analyze.Report { + return analyze.Report{ + "Nodes": []NodeSummary{ + {Type: "Function", Name: "processPayment", File: "pkg/core/engine.go"}, + {Type: "Function", Name: "validateInput", File: "pkg/core/engine.go"}, + {Type: "Function", Name: "handleRequest", File: "pkg/api/handler.go"}, + }, + "Counters": []map[int]int{ + {0: 15, 1: 8, 2: 3}, + {0: 8, 1: 10, 2: 2}, + {0: 3, 1: 2, 2: 5}, + }, + } +} + +func buildHotReport() analyze.Report { + return analyze.Report{ + "Nodes": []NodeSummary{ + {Type: "Function", Name: "hotFunc", File: "hot.go"}, + {Type: "Function", Name: "coldFunc", File: "cold.go"}, + }, + "Counters": []map[int]int{ + {0: HotspotThresholdHigh + 5, 1: 12}, + {0: 12, 1: 3}, + }, + } +} diff --git a/site/analyzers/shotness.md b/site/analyzers/shotness.md index 2ce719b..ab55f41 100644 --- a/site/analyzers/shotness.md +++ b/site/analyzers/shotness.md @@ -34,6 +34,26 @@ For each code entity matched by the DSL query (functions by default), the analyz When two code entities are modified in the same commit, their coupling counter is incremented. This produces a fine-grained coupling matrix at the function level, which is more precise than file-level coupling from the couples analyzer. +### Coupling Strength + +Coupling strength is normalized to a 0-1 scale using the formula: + +``` +strength(A, B) = co_changes(A, B) / max(co_changes(A, B), changes(A), changes(B)) +``` + +This ensures the result is always in [0, 1] and provides a meaningful confidence metric. A strength of 1.0 means functions always change together; 0.5 means they co-change half the time relative to the most active function. + +### Risk Classification + +Nodes are classified into risk levels based on absolute change counts: + +| Risk Level | Threshold | Meaning | +|---|---|---| +| **HIGH** | ≥ 20 changes | Requires immediate attention and robust test coverage | +| **MEDIUM** | ≥ 10 changes | Should be monitored and potentially refactored | +| **LOW** | < 10 changes | Normal change frequency | + ### How It Works For each commit: @@ -58,6 +78,146 @@ The `nodes` map remains in the analyzer as working state because `handleDeletion --- +## Output Formats + +The shotness analyzer supports four output formats: JSON, YAML, text, and plot. + +=== "Text" + + ```bash + codefang run -a history/shotness -f text . + ``` + + Terminal output with color-coded sections: + + ``` + ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ + ┃ Shotness Analysis 42 nodes ┃ + ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ + + Summary + ────────────────────────────────────────────────────────── + Total Nodes 42 + Total Changes 385 + Avg Changes/Node 9.2 + Total Couplings 156 + Avg Coupling Strength 34% + Hot Nodes 8 + + Hottest Functions + ────────────────────────────────────────────────────────── + processPayment (engine [████████████████████░] 1.0 (42 changes) + validateInput (engine. [████████████████░░░░░] 0.8 (34 changes) + + Risk Assessment + ────────────────────────────────────────────────────────── + processPayment (engine HIGH (42 changes) + validateInput (engine. HIGH (34 changes) + + Strongest Couplings + ────────────────────────────────────────────────────────── + processPayment ↔ validateInput 85% (12 co-changes) + handleRequest ↔ parseBody 72% (8 co-changes) + ``` + +=== "JSON" + + ```bash + codefang run -a history/shotness -f json . + ``` + + ```json + { + "node_hotness": [ + { + "name": "processFile", + "type": "Function", + "file": "pkg/core/engine.go", + "change_count": 42, + "coupled_nodes": 3, + "hotness_score": 1.0 + } + ], + "node_coupling": [ + { + "node1_name": "processFile", + "node1_file": "pkg/core/engine.go", + "node2_name": "validate", + "node2_file": "pkg/core/engine.go", + "co_changes": 15, + "coupling_strength": 0.36 + } + ], + "hotspot_nodes": [ + { + "name": "processFile", + "type": "Function", + "file": "pkg/core/engine.go", + "change_count": 42, + "risk_level": "HIGH" + } + ], + "aggregate": { + "total_nodes": 3, + "total_changes": 105, + "total_couplings": 3, + "avg_changes_per_node": 35.0, + "avg_coupling_strength": 0.42, + "hot_nodes": 2 + } + } + ``` + +=== "YAML" + + ```bash + codefang run -a history/shotness -f yaml . + ``` + + ```yaml + node_hotness: + - name: processFile + type: Function + file: pkg/core/engine.go + change_count: 42 + coupled_nodes: 3 + hotness_score: 1.0 + node_coupling: + - node1_name: processFile + node1_file: pkg/core/engine.go + node2_name: validate + node2_file: pkg/core/engine.go + co_changes: 15 + coupling_strength: 0.36 + hotspot_nodes: + - name: processFile + type: Function + file: pkg/core/engine.go + change_count: 42 + risk_level: HIGH + aggregate: + total_nodes: 3 + total_changes: 105 + total_couplings: 3 + avg_changes_per_node: 35.0 + avg_coupling_strength: 0.42 + hot_nodes: 2 + ``` + +=== "Plot" + + ```bash + codefang run -a history/shotness -f plot -o shotness.html . + ``` + + Generates an interactive HTML dashboard with three visualizations: + + 1. **Code Hotness TreeMap**: Hierarchical file → function view sized by change frequency + 2. **Function Coupling Matrix**: Heatmap showing co-change frequency between functions + 3. **Top Hot Functions**: Bar chart comparing self-changes vs coupled changes + +--- + ## Configuration Options | Option | Type | Default | Description | @@ -98,55 +258,38 @@ history: --- -## Example Output +## Metrics Reference -=== "JSON" +### Node Hotness - ```json - { - "nodes": [ - { - "type": "Function", - "name": "processFile", - "file": "pkg/core/engine.go" - }, - { - "type": "Function", - "name": "validate", - "file": "pkg/core/engine.go" - }, - { - "type": "Function", - "name": "handleRequest", - "file": "pkg/api/handler.go" - } - ], - "counters": [ - {"0": 42, "1": 15, "2": 8}, - {"0": 15, "1": 28, "2": 3}, - {"0": 8, "1": 3, "2": 35} - ] - } - ``` +| Field | Type | Description | +|---|---|---| +| `name` | string | Function/method name | +| `type` | string | UAST node type (e.g., "Function") | +| `file` | string | Source file path | +| `change_count` | int | Number of commits that modified this node | +| `coupled_nodes` | int | Number of other nodes that co-changed with this node | +| `hotness_score` | float | Normalized score [0, 1] relative to the hottest node | - The `counters` array is a sparse co-change matrix. `counters[i][i]` is the total change count for node `i`. `counters[i][j]` (where `i != j`) is the co-change count between nodes `i` and `j`. +### Node Coupling -=== "YAML" +| Field | Type | Description | +|---|---|---| +| `node1_name` / `node2_name` | string | Names of the coupled nodes | +| `node1_file` / `node2_file` | string | File paths of the coupled nodes | +| `co_changes` | int | Number of commits where both nodes changed | +| `coupling_strength` | float | Normalized strength [0, 1] | - ```yaml - nodes: - - type: Function - name: processFile - file: pkg/core/engine.go - - type: Function - name: validate - file: pkg/core/engine.go - counters: - - 0: 42 - 1: 15 - - 0: 15 - 1: 28 - ``` +### Aggregate + +| Field | Type | Description | +|---|---|---| +| `total_nodes` | int | Total tracked nodes | +| `total_changes` | int | Sum of all node change counts | +| `total_couplings` | int | Number of unique coupling pairs | +| `avg_changes_per_node` | float | Mean changes per node | +| `avg_coupling_strength` | float | Mean coupling strength across all pairs | +| `hot_nodes` | int | Nodes with change count ≥ 10 (MEDIUM or HIGH risk) | --- @@ -160,6 +303,25 @@ history: --- +## Interpreting Results + +### Reading the Coupling Strength + +| Strength | Interpretation | +|---|---| +| 0.8 - 1.0 | Very tight coupling. Functions almost always change together. Consider merging or extracting shared logic. | +| 0.5 - 0.8 | Moderate coupling. There is a significant shared dependency. Review if coupling is intentional. | +| 0.2 - 0.5 | Loose coupling. Occasional co-changes, likely due to shared APIs or data structures. | +| < 0.2 | Minimal coupling. Co-changes are incidental. | + +### Actionable Insights + +1. **High hotness + High coupling**: Core function that drives many changes. Candidate for splitting or stabilizing the interface. +2. **High hotness + Low coupling**: Frequently bugfixed isolated function. Needs better tests and potentially a redesign. +3. **Low hotness + High coupling**: Stable function that always changes with others. Check if coupling is necessary or indicates a design smell. + +--- + ## Limitations - **UAST required**: Only languages with UAST parser support are analyzed. Files in unsupported languages are skipped entirely.