⚡️ Speed up function find_last_node by 17,056%
#188
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 17,056% (170.56x) speedup for
find_last_nodeinsrc/algorithms/graph.py⏱️ Runtime :
81.8 milliseconds→477 microseconds(best of114runs)📝 Explanation and details
The optimization transforms an O(N*M) algorithm into an O(N+M) algorithm by replacing repeated linear searches with a single set-based lookup.
Key Changes:
{e["source"] for e in edges}containing all edge source IDs (O(M) time)all(e["source"] != n["id"] for e in edges)for each node to a simplen["id"] not in edge_sourcescheck (O(1) per node vs O(M) per node)next()Why It's Faster:
The original code had quadratic complexity - for each of the N nodes, it scanned all M edges to check if the node appears as a source. This results in N*M operations. The optimized version builds the edge sources set once (M operations) then performs N constant-time lookups, totaling N+M operations.
Performance Impact:
The 170x speedup (from 81.8ms to 477µs) demonstrates the dramatic improvement, especially evident in the large-scale test cases. The optimization excels when:
This optimization is particularly valuable for graph analysis workloads where finding sink nodes (nodes with no outgoing edges) is a common operation in larger datasets.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-find_last_node-mjarp9cmand push.