Adding script to calculate metrics for no. of verifying functions in a C file by jyoo980 · Pull Request #131 · vikramnitin9/rust_verification

jyoo980 · 2026-03-15T00:23:29Z

Closes #130

jyoo980 · 2026-03-16T23:24:35Z

coderabbitai · 2026-03-16T23:24:41Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai · 2026-03-16T23:28:43Z

📝 Walkthrough

Walkthrough

Adds a new module eval/spec_complexity_util.py that computes clause complexity (atom counts, unique atoms, tautology detection) and exposes ClauseComplexityInfo, ClauseComplexity, ClauseComplexityError, get_complexity, get_atoms_in_expression, and is_tautology. Introduces a new script eval/get_verification_summary.py that queries an Avocado diskcache to produce per-function verification summaries and writes [input]-verification-summary.json. Adds unit tests test/eval/test_spec_complexity_util.py for the complexity utilities. Updates eval/__init__.py to re-export the new symbols and run.sh to mount the eval directory into the Docker container.

Possibly related PRs

Quoting paths in Docker run commands #16: Modifies run.sh Docker volume mappings in a manner similar to this PR's addition of an eval volume mount.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 44.44% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Linked Issues check	✅ Passed	The PR implements issue `#130` by adding get_verification_summary.py which counts verified specifications from cache and produces per-function metrics.
Out of Scope Changes check	✅ Passed	Supporting infrastructure changes (spec_complexity_util.py, init.py updates, tests, run.sh) are appropriately scoped to enable the verification summary script.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch yoo/count-verified-functions

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Tip

CodeRabbit can approve the review once all CodeRabbit's comments are resolved.

Enable the reviews.request_changes_workflow setting to automatically approve the review once all CodeRabbit's comments are resolved.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@eval/__init__.py`:
- Around line 1-3: The import style in eval/__init__.py currently uses an
aliased module import (eval.get_verification_summary) which triggers Ruff
PLR0402; either import the module as a top-level symbol with "from eval import
get_verification_summary" to match Ruff's preferred style, or if you intended to
expose only the entry function, import the function explicitly with "from
eval.get_verification_summary import main as get_verification_summary"; if you
want to keep the current module import, suppress the lint with a "# noqa:
PLR0402" on the import line.

In `@eval/get_verification_summary.py`:
- Around line 96-100: The JSON dump fails because asdict(vsummary) includes
CacheLookupError exception instances (via lookup_errors) which are not JSON
serializable; update the serialization step to convert exception objects into
JSON-safe representations (e.g., map each exception in the lookup_errors list to
a string or a dict with type/message/args) before writing. Locate where vsummary
is converted to a dict (verification_summary_for_file[function.name] =
asdict(vsummary)) and replace that assignment with logic that calls
asdict(vsummary), then transforms any entries under "lookup_errors" (or similar)
by iterating and replacing Exception/CacheLookupError instances with their str()
or a small dict { "type": e.__class__.__name__, "message": str(e) } so
json.dump() succeeds.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: eef242c3-6bd3-46cb-a035-354bcbd80c20

📥 Commits

Reviewing files that changed from the base of the PR and between 140024f and 59d2420.

📒 Files selected for processing (3)

eval/__init__.py
eval/get_verification_summary.py
run.sh

coderabbitai · 2026-03-16T23:28:46Z

eval/__init__.py

+import eval.get_verification_summary as get_verification_summary
+
+__all__ = ["get_verification_summary"]


🧹 Nitpick | 🔵 Trivial

Consider Ruff's import style suggestion (PLR0402).

Ruff suggests using from eval import get_verification_summary instead of the alias form. However, since this imports the module (not a specific function), the current form works correctly to expose the module.

If the intent is to expose only the main function rather than the entire module, consider:

from eval.get_verification_summary import main as get_verification_summary

Otherwise, if exposing the module is intentional, the current approach is acceptable—you can suppress the warning with a # noqa: PLR0402 comment if desired.

🧰 Tools

🪛 Ruff (0.15.6)

[warning] 1-1: Use from eval import get_verification_summary in lieu of alias

Replace with from eval import get_verification_summary

(PLR0402)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@eval/__init__.py` around lines 1 - 3, The import style in eval/__init__.py currently uses an aliased module import (eval.get_verification_summary) which triggers Ruff PLR0402; either import the module as a top-level symbol with "from eval import get_verification_summary" to match Ruff's preferred style, or if you intended to expose only the entry function, import the function explicitly with "from eval.get_verification_summary import main as get_verification_summary"; if you want to keep the current module import, suppress the lint with a "# noqa: PLR0402" on the import line.

coderabbitai · 2026-03-16T23:28:46Z

eval/get_verification_summary.py

+        vsummary = _get_verification_summary(function, lookup_result)
+        verification_summary_for_file[function.name] = asdict(vsummary)
+
+    with _get_result_json_name(args.file).open(mode="w") as f:
+        json.dump(verification_summary_for_file, f, indent=4)


⚠️ Potential issue | 🟠 Major

CacheLookupError exceptions are not JSON serializable.

asdict(vsummary) will include CacheLookupError instances in the lookup_errors list. Since CacheLookupError extends Exception, it won't be serialized by json.dump(), causing a TypeError.

Consider converting exceptions to a serializable format:

Proposed fix

+def _serialize_summary(vsummary: VerificationSummary) -> dict: + """Convert VerificationSummary to a JSON-serializable dict.""" + result = asdict(vsummary) + result["lookup_errors"] = [str(e) for e in vsummary.lookup_errors] + return result + + def main() -> None: ... for function, lookup_result in function_to_lookup_results.items(): vsummary = _get_verification_summary(function, lookup_result) - verification_summary_for_file[function.name] = asdict(vsummary) + verification_summary_for_file[function.name] = _serialize_summary(vsummary)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

vsummary = _get_verification_summary(function, lookup_result)

verification_summary_for_file[function.name] = asdict(vsummary)

with _get_result_json_name(args.file).open(mode="w") as f:

json.dump(verification_summary_for_file, f, indent=4)

def _serialize_summary(vsummary: VerificationSummary) -> dict:

"""Convert VerificationSummary to a JSON-serializable dict."""

result = asdict(vsummary)

result["lookup_errors"] = [str(e) for e in vsummary.lookup_errors]

return result

vsummary = _get_verification_summary(function, lookup_result)

verification_summary_for_file[function.name] = _serialize_summary(vsummary)

with _get_result_json_name(args.file).open(mode="w") as f:

json.dump(verification_summary_for_file, f, indent=4)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@eval/get_verification_summary.py` around lines 96 - 100, The JSON dump fails because asdict(vsummary) includes CacheLookupError exception instances (via lookup_errors) which are not JSON serializable; update the serialization step to convert exception objects into JSON-safe representations (e.g., map each exception in the lookup_errors list to a string or a dict with type/message/args) before writing. Locate where vsummary is converted to a dict (verification_summary_for_file[function.name] = asdict(vsummary)) and replace that assignment with logic that calls asdict(vsummary), then transforms any entries under "lookup_errors" (or similar) by iterating and replacing Exception/CacheLookupError instances with their str() or a small dict { "type": e.__class__.__name__, "message": str(e) } so json.dump() succeeds.

jyoo980 · 2026-03-18T04:35:37Z

Sample output

{
    "file_name": "data/qsort.c",
    "functions": [
        {
            "function_name": "swap",
            "verifying_specs": [
                {
                    "spec": {
                        "preconditions": [
                            "__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
                            "__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))"
                        ],
                        "postconditions": [
                            "__CPROVER_ensures(*a == __CPROVER_old(*b))",
                            "__CPROVER_ensures(*b == __CPROVER_old(*a))"
                        ]
                    },
                    "precondition_complexity": [
                        {
                            "clause": "__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        },
                        {
                            "clause": "__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        }
                    ],
                    "postcondition_complexity": [
                        {
                            "clause": "__CPROVER_ensures(*a == __CPROVER_old(*b))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        },
                        {
                            "clause": "__CPROVER_ensures(*b == __CPROVER_old(*a))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        }
                    ]
                },
                {
                    "spec": {
                        "preconditions": [
                            "__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
                            "__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))"
                        ],
                        "postconditions": [
                            "__CPROVER_assigns(*a, *b)",
                            "__CPROVER_ensures(*a == __CPROVER_old(*b) && *b == __CPROVER_old(*a))"
                        ]
                    },
                    "precondition_complexity": [
                        {
                            "clause": "__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        },
                        {
                            "clause": "__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        }
                    ],
                    "postcondition_complexity": [
                        {
                            "clause": "__CPROVER_assigns(*a, *b)",
                            "error": "Cannot compute complexity for unexpected clause '__CPROVER_assigns(*a, *b)'"
                        },
                        {
                            "clause": "__CPROVER_ensures(*a == __CPROVER_old(*b) && *b == __CPROVER_old(*a))",
                            "num_atoms": 2,
                            "num_unique_atoms": 2,
                            "is_tautology": false
                        }
                    ]
                },
                {
                    "spec": {
                        "preconditions": [
                            "__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
                            "__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))"
                        ],
                        "postconditions": [
                            "__CPROVER_assigns(*a, *b)",
                            "__CPROVER_ensures(*a == __CPROVER_old(*b))",
                            "__CPROVER_ensures(*b == __CPROVER_old(*a))"
                        ]
                    },
                    "precondition_complexity": [
                        {
                            "clause": "__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        },
                        {
                            "clause": "__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        }
                    ],
                    "postcondition_complexity": [
                        {
                            "clause": "__CPROVER_assigns(*a, *b)",
                            "error": "Cannot compute complexity for unexpected clause '__CPROVER_assigns(*a, *b)'"
                        },
                        {
                            "clause": "__CPROVER_ensures(*a == __CPROVER_old(*b))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        },
                        {
                            "clause": "__CPROVER_ensures(*b == __CPROVER_old(*a))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        }
                    ]
                },
                {
                    "spec": {
                        "preconditions": [
                            "__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
                            "__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))"
                        ],
                        "postconditions": [
                            "__CPROVER_assigns(*a, *b)",
                            "__CPROVER_ensures(*a == __CPROVER_old(*b))",
                            "__CPROVER_ensures(*b == __CPROVER_old(*a))"
                        ]
                    },
                    "precondition_complexity": [
                        {
                            "clause": "__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        },
                        {
                            "clause": "__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        }
                    ],
                    "postcondition_complexity": [
                        {
                            "clause": "__CPROVER_assigns(*a, *b)",
                            "error": "Cannot compute complexity for unexpected clause '__CPROVER_assigns(*a, *b)'"
                        },
                        {
                            "clause": "__CPROVER_ensures(*a == __CPROVER_old(*b))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        },
                        {
                            "clause": "__CPROVER_ensures(*b == __CPROVER_old(*a))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        }
                    ]
                }
            ],
            "failing_specs": [
                {
                    "spec": {
                        "preconditions": [
                            "__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
                            "__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))",
                            "__CPROVER_requires(__CPROVER_pointer_equals(a, b) == false)"
                        ],
                        "postconditions": [
                            "__CPROVER_ensures(*a == __CPROVER_old(*b))",
                            "__CPROVER_ensures(*b == __CPROVER_old(*a))"
                        ]
                    },
                    "precondition_complexity": [
                        {
                            "clause": "__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        },
                        {
                            "clause": "__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        },
                        {
                            "clause": "__CPROVER_requires(__CPROVER_pointer_equals(a, b) == false)",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        }
                    ],
                    "postcondition_complexity": [
                        {
                            "clause": "__CPROVER_ensures(*a == __CPROVER_old(*b))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        },
                        {
                            "clause": "__CPROVER_ensures(*b == __CPROVER_old(*a))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        }
                    ]
                },
                {
                    "spec": {
                        "preconditions": [
                            "__CPROVER_requires(!__CPROVER_pointer_equals(a, b))",
                            "__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
                            "__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))"
                        ],
                        "postconditions": [
                            "__CPROVER_ensures(*a == __CPROVER_old(*b))",
                            "__CPROVER_ensures(*b == __CPROVER_old(*a))"
                        ]
                    },
                    "precondition_complexity": [
                        {
                            "clause": "__CPROVER_requires(!__CPROVER_pointer_equals(a, b))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        },
                        {
                            "clause": "__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        },
                        {
                            "clause": "__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        }
                    ],
                    "postcondition_complexity": [
                        {
                            "clause": "__CPROVER_ensures(*a == __CPROVER_old(*b))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        },
                        {
                            "clause": "__CPROVER_ensures(*b == __CPROVER_old(*a))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        }
                    ]
                },
                {
                    "spec": {
                        "preconditions": [
                            "__CPROVER_requires(!__CPROVER_pointer_equals(a, b))",
                            "__CPROVER_requires(a != NULL && b != NULL)"
                        ],
                        "postconditions": [
                            "__CPROVER_ensures(*a == __CPROVER_old(*b))",
                            "__CPROVER_ensures(*b == __CPROVER_old(*a))"
                        ]
                    },
                    "precondition_complexity": [
                        {
                            "clause": "__CPROVER_requires(!__CPROVER_pointer_equals(a, b))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        },
                        {
                            "clause": "__CPROVER_requires(a != NULL && b != NULL)",
                            "num_atoms": 2,
                            "num_unique_atoms": 2,
                            "is_tautology": false
                        }
                    ],
                    "postcondition_complexity": [
                        {
                            "clause": "__CPROVER_ensures(*a == __CPROVER_old(*b))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        },
                        {
                            "clause": "__CPROVER_ensures(*b == __CPROVER_old(*a))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        }
                    ]
                },
                {
                    "spec": {
                        "preconditions": [
                            "__CPROVER_requires(!__CPROVER_pointer_equals(a, b))",
                            "__CPROVER_requires(a != NULL && b != NULL)"
                        ],
                        "postconditions": [
                            "__CPROVER_assigns(*a, *b)",
                            "__CPROVER_ensures(*a == __CPROVER_old(*b))",
                            "__CPROVER_ensures(*b == __CPROVER_old(*a))"
                        ]
                    },
                    "precondition_complexity": [
                        {
                            "clause": "__CPROVER_requires(!__CPROVER_pointer_equals(a, b))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        },
                        {
                            "clause": "__CPROVER_requires(a != NULL && b != NULL)",
                            "num_atoms": 2,
                            "num_unique_atoms": 2,
                            "is_tautology": false
                        }
                    ],
                    "postcondition_complexity": [
                        {
                            "clause": "__CPROVER_assigns(*a, *b)",
                            "error": "Cannot compute complexity for unexpected clause '__CPROVER_assigns(*a, *b)'"
                        },
                        {
                            "clause": "__CPROVER_ensures(*a == __CPROVER_old(*b))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        },
                        {
                            "clause": "__CPROVER_ensures(*b == __CPROVER_old(*a))",
                            "num_atoms": 1,
                            "num_unique_atoms": 1,
                            "is_tautology": false
                        }
                    ]
                }
            ],
            "lookup_errors": []
        }
    ]
}

coderabbitai

Actionable comments posted: 7

♻️ Duplicate comments (1)

eval/get_verification_summary.py (1)

113-117: ⚠️ Potential issue | 🟠 Major

CacheLookupError instances are not JSON-serializable.

asdict(vsummary) includes CacheLookupError instances in lookup_errors. Since CacheLookupError extends Exception, json.dump() will raise TypeError: Object of type CacheLookupError is not JSON serializable.

Proposed fix: convert errors to strings before serialization

+def _serialize_summary(vsummary: VerificationSummary) -> dict:
+    """Convert VerificationSummary to a JSON-serializable dict."""
+    result = asdict(vsummary)
+    result["lookup_errors"] = [str(e) for e in vsummary.lookup_errors]
+    return result
+
+
 def main() -> None:
     ...
     for function, lookup_result in function_to_lookup_results.items():
         vsummary = _get_verification_summary(function, lookup_result)
-        verification_summary_for_file["functions"].append(asdict(vsummary))
+        verification_summary_for_file["functions"].append(_serialize_summary(vsummary))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@eval/get_verification_summary.py` around lines 113 - 117, The dict produced
by asdict(vsummary) includes lookup_errors containing CacheLookupError (an
Exception) which json.dump cannot serialize; update _get_verification_summary or
immediately after vsummary is created (before appending to
verification_summary_for_file["functions"]) to convert any entries in
vsummary.lookup_errors (or the asdict result's "lookup_errors") to their string
form (e.g., map str() over each error) so that verification_summary_for_file
only contains JSON-serializable primitives when calling json.dump.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@eval/get_verification_summary.py`:
- Around line 166-171: The current logic builds verifying_set and then filters
failing_specs with "spec not in verifying_set", which omits specs that both
verify and fail across different vresults; update the VerificationSummary
documentation (docstring of the VerificationSummary class or the module) to
clearly state that a spec present as verifying in any cache entry is treated as
verified and therefore excluded from failing_specs, mentioning the relevant
symbols verifying_set, failing_specs, vresults and SpecWithComplexity so readers
understand the behavior and its implications.
- Around line 131-137: The catch-all around the cache lookup (the loop using
cache.iterkeys(), the access cache[vinput], and vresult.get_function()) should
be narrowed: only catch expected lookup-related exceptions (e.g., KeyError,
FileNotFoundError, AttributeError, TypeError) and wrap those in
CacheLookupError, and ensure the original exception is preserved/logged (include
the original exception message when constructing CacheLookupError or call the
logger). Update the try/except to list the specific exception types instead of
Exception and include the original exception details when appending
CacheLookupError(vinput, original_exception) or logging it.

In `@eval/spec_complexity_util.py`:
- Line 78: Replace the O(n²) comprehension used to dedupe while preserving order
(unique_atoms = [atom for i, atom in enumerate(atoms) if atom not in atoms[:i]])
with an O(n) approach: use dict.fromkeys (or collections.OrderedDict.fromkeys
for older Python versions) to remove duplicates while preserving order, e.g.,
rebuild unique_atoms from the keys of that dict; update the variable
unique_atoms in spec_complexity_util.py accordingly.
- Around line 107-108: The pattern match for Quantifier uses a misleading
variable name `body_expr`; update the match to use the actual field name `expr`
(e.g., change `case Quantifier(_, _, body_expr, _):` to `case Quantifier(_, _,
expr, _):` or use a named pattern `case Quantifier(expr=expr):`) and then call
`get_atoms_in_expression(expr)` so the variable name matches the dataclass field
`expr` and avoids confusion in the `get_atoms_in_expression` handling.
- Line 76: The pattern match in the case arm using Assigns is using the wrong
field name—change the pattern from Assigns(conditions=e, targets=_) to use the
actual dataclass field name Assigns(condition=e, targets=_) so Assigns clauses
are correctly matched; update the case in spec_complexity_util.py where the
match handles RequiresClause/EnsuresClause/Assigns to reference
Assigns(condition=...) instead of conditions.
- Around line 102-111: Add a clarifying docstring to get_atoms_in_expression
that states an "atom" is an atomic proposition without top-level logical
operators: explain the function recursively traverses AndOp, OrOp, NotOp and
Quantifier to collect atoms but treats comparison/arithmetic nodes (e.g., EqOp,
LtOp, AddOp and similar) as indivisible atoms and does not descend into them;
include parameter and return descriptions as in the suggested example to make
intended behavior explicit.

In `@test/eval/test_spec_complexity_util.py`:
- Around line 24-30: Replace uses of "assert False, (...)" in the pattern-match
failure branches with pytest.fail(...) to avoid issues when Python is run with
-O; specifically update the failure in the match for ClauseComplexity (the block
that currently does "case _: assert False, f\"...\"") to call pytest.fail with
the same formatted message, and make the same change in the
test_get_complexity_tautology failure branch; also ensure pytest is imported in
test_spec_complexity_util.py if it isn’t already.

---

Duplicate comments:
In `@eval/get_verification_summary.py`:
- Around line 113-117: The dict produced by asdict(vsummary) includes
lookup_errors containing CacheLookupError (an Exception) which json.dump cannot
serialize; update _get_verification_summary or immediately after vsummary is
created (before appending to verification_summary_for_file["functions"]) to
convert any entries in vsummary.lookup_errors (or the asdict result's
"lookup_errors") to their string form (e.g., map str() over each error) so that
verification_summary_for_file only contains JSON-serializable primitives when
calling json.dump.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7d47d219-812e-480c-814c-0d6a504d3ac8

📥 Commits

Reviewing files that changed from the base of the PR and between 59d2420 and 58abcdc.

📒 Files selected for processing (5)

eval/__init__.py
eval/get_verification_summary.py
eval/spec_complexity_util.py
test/eval/__init__.py
test/eval/test_spec_complexity_util.py

coderabbitai · 2026-03-18T04:42:58Z

eval/get_verification_summary.py

+    for vinput in cache.iterkeys():
+        try:
+            if (vresult := cache[vinput]) and vresult.get_function() == function:
+                results.append(vresult)
+        except Exception as e:
+            results.append(CacheLookupError(e))
+            continue


⚠️ Potential issue | 🟡 Minor

Broad Exception catch may obscure specific errors.

Catching all exceptions makes debugging harder when unexpected errors occur (e.g., KeyError, AttributeError, TypeError). The comment at line 27-29 mentions "dangling pointer to a file," suggesting FileNotFoundError or similar is expected.

Consider narrowing to specific exception types that represent cache lookup failures, or at minimum log the original exception for debugging.

Proposed improvement

+import logging + +logger = logging.getLogger(__name__) + def _get_lookup_result(cache: Cache, function: CFunction) -> CacheLookupResult: ... for vinput in cache.iterkeys(): try: if (vresult := cache[vinput]) and vresult.get_function() == function: results.append(vresult) - except Exception as e: + except (KeyError, FileNotFoundError, OSError) as e: + logger.debug("Cache lookup error for %s: %s", vinput, e) results.append(CacheLookupError(e)) continue

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

for vinput in cache.iterkeys():

try:

if (vresult := cache[vinput]) and vresult.get_function() == function:

results.append(vresult)

except Exception as e:

results.append(CacheLookupError(e))

continue

for vinput in cache.iterkeys():

try:

if (vresult := cache[vinput]) and vresult.get_function() == function:

results.append(vresult)

except (KeyError, FileNotFoundError, OSError) as e:

logger.debug("Cache lookup error for %s: %s", vinput, e)

results.append(CacheLookupError(e))

continue

🧰 Tools

🪛 Ruff (0.15.6)

[warning] 135-135: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@eval/get_verification_summary.py` around lines 131 - 137, The catch-all around the cache lookup (the loop using cache.iterkeys(), the access cache[vinput], and vresult.get_function()) should be narrowed: only catch expected lookup-related exceptions (e.g., KeyError, FileNotFoundError, AttributeError, TypeError) and wrap those in CacheLookupError, and ensure the original exception is preserved/logged (include the original exception message when constructing CacheLookupError or call the logger). Update the try/except to list the specific exception types instead of Exception and include the original exception details when appending CacheLookupError(vinput, original_exception) or logging it.

coderabbitai · 2026-03-18T04:42:58Z

eval/get_verification_summary.py

+    verifying_set = set(verifying_raw)
+    failing_specs = [
+        SpecWithComplexity(spec, *_get_complexity_for_clauses(spec))
+        for vresult in vresults
+        if (spec := vresult.get_spec()) and spec not in verifying_set
+    ]


🧹 Nitpick | 🔵 Trivial

Specs that both verify and fail in different cache entries are excluded from failing_specs.

If the same spec verifies in one cache entry but fails in another (e.g., different invocations), it will only appear in verifying_specs due to the spec not in verifying_set filter. This might be intentional (verified = success), but could lead to confusion if users expect all failure instances to be reported.

Consider documenting this behavior in the docstring or VerificationSummary class.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@eval/get_verification_summary.py` around lines 166 - 171, The current logic builds verifying_set and then filters failing_specs with "spec not in verifying_set", which omits specs that both verify and fail across different vresults; update the VerificationSummary documentation (docstring of the VerificationSummary class or the module) to clearly state that a spec present as verifying in any cache entry is treated as verified and therefore excluded from failing_specs, mentioning the relevant symbols verifying_set, failing_specs, vresults and SpecWithComplexity so readers understand the behavior and its implications.

coderabbitai · 2026-03-18T04:42:58Z

eval/spec_complexity_util.py

+        return ClauseComplexityError(clause=clause, error=str(e))
+
+    match ast:
+        case RequiresClause(_, e) | EnsuresClause(_, e) | Assigns(conditions=e, targets=_):


⚠️ Potential issue | 🔴 Critical

Incorrect field name in Assigns pattern match causes failure.

The Assigns dataclass has a field named condition (singular), but the pattern uses conditions=e (plural). This mismatch will cause the pattern to never match for assigns clauses, resulting in them incorrectly falling through to the error case.

Per the context snippet from translation/ast/cbmc_ast.py:39-67, the field is condition: Any | None.

Proposed fix

- case RequiresClause(_, e) | EnsuresClause(_, e) | Assigns(conditions=e, targets=_): + case RequiresClause(_, e) | EnsuresClause(_, e) | Assigns(condition=e, targets=_):

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

case RequiresClause(_, e) | EnsuresClause(_, e) | Assigns(conditions=e, targets=_):

case RequiresClause(_, e) | EnsuresClause(_, e) | Assigns(condition=e, targets=_):

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@eval/spec_complexity_util.py` at line 76, The pattern match in the case arm using Assigns is using the wrong field name—change the pattern from Assigns(conditions=e, targets=_) to use the actual dataclass field name Assigns(condition=e, targets=_) so Assigns clauses are correctly matched; update the case in spec_complexity_util.py where the match handles RequiresClause/EnsuresClause/Assigns to reference Assigns(condition=...) instead of conditions.

coderabbitai · 2026-03-18T04:42:58Z

eval/spec_complexity_util.py

+    match ast:
+        case RequiresClause(_, e) | EnsuresClause(_, e) | Assigns(conditions=e, targets=_):
+            atoms = get_atoms_in_expression(e)
+            unique_atoms = [atom for i, atom in enumerate(atoms) if atom not in atoms[:i]]


🧹 Nitpick | 🔵 Trivial

O(n²) unique computation can be simplified.

The current approach iterates atoms and checks membership in slices, resulting in O(n²) complexity.

Use dict.fromkeys for O(n) uniqueness while preserving order

- unique_atoms = [atom for i, atom in enumerate(atoms) if atom not in atoms[:i]] + unique_atoms = list(dict.fromkeys(atoms))

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

unique_atoms = [atom for i, atom in enumerate(atoms) if atom not in atoms[:i]]

unique_atoms = list(dict.fromkeys(atoms))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@eval/spec_complexity_util.py` at line 78, Replace the O(n²) comprehension used to dedupe while preserving order (unique_atoms = [atom for i, atom in enumerate(atoms) if atom not in atoms[:i]]) with an O(n) approach: use dict.fromkeys (or collections.OrderedDict.fromkeys for older Python versions) to remove duplicates while preserving order, e.g., rebuild unique_atoms from the keys of that dict; update the variable unique_atoms in spec_complexity_util.py accordingly.

coderabbitai · 2026-03-18T04:42:58Z

eval/spec_complexity_util.py

+    match expr:
+        case AndOp(left=left, right=right) | OrOp(left=left, right=right):
+            result = [*result, *get_atoms_in_expression(left), *get_atoms_in_expression(right)]
+        case NotOp(e):
+            result = [*result, *get_atoms_in_expression(e)]
+        case Quantifier(_, _, body_expr, _):
+            result = [*result, *get_atoms_in_expression(body_expr)]
+        case e:
+            result.append(e)
+    return result


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check if there are test cases that verify atoms inside comparison operators are not traversed rg -n "get_atoms_in_expression" test/

Repository: vikramnitin9/rust_verification

Length of output: 816

🏁 Script executed:

cat -n test/eval/test_spec_complexity_util.py

Repository: vikramnitin9/rust_verification

Length of output: 4021

🏁 Script executed:

# Get the complete implementation of get_atoms_in_expression rg -A 20 "def get_atoms_in_expression" eval/spec_complexity_util.py

Repository: vikramnitin9/rust_verification

Length of output: 835

🏁 Script executed:

# Check for any docstring in the function rg -B 5 "def get_atoms_in_expression" eval/spec_complexity_util.py

Repository: vikramnitin9/rust_verification

Length of output: 266

Add a clarifying docstring to explain what constitutes an "atom" in this function.

The current docstring does not specify that atoms are propositions without top-level logical operators. The function recursively traverses AndOp, OrOp, NotOp, and Quantifier, but treats comparison and arithmetic operators (EqOp, LtOp, AddOp, etc.) as indivisible atoms. This is the intended behavior per the test cases (e.g., test_count_atoms_in_clause_eq expects EqOp(LtOp(...), AddOp(...)) to return 1 atom), but the docstring should make this explicit:

def get_atoms_in_expression(expr: CBMCAst) -> list[CBMCAst]: """Return the atomic propositions in the expression. Recursively traverses logical operators (And, Or, Not) and quantifiers to find all atomic propositions. Comparison and arithmetic operators are treated as indivisible atoms and are not traversed further. Args: expr (CBMCAst): The expression from which to obtain atoms. Returns: list[CBMCAst]: The atomic propositions in the given expression. """

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@eval/spec_complexity_util.py` around lines 102 - 111, Add a clarifying docstring to get_atoms_in_expression that states an "atom" is an atomic proposition without top-level logical operators: explain the function recursively traverses AndOp, OrOp, NotOp and Quantifier to collect atoms but treats comparison/arithmetic nodes (e.g., EqOp, LtOp, AddOp and similar) as indivisible atoms and does not descend into them; include parameter and return descriptions as in the suggested example to make intended behavior explicit.

coderabbitai · 2026-03-18T04:42:58Z

eval/spec_complexity_util.py

+        case Quantifier(_, _, body_expr, _):
+            result = [*result, *get_atoms_in_expression(body_expr)]


🧹 Nitpick | 🔵 Trivial

Incorrect field name in Quantifier pattern causes under-counting of atoms.

The pattern uses body_expr as the third positional parameter, but the Quantifier dataclass defines the field as expr (per translation/ast/cbmc_ast.py:360-366). Since structural pattern matching uses positional arguments here, and the actual order is (decl, range_expr, expr, kind), the third position captures expr correctly into body_expr.

However, this is fragile and misleading. The variable name body_expr doesn't match the field name expr, which could cause confusion and maintenance issues.

Suggested improvement for clarity

- case Quantifier(_, _, body_expr, _): - result = [*result, *get_atoms_in_expression(body_expr)] + case Quantifier(decl=_, range_expr=_, expr=body_expr, kind=_): + result = [*result, *get_atoms_in_expression(body_expr)]

Or use named parameters consistently:

- case Quantifier(_, _, body_expr, _): - result = [*result, *get_atoms_in_expression(body_expr)] + case Quantifier(expr=expr): + result = [*result, *get_atoms_in_expression(expr)]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

case Quantifier(_, _, body_expr, _):

result = [*result, *get_atoms_in_expression(body_expr)]

case Quantifier(decl=_, range_expr=_, expr=body_expr, kind=_):

result = [*result, *get_atoms_in_expression(body_expr)]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@eval/spec_complexity_util.py` around lines 107 - 108, The pattern match for Quantifier uses a misleading variable name `body_expr`; update the match to use the actual field name `expr` (e.g., change `case Quantifier(_, _, body_expr, _):` to `case Quantifier(_, _, expr, _):` or use a named pattern `case Quantifier(expr=expr):`) and then call `get_atoms_in_expression(expr)` so the variable name matches the dataclass field `expr` and avoids confusion in the `get_atoms_in_expression` handling.

coderabbitai · 2026-03-18T04:42:58Z

test/eval/test_spec_complexity_util.py

+    match complexity:
+        case ClauseComplexity(num_atoms=3, is_tautology=False):
+            pass
+        case _:
+            assert False, (
+                f"'{clause}' should be reported to have 3 atoms and not be a tautology, but got {complexity}"
+            )


🧹 Nitpick | 🔵 Trivial

Consider using pytest.fail() instead of assert False.

Static analysis (B011, PT015) flags assert False because it's removed when Python runs with -O (optimized mode). Using pytest.fail() ensures the failure always occurs.

Proposed fix

+import pytest + def test_get_complexity_simple() -> None: clause = "__CPROVER_requires(a < b || c && 1 + 2 == d)" complexity = get_complexity(clause) match complexity: case ClauseComplexity(num_atoms=3, is_tautology=False): pass case _: - assert False, ( + pytest.fail( f"'{clause}' should be reported to have 3 atoms and not be a tautology, but got {complexity}" )

Apply the same pattern to test_get_complexity_tautology at lines 38-41.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

match complexity:

case ClauseComplexity(num_atoms=3, is_tautology=False):

pass

case _:

assert False, (

f"'{clause}' should be reported to have 3 atoms and not be a tautology, but got {complexity}"

)

match complexity:

case ClauseComplexity(num_atoms=3, is_tautology=False):

pass

case _:

pytest.fail(

f"'{clause}' should be reported to have 3 atoms and not be a tautology, but got {complexity}"

)

🧰 Tools

🪛 Ruff (0.15.6)

[warning] 28-30: Assertion always fails, replace with pytest.fail()

(PT015)

[warning] 28-28: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@test/eval/test_spec_complexity_util.py` around lines 24 - 30, Replace uses of "assert False, (...)" in the pattern-match failure branches with pytest.fail(...) to avoid issues when Python is run with -O; specifically update the failure in the match for ClauseComplexity (the block that currently does "case _: assert False, f\"...\"") to call pytest.fail with the same formatted message, and make the same change in the test_get_complexity_tautology failure branch; also ensure pytest is imported in test_spec_complexity_util.py if it isn’t already.

jyoo980 added 2 commits March 14, 2026 17:22

Initial work for #130

c7ac58f

Implementing get_verification_summary.py

59d2420

coderabbitai bot reviewed Mar 16, 2026

View reviewed changes

jyoo980 added 7 commits March 16, 2026 17:25

Adding spec_complexity_util.py

19b19f9

Adding tests for is_tautology

d59760c

Adding additional methods

01dc7c8

Adding documentation

9af2dcd

Simplifying API for spec_complexity_util.py

dc3d02d

Fixing import error

473b8fd

Updatess

58abcdc

jyoo980 marked this pull request as ready for review March 18, 2026 04:35

jyoo980 assigned mernst and vikramnitin9 Mar 18, 2026

coderabbitai bot reviewed Mar 18, 2026

View reviewed changes

		import eval.get_verification_summary as get_verification_summary

		__all__ = ["get_verification_summary"] No newline at end of file

	case RequiresClause(_, e) \| EnsuresClause(_, e) \| Assigns(conditions=e, targets=_):
	case RequiresClause(_, e) \| EnsuresClause(_, e) \| Assigns(condition=e, targets=_):

	unique_atoms = [atom for i, atom in enumerate(atoms) if atom not in atoms[:i]]
	unique_atoms = list(dict.fromkeys(atoms))

		case Quantifier(_, _, body_expr, _):
		result = [result, get_atoms_in_expression(body_expr)]

Conversation

jyoo980 commented Mar 15, 2026

Uh oh!

jyoo980 commented Mar 16, 2026

Uh oh!

coderabbitai bot commented Mar 16, 2026

Uh oh!

coderabbitai bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

jyoo980 commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

coderabbitai bot commented Mar 16, 2026 •

edited

Loading

jyoo980 commented Mar 18, 2026 •

edited

Loading