Adding script to calculate metrics for no. of verifying functions in a C file#131
Adding script to calculate metrics for no. of verifying functions in a C file#131
Conversation
|
@CodeRabbit review |
✅ Actions performedReview triggered.
|
📝 WalkthroughWalkthroughAdds a new module Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip CodeRabbit can approve the review once all CodeRabbit's comments are resolved.Enable the |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@eval/__init__.py`:
- Around line 1-3: The import style in eval/__init__.py currently uses an
aliased module import (eval.get_verification_summary) which triggers Ruff
PLR0402; either import the module as a top-level symbol with "from eval import
get_verification_summary" to match Ruff's preferred style, or if you intended to
expose only the entry function, import the function explicitly with "from
eval.get_verification_summary import main as get_verification_summary"; if you
want to keep the current module import, suppress the lint with a "# noqa:
PLR0402" on the import line.
In `@eval/get_verification_summary.py`:
- Around line 96-100: The JSON dump fails because asdict(vsummary) includes
CacheLookupError exception instances (via lookup_errors) which are not JSON
serializable; update the serialization step to convert exception objects into
JSON-safe representations (e.g., map each exception in the lookup_errors list to
a string or a dict with type/message/args) before writing. Locate where vsummary
is converted to a dict (verification_summary_for_file[function.name] =
asdict(vsummary)) and replace that assignment with logic that calls
asdict(vsummary), then transforms any entries under "lookup_errors" (or similar)
by iterating and replacing Exception/CacheLookupError instances with their str()
or a small dict { "type": e.__class__.__name__, "message": str(e) } so
json.dump() succeeds.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: eef242c3-6bd3-46cb-a035-354bcbd80c20
📒 Files selected for processing (3)
eval/__init__.pyeval/get_verification_summary.pyrun.sh
eval/__init__.py
Outdated
| import eval.get_verification_summary as get_verification_summary | ||
|
|
||
| __all__ = ["get_verification_summary"] No newline at end of file |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider Ruff's import style suggestion (PLR0402).
Ruff suggests using from eval import get_verification_summary instead of the alias form. However, since this imports the module (not a specific function), the current form works correctly to expose the module.
If the intent is to expose only the main function rather than the entire module, consider:
from eval.get_verification_summary import main as get_verification_summaryOtherwise, if exposing the module is intentional, the current approach is acceptable—you can suppress the warning with a # noqa: PLR0402 comment if desired.
🧰 Tools
🪛 Ruff (0.15.6)
[warning] 1-1: Use from eval import get_verification_summary in lieu of alias
Replace with from eval import get_verification_summary
(PLR0402)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@eval/__init__.py` around lines 1 - 3, The import style in eval/__init__.py
currently uses an aliased module import (eval.get_verification_summary) which
triggers Ruff PLR0402; either import the module as a top-level symbol with "from
eval import get_verification_summary" to match Ruff's preferred style, or if you
intended to expose only the entry function, import the function explicitly with
"from eval.get_verification_summary import main as get_verification_summary"; if
you want to keep the current module import, suppress the lint with a "# noqa:
PLR0402" on the import line.
| vsummary = _get_verification_summary(function, lookup_result) | ||
| verification_summary_for_file[function.name] = asdict(vsummary) | ||
|
|
||
| with _get_result_json_name(args.file).open(mode="w") as f: | ||
| json.dump(verification_summary_for_file, f, indent=4) |
There was a problem hiding this comment.
CacheLookupError exceptions are not JSON serializable.
asdict(vsummary) will include CacheLookupError instances in the lookup_errors list. Since CacheLookupError extends Exception, it won't be serialized by json.dump(), causing a TypeError.
Consider converting exceptions to a serializable format:
Proposed fix
+def _serialize_summary(vsummary: VerificationSummary) -> dict:
+ """Convert VerificationSummary to a JSON-serializable dict."""
+ result = asdict(vsummary)
+ result["lookup_errors"] = [str(e) for e in vsummary.lookup_errors]
+ return result
+
+
def main() -> None:
...
for function, lookup_result in function_to_lookup_results.items():
vsummary = _get_verification_summary(function, lookup_result)
- verification_summary_for_file[function.name] = asdict(vsummary)
+ verification_summary_for_file[function.name] = _serialize_summary(vsummary)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| vsummary = _get_verification_summary(function, lookup_result) | |
| verification_summary_for_file[function.name] = asdict(vsummary) | |
| with _get_result_json_name(args.file).open(mode="w") as f: | |
| json.dump(verification_summary_for_file, f, indent=4) | |
| def _serialize_summary(vsummary: VerificationSummary) -> dict: | |
| """Convert VerificationSummary to a JSON-serializable dict.""" | |
| result = asdict(vsummary) | |
| result["lookup_errors"] = [str(e) for e in vsummary.lookup_errors] | |
| return result | |
| vsummary = _get_verification_summary(function, lookup_result) | |
| verification_summary_for_file[function.name] = _serialize_summary(vsummary) | |
| with _get_result_json_name(args.file).open(mode="w") as f: | |
| json.dump(verification_summary_for_file, f, indent=4) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@eval/get_verification_summary.py` around lines 96 - 100, The JSON dump fails
because asdict(vsummary) includes CacheLookupError exception instances (via
lookup_errors) which are not JSON serializable; update the serialization step to
convert exception objects into JSON-safe representations (e.g., map each
exception in the lookup_errors list to a string or a dict with
type/message/args) before writing. Locate where vsummary is converted to a dict
(verification_summary_for_file[function.name] = asdict(vsummary)) and replace
that assignment with logic that calls asdict(vsummary), then transforms any
entries under "lookup_errors" (or similar) by iterating and replacing
Exception/CacheLookupError instances with their str() or a small dict { "type":
e.__class__.__name__, "message": str(e) } so json.dump() succeeds.
|
Sample output {
"file_name": "data/qsort.c",
"functions": [
{
"function_name": "swap",
"verifying_specs": [
{
"spec": {
"preconditions": [
"__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
"__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))"
],
"postconditions": [
"__CPROVER_ensures(*a == __CPROVER_old(*b))",
"__CPROVER_ensures(*b == __CPROVER_old(*a))"
]
},
"precondition_complexity": [
{
"clause": "__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
},
{
"clause": "__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
}
],
"postcondition_complexity": [
{
"clause": "__CPROVER_ensures(*a == __CPROVER_old(*b))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
},
{
"clause": "__CPROVER_ensures(*b == __CPROVER_old(*a))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
}
]
},
{
"spec": {
"preconditions": [
"__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
"__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))"
],
"postconditions": [
"__CPROVER_assigns(*a, *b)",
"__CPROVER_ensures(*a == __CPROVER_old(*b) && *b == __CPROVER_old(*a))"
]
},
"precondition_complexity": [
{
"clause": "__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
},
{
"clause": "__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
}
],
"postcondition_complexity": [
{
"clause": "__CPROVER_assigns(*a, *b)",
"error": "Cannot compute complexity for unexpected clause '__CPROVER_assigns(*a, *b)'"
},
{
"clause": "__CPROVER_ensures(*a == __CPROVER_old(*b) && *b == __CPROVER_old(*a))",
"num_atoms": 2,
"num_unique_atoms": 2,
"is_tautology": false
}
]
},
{
"spec": {
"preconditions": [
"__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
"__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))"
],
"postconditions": [
"__CPROVER_assigns(*a, *b)",
"__CPROVER_ensures(*a == __CPROVER_old(*b))",
"__CPROVER_ensures(*b == __CPROVER_old(*a))"
]
},
"precondition_complexity": [
{
"clause": "__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
},
{
"clause": "__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
}
],
"postcondition_complexity": [
{
"clause": "__CPROVER_assigns(*a, *b)",
"error": "Cannot compute complexity for unexpected clause '__CPROVER_assigns(*a, *b)'"
},
{
"clause": "__CPROVER_ensures(*a == __CPROVER_old(*b))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
},
{
"clause": "__CPROVER_ensures(*b == __CPROVER_old(*a))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
}
]
},
{
"spec": {
"preconditions": [
"__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
"__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))"
],
"postconditions": [
"__CPROVER_assigns(*a, *b)",
"__CPROVER_ensures(*a == __CPROVER_old(*b))",
"__CPROVER_ensures(*b == __CPROVER_old(*a))"
]
},
"precondition_complexity": [
{
"clause": "__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
},
{
"clause": "__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
}
],
"postcondition_complexity": [
{
"clause": "__CPROVER_assigns(*a, *b)",
"error": "Cannot compute complexity for unexpected clause '__CPROVER_assigns(*a, *b)'"
},
{
"clause": "__CPROVER_ensures(*a == __CPROVER_old(*b))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
},
{
"clause": "__CPROVER_ensures(*b == __CPROVER_old(*a))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
}
]
}
],
"failing_specs": [
{
"spec": {
"preconditions": [
"__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
"__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))",
"__CPROVER_requires(__CPROVER_pointer_equals(a, b) == false)"
],
"postconditions": [
"__CPROVER_ensures(*a == __CPROVER_old(*b))",
"__CPROVER_ensures(*b == __CPROVER_old(*a))"
]
},
"precondition_complexity": [
{
"clause": "__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
},
{
"clause": "__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
},
{
"clause": "__CPROVER_requires(__CPROVER_pointer_equals(a, b) == false)",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
}
],
"postcondition_complexity": [
{
"clause": "__CPROVER_ensures(*a == __CPROVER_old(*b))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
},
{
"clause": "__CPROVER_ensures(*b == __CPROVER_old(*a))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
}
]
},
{
"spec": {
"preconditions": [
"__CPROVER_requires(!__CPROVER_pointer_equals(a, b))",
"__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
"__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))"
],
"postconditions": [
"__CPROVER_ensures(*a == __CPROVER_old(*b))",
"__CPROVER_ensures(*b == __CPROVER_old(*a))"
]
},
"precondition_complexity": [
{
"clause": "__CPROVER_requires(!__CPROVER_pointer_equals(a, b))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
},
{
"clause": "__CPROVER_requires(__CPROVER_is_fresh(a, sizeof(*a)))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
},
{
"clause": "__CPROVER_requires(__CPROVER_is_fresh(b, sizeof(*b)))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
}
],
"postcondition_complexity": [
{
"clause": "__CPROVER_ensures(*a == __CPROVER_old(*b))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
},
{
"clause": "__CPROVER_ensures(*b == __CPROVER_old(*a))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
}
]
},
{
"spec": {
"preconditions": [
"__CPROVER_requires(!__CPROVER_pointer_equals(a, b))",
"__CPROVER_requires(a != NULL && b != NULL)"
],
"postconditions": [
"__CPROVER_ensures(*a == __CPROVER_old(*b))",
"__CPROVER_ensures(*b == __CPROVER_old(*a))"
]
},
"precondition_complexity": [
{
"clause": "__CPROVER_requires(!__CPROVER_pointer_equals(a, b))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
},
{
"clause": "__CPROVER_requires(a != NULL && b != NULL)",
"num_atoms": 2,
"num_unique_atoms": 2,
"is_tautology": false
}
],
"postcondition_complexity": [
{
"clause": "__CPROVER_ensures(*a == __CPROVER_old(*b))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
},
{
"clause": "__CPROVER_ensures(*b == __CPROVER_old(*a))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
}
]
},
{
"spec": {
"preconditions": [
"__CPROVER_requires(!__CPROVER_pointer_equals(a, b))",
"__CPROVER_requires(a != NULL && b != NULL)"
],
"postconditions": [
"__CPROVER_assigns(*a, *b)",
"__CPROVER_ensures(*a == __CPROVER_old(*b))",
"__CPROVER_ensures(*b == __CPROVER_old(*a))"
]
},
"precondition_complexity": [
{
"clause": "__CPROVER_requires(!__CPROVER_pointer_equals(a, b))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
},
{
"clause": "__CPROVER_requires(a != NULL && b != NULL)",
"num_atoms": 2,
"num_unique_atoms": 2,
"is_tautology": false
}
],
"postcondition_complexity": [
{
"clause": "__CPROVER_assigns(*a, *b)",
"error": "Cannot compute complexity for unexpected clause '__CPROVER_assigns(*a, *b)'"
},
{
"clause": "__CPROVER_ensures(*a == __CPROVER_old(*b))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
},
{
"clause": "__CPROVER_ensures(*b == __CPROVER_old(*a))",
"num_atoms": 1,
"num_unique_atoms": 1,
"is_tautology": false
}
]
}
],
"lookup_errors": []
}
]
} |
There was a problem hiding this comment.
Actionable comments posted: 7
♻️ Duplicate comments (1)
eval/get_verification_summary.py (1)
113-117:⚠️ Potential issue | 🟠 Major
CacheLookupErrorinstances are not JSON-serializable.
asdict(vsummary)includesCacheLookupErrorinstances inlookup_errors. SinceCacheLookupErrorextendsException,json.dump()will raiseTypeError: Object of type CacheLookupError is not JSON serializable.Proposed fix: convert errors to strings before serialization
+def _serialize_summary(vsummary: VerificationSummary) -> dict: + """Convert VerificationSummary to a JSON-serializable dict.""" + result = asdict(vsummary) + result["lookup_errors"] = [str(e) for e in vsummary.lookup_errors] + return result + + def main() -> None: ... for function, lookup_result in function_to_lookup_results.items(): vsummary = _get_verification_summary(function, lookup_result) - verification_summary_for_file["functions"].append(asdict(vsummary)) + verification_summary_for_file["functions"].append(_serialize_summary(vsummary))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@eval/get_verification_summary.py` around lines 113 - 117, The dict produced by asdict(vsummary) includes lookup_errors containing CacheLookupError (an Exception) which json.dump cannot serialize; update _get_verification_summary or immediately after vsummary is created (before appending to verification_summary_for_file["functions"]) to convert any entries in vsummary.lookup_errors (or the asdict result's "lookup_errors") to their string form (e.g., map str() over each error) so that verification_summary_for_file only contains JSON-serializable primitives when calling json.dump.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@eval/get_verification_summary.py`:
- Around line 166-171: The current logic builds verifying_set and then filters
failing_specs with "spec not in verifying_set", which omits specs that both
verify and fail across different vresults; update the VerificationSummary
documentation (docstring of the VerificationSummary class or the module) to
clearly state that a spec present as verifying in any cache entry is treated as
verified and therefore excluded from failing_specs, mentioning the relevant
symbols verifying_set, failing_specs, vresults and SpecWithComplexity so readers
understand the behavior and its implications.
- Around line 131-137: The catch-all around the cache lookup (the loop using
cache.iterkeys(), the access cache[vinput], and vresult.get_function()) should
be narrowed: only catch expected lookup-related exceptions (e.g., KeyError,
FileNotFoundError, AttributeError, TypeError) and wrap those in
CacheLookupError, and ensure the original exception is preserved/logged (include
the original exception message when constructing CacheLookupError or call the
logger). Update the try/except to list the specific exception types instead of
Exception and include the original exception details when appending
CacheLookupError(vinput, original_exception) or logging it.
In `@eval/spec_complexity_util.py`:
- Line 78: Replace the O(n²) comprehension used to dedupe while preserving order
(unique_atoms = [atom for i, atom in enumerate(atoms) if atom not in atoms[:i]])
with an O(n) approach: use dict.fromkeys (or collections.OrderedDict.fromkeys
for older Python versions) to remove duplicates while preserving order, e.g.,
rebuild unique_atoms from the keys of that dict; update the variable
unique_atoms in spec_complexity_util.py accordingly.
- Around line 107-108: The pattern match for Quantifier uses a misleading
variable name `body_expr`; update the match to use the actual field name `expr`
(e.g., change `case Quantifier(_, _, body_expr, _):` to `case Quantifier(_, _,
expr, _):` or use a named pattern `case Quantifier(expr=expr):`) and then call
`get_atoms_in_expression(expr)` so the variable name matches the dataclass field
`expr` and avoids confusion in the `get_atoms_in_expression` handling.
- Line 76: The pattern match in the case arm using Assigns is using the wrong
field name—change the pattern from Assigns(conditions=e, targets=_) to use the
actual dataclass field name Assigns(condition=e, targets=_) so Assigns clauses
are correctly matched; update the case in spec_complexity_util.py where the
match handles RequiresClause/EnsuresClause/Assigns to reference
Assigns(condition=...) instead of conditions.
- Around line 102-111: Add a clarifying docstring to get_atoms_in_expression
that states an "atom" is an atomic proposition without top-level logical
operators: explain the function recursively traverses AndOp, OrOp, NotOp and
Quantifier to collect atoms but treats comparison/arithmetic nodes (e.g., EqOp,
LtOp, AddOp and similar) as indivisible atoms and does not descend into them;
include parameter and return descriptions as in the suggested example to make
intended behavior explicit.
In `@test/eval/test_spec_complexity_util.py`:
- Around line 24-30: Replace uses of "assert False, (...)" in the pattern-match
failure branches with pytest.fail(...) to avoid issues when Python is run with
-O; specifically update the failure in the match for ClauseComplexity (the block
that currently does "case _: assert False, f\"...\"") to call pytest.fail with
the same formatted message, and make the same change in the
test_get_complexity_tautology failure branch; also ensure pytest is imported in
test_spec_complexity_util.py if it isn’t already.
---
Duplicate comments:
In `@eval/get_verification_summary.py`:
- Around line 113-117: The dict produced by asdict(vsummary) includes
lookup_errors containing CacheLookupError (an Exception) which json.dump cannot
serialize; update _get_verification_summary or immediately after vsummary is
created (before appending to verification_summary_for_file["functions"]) to
convert any entries in vsummary.lookup_errors (or the asdict result's
"lookup_errors") to their string form (e.g., map str() over each error) so that
verification_summary_for_file only contains JSON-serializable primitives when
calling json.dump.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7d47d219-812e-480c-814c-0d6a504d3ac8
📒 Files selected for processing (5)
eval/__init__.pyeval/get_verification_summary.pyeval/spec_complexity_util.pytest/eval/__init__.pytest/eval/test_spec_complexity_util.py
| for vinput in cache.iterkeys(): | ||
| try: | ||
| if (vresult := cache[vinput]) and vresult.get_function() == function: | ||
| results.append(vresult) | ||
| except Exception as e: | ||
| results.append(CacheLookupError(e)) | ||
| continue |
There was a problem hiding this comment.
Broad Exception catch may obscure specific errors.
Catching all exceptions makes debugging harder when unexpected errors occur (e.g., KeyError, AttributeError, TypeError). The comment at line 27-29 mentions "dangling pointer to a file," suggesting FileNotFoundError or similar is expected.
Consider narrowing to specific exception types that represent cache lookup failures, or at minimum log the original exception for debugging.
Proposed improvement
+import logging
+
+logger = logging.getLogger(__name__)
+
def _get_lookup_result(cache: Cache, function: CFunction) -> CacheLookupResult:
...
for vinput in cache.iterkeys():
try:
if (vresult := cache[vinput]) and vresult.get_function() == function:
results.append(vresult)
- except Exception as e:
+ except (KeyError, FileNotFoundError, OSError) as e:
+ logger.debug("Cache lookup error for %s: %s", vinput, e)
results.append(CacheLookupError(e))
continue📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for vinput in cache.iterkeys(): | |
| try: | |
| if (vresult := cache[vinput]) and vresult.get_function() == function: | |
| results.append(vresult) | |
| except Exception as e: | |
| results.append(CacheLookupError(e)) | |
| continue | |
| for vinput in cache.iterkeys(): | |
| try: | |
| if (vresult := cache[vinput]) and vresult.get_function() == function: | |
| results.append(vresult) | |
| except (KeyError, FileNotFoundError, OSError) as e: | |
| logger.debug("Cache lookup error for %s: %s", vinput, e) | |
| results.append(CacheLookupError(e)) | |
| continue |
🧰 Tools
🪛 Ruff (0.15.6)
[warning] 135-135: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@eval/get_verification_summary.py` around lines 131 - 137, The catch-all
around the cache lookup (the loop using cache.iterkeys(), the access
cache[vinput], and vresult.get_function()) should be narrowed: only catch
expected lookup-related exceptions (e.g., KeyError, FileNotFoundError,
AttributeError, TypeError) and wrap those in CacheLookupError, and ensure the
original exception is preserved/logged (include the original exception message
when constructing CacheLookupError or call the logger). Update the try/except to
list the specific exception types instead of Exception and include the original
exception details when appending CacheLookupError(vinput, original_exception) or
logging it.
| verifying_set = set(verifying_raw) | ||
| failing_specs = [ | ||
| SpecWithComplexity(spec, *_get_complexity_for_clauses(spec)) | ||
| for vresult in vresults | ||
| if (spec := vresult.get_spec()) and spec not in verifying_set | ||
| ] |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Specs that both verify and fail in different cache entries are excluded from failing_specs.
If the same spec verifies in one cache entry but fails in another (e.g., different invocations), it will only appear in verifying_specs due to the spec not in verifying_set filter. This might be intentional (verified = success), but could lead to confusion if users expect all failure instances to be reported.
Consider documenting this behavior in the docstring or VerificationSummary class.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@eval/get_verification_summary.py` around lines 166 - 171, The current logic
builds verifying_set and then filters failing_specs with "spec not in
verifying_set", which omits specs that both verify and fail across different
vresults; update the VerificationSummary documentation (docstring of the
VerificationSummary class or the module) to clearly state that a spec present as
verifying in any cache entry is treated as verified and therefore excluded from
failing_specs, mentioning the relevant symbols verifying_set, failing_specs,
vresults and SpecWithComplexity so readers understand the behavior and its
implications.
| return ClauseComplexityError(clause=clause, error=str(e)) | ||
|
|
||
| match ast: | ||
| case RequiresClause(_, e) | EnsuresClause(_, e) | Assigns(conditions=e, targets=_): |
There was a problem hiding this comment.
Incorrect field name in Assigns pattern match causes failure.
The Assigns dataclass has a field named condition (singular), but the pattern uses conditions=e (plural). This mismatch will cause the pattern to never match for assigns clauses, resulting in them incorrectly falling through to the error case.
Per the context snippet from translation/ast/cbmc_ast.py:39-67, the field is condition: Any | None.
Proposed fix
- case RequiresClause(_, e) | EnsuresClause(_, e) | Assigns(conditions=e, targets=_):
+ case RequiresClause(_, e) | EnsuresClause(_, e) | Assigns(condition=e, targets=_):📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| case RequiresClause(_, e) | EnsuresClause(_, e) | Assigns(conditions=e, targets=_): | |
| case RequiresClause(_, e) | EnsuresClause(_, e) | Assigns(condition=e, targets=_): |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@eval/spec_complexity_util.py` at line 76, The pattern match in the case arm
using Assigns is using the wrong field name—change the pattern from
Assigns(conditions=e, targets=_) to use the actual dataclass field name
Assigns(condition=e, targets=_) so Assigns clauses are correctly matched; update
the case in spec_complexity_util.py where the match handles
RequiresClause/EnsuresClause/Assigns to reference Assigns(condition=...) instead
of conditions.
| match ast: | ||
| case RequiresClause(_, e) | EnsuresClause(_, e) | Assigns(conditions=e, targets=_): | ||
| atoms = get_atoms_in_expression(e) | ||
| unique_atoms = [atom for i, atom in enumerate(atoms) if atom not in atoms[:i]] |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
O(n²) unique computation can be simplified.
The current approach iterates atoms and checks membership in slices, resulting in O(n²) complexity.
Use dict.fromkeys for O(n) uniqueness while preserving order
- unique_atoms = [atom for i, atom in enumerate(atoms) if atom not in atoms[:i]]
+ unique_atoms = list(dict.fromkeys(atoms))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| unique_atoms = [atom for i, atom in enumerate(atoms) if atom not in atoms[:i]] | |
| unique_atoms = list(dict.fromkeys(atoms)) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@eval/spec_complexity_util.py` at line 78, Replace the O(n²) comprehension
used to dedupe while preserving order (unique_atoms = [atom for i, atom in
enumerate(atoms) if atom not in atoms[:i]]) with an O(n) approach: use
dict.fromkeys (or collections.OrderedDict.fromkeys for older Python versions) to
remove duplicates while preserving order, e.g., rebuild unique_atoms from the
keys of that dict; update the variable unique_atoms in spec_complexity_util.py
accordingly.
| match expr: | ||
| case AndOp(left=left, right=right) | OrOp(left=left, right=right): | ||
| result = [*result, *get_atoms_in_expression(left), *get_atoms_in_expression(right)] | ||
| case NotOp(e): | ||
| result = [*result, *get_atoms_in_expression(e)] | ||
| case Quantifier(_, _, body_expr, _): | ||
| result = [*result, *get_atoms_in_expression(body_expr)] | ||
| case e: | ||
| result.append(e) | ||
| return result |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if there are test cases that verify atoms inside comparison operators are not traversed
rg -n "get_atoms_in_expression" test/Repository: vikramnitin9/rust_verification
Length of output: 816
🏁 Script executed:
cat -n test/eval/test_spec_complexity_util.pyRepository: vikramnitin9/rust_verification
Length of output: 4021
🏁 Script executed:
# Get the complete implementation of get_atoms_in_expression
rg -A 20 "def get_atoms_in_expression" eval/spec_complexity_util.pyRepository: vikramnitin9/rust_verification
Length of output: 835
🏁 Script executed:
# Check for any docstring in the function
rg -B 5 "def get_atoms_in_expression" eval/spec_complexity_util.pyRepository: vikramnitin9/rust_verification
Length of output: 266
Add a clarifying docstring to explain what constitutes an "atom" in this function.
The current docstring does not specify that atoms are propositions without top-level logical operators. The function recursively traverses AndOp, OrOp, NotOp, and Quantifier, but treats comparison and arithmetic operators (EqOp, LtOp, AddOp, etc.) as indivisible atoms. This is the intended behavior per the test cases (e.g., test_count_atoms_in_clause_eq expects EqOp(LtOp(...), AddOp(...)) to return 1 atom), but the docstring should make this explicit:
def get_atoms_in_expression(expr: CBMCAst) -> list[CBMCAst]:
"""Return the atomic propositions in the expression.
Recursively traverses logical operators (And, Or, Not) and quantifiers to find all
atomic propositions. Comparison and arithmetic operators are treated as indivisible atoms
and are not traversed further.
Args:
expr (CBMCAst): The expression from which to obtain atoms.
Returns:
list[CBMCAst]: The atomic propositions in the given expression.
"""🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@eval/spec_complexity_util.py` around lines 102 - 111, Add a clarifying
docstring to get_atoms_in_expression that states an "atom" is an atomic
proposition without top-level logical operators: explain the function
recursively traverses AndOp, OrOp, NotOp and Quantifier to collect atoms but
treats comparison/arithmetic nodes (e.g., EqOp, LtOp, AddOp and similar) as
indivisible atoms and does not descend into them; include parameter and return
descriptions as in the suggested example to make intended behavior explicit.
| case Quantifier(_, _, body_expr, _): | ||
| result = [*result, *get_atoms_in_expression(body_expr)] |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Incorrect field name in Quantifier pattern causes under-counting of atoms.
The pattern uses body_expr as the third positional parameter, but the Quantifier dataclass defines the field as expr (per translation/ast/cbmc_ast.py:360-366). Since structural pattern matching uses positional arguments here, and the actual order is (decl, range_expr, expr, kind), the third position captures expr correctly into body_expr.
However, this is fragile and misleading. The variable name body_expr doesn't match the field name expr, which could cause confusion and maintenance issues.
Suggested improvement for clarity
- case Quantifier(_, _, body_expr, _):
- result = [*result, *get_atoms_in_expression(body_expr)]
+ case Quantifier(decl=_, range_expr=_, expr=body_expr, kind=_):
+ result = [*result, *get_atoms_in_expression(body_expr)]Or use named parameters consistently:
- case Quantifier(_, _, body_expr, _):
- result = [*result, *get_atoms_in_expression(body_expr)]
+ case Quantifier(expr=expr):
+ result = [*result, *get_atoms_in_expression(expr)]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| case Quantifier(_, _, body_expr, _): | |
| result = [*result, *get_atoms_in_expression(body_expr)] | |
| case Quantifier(decl=_, range_expr=_, expr=body_expr, kind=_): | |
| result = [*result, *get_atoms_in_expression(body_expr)] |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@eval/spec_complexity_util.py` around lines 107 - 108, The pattern match for
Quantifier uses a misleading variable name `body_expr`; update the match to use
the actual field name `expr` (e.g., change `case Quantifier(_, _, body_expr,
_):` to `case Quantifier(_, _, expr, _):` or use a named pattern `case
Quantifier(expr=expr):`) and then call `get_atoms_in_expression(expr)` so the
variable name matches the dataclass field `expr` and avoids confusion in the
`get_atoms_in_expression` handling.
| match complexity: | ||
| case ClauseComplexity(num_atoms=3, is_tautology=False): | ||
| pass | ||
| case _: | ||
| assert False, ( | ||
| f"'{clause}' should be reported to have 3 atoms and not be a tautology, but got {complexity}" | ||
| ) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider using pytest.fail() instead of assert False.
Static analysis (B011, PT015) flags assert False because it's removed when Python runs with -O (optimized mode). Using pytest.fail() ensures the failure always occurs.
Proposed fix
+import pytest
+
def test_get_complexity_simple() -> None:
clause = "__CPROVER_requires(a < b || c && 1 + 2 == d)"
complexity = get_complexity(clause)
match complexity:
case ClauseComplexity(num_atoms=3, is_tautology=False):
pass
case _:
- assert False, (
+ pytest.fail(
f"'{clause}' should be reported to have 3 atoms and not be a tautology, but got {complexity}"
)Apply the same pattern to test_get_complexity_tautology at lines 38-41.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| match complexity: | |
| case ClauseComplexity(num_atoms=3, is_tautology=False): | |
| pass | |
| case _: | |
| assert False, ( | |
| f"'{clause}' should be reported to have 3 atoms and not be a tautology, but got {complexity}" | |
| ) | |
| match complexity: | |
| case ClauseComplexity(num_atoms=3, is_tautology=False): | |
| pass | |
| case _: | |
| pytest.fail( | |
| f"'{clause}' should be reported to have 3 atoms and not be a tautology, but got {complexity}" | |
| ) |
🧰 Tools
🪛 Ruff (0.15.6)
[warning] 28-30: Assertion always fails, replace with pytest.fail()
(PT015)
[warning] 28-28: Do not assert False (python -O removes these calls), raise AssertionError()
Replace assert False
(B011)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/eval/test_spec_complexity_util.py` around lines 24 - 30, Replace uses of
"assert False, (...)" in the pattern-match failure branches with
pytest.fail(...) to avoid issues when Python is run with -O; specifically update
the failure in the match for ClauseComplexity (the block that currently does
"case _: assert False, f\"...\"") to call pytest.fail with the same formatted
message, and make the same change in the test_get_complexity_tautology failure
branch; also ensure pytest is imported in test_spec_complexity_util.py if it
isn’t already.
Closes #130