Skip to content

fix GHI# 206#207

Open
nnuk wants to merge 2 commits intodevfrom
fix_MolecularInorganics_leak
Open

fix GHI# 206#207
nnuk wants to merge 2 commits intodevfrom
fix_MolecularInorganics_leak

Conversation

@nnuk
Copy link
Copy Markdown
Collaborator

@nnuk nnuk commented Apr 13, 2026

This PR fixes memory leaks in the Molecular Inorganics pipeline that were exposed by unit tests as described in the GHI#206.

The fix controls the flow in ProcessOneStructure() to prevent duplicate INChI generation when Molecular Inorganics mode is enabled.

Copy link
Copy Markdown
Collaborator

@JanCBrammer JanCBrammer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked out one commit prior to the fix (3a730b5) and compiled the tests with address sanitization (by setting -DCMAKE_BUILD_TYPE=Debug).
Running the unit tests resulted in the following (abbreviated):

SUMMARY: AddressSanitizer: 7879 byte(s) leaked in 185 allocation(s).
The following tests FAILED:
 11 - test_molecularInorganics (Failed)

I then checked out the fix (e273081) and re-compiled (again with address sanitization). All unit tests pass now and run without any output from the address sanitizer:
100% tests passed, 0 tests failed out of 15

Therefore, I can confirm that the PR fixes the memory leak.
I had Claude (Sonnet 4.6) review the implementation. It seems to me that its three remarks are worth following up. What follows is AI-generated.

Overview

The PR correctly identifies that ProcessOneStructure() was falling through from the Molecular Inorganics branch into the standard InChI generation path. This caused CreateOneStructureINChI() to be called a second time on the already-preprocessed structure, allocating data that was never freed. The fix — adding goto after_structure_generation — skips the duplicate work and is consistent with the existing heavy use of goto exit_function throughout the file.


What the goto skips

Before reaching after_structure_generation:, the standard path does four things that Molecular Inorganics mode now bypasses:

Skipped code Safe to skip?
Polymer frame-shift housekeeping (is_polymer block) Yes — not applicable to Molecular Inorganics
OrigAtData_SaveMolfile() Needs confirmation (see below)
OrigAtData_StoreNativeInput() / pOrigStruct setup Needs documentation (see below)
Standard CreateOneStructureINChI() calls Yes — this is the fix

Concerns

1. OrigAtData_SaveMolfile is silently skipped

// runichi.c ~396 — now unreachable when ip->bMolecularInorganics is true
ret1 = OrigAtData_SaveMolfile(orig_inp_data, sd, ip, num_inp, out_file);

OrigAtData_SaveMolfile is a no-op unless INCHI_OUT_SDFILE_ONLY is set. If a user combines -MolecularInorganics with -OutputSDF, SDF output will be silently skipped. Please confirm whether that combination is intentionally unsupported, or move the OrigAtData_SaveMolfile call to before the goto if SDF output should work in this mode.

2. pOrigStruct remains NULL — AuxInfo reversibility layers are lost

// runichi.c ~403–406 — also now unreachable
pOrigStruct = &OrigStruct;
memset(pOrigStruct, 0, sizeof(*pOrigStruct));
OrigAtData_StoreNativeInput(pCG, &nRet, sd, ip, orig_inp_data, pOrigStruct);

pOrigStruct is initialised to NULL at line 246 and only assigned here. With the goto, it stays NULL for Molecular Inorganics structures. There is no crash riskOrigStruct_Free, OutputAUXINFO_ReversibilityInfo, and OutputAUXINFO_PolymerInfo all guard against NULL before dereferencing. However, the /rA, /rB, /rC AuxInfo reversibility layers will be absent from Molecular Inorganics output. The comment in OrigAtData_StoreNativeInput says "v. 1.05 always create and fill OrigStruc". Is silently omitting AuxInfo the intended behaviour for this mode? If yes, document it.

3. No comment on the goto

The jump is non-obvious — a reader has to mentally trace what is being skipped and why. A one-liner like:

/* InChI already generated via Molecular Inorganics path; skip standard generation to avoid duplicate calls and leaks */
goto after_structure_generation;

would significantly help future maintainers.


Positives

  • The else if (nRet != _IS_FATAL && nRet != _IS_ERROR) { maxINChI = 1; } branch is correct and was genuinely missing — without it maxINChI would stay 0 in the non-reconnected Molecular Inorganics case.
  • Error paths (goto exit_function inside the reconnected-InChI block) interact cleanly with the new label.
  • The trailing-whitespace removal in CreateOneComponentINChI is harmless.

Summary

The core fix is correct and stops the duplicate CreateOneStructureINChI() call that was causing the leaks. Before merging, please address:

  1. Confirm or fix the OrigAtData_SaveMolfile skip — is -OutputSDF + -MolecularInorganics a supported combination?
  2. Document (comment or PR description) the decision to leave pOrigStruct = NULL, which silently omits AuxInfo reversibility for Molecular Inorganics output.
  3. Add a short comment at goto after_structure_generation explaining why the skip is needed.

@nnuk
Copy link
Copy Markdown
Collaborator Author

nnuk commented Apr 16, 2026

Thanks @JanCBrammer for the review.

  1. The OrigAtData_SaveMolfile() functionality is now preserved in the MI code flow also.

  2. The molecular inorganics preprocessing flow modifies the input structure (e.g., bond disconnections or transformations required for handling molecular inorganics). As a result, the structure used for INChI generation is no longer strictly equivalent to the original molfile representation. For this reason, OrigAtData_StoreNativeInput() is intentionally not invoked in the Molecular Inorganics path, and pOrigStruct remains NULL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants