Skip to content

Feat/modular julia pardiso#310

Open
QG-phy wants to merge 20 commits into
deepmodeling:mainfrom
QG-phy:feat/modular-julia-pardiso
Open

Feat/modular julia pardiso#310
QG-phy wants to merge 20 commits into
deepmodeling:mainfrom
QG-phy:feat/modular-julia-pardiso

Conversation

@QG-phy
Copy link
Copy Markdown
Collaborator

@QG-phy QG-phy commented Jan 26, 2026

Summary by CodeRabbit

  • New Features

    • Added Pardiso backend for high-performance eigenvalue solving in band structure and DOS calculations.
    • Introduced dptb pdso CLI command to run Pardiso-based postprocessing workflows.
    • Added to_pardiso_json() method for structured data export to the Julia backend.
    • Included automated installation scripts for Julia and Pardiso dependencies on Linux.
  • Documentation

    • Added comprehensive tutorials, examples, and architecture guides for Pardiso integration.

Review Change Stack

YJQ-7 and others added 13 commits January 22, 2026 19:34
…o/Julia band structure calculations and include related tests.
- Remove redundant solve_eigen_dense function and fixed band window logic
- Replace custom eigenvalue solving with existing solve_eigen_at_k function
- Update example notebook with new imports and execution outputs
- Fix root_dir path from Pardiso_teach to To_pardiso
- Add sys.path manipulation for proper module imports
- Expand Julia installation instructions with detailed steps
- Add MKL and Pardiso environment variable configuration
- Include troubleshooting section for common issues
- Clear unnecessary cell outputs and update execution counts
…y check

Replace generic PardisoSolver with MKLPardisoSolver for consistent performance and add explicit architecture check for Apple Silicon systems to prevent runtime failures with helpful guidance to use numpy solver instead.
- Implement `dos_calculation.jl` for Density of States calculations
- Update `main.jl` to switch between `band` and `dos` tasks
- Refactor `band_calculation.jl` to export `bandstructure.h5` using native HDF5
- Add incremental `bands.dat` text output for real-time tracking
- Add verification notebook `examples/To_pardiso/dptb_to_Pardiso_new.ipynb`
- Refactored  to use direct ASE integration in .
- Implemented modular Julia backend structure in .
- Optimized : compressed species, removed redundant orbital arrays, aligned formatting.
- Updated example notebook  with new API usage.
- Improved logging: Output logs to both console and .
- Renamed output directory to .
…fix spin handling

- Renamed `dptb/postprocess/julia` to `dptb/postprocess/pardiso`.
- Updated `io.jl` to support legacy `.dat` files via `load_structure_dat`.
- Fixed `load_structure_json` to correctly account for spin degeneracy.
- Updated `pdso.py` and tests to reflect directory rename.
…hod and rename the legacy text export to `to_pardiso_debug`.
…pdso` entrypoint with ill-conditioned state projection parameters.
…o backend, including platform support details.
@QG-phy QG-phy requested a review from floatingCatty January 26, 2026 01:58
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jan 26, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0d66d44d-2a25-4109-9bde-5683cda030c5

📥 Commits

Reviewing files that changed from the base of the PR and between 8778096 and 95caee6.

📒 Files selected for processing (5)
  • .gitignore
  • dptb/entrypoints/pdso.py
  • dptb/postprocess/pardiso/solvers/pardiso_solver.jl
  • dptb/postprocess/unified/system.py
  • examples/To_pardiso/test_pardiso_new.py
✅ Files skipped from review due to trivial changes (1)
  • .gitignore

📝 Walkthrough

Walkthrough

This PR integrates a high-performance Julia-based Pardiso eigensolver backend into DeePTB. The addition includes a modular Julia backend with separate I/O, solver, and task modules; a new pdso Python CLI entrypoint that orchestrates TBSystem JSON export and subprocess-based Julia execution; enhanced TBSystem export methods generating structure.json and HDF5 files; comprehensive installation automation for Linux; and extensive examples and tests demonstrating the workflow.

Changes

Pardiso Backend Integration

Layer / File(s) Summary
Python CLI and TBSystem export
dptb/entrypoints/pdso.py, dptb/postprocess/unified/system.py, dptb/postprocess/unified/calculator.py
New pdso entrypoint supports full-export mode (init_model + structure → TBSystem.to_pardiso_json) and run-only mode (pre-exported data_dir). TBSystem methods export Hamiltonian/overlap HDF5 and structure.json with geometry, basis metadata, and version info.
Julia I/O module (DataIO)
dptb/postprocess/pardiso/io/io.jl
DataIO loads structure via JSON dispatcher with legacy .dat fallback; parses basis info and derives orbital counts; loads HDF5 Hamiltonian/overlap matrices; expands chemical formulas to element symbols.
Julia solver modules
dptb/postprocess/pardiso/solvers/pardiso_solver.jl, dptb/postprocess/pardiso/solvers/dense_solver.jl
PardisoSolver implements shift-invert Pardiso eigensolver with optional ill-conditioning projection and iterative refinement; DenseSolver provides dense generalized eigensolver fallback for small systems and testing.
Julia utilities and tasks
dptb/postprocess/pardiso/utils/hamiltonian.jl, dptb/postprocess/pardiso/utils/kpoints.jl, dptb/postprocess/pardiso/tasks/band_calculation.jl, dptb/postprocess/pardiso/tasks/dos_calculation.jl
Hamiltonian constructs sparse H_R/S_R from HDF5 blocks and Fourier-transforms to k-space; KPoints parses k-paths and generates k-meshes; BandCalculation and DosCalculation tasks orchestrate eigensolves and write results to HDF5 and ASCII files.
Julia main entry and legacy implementation
dptb/postprocess/pardiso/main.jl, dptb/postprocess/pardiso/sparse_calc_npy_print.jl
main.jl parses CLI args, loads config/structure/matrices, selects solver/task, and runs band or DOS calculations; sparse_calc_npy_print.jl provides comprehensive legacy implementation with NPy export for backward compatibility.
Installation automation
install_julia.sh, install_julia_packages.jl, test_mkl_pardiso.jl
install_julia.sh automates Julia setup on Linux with OS checks; install_julia_packages.jl installs Pardiso and dependencies; test_mkl_pardiso.jl verifies MKL runtime loading and Pardiso solver availability.
Tests and integration
dptb/tests/test_to_pardiso.py, examples/To_pardiso/test_pardiso_new.py
test_to_pardiso.py adds test_to_pardiso_json to validate JSON export schema; test_pardiso_new.py exercises full TBSystem export and Julia backend via subprocess, verifying structure.json content and output files.
Example notebooks and scripts
examples/To_pardiso/dptb_to_Pardiso_new.ipynb, examples/To_pardiso/pardiso_tutorial.ipynb, examples/To_pardiso/dptb_to_Pardiso.ipynb
dptb_to_Pardiso_new.ipynb demonstrates JSON export and modular backend execution; pardiso_tutorial.ipynb provides complete workflow (Python export, manual Julia, CLI invocation, result visualization, backward compatibility); dptb_to_Pardiso.ipynb updates legacy paths and includes example config.
Documentation
README.md, examples/To_pardiso/README.md, dptb/postprocess/pardiso/README.md, band_new.json, examples/To_pardiso/band.json
README.md adds Julia Backend section; examples/To_pardiso/README.md documents directory, CLI/Python APIs, configuration, troubleshooting; Pardiso backend README details modular architecture; band_new.json and band.json provide example configs for band calculations.

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 64.29% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Feat/modular julia pardiso' is directly related to the main objective of the PR: introducing a modular Julia Pardiso integration. It clearly references the primary feature being added.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

🤖 Fix all issues with AI agents
In `@dptb/entrypoints/pdso.py`:
- Around line 18-22: The parameters log_level, log_path, and **kwargs on the
function signature in dptb/entrypoints/pdso.py are currently unused; either
remove them or configure logging at the start of that function: import logging,
call logging.basicConfig(level=log_level, **kwargs) (or pass specific kwargs
through), get a logger (e.g., log = logging.getLogger(__name__)), and if
log_path is set add a FileHandler with file_handler.setLevel(log_level) and
log.addHandler(file_handler); if you prefer to drop them, remove log_level,
log_path and **kwargs from the function signature and any docs/comments
referring to them.

In `@dptb/postprocess/pardiso/io/io.jl`:
- Around line 116-124: The z_to_symbol Dict is missing atomic numbers 57–71
(lanthanides) and a few others so lookups like z_to_symbol[z] (used later in
this module) will KeyError; fix by either populating z_to_symbol with the full
mapping for 1–~84 (add 57=>"La", 58=>"Ce", 59=>"Pr", 60=>"Nd", 61=>"Pm",
62=>"Sm", 63=>"Eu", 64=>"Gd", 65=>"Tb", 66=>"Dy", 67=>"Ho", 68=>"Er", 69=>"Tm",
70=>"Yb", 71=>"Lu" and any other missing entries) or replace direct lookups with
a safe lookup using get(z_to_symbol, z, fallback_symbol) or integrate a
periodic-table package (e.g. PeriodicTable.jl) and use its lookup function
instead.

In `@dptb/postprocess/pardiso/main.jl`:
- Around line 118-130: Refactor pardiso_solver.jl to declare a module (e.g.,
module PardisoSolver) and export/define solve_eigen_k inside it (mirroring
DenseSolver), then update main.jl to select the implementation by qualifying the
function or importing selectively: when eig_solver == "dense" use
DenseSolver.solve_eigen_k (or import DenseSolver: solve_eigen_k), and when
eig_solver == "pardiso" use PardisoSolver.solve_eigen_k (or import
PardisoSolver: solve_eigen_k); ensure main.jl no longer relies on an unmodulized
global solve_eigen_k so the Pardiso implementation is not shadowed by
DenseSolver.

In `@dptb/postprocess/pardiso/solvers/pardiso_solver.jl`:
- Around line 1-13: This file is missing a module wrapper so the export
statements (used later around the exports near the end of the file) have no
effect; fix it by wrapping the file contents in a module declaration (e.g.,
module PardisoSolver) and a matching end at the bottom of pardiso_solver.jl,
moving the using statements (using Pardiso, Arpack, LinearMaps, LinearAlgebra)
and const default_dtype inside that module so exported symbols work correctly;
ensure the module name you choose matches the intended public API referenced by
the export statements.

In `@dptb/postprocess/pardiso/sparse_calc_npy_print.jl`:
- Around line 326-334: The z_to_symbol Dict in sparse_calc_npy_print.jl is
incomplete (missing Z=57–71, 85–88, 89+) and will raise KeyError for those
atoms; update the z_to_symbol mapping to include the full periodic table or
replace its usage with a reliable lookup (e.g., using a package like
PeriodicTable.jl or a complete array/map) and add a safe fallback (e.g., return
"X" or string(z) when an unknown Z is requested) wherever z_to_symbol is
accessed (search for z_to_symbol in this file to locate call sites).
- Around line 398-400: Remove the duplicated save call: there are two identical
invocations of save(sparse_file, "H_R", H_R, "S_R", S_R) back-to-back; keep a
single save(...) to avoid redundant I/O and leave the subsequent
tee_info("Sparse matrices constructed and cached", log_path) unchanged so the
caching message still logs after the one save completes.

In `@dptb/postprocess/pardiso/tasks/dos_calculation.jl`:
- Around line 94-98: The format string in the `@printf` call within the open(...,
"w") do f block is using an escaped backslash sequence ("\\n") which will write
a literal backslash and 'n' instead of a newline; update the `@printf` call in the
loop (for (ω, d) in zip(ωlist, dos)) to use a normal newline escape ("\n") so
each record is written on its own line and the output file dos.dat is correctly
formatted.

In `@dptb/tests/test_to_pardiso.py`:
- Around line 118-126: total_orbitals and expected_orbitals are assigned but
never asserted; compute expected_orbitals from the per-atom orbital count (e.g.
data["basis_info"]["orbitals_per_atom"] or similar field) multiplied by the
number of atoms in the test system, then if has_soc (tbsys.model.soc_param
present) adjust expected_orbitals for spin doubling and assert that
data["basis_info"]["total_orbitals"] == expected_orbitals; add a clear assertion
message referencing total_orbitals, expected_orbitals and the spinful flag to
fail loudly if mismatch.

In `@examples/To_pardiso/dptb_to_Pardiso_new.ipynb`:
- Around line 133-135: The julia_script path is wrong and will raise
FileNotFoundError; update the string used to build the Julia script path (the
julia_script variable where parent_path is joined) from
"dptb/postprocess/julia/main.jl" to the correct
"dptb/postprocess/pardiso/main.jl" so it matches other references (see pdso.py
usage) and ensure parent_path/julia_script points to the actual main.jl; verify
the related variables parent_path and config_path remain unchanged.
- Around line 107-108: Replace the non-existent method call tbsys.to_pardiso_new
with the correct existing method tbsys.to_pardiso throughout the notebook and
its documentation: update the call at the shown cell (lines calling
to_pardiso_new), any example/test references that mention to_pardiso_new, and
any explanatory text or docstrings that describe to_pardiso_new so they
reference to_pardiso instead; verify the notebook now invokes the actual
function name to_pardiso (the implementation is the existing to_pardiso method).

In `@examples/To_pardiso/dptb_to_Pardiso.ipynb`:
- Around line 263-265: The notebook sets julia_script using parent_path and the
outdated subpath "dptb/postprocess/julia/sparse_calc_npy_print.jl"; update the
julia_script assignment to point to the new location under
"dptb/postprocess/pardiso/" (replace the "dptb/postprocess/julia/..." segment
with "dptb/postprocess/pardiso/...") so the variable julia_script resolves to
the moved script while still constructing the path from parent_path.

In `@examples/To_pardiso/README.md`:
- Around line 85-89: The README k-point example uses "klabels": ["Γ", "Z"] which
is inconsistent with the project's config files that use "G"; update the
documentation to match the actual configs by replacing the "Γ" label with "G"
(or alternatively update the config files to use "Γ" if you prefer the Unicode
label) so that the "klabels" entries in the README and the "klabels" in
band.json / band_new.json are identical; locate the "klabels" array in the
README example and make the label change accordingly.

In `@examples/To_pardiso/test_pardiso_new.py`:
- Around line 15-16: The current sys.path insertion uses
os.path.dirname(os.path.dirname(os.getcwd())) which is fragile; change the
insertion to derive the project root from the script file location (use
__file__) instead — e.g. compute the absolute path via
os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '..')) or
Path(__file__).resolve().parents[2] and pass that to sys.path.insert(0, ...);
update the call site where sys.path.insert is invoked and remove the
os.getcwd()-based os.path.dirname(os.path.dirname(os.getcwd())) expression.

In `@install_julia_packages.jl`:
- Around line 29-46: The loop that installs packages (iterating over packages
and calling Pkg.add) currently swallows failures and proceeds to Pkg.precompile,
so change it to collect failed package names when a Pkg.add throws (e.g., push!
to a failed_packages Vector), print a summary of failed packages after the loop
and abort immediately (exit(1) or throw an error) if failed_packages is
non-empty instead of calling Pkg.precompile or printing "Installation complete";
keep the success/failed prints per-package but ensure Pkg.precompile only runs
when failed_packages is empty.

In `@install_julia.sh`:
- Around line 68-74: The current call julia install_julia_packages.jl is
relative to the current working directory; make it path-independent by resolving
the script's directory and invoking Julia with the absolute path to
install_julia_packages.jl (i.e., compute the script folder from $0/BASH_SOURCE
and call Julia with "$SCRIPT_DIR/install_julia_packages.jl"). Update
install_julia.sh to resolve the script directory and use that absolute path when
running the install_julia_packages.jl command so the install works regardless of
where the user runs the shell script.

In `@README.md`:
- Around line 197-202: Update the README's "Manual Installation" bash snippet
comment that currently reads "# Linux/macOS" to clarify macOS support: change
the comment to explicitly state that the installer command works on macOS but
the Pardiso backend is not supported (e.g., "# Linux (macOS: Julia installs but
Pardiso backend won't work)") or remove "macOS" entirely; modify the comment
above the curl command in the Manual Installation section so readers know Julia
installs on macOS but Pardiso is unsupported, referencing the existing "#
Linux/macOS" comment to locate the change.

In `@test_mkl_pardiso.jl`:
- Around line 37-39: The condition uses the Ref{Bool} flag MKL_PARDISO_LOADED
but fails to dereference it; change the conditional and any access to
MKL_PARDISO_LOADED to use MKL_PARDISO_LOADED[] so the Bool value is read (e.g.,
if !MKL_PARDISO_LOADED[] ...), and update any other occurrences in this test
(such as prints or branches) to consistently use the [] dereference when
checking or reading the flag.
- Around line 10-13: Replace the hard-coded macOS library extension when
building mkl_path in test_mkl_pardiso.jl: instead of appending ".dylib"
directly, use a platform-aware extension (e.g. Libdl.dlext or conditional on
Sys.islinux()/Sys.isapple()/Sys.iswindows()) and construct mkl_path with
joinpath(MKL_jll.LIBPATH[], "libmkl_rt"*Libdl.dlext) so the test finds the
correct MKL runtime on Linux (.so), macOS (.dylib) or Windows (.dll).
🧹 Nitpick comments (34)
CLAUDE.md (1)

51-89: Consider documenting the new pdso CLI command.

The PR introduces a new pdso entrypoint for the Pardiso workflow, but it's not documented in the "Running DeePTB" section. For completeness, consider adding:

# Run Pardiso workflow for postprocessing
uv run dptb pdso INPUT -i INIT_MODEL [-stu STRUCTURE] [-o OUTPUT]
examples/To_pardiso/band.json (2)

32-37: String booleans instead of native JSON booleans.

Multiple fields use string "false" instead of JSON boolean false. This is consistent with band_new.json, but native booleans would be more idiomatic and avoid potential parsing issues.

     "device": "cpu",
-    "out_wfc": "false",
+    "out_wfc": false,
     "which_k": 0,
     "max_iter": 400,
     "num_band": 30,
-    "gamma_only": "false",
-    "isspinful": "false"
+    "gamma_only": false,
+    "isspinful": false
 }

37-38: Minor: Missing trailing newline.

The file doesn't end with a newline character, which is a POSIX convention. Most editors and linters prefer files to end with a newline.

examples/To_pardiso/README.md (1)

63-69: Hardcoded relative path to Julia script may be fragile.

The Python API example uses a relative path ../../dptb/postprocess/pardiso/main.jl which will break if the script is run from a different directory. Consider documenting the need to adjust the path or suggesting an absolute path approach.

Suggested improvement
import subprocess
import os

# Get path relative to dptb package installation
import dptb
pardiso_main = os.path.join(os.path.dirname(dptb.__file__), "postprocess/pardiso/main.jl")

subprocess.run([
    "julia", pardiso_main,
    "--input_dir", "pardiso_data",
    "--output_dir", "results",
    "--config", "band_new.json"
])
install_julia_packages.jl (1)

7-10: Activate a project-local environment for reproducible installs.
Installing into the default environment can cause version drift across machines. Consider activating a local env (and committing Project/Manifest) so this script is deterministic.

♻️ Suggested adjustment
 using Pkg
+
+# Use a project-local environment for reproducible installs
+Pkg.activate(`@__DIR__`)
dptb/postprocess/pardiso/README.md (1)

161-166: Minor wording polish in Performance Tips.
Consider removing “very” or using a more precise phrasing.

✏️ Suggested tweak
-4. **Memory**: For very large systems (>10000 orbitals), consider reducing `num_band`
+4. **Memory**: For large systems (>10,000 orbitals), consider reducing `num_band`
examples/To_pardiso/pardiso_tutorial.ipynb (1)

318-326: Small notebook cleanups (label name + f-string).
Rename the list-comprehension variable for clarity and drop the f-string with no placeholders.

✏️ Suggested tweak
-    labels = [l.decode('utf-8') for l in labels_bytes]
+    labels = [label.decode('utf-8') for label in labels_bytes]
 ...
-print(f"Successfully loaded bandstructure.h5")
+print("Successfully loaded bandstructure.h5")
dptb/postprocess/pardiso/utils/kpoints.jl (1)

21-56: klabels_vec is returned but never populated from kpath_config.

The function returns klabels_vec which is either empty or a copy of the labels parameter, but the k-path configuration rows likely contain label information that isn't being extracted. If labels are expected to come from kpath_config (e.g., as a 5th element), they should be parsed; otherwise, the docstring should clarify that labels must be passed explicitly.

Additionally, when n_segment == 0 for a segment, no intermediate k-points are added, but high_sym_kpts still accumulates the distance. This may cause a mismatch between high_sym_kpts indices and actual k-point positions in klist_vec.

dptb/postprocess/pardiso/solvers/dense_solver.jl (2)

70-74: Empty conditional block is dead code.

The if block at lines 71-74 contains only a comment and no executable code. Either implement the padding logic or remove this block to reduce confusion.

♻️ Suggested fix
-    # Check if we have enough
-    if length(closest_indices) < num_band
-        # Pad with very large values if not enough states (unlikely for dense)
-        # But here we just return what we have
-    end
+    # Note: If length(closest_indices) < num_band, we return what we have.
+    # This is unlikely for dense diagonalization but handled gracefully.

46-68: Band selection strategy may cause discontinuities in band structure plots.

Selecting the num_band eigenvalues closest to fermi_level independently at each k-point can cause band switching/discontinuities when bands cross the selection boundary. The inline comments acknowledge this concern.

For production use, consider either:

  1. Selecting a fixed index range (e.g., bands n to n+num_band-1) determined at the first k-point
  2. Using an energy window [fermi_level - delta, fermi_level + delta]

This is acceptable for initial implementation but worth revisiting if visual artifacts appear.

dptb/postprocess/pardiso/io/io.jl (2)

78-84: Potential inconsistency between calculated site_norbits and total_orbitals from JSON.

site_norbits is calculated locally (line 78) while norbits is read directly from basis["total_orbitals"] (line 84). If these don't match due to data inconsistency, downstream code may fail silently. Consider adding a validation check.

♻️ Suggested validation
     site_norbits = [orb_counts[sym] * (1 + spinful) for sym in symbols]
+    
+    # Validate consistency
+    calculated_total = sum(site_norbits)
+    if calculated_total != basis["total_orbitals"]
+        `@warn` "Mismatch: calculated orbitals ($calculated_total) != JSON total ($(basis["total_orbitals"]))"
+    end

     structure = Dict{String, Any}(

183-184: Missing blank line before docstring.

There's no blank line between the closing of load_matrix_hdf5 (line 183) and the docstring for expand_species (line 184). This affects readability and may cause documentation generation issues.

♻️ Suggested fix
     end
 end
+
 """
 Expand chemical formula string (e.g. "C2H2") or list to list of symbols.
 """
dptb/entrypoints/main.py (2)

511-517: Argument naming inconsistency: --init_model vs --init-model.

Other subparsers (train, test, run, export) use --init-model with a hyphen, but pdso uses --init_model with an underscore. This inconsistency may confuse users. Argparse converts hyphens to underscores internally, so --init-model would still map to init_model in the namespace.

♻️ Suggested fix for consistency
     parser_pdso.add_argument(
         "-i",
-        "--init_model",
+        "--init-model",
         type=str,
         default=None,
         help="Path to model checkpoint (triggers Export + Run mode)."
     )

Note: You'll also need to update dptb/entrypoints/pdso.py parameter name from init_model to match, or use dest="init_model" in the argument definition.


543-548: Non-standard boolean argument parsing.

Using type=lambda x: x.lower() == 'true' requires users to type --ill_project true or --ill_project false. The more common CLI pattern uses --flag / --no-flag with store_true/store_false actions, or BooleanOptionalAction in Python 3.9+.

Current approach works but may surprise users expecting standard flag behavior.

♻️ Alternative using BooleanOptionalAction (Python 3.9+)
     parser_pdso.add_argument(
-        "--ill_project",
-        type=lambda x: x.lower() == 'true',
-        default=True,
-        help="Enable ill-conditioned state projection (default: True)."
+        "--ill-project/--no-ill-project",
+        action=argparse.BooleanOptionalAction,
+        default=True,
+        help="Enable/disable ill-conditioned state projection (default: enabled)."
     )

This allows --ill-project (enable) or --no-ill-project (disable).

examples/To_pardiso/dptb_to_Pardiso_new.ipynb (2)

47-50: Incorrect use of "__file__" string literal.

Line 47 uses os.path.abspath("__file__") which gets the absolute path of a literal string "__file__", not the notebook's location. In Jupyter notebooks, __file__ is not defined, so the fallback to os.getcwd() at line 50 will always execute. Consider simplifying:

♻️ Suggested fix
-"root_dir = os.path.dirname(os.path.abspath(\"__file__\")) \n",
-"# Note: In Jupyter __file__ might not exist, using current dir if needed\n",
-"if not os.path.exists(root_dir) or root_dir == '':\n",
-"    root_dir = os.getcwd()\n",
+"# In Jupyter notebooks, use the current working directory\n",
+"root_dir = os.getcwd()\n",

191-193: Ambiguous variable name l.

The variable l (lowercase L) can be confused with 1 (one) or I (uppercase i) in many fonts. Use a more descriptive name like label or lbl.

♻️ Suggested fix
-"    labels = [l.decode('utf-8') for l in labels_bytes]\n",
+"    labels = [label.decode('utf-8') for label in labels_bytes]\n",
examples/To_pardiso/test_pardiso_new.py (1)

13-13: Remove unused import.

numpy is imported but never used in this file.

Proposed fix
 import os
 import sys
 import json
-import numpy as np
dptb/postprocess/pardiso/tasks/dos_calculation.jl (1)

89-92: Consider vectorizing the Gaussian broadening loop for performance.

The nested triple loop iterates over nk_total × num_band × length(ωlist), which can be expensive for large systems. Julia's broadcasting could significantly improve performance:

Vectorized alternative
# Vectorized approach using broadcasting
for ik in 1:nk_total
    for ib in 1:num_band
        diff = egvals_all[ib,ik] .- ωlist .- fermi_level
        dos .+= exp.(-(diff.^2 / ϵ^2)) .* factor
    end
end
dptb/tests/test_to_pardiso.py (1)

7-7: Remove unused import.

ast is imported but never used in this file.

Proposed fix
 import json
-import ast
 import numpy as np
dptb/postprocess/pardiso/main.jl (1)

100-101: Fragile boolean parsing from config.

The expression config["isspinful"] in [true, "true"] mixes boolean and string comparisons. This could miss valid inputs like "True" or 1.

Proposed fix
-    spinful = haskey(config, "isspinful") ? (config["isspinful"] in [true, "true"]) : false
+    spinful = haskey(config, "isspinful") ? (lowercase(string(config["isspinful"])) in ["true", "1"]) : false
docs/pardiso_architecture.md (1)

18-30: Add language specifier to fenced code block.

The directory structure code block lacks a language specifier. Use text or plaintext for better markdown compatibility.

Proposed fix
-```
+```text
 dptb/postprocess/julia/
 ├── io/
dptb/entrypoints/pdso.py (2)

12-23: Use explicit Optional type hints per PEP 484.

Parameters with = None default should use Optional[T] for clarity and static analysis compliance.

Proposed fix
+from typing import Optional
+
 def pdso(
         INPUT: str,
-        init_model: str = None,
-        structure: str = None,
-        data_dir: str = None,
+        init_model: Optional[str] = None,
+        structure: Optional[str] = None,
+        data_dir: Optional[str] = None,
         output_dir: str = "./",
         log_level: int = 20,
-        log_path: str = None,
+        log_path: Optional[str] = None,
         ill_project: bool = True,
         ill_threshold: float = 5e-4,
         **kwargs
         ):

81-83: Simplify exception logging.

log.exception() already includes the exception information automatically; passing e explicitly is redundant.

Proposed fix
         except Exception as e:
-            log.exception(f"Export failed: {e}")
+            log.exception("Export failed")
             sys.exit(1)
dptb/postprocess/pardiso/utils/hamiltonian.jl (2)

14-44: Consider using JLD2 instead of Serialization for caching.

The Serialization module in Julia is not safe for loading untrusted data and can execute arbitrary code during deserialization. For a cache file that might be shared or persisted, JLD2.jl provides a safer and more portable alternative.

Additionally, the cache file extension .jld is misleading since it's actually using Julia's native Serialization format, not the JLD format.

♻️ Suggested improvement using JLD2
-using Serialization
+using JLD2

 function get_HR_SR_sparse(input_dir::String, structure::Dict, matrix_loader::Function, use_cache::Bool=true)
-    cache_file = joinpath(input_dir, "sparse_matrices.jld")
+    cache_file = joinpath(input_dir, "sparse_matrices.jld2")

     if use_cache && isfile(cache_file)
         println("Loading H/S matrices from cache: $cache_file")
         try
-            data = deserialize(cache_file)
-            if isa(data, Tuple) && length(data) == 2
-                 return data[1], data[2]
-            else
-                 println("Cache format invalid. Rebuilding...")
-            end
+            `@load` cache_file H_R S_R
+            return H_R, S_R
         catch e
             println("Failed to load cache: $e. Rebuilding...")
         end
     end
     # ... rest of function ...
     if use_cache
         println("Saving sparse H/S matrices to cache: $cache_file")
-        serialize(cache_file, (H_R, S_R))
+        `@save` cache_file H_R S_R
     end

101-102: Consider using Tuple{Int,Int,Int} instead of Vector{Int} for dictionary keys.

Using Vector{Int} as dictionary keys is less efficient because vectors are mutable and require content-based hashing. Tuples are immutable and have faster hash/equality operations.

♻️ Suggested improvement
-    H_data = Dict{Vector{Int}, Tuple{Vector{Int}, Vector{Int}, Vector{ComplexF64}}}()
-    S_data = Dict{Vector{Int}, Tuple{Vector{Int}, Vector{Int}, Vector{ComplexF64}}}()
+    H_data = Dict{Tuple{Int,Int,Int}, Tuple{Vector{Int}, Vector{Int}, Vector{ComplexF64}}}()
+    S_data = Dict{Tuple{Int,Int,Int}, Tuple{Vector{Int}, Vector{Int}, Vector{ComplexF64}}}()

# And later:
-            R = [rx, ry, rz]
+            R = (rx, ry, rz)

This would also require updating the return type and HR2HK to use tuples for R vectors.

dptb/postprocess/pardiso/tasks/band_calculation.jl (2)

92-120: Inefficient file I/O: opening/closing file on every k-point.

The bands.dat file is opened and closed for each k-point in the loop (lines 112-120). For large k-point calculations, this creates significant I/O overhead.

♻️ Suggested improvement: Keep file handle open during calculation
     # Initialize text output (bands.dat)
     txt_path = joinpath(output_dir, "bands.dat")
-    open(txt_path, "w") do f
-        `@printf`(f, "# %4s %10s %10s %10s %12s %s\n", "Idx", "Kx", "Ky", "Kz", "Dist", "Eigenvalues(eV, shifted by Fermi)")
-    end
 
     all_egvals = Vector{Vector{Float64}}()
     start_time = time()
 
-    # Main calculation loop
-    for (ik, kpt) in enumerate(klist)
+    # Main calculation loop - keep file open for efficiency
+    open(txt_path, "w") do f
+        `@printf`(f, "# %4s %10s %10s %10s %12s %s\n", "Idx", "Kx", "Ky", "Kz", "Dist", "Eigenvalues(eV, shifted by Fermi)")
+        
+        for (ik, kpt) in enumerate(klist)
             # Construct H(k) and S(k)
             H_k, S_k = HR2HK(kpt, H_R, S_R, norbits)
 
             # Solve eigenvalue problem using provided solver
             egvals, _, _ = solver_func(H_k, S_k, fermi_level, num_band, max_iter,
                                             false, solver_opts.ill_project, solver_opts.ill_threshold)
             push!(all_egvals, egvals)
 
-            # Append to text file incrementally
-            open(txt_path, "a") do f
-                # Write K-point info
-                `@printf`(f, "%6d %10.6f %10.6f %10.6f %12.6f", ik, kpt[1], kpt[2], kpt[3], xlist[ik])
-                # Write eigenvalues (shifted by Fermi level for consistency with plot)
-                for e in egvals
-                    `@printf`(f, " %12.6f", e - fermi_level)
-                end
-                `@printf`(f, "\n")
+            # Write K-point info
+            `@printf`(f, "%6d %10.6f %10.6f %10.6f %12.6f", ik, kpt[1], kpt[2], kpt[3], xlist[ik])
+            for e in egvals
+                `@printf`(f, " %12.6f", e - fermi_level)
             end
+            `@printf`(f, "\n")
 
             # Progress logging
             if ik % 10 == 0 || ik == length(klist)
                 elapsed = (time() - start_time) / 60
                 log_message(`@sprintf`("K-point %4d/%d done | Elapsed: %.2f min", ik, length(klist), elapsed))
             end
+        end
     end

59-68: Consider logging file errors instead of silently ignoring them.

The empty catch block makes debugging difficult if file writes consistently fail. At minimum, log a warning.

♻️ Suggested improvement
         catch e
-            # Ignore file errors to prevent crash
+            `@debug` "Failed to write to log file: $e"
         end
dptb/postprocess/pardiso/sparse_calc_npy_print.jl (1)

559-584: Executing embedded Python scripts is fragile and potentially insecure.

This approach has several concerns:

  1. Assumes python3 is available in PATH
  2. Creates temporary files that could fail to clean up on errors
  3. The Python script path contains the output directory which could have special characters

Consider using Julia's NPZ.jl package for direct NPY file writing, or at minimum add proper error handling and path escaping.

♻️ Alternative using NPZ.jl
using NPZ

function save_bandstructure_npy(klist, xlist, eigenvalues, e_fermi, high_sym, labels, output_dir, log_path)
    try
        # Convert data to arrays
        klist_arr = hcat(klist...)
        eig_arr = hcat(eigenvalues...)
        
        npzwrite(joinpath(output_dir, "bandstructure.npz"), Dict(
            "klist" => klist_arr,
            "xlist" => collect(xlist),
            "eigenvalues" => eig_arr,
            "E_fermi" => e_fermi,
            "high_sym_kpoints" => collect(high_sym),
            # Note: NPZ doesn't handle string arrays well, save separately if needed
        ))
        tee_info("Generated bandstructure.npz", log_path)
    catch e
        `@warn` "Failed to generate bandstructure.npz: $e"
        tee_log("[Warn] Failed to generate bandstructure.npz: $e", log_path)
    end
end
docs/phase1_summary.md (1)

18-29: Add language specifier to fenced code block.

The code block starting at line 18 is missing a language identifier. Based on static analysis hint, this should specify the language for proper syntax highlighting.

♻️ Proposed fix
-```
+```text
 dptb/postprocess/julia/
 ├── io/
 │   ├── structure_io.jl      # Load JSON structure
dptb/postprocess/pardiso/solvers/pardiso_solver.jl (1)

59-114: Code duplication: make_shift_invert_map call is identical in both branches.

Lines 61 and 101 create the same shift-invert map. The map creation and Pardiso setup can be hoisted before the if statement.

♻️ Suggested refactor
 function solve_eigen_k(H_k, S_k, fermi_level, num_band, max_iter, out_wfc, ill_project, ill_threshold)
+    # Common setup for both branches
+    lm, ps = make_shift_invert_map(Hermitian(H_k) - fermi_level * Hermitian(S_k), Hermitian(S_k))
+    
     if ill_project
-        lm, ps = make_shift_invert_map(Hermitian(H_k) - fermi_level * Hermitian(S_k), Hermitian(S_k))
-
         if out_wfc
             egval_inv, egvec_sub = eigs(lm, nev=num_band, which=:LM, ritzvec=true, maxiter=max_iter)
         # ... rest of ill_project branch ...
-
-        set_phase!(ps, Pardiso.RELEASE_ALL)
-        pardiso(ps)
     else
-        lm, ps = make_shift_invert_map(Hermitian(H_k) - fermi_level * Hermitian(S_k), Hermitian(S_k))
-
         if out_wfc
             egval_inv, egvec = eigs(lm, nev=num_band, which=:LM, ritzvec=true, maxiter=max_iter)
         # ... rest of non-ill_project branch ...
-
-        set_phase!(ps, Pardiso.RELEASE_ALL)
-        pardiso(ps)
     end
+    
+    # Common cleanup
+    set_phase!(ps, Pardiso.RELEASE_ALL)
+    pardiso(ps)
dptb/postprocess/unified/system.py (4)

536-544: Unused variable orb_type and redundant inner loop.

The variable orb_type is assigned but never used (Ruff F841). The inner loop iterates through l_map keys but the matched key t is used directly instead.

♻️ Proposed fix
         for elem, orbs in basis.items():
             norb = 0
             for orb in orbs:
-                orb_type = orb[-1] 
                 for t in l_map:
                     if t in orb:
                         norb += l_map[t]
                         break
             orbital_counts[elem] = norb

571-574: Remove extraneous f prefix from strings without placeholders.

Per Ruff F541, these f-strings contain no placeholders and should be regular strings.

♻️ Proposed fix
-        log.info(f"Successfully saved all Pardiso data (NEW format).")
+        log.info("Successfully saved all Pardiso data (NEW format).")
         log.info(f"  - Hamiltonian blocks: {len(hr)}")
         log.info(f"  - Structure: {len(self.atoms)} atoms, {basis_info['total_orbitals']} orbitals")
-        log.info(f"  - Files: predicted_hamiltonians.h5, predicted_overlaps.h5, structure.json")
+        log.info("  - Files: predicted_hamiltonians.h5, predicted_overlaps.h5, structure.json")

588-606: Complex regex-based JSON formatting may be fragile.

The regex approach for collapsing JSON arrays and objects is clever but can be brittle with edge cases (e.g., strings containing brackets, deeply nested structures). Consider using Python's json module with a custom encoder or the simplejson library's formatting options.

Also, line 594 has multiple statements on one line (Ruff E701).

♻️ Minor fix for line 594
             try:
-                if not tokens: return "[]"
+                if not tokens:
+                    return "[]"
                 # Check if numbers

519-520: Move import dptb to top of file.

Importing inside a method is unconventional and can mask import errors until the method is called. Since dptb is the package this file is part of, it should always be available.

♻️ Suggested improvement

Add at top of file:

import dptb

And remove from line 519.

Comment thread dptb/entrypoints/pdso.py
Comment thread dptb/postprocess/pardiso/io/io.jl Outdated
Comment thread dptb/postprocess/pardiso/main.jl
Comment thread dptb/postprocess/pardiso/solvers/pardiso_solver.jl
Comment thread dptb/postprocess/pardiso/sparse_calc_npy_print.jl Outdated
Comment thread install_julia_packages.jl
Comment thread install_julia.sh
Comment thread README.md
Comment thread test_mkl_pardiso.jl
Comment thread test_mkl_pardiso.jl
@QG-phy
Copy link
Copy Markdown
Collaborator Author

QG-phy commented Jan 26, 2026

@YJQ-7 看一下,我重构了一下代码。之前的目前还支持。后面可以梳理一下。弄干净点。

@YJQ-7
Copy link
Copy Markdown
Contributor

YJQ-7 commented Jan 26, 2026

@YJQ-7 看一下,我重构了一下代码。之前的目前还支持。后面可以梳理一下。弄干净点。

好的老师

QG-phy and others added 2 commits January 26, 2026 14:34
…lver in a module, and add ill-conditioned projection and improved eigenvector handling to the dense solver. (#2)
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@dptb/postprocess/pardiso/solvers/pardiso_solver.jl`:
- Around line 60-115: The solve_eigen_k_pardiso function can leak MKL/Pardiso
resources if eigs (or other linear algebra) throws because set_phase!(ps,
Pardiso.RELEASE_ALL) and pardiso(ps) are only called on the happy path; wrap the
creation/use of the Pardiso state (lm, ps from make_shift_invert_map) and all
eigs/linear-algebra work in a try/finally block and move set_phase!(ps,
Pardiso.RELEASE_ALL) and pardiso(ps) into the finally so they always run; apply
this to both branches (ill_project true and false) and ensure variables (egval,
egvec, etc.) are assigned before rethrowing or returning so callers still see
deterministic state.
🧹 Nitpick comments (4)
dptb/postprocess/pardiso/solvers/dense_solver.jl (3)

13-15: default_dtype is not exported but may be intended as public API.

The constant default_dtype is defined but not included in the export statement. If this constant is intended for external use (e.g., by callers needing to match the expected data type), consider exporting it:

-export solve_eigen_k_dense
+export solve_eigen_k_dense, default_dtype

Otherwise, if it's internal-only, prefix with underscore per Julia convention (_default_dtype).


17-32: Docstring is incomplete—missing 4 parameters.

The docstring documents only 4 of the 8 function parameters. Consider adding documentation for:

  • max_iter: Unused in dense solver (kept for API compatibility)
  • out_wfc: Whether to compute and return eigenvectors
  • ill_project: Enable ill-conditioning handling via S-eigenspace projection
  • ill_threshold: Threshold for filtering small S-eigenvalues

33-36: Consider underscore prefix for unused max_iter parameter.

The max_iter parameter is never used in this dense solver (expected, since dense diagonalization doesn't iterate). Julia convention suggests prefixing unused parameters with underscore to signal intent:

-function solve_eigen_k_dense(H_k, S_k, fermi_level, num_band, max_iter, out_wfc, ill_project, ill_threshold)
+function solve_eigen_k_dense(H_k, S_k, fermi_level, num_band, _max_iter, out_wfc, ill_project, ill_threshold)
dptb/postprocess/pardiso/solvers/pardiso_solver.jl (1)

43-47: Reduce per-iteration allocation inside LinearMap apply.

S * x allocates every multiply; Arpack will call this many times. Consider a preallocated buffer with mul! to cut allocations and improve solver throughput.

♻️ Suggested refactor (preallocate & mul!)
-    lm = LinearMap{ComplexF64}(
-        (y, x) -> begin
-            set_phase!(ps, Pardiso.SOLVE_ITERATIVE_REFINE)
-            pardiso(ps, y, H_pardiso, S * x)
-        end,
-        size(H, 1);
-        ismutating=true
-    )
+    tmp = similar(b)
+    lm = LinearMap{ComplexF64}(
+        (y, x) -> begin
+            mul!(tmp, S, x)
+            set_phase!(ps, Pardiso.SOLVE_ITERATIVE_REFINE)
+            pardiso(ps, y, H_pardiso, tmp)
+        end,
+        size(H, 1);
+        ismutating=true
+    )

Comment thread dptb/postprocess/pardiso/solvers/pardiso_solver.jl
Comment thread CLAUDE.md Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@dptb/entrypoints/pdso.py`:
- Around line 54-55: Resolve and validate the INPUT-derived path before calling
Julia: after computing config_path = os.path.abspath(INPUT) (and before any
Julia invocation), verify the file exists and is readable (e.g.,
os.path.exists/config_path and os.path.isfile(config_path) or equivalent) and
raise a clear exception or exit with an informative error if not; update error
message to reference config_path so failures are fast and not opaque. Also
ensure output_path resolution remains unchanged and only proceed to invoke the
Julia logic when config_path validation passes.

In `@dptb/postprocess/unified/system.py`:
- Around line 588-607: The l_map used to compute orbital counts is missing the
'g' orbital mapping, causing incorrect orbital_counts, site_norbits, and
total_orbitals for bases that include g functions; update the mapping in the
block that defines l_map so it includes "g": 9 (alongside
"s":1,"p":3,"d":5,"f":7), then regenerate
orbital_counts/site_norbits/total_orbitals (the code that populates
orbital_counts, computes site_norbits, and sets total_orbitals in basis_info) so
g orbitals are counted correctly.

In `@examples/To_pardiso/test_pardiso_new.py`:
- Around line 93-99: The test is swallowing backend failures by catching
subprocess.CalledProcessError and FileNotFoundError and only printing messages;
modify the test around the subprocess.run(cmd, check=True) call so failures
cause the test to fail or be explicitly skipped: for FileNotFoundError call
pytest.skip with a clear message, and for CalledProcessError either re-raise the
exception (remove the except block) or assert the subprocess return code / let
check=True propagate the error so the test fails; reference the subprocess.run
call and the surrounding test_* function to locate and update the handling.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 265088cf-af47-49e9-ac82-e1883be17b9f

📥 Commits

Reviewing files that changed from the base of the PR and between e44674a and e4b36e9.

📒 Files selected for processing (6)
  • dptb/entrypoints/pdso.py
  • dptb/postprocess/pardiso/io/io.jl
  • dptb/postprocess/unified/calculator.py
  • dptb/postprocess/unified/system.py
  • dptb/tests/test_to_pardiso.py
  • examples/To_pardiso/test_pardiso_new.py
✅ Files skipped from review due to trivial changes (1)
  • dptb/postprocess/unified/calculator.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • dptb/postprocess/pardiso/io/io.jl
  • dptb/tests/test_to_pardiso.py

Comment thread dptb/entrypoints/pdso.py
Comment thread dptb/postprocess/unified/system.py Outdated
Comment thread examples/To_pardiso/test_pardiso_new.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
dptb/postprocess/pardiso/solvers/pardiso_solver.jl (1)

60-106: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Variable scoping error: egval_inv and egvec_sub used outside the try block where they are assigned.

In Julia, variables assigned inside a try block are local to that block and cannot be accessed afterward. Lines 66–69 assign egval_inv and egvec_sub inside the try block, but lines 80 and 82 attempt to use them outside, causing UndefVarError at runtime.

Initialize these variables before the try block:

Fix: Declare variables before try block
     if ill_project
         lm, ps = make_shift_invert_map(Hermitian(H_k) - fermi_level * Hermitian(S_k), Hermitian(S_k))
 
+        egval_inv = nothing
+        egvec_sub = nothing
         try
             if out_wfc
                 egval_inv, egvec_sub = eigs(lm, nev=num_band, which=:LM, ritzvec=true, maxiter=max_iter)
             else
                 egval_inv = eigs(lm, nev=num_band, which=:LM, ritzvec=false, maxiter=max_iter)[1]
                 egvec_sub = zeros(default_dtype, size(H_k, 1), 0)
             end
         finally
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dptb/postprocess/pardiso/solvers/pardiso_solver.jl` around lines 60 - 106,
The variables egval_inv and egvec_sub are assigned only inside the try in
solve_eigen_k_pardiso, causing a scoping error when used later; to fix, declare
and initialize egval_inv and egvec_sub just before the try (e.g. egval_inv =
similar(...) or egval_inv = zeros(...); egvec_sub = zeros(default_dtype,
size(H_k,1), 0)) so they exist in the outer scope regardless of which branch
inside the try runs, then keep the existing assignments inside the try (where
eigs is called) and leave the resource-release code (set_phase!/pardiso)
unchanged.
♻️ Duplicate comments (2)
dptb/entrypoints/pdso.py (1)

57-58: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Validate config_path before invoking Julia.

Fail fast if the config file is missing/unreadable to avoid opaque backend errors.

Proposed fix
     config_path = os.path.abspath(INPUT)
+    if not (os.path.isfile(config_path) and os.access(config_path, os.R_OK)):
+        log.error(f"Configuration file not found or not readable: {config_path}")
+        sys.exit(1)
     output_path = os.path.abspath(output_dir)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dptb/entrypoints/pdso.py` around lines 57 - 58, Validate that the resolved
config_path (created from INPUT) exists and is readable before calling into
Julia: after computing config_path = os.path.abspath(INPUT) check
os.path.isfile(config_path) and os.access(config_path, os.R_OK) (or equivalent)
and if the check fails log a clear error and exit/raise so the process fails
fast instead of letting the backend produce opaque errors; update the code
around the config_path/output_path assignment to perform this validation and use
INPUT, config_path, output_dir, and output_path names to locate where to add the
check.
examples/To_pardiso/test_pardiso_new.py (1)

94-100: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not swallow backend failures in this test flow.

This block only prints on failure, so the test can appear successful when Julia execution fails or is missing.

Proposed fix
     print(f"Running: {' '.join(cmd)}")
-    try:
-        subprocess.run(cmd, check=True)
-        print("Julia backend run successfully!")
-    except subprocess.CalledProcessError as e:
-        print(f"Julia execution failed with code {e.returncode}")
-    except FileNotFoundError:
-        print("Julia executable not found. Skipping execution.")
+    try:
+        subprocess.run(cmd, check=True)
+        print("Julia backend run successfully!")
+    except FileNotFoundError as e:
+        raise RuntimeError("Julia executable not found in PATH.") from e
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/To_pardiso/test_pardiso_new.py` around lines 94 - 100, The test
currently swallows subprocess.run failures by only printing in the except
blocks; update the try/except around subprocess.run (the block handling
CalledProcessError and FileNotFoundError) to not hide failures — either remove
the except for CalledProcessError so the exception propagates and fails the
test, or re-raise the CalledProcessError after logging; for FileNotFoundError,
replace the print with a proper test skip (e.g., call pytest.skip) so missing
Julia marks the test as skipped rather than passing. Ensure references to
subprocess.run, CalledProcessError, and FileNotFoundError are updated
accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@dptb/postprocess/pardiso/solvers/pardiso_solver.jl`:
- Around line 107-127: The variables egval_inv, egval, and egvec are only
assigned inside the try block but are referenced after the finally; to fix,
predeclare and initialize egval_inv, egval, and egvec before the try so they are
in scope for later use (use appropriate shapes/types matching later code, e.g.,
egvec = zeros(default_dtype, size(H_k,1), 0) or egval_inv = similar(empty array)
so the later logic that computes egval = real(1 ./ egval_inv) .+ fermi_level and
uses egvec will always have defined values); update the else branch around
make_shift_invert_map, try, and finally to declare these variables before
entering try and only assign them inside try.

---

Outside diff comments:
In `@dptb/postprocess/pardiso/solvers/pardiso_solver.jl`:
- Around line 60-106: The variables egval_inv and egvec_sub are assigned only
inside the try in solve_eigen_k_pardiso, causing a scoping error when used
later; to fix, declare and initialize egval_inv and egvec_sub just before the
try (e.g. egval_inv = similar(...) or egval_inv = zeros(...); egvec_sub =
zeros(default_dtype, size(H_k,1), 0)) so they exist in the outer scope
regardless of which branch inside the try runs, then keep the existing
assignments inside the try (where eigs is called) and leave the resource-release
code (set_phase!/pardiso) unchanged.

---

Duplicate comments:
In `@dptb/entrypoints/pdso.py`:
- Around line 57-58: Validate that the resolved config_path (created from INPUT)
exists and is readable before calling into Julia: after computing config_path =
os.path.abspath(INPUT) check os.path.isfile(config_path) and
os.access(config_path, os.R_OK) (or equivalent) and if the check fails log a
clear error and exit/raise so the process fails fast instead of letting the
backend produce opaque errors; update the code around the
config_path/output_path assignment to perform this validation and use INPUT,
config_path, output_dir, and output_path names to locate where to add the check.

In `@examples/To_pardiso/test_pardiso_new.py`:
- Around line 94-100: The test currently swallows subprocess.run failures by
only printing in the except blocks; update the try/except around subprocess.run
(the block handling CalledProcessError and FileNotFoundError) to not hide
failures — either remove the except for CalledProcessError so the exception
propagates and fails the test, or re-raise the CalledProcessError after logging;
for FileNotFoundError, replace the print with a proper test skip (e.g., call
pytest.skip) so missing Julia marks the test as skipped rather than passing.
Ensure references to subprocess.run, CalledProcessError, and FileNotFoundError
are updated accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c0103fc8-b753-415d-9ee4-656157ebdcf3

📥 Commits

Reviewing files that changed from the base of the PR and between e4b36e9 and 8778096.

📒 Files selected for processing (15)
  • CLAUDE.md
  • README.md
  • dptb/entrypoints/pdso.py
  • dptb/postprocess/pardiso/io/io.jl
  • dptb/postprocess/pardiso/solvers/pardiso_solver.jl
  • dptb/postprocess/pardiso/sparse_calc_npy_print.jl
  • dptb/postprocess/pardiso/tasks/dos_calculation.jl
  • examples/To_pardiso/README.md
  • examples/To_pardiso/dptb_to_Pardiso.ipynb
  • examples/To_pardiso/dptb_to_Pardiso_new.ipynb
  • examples/To_pardiso/pardiso_tutorial.ipynb
  • examples/To_pardiso/test_pardiso_new.py
  • install_julia.sh
  • install_julia_packages.jl
  • test_mkl_pardiso.jl
✅ Files skipped from review due to trivial changes (3)
  • README.md
  • examples/To_pardiso/README.md
  • CLAUDE.md

Comment thread dptb/postprocess/pardiso/solvers/pardiso_solver.jl
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants