Skip to content

Update instrumentControl branch#4

Open
avcarr2 wants to merge 285 commits into
avcarr2:InstrumentControlOldfrom
smith-chem-wisc:master
Open

Update instrumentControl branch#4
avcarr2 wants to merge 285 commits into
avcarr2:InstrumentControlOldfrom
smith-chem-wisc:master

Conversation

@avcarr2
Copy link
Copy Markdown
Owner

@avcarr2 avcarr2 commented May 21, 2022

No description provided.

trishorts and others added 30 commits November 15, 2022 12:59
* correct Within calculation

* update unit tests

* this is the spot

* readPrecursorScanNumber

* return reported scan and precursor scan numbers when reading mzml

* unit test exception check for no scan number in mzml

* no real change

* delete empty line

Co-authored-by: MICHAEL SHORTREED <mrshortreed@wisc.edu>
Co-authored-by: Lei Lu <lonelu@users.noreply.github.com>
* Edited normalization methods within SpectralSimilarity to remove side effects, made these methods public

* Made normalization methods static
Co-authored-by: trishorts <mshort@chem.wisc.edu>
#667)

* Added RtHypothesis and RtStdDev fields to ChromatographicPeak in FlashLFQ

* Undid accidental formatting changes

* Undid accidental comment changes

* Added RtInterquartileRange field to ChromPeak and associated tests

* added requested comments

* Added comment

* Enhanced information on fields

* Rt information is now internal set

Co-authored-by: trishorts <mshort@chem.wisc.edu>
* Changed x64;AnyCPU to only AnyCPU in all project files

* One more change

* Nuget - 3.
Me - 0

* Moved over all classes from the spec averaging codebase

* Implemented MRS Noise Estimation as a weighting type

* SpecAveragingExtensions now creates binned spectra to be used for downstream processing

* Implemented SpectraWeighting

* Combined both options classes

* Replaced all references to MzLibSpecAveragingOptions

* Made some changes to tests

* Added ignore case to spectra file handler

* Restructured Austin's objects to fit within the current structure and made all tests pass

* Removed normalization class and implemented ability to open mzml and raw reguardless of case of extensions

* Tested Weighting, removed MRS Noise Estimation from SpectraMergingType

* Revised structure to elminate redundant objects

* Removed noise and fixed tests

* Made final adjustments to tests and algorithms

* Removed extra methods and cleaned up a test where I was tining them

* Forgot to save a file on the last commit. Whoops

* Fixed an error

* Expanded test coverage

* Added single spectra and scan normalization methods and renamed a few enums to match the revised structure

* Added one more test

* Renamed project to SpectralAveraging and cleaned up the classes

* Revert "Added one more test"

This reverts commit 3d32bca.

* Renamed the project to SpectralAveraging

* Updated nuspec to current version

* Changed line in tests.csproj per Alex

* Made it so that the Averaging modifies an Ms2 in place instead of constructing a new object

* Actually renamed the writer this time

* Cleaned up the code

* Parallelized Bin Averaging

Co-authored-by: Nic Bollis <nbollis@wisc.edu>
* Changed x64;AnyCPU to only AnyCPU in all project files

* One more change

* Nuget - 3.
Me - 0

* Made it so that default deconvolution is the entire spectra, with an optional range parameter to get from a specific region of the spectra

* Added some comments
* Added infrastructure for MRS noise estimation

* Added basic tests for mrs noise estimation. Will add more complex tests once I see the code coverage results.

* Increased test coverage a little more

* Fixed failing build due to not updating a constructor.

* removed BasicStatistics.CalculateMedian and substituted it with Array.Median() at all places CalculateMedian was called.

Co-authored-by: trishorts <mshort@chem.wisc.edu>
* Changed x64;AnyCPU to only AnyCPU in all project files

* One more change

* Nuget - 3.
Me - 0

* Started initial structure

* Revert "Started initial structure"

This reverts commit f06cbc0.

* Attempted to solve random test fails by restructuring the HTTP response
* correct Within calculation

* update unit tests

* this is the spot

* added median polish test cases

* all tests all day

* unused code

* test updates

---------

Co-authored-by: MICHAEL SHORTREED <mrshortreed@wisc.edu>
* weighted mean polish

* more unit tests
* correct Within calculation

* update unit tests

* this is the spot

* fixed with unit tests

---------

Co-authored-by: MICHAEL SHORTREED <mrshortreed@wisc.edu>
* Changed x64;AnyCPU to only AnyCPU in all project files

* One more change

* beginning refactoring

* Removed ExtractIonChromatogram from MsDataFile

* removed Deconvolute from MsDataFile

* Updated ThermoRawFileReader

* * fixing tests and code after overhauling the scan readers

* * mgf tests are fully fixed

* Completed test fixes.

* Fixed remaining tests.

* Deleted folders that were hanging around

* Increased code coverage to >90%.

* Removed class. Added a test.

* Added more tests to Readers

* Added more tests

* Nuget - 3.
Me - 0

* Deleted unused DynamicDataConnection from MassSpectrometry

* beginning refactoring

* Removed ExtractIonChromatogram from MsDataFile

* removed Deconvolute from MsDataFile

* Updated ThermoRawFileReader

* * fixing tests and code after overhauling the scan readers

* * mgf tests are fully fixed

* Completed test fixes.

* Fixed remaining tests.

* Deleted folders that were hanging around

* Increased code coverage to >90%.

* Removed class. Added a test.

* Added more tests to Readers

* Added more tests

* Deleted unused DynamicDataConnection from MassSpectrometry

* Delete SummedMsDataFile.cs

* Started implementing Agilent Reader

* ASDF

* adg

* Reset to previous commit

* Restructred Tests

* removed some superfluous stuff from the core readers.

* Removed IDataReader Interface

* Renamed MsDataFileHelpes to MsDataFileExtensions
Added extension method to write as mzml
Removed GenericPeak and GenericMzSpectrum
Restructed MsDataFile classes to order from public to private
Added in static LoadAllStaticData() to ensure backwards compatibility

* Consolidated file reading tests

* Tested new export code extension

* Fixed a non-build breaking error due to using internal in interface definition for IReaderFactory. Kinda odd tbh.

* Added a few tests. Excluded SystemInfo.cs from code coverage because it's 1) impossible to create a test case that will work with the CI and developer machines and 2) it's stock Microsoft Code that shouldn't break.

* * Commented out a few method that have been in the code base since 2018, but are unused and have no unit test coverage. Methods kept as comments for posterity.

* Added two tests to FlashLFQ that covered a 0% test coverage file.

* Removed x86 configuration

---------

Co-authored-by: nbollis <96196865+nbollis@users.noreply.github.com>
Co-authored-by: nbollis <nbollis@comcast.net>
* Changed x64;AnyCPU to only AnyCPU in all project files

* One more change

* Nuget - 3.
Me - 0

* Started initial structure

* Revert "Started initial structure"

This reverts commit f06cbc0.

* Changed the within in Old deconvolution tests as most abundant mass did not take into account the proton mass, making hte results off by a bit more than previous

Added tests for deconvolution with good data

* Changed how namespace was assigned in single test

* Moved Deconvolution Test to a development project

* comments

* Uncommented lines in test fixture setup

* Updates tests to have a tolerance limit

* Fixed test I broke by IsotopicEnvelop change

* Reverted changes to new branch to see if that fixes all the tests crashing on build

* Reimplemented almost all changes from previous branch

* Implemented the final changes from last pull request

* Excluded a test from code coverage

* Removed unused namespaces

* Coin flip

* DeconTestWithout Namespace Changes

* Commit

* Update SinglePeakDeconvolutionTestCase.cs

* Update WholeSpectrumDeconvolutionTestCase.cs

* Update SinglePeakDeconvolutionTestCase.cs

* Update WholeSpectrumDeconvolutionTestCase.cs

* Added a reference to the readers project

---------

Co-authored-by: Nic Bollis <nbollis@wisc.edu>
Co-authored-by: Alexander-Sol <41119316+Alexander-Sol@users.noreply.github.com>
Co-authored-by: Anthony <anthony.cesnik@gmail.com>
* Added option to quantify ambiguous peptides in FlashLFQ

* Reverted .nuspec

* Removed Proteomics.AminoAcidPolymer ref in flashlfqresults
…e Develpment Dependency (#699)

* Changed x64;AnyCPU to only AnyCPU in all project files

* One more change

* Nuget - 3.
Me - 0

* Started initial structure

* Revert "Started initial structure"

This reverts commit f06cbc0.

* Removed Test as a dependency of Development

* Individual reader factores will not grow in the future, only more factories will be added.
For this reason, they were combined into a single .cs file and all marked as sealed classes

Uncommented out stuff in Variant Application as it was used in MetaMorpheus

* ThermoRawFileReaderLiscence added to backwards support

* Updated nuspec to be final 5.0.539

* Eliminated factory layer from Readers

* forgot to save one file

* Removed extra constructors and expanded test coverage

* Expanded test coverage

* Added SequenceVariant tests

---------

Co-authored-by: Alex <AlexSolivais@gmail.com>
* Changed x64;AnyCPU to only AnyCPU in all project files

* One more change

* Nuget - 3.
Me - 0

* Fix crash caused by all values in a bin being rejected

* Lets try that again

* Changed Name of DeconvolutionTypes to DeconvolutionType to fit convention

* Added method to gerneate an IEnummerable of SpectralAveragingParameters

* Fixed winsorized and averaged sigma clipping rejection methods and their relevant tests.

* Eliminated all of our own versions of median and standard deviaition and went with those in Mathnet.Numerics

* Added enums as parameters to GenerateSpectralAveragingParameters method

* Moved Noise Estimation toMzLibUtil so it can be accessed by other projects

* Adjusted averaged spectra writer to be able to output multiple averaged spectra from the same original file
Added additional normalization extension methods

* Commented Magic Numbers

* Expanded test coverage, fixed an output bug, enabled ouptut to custom locations with custom names, removed outputting at csv

* Change MrsNoiseEstimation namespace to NoiseEstimation

* Updated namespace reference

* Update mzLib.nuspec

* SR changes

* Updated variable names in Averaging MzBinning Algorithm

* Resolved Merge Conflict

* Added comment for magic number.

---------

Co-authored-by: avcarr2 <64652734+avcarr2@users.noreply.github.com>
Co-authored-by: trishorts <mshort@chem.wisc.edu>
* Added Bruker data handling files to the most updated version of master.

* Renamed Bruker classes to BrukerFileReader

* Added tests for loading Bruker data

* Added .dll required to run Bruker sourced methods

* Added test data references.

* Added all test data files from the requisite .d directories.

* Added documentation.

* Added more documentation.

* Added dependencies of baf2sql_c.dll

* Added more unit tests

* Anonymized lockinfo files.

* Changed namespace in from IO.* to Readers for the Bruker file readers.
* correct Within calculation

* update unit tests

* this is the spot

* add space

* unbroke

* now with fastaGz

* wrong name

---------

Co-authored-by: MICHAEL SHORTREED <mrshortreed@wisc.edu>
…Engine (#714)

Co-authored-by: trishorts <mshort@chem.wisc.edu>
* Changed x64;AnyCPU to only AnyCPU in all project files

* One more change

* Nuget - 3.
Me - 0

* Started initial structure

* Revert "Started initial structure"

This reverts commit f06cbc0.

* Combined commits to trick github after file size reduction

* Fixed broken test and class diagram

* Expanded Test Coverage
Added IEquatable<MzSpectrum> to MzSpectrum

* Eliminated IResult interface

* Added exclude from code coverage to test files

* One more exclude form code coverage

* Used only filepath to generate hash of resultfile

* Added a few new comments

* Unified some variable names and reduced test file size

* Changed test file names to show version

* Unified variable names

* Update TestMsFeature.cs

* Updated Supported File Types to be cleaner

* Adjusted Test file names to have file type before software and version

---------

Co-authored-by: Alexander-Sol <41119316+Alexander-Sol@users.noreply.github.com>
* Created an extensions for binary array searching + associated tests

* Deleted unused lines

* Implemented new search extensions across mzLib, edited tests

* Added summary comments
…ons (#722)

* Added extension method to check valid base sequences to AminoAcidPolymerExtensions

* Changed name
* correct Within calculation

* update unit tests

* this is the spot

* add space

* works for base sequence but dont know about full sequence

* test flashlfq silent

* test flashlfq bayesian

* not quite there

* works yo

* works works

* mo better explanation yo

* style baby

---------

Co-authored-by: MICHAEL SHORTREED <mrshortreed@wisc.edu>
* Fixed bug in GetClosestIndex where arrays of length 1 would break closest search

* Added tests and comments

---------

Co-authored-by: Edwin Laboy <63374885+elaboy@users.noreply.github.com>
Co-authored-by: trishorts <mshort@chem.wisc.edu>
* Added support for negative mode deconvolution

* Changed name of ExampleNewDeconvolution to include the word Template

---------

Co-authored-by: trishorts <mshort@chem.wisc.edu>
* correct Within calculation

* update unit tests

* this is the spot

* add space

* comment update UniProt URL where column names for downloads can be found

---------

Co-authored-by: MICHAEL SHORTREED <mrshortreed@wisc.edu>
Co-authored-by: Anthony <anthony.cesnik@gmail.com>
* correct Within calculation

* update unit tests

* this is the spot

* add space

* added comment explaining the semicolon in the tostring of proteingroups for quant

* space the final frontier

---------

Co-authored-by: MICHAEL SHORTREED <mrshortreed@wisc.edu>
* Reduces complexity of GetOneBasedScan from O(logN) to constant

* Removed excess changes

* Added a comment and removed extra line

---------

Co-authored-by: Edwin Laboy <63374885+elaboy@users.noreply.github.com>
Co-authored-by: trishorts <mshort@chem.wisc.edu>
Alexander-Sol and others added 30 commits March 3, 2026 13:25
* Created SingletonDoublePool, edited PWSM to use new pool

* Added new test to TestFragments

* Improved benchmarking test

* Changed PWSM to use a list pool for neutral losses

* Deleted test with local file reference, deleted unused usings

* went back to hashset, switched to threadlocal implementation

* Reverted changes to listpool

* Reverted unintended changes

---------

Co-authored-by: Nic Bollis <nbollis@comcast.net>
* Added to reader

* Clean Up and fix tests

* Updated to optional method

* Made parser check for it in scan desccription

* changed where we get from
* updating batching for detectability model (low batch size)

* MESSY BROKEN IN PROGRESS

* still broken

* almost there. fragment intensity model all up to date. Cant test bc other models are unbuildable. just gotta clean up and test.

* just need tests now

* step

* All but intensity model tested. Everything compiles. One step left.

* TESTS DONE

* first benchmark test on 4M peptides/predictions. three to go

* benchies got me acting up. All four done.

* oops. new dynamic timing was broken for cases of no valid inputs (client timeout would set to 0). Now only create client and get requests if there are valid inputs.

* small stuff

* test expansion for coverage. Improvements to fragment intensity models' parameter validation.

* Updated fragmentation model to have a parameter specifying what MZs (original input sequence or validated sequence) to record predictions for. Updated LibrarySpectrum generation to take into account fragment ion MZ choice. This will better map the MZs we will typically want when creating/comparing spectral libraries.

* Made MapToValidatedFullSequence The default for hcd model to err on the side of prediction validity.

---------

Co-authored-by: trishorts <mshort@chem.wisc.edu>
Co-authored-by: Nic Bollis <nbollis@comcast.net>
* Added to reader

* opencode gitignore

* Sequence conversion base

* Removed converter structure and outlined cannonical schematic structure

* Schema and conversion warnings

* Interface Design

* Modification Lookup

* Moved warnings

* Base Class Adjustment: Cannical mod is better and schema is an abstract.

* First implementaitons.

* Design.md

* Serializers and Parsers

* gitignore

* Start of testing

* project restructure

* ugh

* Mass shift seq serializer base

* Global mod lookup

* Mood lookup refactor

* Chronologer initial

* revised chronologer

* Refactor Serializers: ShouldResolveMod

* Schema construction refactor

* Added in unimod and essential sequence serializer

* Tested Conversion Service with Essential Seq

* Kiiona Initial Integration. Still failing TMT

* Multiplex Mod mapping

* Uniprot start

* Lookup Refactor: SearchFromCandidatesOnly

* ModLookup Structural refactor

* Koina. Finally fix that shit

* Lookup: Cumulative filtering

* Uniprot revisions

* Koina: Added ValidatedFullSequence to all prediction outputs

* Uniprot conversion works, but is hardcoded for many things

* Chronologer and Uniprot work

* Sequence Converter

* Koina conversion refactor

* Koina conversion refactor

* Sequence Converter

* Uniprot serialization fix and peptide and protein extensions

* Moved base classes and added no new namespaces file

* Testing Command Round 1.

* Testing Command Round 2

* Testing Command Round 3

* removed testflashlfq

* lookup base Tests

* Expanded test coverage

* Revisered wiki

* Removed uniprot
* Added to reader

* feat: add entrapment database support

Add IsEntrapment property to IBioPolymer interface and propagate through Protein,
NucleicAcid, and RNA classes. Entrapment proteins are synthetic decoys
designed to catch false identifications.

- Add entrapmentIdentifier parameter (default 'NTRAP') for auto-detection via
  accession.StartsWith(entrapmentIdentifier) in ProteinDbLoader and RnaDbLoader
- Update Merge functions to use 5-tuple keys including IsEntrapment
- Generated decoys preserve IsEntrapment from source proteins
- BioPolymerGroup outputs 'D/C/E/T' (Entrapment checked first in priority)
- TSV results include IsEntrapment via DecoyContamTarget.Contains('E')
- Add test data file TwoHumanHistone_mimic_Retain3.fasta

* Move isEntrapment to the end of the constructors and made it optional

* better biopolymergroup output

* Testing

* more tests

* Replace NTrap with Random

* more tests
* Changed ProteinXML writer to no longer produce empty position tags (e.g., <begin position="" />" in the .xml database

* No longer write malformed entries

* Small modificatoins to protein db writer of questionable quality

* Updated .gitignore

* Added UniprotEntry class, modified ProteinDbWriter for ProSight Compatibility

* Added test for fasta -> xml writing

* Updated proteinDbWriter, tests, add new protein constructor

* Added new flag to WriteProteinDb

* Fixed non variant protein assignment issue

* Addressed Shortreed PR comments
* New clean repo with ptm_stoch contents. The methods for occupancy calculation in mzlibutils were copied from the previous branch onto this one. Need to add/remake the tests next.

* Added TestMzLibUtils tests for quantified mods, peptides, and proteins. Need tests for the protein groups and the occupancy set up (currently called CalculateOccupancies).

* Added PG and Quant object setup tests. Need to finish these tests, though

* Finshed TestSetUpQuantificationObjects. Removed Peptides field (and its population) from SetUpQuantificationObjects method for now.

* Refactored quantification util classes

* improving quantprot exception throw.

* Extended commenting. Added a peptide record class that stores the peptide input for setting up the protein groups and the quantifications.

* delayed test fixes....

* Adding GeneName and Organism fields to QuantifiedProteinGroup. FIXED bug when merging QuantifiedPeptides that caused the resulting mods to have greater intensity than total base peptide intensity.

* Apply suggestions from code review

Copilot suggestions for PFA class

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestions from code review

copilot suggestions for creating deep copy on peptide mod stoich method as well as cleaner AND/OR conditional priority

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestion from @Copilot

comment fix

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestion from @Copilot

comment fix

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Static occuancy methods and integration into Omics.BioPolymerGroup. Note: the util occupancy code will be kept for now in the event it can be useful in the future due to its simpler code structure.

* cleaning docs and small bug risks

* removing occupancy from sequence coverage. Adding sample group class for occupancy reporting. Updating quant to always have per group psm count, psm occupancy, and additionally intensity and intensity occupancy if lfq provided. NEEDS TESTS.

* Fixing tests. Still need to add tests for new classes. Updated some properties to ensure string and headers are always writeable.

* remove property setter calls to populatesamplegroupresults. method should be call before writing quantification columns.

* coverage improvement and small bug fixes.

* minor corrections from claude

* temp. added ScanMetadata

* temp save, but test run works

* cleaning biopolymergroup

* make sure psms with multiple pwsm matches do not inflate the psm count denominator.

* restored accidentally deleted code.

* preventativce maintainance

* unit tests to promote understanding

* Bug fix for inflated occupancies due to Full sequences from the same PSM recounting the psm intensity for the unambiguous mods in that PSM.

* fix counting and only report unambiguous mods.

* Cleaning code and implementing some suggestions. Renamed BioPolymerGroupType categories. Occupancy is now only reported from unamibiguous PSMs (will be enhanced later).

* revert nuspec

* final fix. output seems correct.

* final fix. output seems correct.

* nuspec?

* test file instead of folder referencing in dotnet.yaml to see if that fixes the integration issue.

---------

Co-authored-by: Alexander-Sol <41119316+Alexander-Sol@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: MICHAEL SHORTREED <mrshortreed@wisc.edu>
Co-authored-by: Nic Bollis <nbollis@comcast.net>
Co-authored-by: trishorts <mshort@chem.wisc.edu>
* slim fit

* increased test coverage plus handle negative mode
* Averatide revisions

* Emperical

* Empirical Testing

* Added missing elements.
* updating fragment intensity model input mods handling default and adding safeguards to pwsm fragmentation output.

* Make every models' default behavior be to return null predictions on incompatible inputs.

---------

Co-authored-by: Nic Bollis <nbollis@comcast.net>
* Object Pool

* ope

* Split: TestProeinDigestion

* Split: TestPeptides

* Split: TestModifications

* Split: TestFragments

* Split: TestChemicalFormula

* Split: TestPeptideWithSetMods

* Move: First 8 tests

* Move: MzlibUtil

* Rename: Chemistry Tests

* Move Loose: Database Tests

* Move Loose: IsoTracker

* Move Loose: MassSpectrometry Decon and loose

* Move Loose: Spectral Library Tests

* Structure: FileReadingTests

* Move Loose: Proteomics

* Adjust: Transcriptomics

* Move Loose: Omics-Fragmentation

* Mass Spec Oopsie

* Move Loose: Omics-Modification

* Move Loose: Omics-BioPolymerGroup

* Move Loose: Omics-SampleInfo

* Move Loose: Omics-Occupancy

* Move Loose: Omics-SpectralMatch

* Test.csproj changes

* Fix tests.

* refactor a few more namespaces
… Generalized Naming (#1048)

* batch mode

* batch tests

* IDisposable left to individual predictors

* best of both worlds

* eliminate record struct

* threaded with split by range

* obsolete predictretentiontime

* internal code review

* fix broken tests

* retention time predictor exception tests

* bye bye ienumerable

* eliminate exception

* delete chrnologer override

* throw caution to the wind

* some tests

* Drop the IsConcurrentPredictionSafe gate concept entirely: predictors are
  expected to handle whatever maxThreads they receive. Chronologer serializes
  on _modelLock; SSRCalc3 and CZE are pure functions over readonly state.
  maxThreads < 1 clamps to 1 rather than throws. No streaming API.

  Production code:
  - Producer faults in ProduceResults are now captured into the existing
    ConcurrentQueue<Exception> and surfaced as AggregateException after the
    consumer drains, honoring the documented contract.
  - peptides.ToList() and the empty-input early return hoisted out of
    Task.Run to the caller's thread, so source-enumeration faults surface
    synchronously and thread-affine sources behave correctly.
  - PredictRetentionTimeEquivalents XML doc on base class and interface now
    explicitly states result order is not guaranteed; the tuple's Peptide
    element is the binding to inputs.
  - SSRCalc3.ValidateBasicConstraints calls base first, so null/empty
    BaseSequence yields EmptySequence instead of NRE.
  - Removed orphaned IsConcurrentPredictionSafe XML doc, broken
    <see cref="StreamRetentionTimeEquivalents"/>, redundant IDisposable on
    the base class declaration, and the obsolete PredictRetentionTime DIM
    on the interface (kept the concrete shim on the base class).

  Tests:
  - Multi-threaded parity test now uses real amino-acid sequences instead of
    PEPTIDE1..PEPTIDE20, which were all rejected as InvalidAminoAcid and
    reduced the test to "null equals null".
  - ContainsAllInputPeptides asserts Count == peptides.Count and multiset
    equality (no-collapse, unordered).
  - Cross-predictor failure-reason assertions in MixedBatch test weakened
    to the universal "value null ⇔ reason not null" contract.
  - DelegatesToProduceResults wired through SpyRetentionTimePredictor with
    real spy assertions.
  - Deleted broken Diagnostic_ParallelExecution_Behavior, unused
    AggregateExceptionThrowingPredictor and ThrowingEnumerable helpers,
    and two empty AggregateException placeholder regions.

  Cosmetic reverts: undid 9 stylistic changes flagged by the review
  (EOF newlines, trailing whitespace, blank lines between interface
  members, comment rewording, an unused-using removal that was actually
  used). No behavioural impact.

  Verified: dotnet build clean; 129/129 RetentionTimePrediction tests pass.
* protease dictionary embedded resource new default behavior

* internal review

* Address PR review: restore deprecated APIs, fix CNBr test, update stale docs

Blocking
- ProteaseDictionary: restore LoadProteaseDictionary(path, mods) and
  ResetToDefaults(mods) as [Obsolete] thin wrappers. The parser wrapper
  does not mutate global state (matches pre-PR behavior); ResetToDefaults
  rebuilds Dictionary from the embedded resource.
- RnaseDictionary: restore LoadRnaseDictionary(path) and ResetToDefaults()
  with the same [Obsolete] treatment.
- TestProteinDigestion.CNBrProteinDigestion: rename the three colliding
  rows in DoubleProtease.tsv (CNBr, CNBr_old, CNBr_N) to CNBr_custom,
  CNBr_old_custom, CNBr_N_custom, update the test's dictionary lookups,
  and add result.Added assertions so the test now exercises custom-file
  loading instead of silently falling back to the embedded definitions.
  The Test1/Test2/Test3 rows used by TestSeqCoverage.ReadCustomFile are
  untouched.

Non-blocking
- proteases.tsv, rnases.tsv: remove dead ResetToDefaults references,
  rewrite merge-rules comments to reflect skip-on-collision semantics,
  and replace the misleading 'trypsin' example in proteases.tsv with a
  non-colliding name.
- ProteaseDictionary, RnaseDictionary: soften 'immutable' /
  'cannot be overridden' language in class docstrings to
  'protected baseline', and note that dictionary contents remain
  mutable via the indexer.

Deferred to follow-up (non-blocking suggestions): standalone parser
test in TestSeqCoverage and splitting atomicity tests out of
TestProteaseDictionaryEmbeddedMods.cs.

* fix(digestion): address PR #1040 review

  Restore the public no-args RnaseDictionary.LoadRnaseDictionary() as
  an [Obsolete] shim so external callers compile with a deprecation
  message pointing them at RnaseDictionary.Dictionary.

  Enforce CustomDigestionAgentLoadResult's advertised immutability:
  Added and Skipped now wrap a defensive copy in a ReadOnlyCollection
  so callers can't mutate reported results via a retained reference.

  Validate path and reject null elements in
  RnaseDictionary.LoadAndMergeCustomRnases before any I/O, so the
  resulting ArgumentNullException identifies the bad input clearly.

  Clarify in XML docs that:
  - CustomDigestionAgentLoadResult.Skipped intentionally conflates
    embedded-baseline, cross-file, and cross-call collisions.
  - ProteaseDictionary.LoadAndMergeCustomProteases is not thread-safe
    and must run during single-threaded initialization.
  - Custom proteases referencing an unresolvable cleavage modification
    throw MzLibException; entries with no mod are accepted silently.

  Addresses 6 of 19 findings from the PR #1040 review; the remaining
  were already applied or out of scope. See fixes_summary.md.

* fix broken uni t tests
* minimal decon scorer

* test shallow clone

* isotope ratio consistency

* new scorer weights

* nic changes

* fix one unit test

* internal code review

* test classic decon with decoys

* Add XML documentation to ClassicDeconvolutionParameters

Added XML documentation comments for the ClassicDeconvolutionParameters class and its constructor.

* Add XML documentation for Deconvoluter methods

* Add XML documentation to CreateAlgorithm method

Added XML documentation for CreateAlgorithm method.

* convert virtual to abstract

* fix(decon): eagerly validate ScoreEnvelopes args; skip null envelopes

  ScoreEnvelopes was a single yield-return method, so arg validation was
  deferred until first MoveNext. That hid two bugs:

  - Passing model=null with an empty envelopes sequence threw nothing at
    all (the iterator body never ran).
  - Passing envelopes=null surfaced as a NullReferenceException from the
    foreach rather than as ArgumentNullException("envelopes").

  Split into a non-iterator wrapper that null-checks both args eagerly
  and a private iterator helper. The iterator now skips null elements
  silently instead of letting them propagate into ComputeFeatures, which
  matches the intent of the batch API (one bad envelope shouldn't
  abort the rest of the run).

* fix(decon): anchor ComputeFeatures m/z on signed envelope charge

  ComputeFeatures was passing absCharge to ToMz when computing the
  monoisotopic m/z anchor for the theoretical isotope grid. ToMz is
  asymmetric in the proton sign -- mass/|z| + sign(z)*proton -- so
  ToMz(mass, +5) and ToMz(mass, -5) differ by 2 protons (~2.015 Da),
  far beyond the 10 ppm match tolerance.

  For negative-mode envelopes that meant the theoretical grid landed
  on the positive-mode mirror, no observed peaks matched, and
  cosine/completeness/ratio-consistency collapsed to ~0.

  Pass envelope.Charge (signed) so ToMz embeds the correct polarity
  sign. The rest of the function still uses absCharge for isotope
  step and per-position math, which is correct.

* test(decon): pin q-value, decoy-averagine, and cache-reset contracts

  The deconScorer branch shipped several tests that passed trivially
  without exercising the contract they claimed to verify:

  - G4_AssignQValues_MonotonicallyEnforced asserted only non-decreasing
    order, which any constant array satisfies. Now asserts the four exact
    post-sweep values (all 0.5 for the documented input), so a regression
    that drops the +1 correction or removes the monotone sweep fails.
  - C1_MonoisotopicPeak_IsNotShifted iterated only idx 0..19 where apex_n
    is always 0, so the peak-shift logic was never exercised. Renamed to
    C1_MonoisotopicRecovery_HoldsAtLowMassBreaksAtHighMass and now anchors
    on both a low-mass index (where apex IS monoisotopic) and a high-mass
    index (apex_n > 0). The latter exposes a real contract violation in
    DecoyAveragine: GetDiffToMonoisotopic forwards unchanged from the real
    model while the apex peak is shifted, so apex - DiffToMono = realMono
    + apex_n * perPeakOffset rather than realMono. Test pins current
    actual behaviour; doc-block flags a follow-up to fix the prod side.
  - IsoDec_ToDecoyParameters_DecoySpacingReachesDll built a fresh original
    with no _isoSettings cache, so the cache-reset claim was untested.
    Now warms original.ToIsoSettings() before cloning.
  - D2 in TestDecoyAveragine asserted finiteness of e.Score (the raw
    algorithm score this PR replaces) instead of the new generic scorer.
    Now scores every envelope via DeconvolutionScorer.ScoreEnvelope.

  Also added CosineOfAlignedVectors_ZeroNorm_ReturnsZero to pin the helper
  contract that A6_ComputeFeatures_EmptyPeakList_DoesNotThrow implicitly
  relies on.

  Restored standard logistic-regression sign convention in
  DeconvolutionScorer (positive coefficient => feature increases P(true)),
  with sigmoid back to 1/(1+exp(-linear)). Algebraically identical to the
  flipped form, but the standard convention lets weights be replaced
  directly with sklearn / glm output.

  Removed bogus meanTarget > meanDecoy assertion in
  TestDecoyClassicRealData (contradicted documented Classic AUC=0.50) and
  corrected E2's "identical envelopes" comment to match the empirical
  empty-decoy-list reality.

  Renamed TestShallowClone class to TestToDecoyParameters; file rename
  done via `git mv`.

* filename change

* test(decon): pin scorer contracts and expose decoy-averagine API gap

  Strengthens the deconScorer branch's regression coverage. Several tests
  on the branch passed trivially without exercising the contract they
  claimed to verify; this commit replaces them with assertions that
  actually fail on the regressions they're meant to catch, plus one
  contract test for a helper that previously had none.

  Substantive test changes:

  - G4_AssignQValues_MonotonicallyEnforced now asserts the four exact
    post-sweep q-values (all 0.5 for the documented input) instead of
    only non-decreasing order, so a regression that drops the +1
    correction or removes the monotone sweep fails. The old XML comment
    documented the formula without the +1 correction; that's corrected.
  - C1_MonoisotopicPeak_IsNotShifted (renamed to
    C1_MonoisotopicRecovery_HoldsAtLowMassBreaksAtHighMass) iterated
    only idx 0..19 where apex_n is always 0, so the peak-shift logic
    was never exercised. Now anchors on both a low-mass and a high-mass
    entry. The high-mass case exposes a real DecoyAveragine API gap:
    GetDiffToMonoisotopic forwards unchanged from the real model while
    the apex peak is shifted, so apex - DiffToMono = realMono +
    apex_n * perPeakOffset, not realMono. Test pins current actual
    behaviour; doc-block flags a follow-up to fix the prod side.
  - IsoDec_ToDecoyParameters_DecoySpacingReachesDll now warms the
    original's _isoSettings cache before cloning. Previously the cache
    was always null at clone time, so any future "cache leaks to decoy"
    regression would have been invisible.
  - D2_DecoyEnvelopes_ReturnedFromDeconvoluteWithDecoys_OnCleanSynthetic
    now scores every envelope via DeconvolutionScorer.ScoreEnvelope (the
    new generic scorer this PR introduces) rather than asserting on the
    algorithm's raw e.Score field that the PR sets out to replace.
  - New CosineOfAlignedVectors_ZeroNorm_ReturnsZero pins the helper's
    zero-norm-returns-0 contract that ComputeFeatures implicitly relies
    on for envelopes with no observed signal at theoretical positions.

  Production-code changes:

  - Restored standard logistic-regression sign convention in
    DeconvolutionScorer (positive coefficient => feature increases
    P(true)), with sigmoid back to 1/(1+exp(-linear)) and the explanatory
    comment re-attached. Algebraically identical to the flipped form,
    but the standard convention lets weights drop in directly from
    sklearn / glm output.
  - Documented the readonly-struct choice on EnvelopeScoreFeatures with
    a <remarks> block: 32-byte struct, ComputeFeatures is the
    batch-scoring hot path (thousands of envelopes per scan), value
    semantics avoid per-envelope GC pressure.

  Other corrections:

  - Removed bogus meanTarget > meanDecoy assertion in
    TestDecoyClassicRealData (contradicted documented Classic AUC=0.50).
  - Corrected E2's "identical envelopes" comment in
    TestDeconvolutionScorerIntegration: empirically decoys is empty for
    Classic on the synthetic spectrum, not identical to targets.
  - Renamed TestShallowClone -> TestToDecoyParameters (file + class) to
    match the actual API surface under test.
  - Restored a few comments and a brace pair removed during the branch's
    earlier cosmetic churn (style choices kept where the user signed off
    on them; reverts limited to noise).

* Pass in expected spacing to decoy

---------

Co-authored-by: nbollis <nbollis@comcast.net>
* Initial param equality

* Tests

---------

Co-authored-by: trishorts <mshort@chem.wisc.edu>
* Bump OpenMcdf from 2.3.1 to 3.1.3

---
updated-dependencies:
- dependency-name: OpenMcdf
  dependency-version: 3.1.3
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update nuspec: bump OpenMcdf to 3.1.3 and OpenMcdf.Extensions to 2.3.1

Agent-Logs-Url: https://github.com/smith-chem-wisc/mzLib/sessions/ff64eb83-51bd-4e68-8cd9-eb7781945b78

Co-authored-by: trishorts <16841846+trishorts@users.noreply.github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nic Bollis <nbollis@comcast.net>
Co-authored-by: trishorts <mshort@chem.wisc.edu>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: trishorts <16841846+trishorts@users.noreply.github.com>
* Add MS:1000516 charge array support to MsDataScan + mzML I/O

Adds an optional 'ChargeArray' (int[]) property to MsDataScan, parallel
to the m/z and intensity arrays. The mzML writer emits a third
binaryDataArray (PSI-MS MS:1000516, 32-bit float, no compression) when
ChargeArray is non-null. The mzML reader recognizes the same accession
and round-trips the array.

Existing callers are unaffected — the constructor parameter defaults
to null and writers/readers behave identically when no charge data is
present. Conformant readers that don't recognize MS:1000516 silently
ignore the array per spec.

Use case: deisotopers (e.g. YADA's annotate mode) want to publish
per-peak charge state inline in the mzML rather than via a side-car
file. PSI-MS defines the charge array as a first-class spec feature
since mzML 1.0; ProteoWizard, Skyline, MZmine, mzR, pymzML all consume
it. mzLib's writer just didn't expose the API.

* Add round-trip + null-default regression tests for charge array

Per @nbollis's review on #1060 — exercises both the writer branch in
MzmlMethods (third <binaryDataArray> with MS:1000516) and the reader
branch in Mzml.cs (cvParam recognition + 32-bit-float decode rounded
back to int[]). The second test pins the null-default behavior so a
future change can't quietly start emitting an empty array.

Both tests use GenericMsDataFile as the concrete MsDataFile, matching
the existing test patterns in this file.

* yada 31

* Address review feedback from trishorts

- Wire MS:1000516 charge-array decoding into the dynamic-reader path
  (GetOneBasedScanFromDynamicConnection) so InitiateDynamicConnection
  callers no longer silently drop per-peak charges. Mirrors the existing
  static-reader path: new readingCharges flag + chargeArray local, an
  MS:1000516 CVPARAM branch, BINARY decodes & rounds the 32-bit floats
  back to int, and the constructed MsDataScan receives chargeArray.
- Correct ChargeArray's XML doc-comment ("zlib-compressed" was wrong;
  the writer emits MS:1000576 "no compression"), with a note explaining
  the deferred-compression rationale.
- Tighten ChargeArrayRoundTrip to also assert MassSpectrum.XArray and
  YArray round-trip element-for-element. Guards against writer
  sizing/indexing bugs that would corrupt the existing arrays while
  leaving the new charge array alone.
- Add ChargeArrayRoundTripWithNoiseData exercising the 6-array writer
  branch (NoiseData + ChargeArray). Asserts m/z, intensity, and charge
  all round-trip, and inspects the on-disk XML for
  binaryDataArrayList count="6" + an MS:1000516 cvParam.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Nic Bollis <nbollis@comcast.net>
* Move ISingleChargeMs1Feature to MassSpectrometry

The interface describes an MS1 deconvolution result and is the input
contract for the upcoming Deconvoluter.PairPrecursorsToMs2 join. That
method lives in MassSpectrometry.Deconvolution; Readers references
MassSpectrometry rather than the reverse, so the interface had to live
in MassSpectrometry for the dependency direction to work.

Relocate the file via `git mv` (history preserved), re-namespace to
MassSpectrometry, and add `using MassSpectrometry;` to the five Readers
files that implement or reference it. No behavior change. No external
consumer references the interface today (grep-checked on MetaMorpheus,
ProteaseGuru, ProteoformExplorer), so the move is invisible outside
this PR.

Doc tightened to reflect that pairing requires both an RT-window check
and an isolation-window check, not just an RT-apex check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add Deconvoluter.PairPrecursorsToMs2 with FlashDeconv / TopFD coverage

A new join between MS1 deconvolution features (any
ISingleChargeMs1Feature source — Ms1FeatureFile from TopFD/FlashDeconv,
DinosaurTsvFile, or future in-house whole-file deconvolution) and the
MS2 scans that selected them for fragmentation. One method on the
existing Deconvoluter:

    public static IEnumerable<(MsDataScan Ms2Scan, IsotopicEnvelope PrecursorEnvelope)>
        PairPrecursorsToMs2(IEnumerable<ISingleChargeMs1Feature> ms1Features,
                            MsDataFile msDataFile);

Match rule: MS2.RetentionTime ∈ [feat.RetentionTimeStart,
feat.RetentionTimeEnd] AND feat.Mz ∈ MS2.IsolationRange. Pure join —
no cross-charge consensus, no off-by-one correction, no harmonic
filtering, no isobaric-tag reporter extraction. Those concerns belong
to the deconvolution producer upstream or to the search engine
downstream. Pairing is restricted to MsnOrder == 2 (MS3 isolation
targets an MS2 fragment, not an MS1 precursor); MS2 scans with a null
IsolationRange are skipped. Chimeras emit one pair per matching
feature; a single feature spanning multiple MS2 scans likewise emits
one pair per scan.

The returned IsotopicEnvelope is the type MetaMorpheus already speaks
via EngineLayer.Util.Precursor's Precursor(IsotopicEnvelope, ...)
constructor, so downstream wrapping into Ms2ScanWithSpecificMass is a
direct two-line adapter at the call site (no mzLib-side wrapper class
needed):

    var pairs = Deconvoluter.PairPrecursorsToMs2(features, dataFile);
    foreach (var (ms2, envelope) in pairs)
    {
        var precursor = new Precursor(envelope, ...);
        var scan = new Ms2ScanWithSpecificMass(ms2, precursor.MonoisotopicPeakMz,
            precursor.Charge, fullFilePath, commonParameters,
            neutralExperimentalFragments, precursor.Intensity,
            precursor.EnvelopePeakCount, precursor.FractionalIntensity);
        ...
    }

For input from external readers (.ms1.feature etc.), the envelope is
built via the existing 3-arg IsotopicEnvelope(mass, intensity, charge)
constructor — Peaks is a single synthetic entry; NumberOfIsotopes from
the input is not surfaced because per-peak m/z and intensity are
unknown. When the producer is mzLib's own in-house deconvolution it can
hand PairPrecursorsToMs2 real envelopes; the contract works lossless in
that case.

Tests in TestPrecursorPairing.cs cover:
- 14 synthetic-input unit tests: happy path, RT-out-of-window (both
  sides), m/z-out-of-isolation (both sides), chimera, one-feature-
  spanning-many-MS2s, MS2 with null isolation, MS1 ignored, MS3 ignored,
  RT/isolation boundary inclusivity, empty inputs, MS2-free file.
- 3 end-to-end fixture tests: load Ms1Feature_FlashDeconvOpenMs3.0.0
  and Ms1Feature_TopFDv1.6.2 fixtures through the production
  Ms1FeatureFile reader, pair against a synthetic MS2 scan tuned to
  catch only one charge state of feature row #1 (≈10835.9 Da at z=10),
  assert one pair with the expected charge, mass, and format-
  appropriate intensity (FlashDeconv → 0; TopFD → positive).

17/17 pass; 774/774 MassSpectrometryTests + FileReadingTests
regression remains green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Remove unused MassSpectrometry namespace

* Remove unused MassSpectrometry namespace import

* Update Ms1Feature.cs

* Remove unused MassSpectrometry import

Removed unused MassSpectrometry namespace import.

* Remove unused MassSpectrometry import

* Restore using MassSpectrometry; in 5 Readers files (build fix)

These five files reference ISingleChargeMs1Feature in method signatures
or interface implementations, and the interface now lives in the
MassSpectrometry namespace (per the move two commits earlier). Readers
has ImplicitUsings=enable but the auto-generated global usings only
cover stdlib namespaces (System, System.Linq, etc.), so MassSpectrometry
isn't imported implicitly.

The previous round of cleanup commits removed these using directives as
"unused imports" — but each is required for the file to compile.
Restoring them undoes those five build-breaking deletions.

Verified: full Test.csproj suite — 3413/3413 pass (5 benchmarks skipped
as designed) on .NET 8.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Replace PairPrecursorsToMs2 with FromFile deconvolution algorithm

Adopts the pattern from Nic's PR #1065 draft: instead of a bespoke
join method on Deconvoluter, "deconvolution loaded from disk" becomes
another DeconvolutionAlgorithm that plugs into the existing
Deconvoluter factory. MetaMorpheus's per-MS2 precursor-deconvolution
loop already calls MsDataScan.GetIsolatedMassesAndCharges(precursorSpectrum,
deconParams); swapping ClassicDeconvolutionParameters for
FromFileDeconvolutionParameters transparently switches the entire
pipeline to consume pre-deconvoluted features.

New types:
- DeconvolutionType.FromFile (enum entry)
- FromFileDeconvolutionParameters (DeconvolutionParameters subclass —
  wraps a pre-loaded IEnumerable<ISingleChargeMs1Feature>; the caller
  chooses which Readers parser to use, so the Readers → MassSpectrometry
  dependency arrow is preserved)
- FromFileDeconvolutionAlgorithm (DeconvolutionAlgorithm subclass —
  filters the stored features by m/z range, RT-window overlap, and
  [MinAssumedChargeState, MaxAssumedChargeState], yields synthetic
  IsotopicEnvelopes via the existing 3-arg constructor)
- MzLibUtil.MzRtRange (MzRange subclass adding RT bounds — reusable
  primitive for any future algorithm needing joint m/z + RT filtering;
  adopted verbatim from #1065)

Modified:
- Deconvoluter.Deconvolute(scan, params, range): auto-upgrades a plain
  MzRange to MzRtRange when params are FromFile, using scan.RetentionTime
  as the anchor. The MzSpectrum overload validates that an MzRtRange
  was supplied (no anchor available there) and raises ArgumentException
  otherwise. The branch only fires for FromFile — existing in-memory
  algorithm callers are entirely unaffected.
- Deconvoluter.CreateAlgorithm: factory case for FromFile.
- MsDataScan.GetIsolatedMassesAndCharges (both overloads): same
  auto-upgrade with rtTolerance=0.1 (matches #1065's value). Also
  extracts the 8.5 magic number into a const, mirroring #1065.

Deleted:
- The previous PairPrecursorsToMs2 method on Deconvoluter is gone.
  Callers wanting the bulk-pairing shape can write the 4-line loop
  themselves; the algorithm pattern is the canonical entry point.

Tests (renamed Test/MassSpectrometryTests/TestPrecursorPairing.cs →
TestFromFileDeconvolution.cs via git mv):
- 8 direct-algorithm tests via Deconvoluter.Deconvolute(spectrum,
  params, MzRtRange).
- 6 integration tests via MsDataScan.GetIsolatedMassesAndCharges
  (the production MM entry point).
- 3 end-to-end tests load real Ms1Feature_FlashDeconvOpenMs3.0.0 and
  Ms1Feature_TopFDv1.6.2 fixtures through Ms1FeatureFile, pair against
  a synthetic MS2 tuned to catch one charge state, assert the
  envelope's mass / charge / format-appropriate intensity.

17/17 pass. Full Test.csproj suite: 3413/3413 pass (5 benchmarks
skipped as designed), 2 min 11 s on .NET 8.

Adjustments vs Nic's #1065 sketch:
- Params class takes IEnumerable<ISingleChargeMs1Feature> instead of a
  resultPath string. Keeps file-format coupling out of MassSpectrometry.
- Algorithm body is implemented (sketch was NotImplementedException).
- MsDataScan.GetIsolatedMassesAndCharges(MsDataScan precursorScan, ...)
  uses this.RetentionTime (the MS2's RT) rather than
  precursorScan.RetentionTime. The canonical question is "what features
  were eluting when this MS2 was fired?"; the difference in practice
  is sub-second but the MS2-anchored answer is semantically right.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Move FromFile to Readers; params takes file path (PR #1064 review)

Addresses Nic's review comments on #1064:
- "FromFileDecon parameters and algorithm will need to be moved to the
  readers project. The parameters will then take in the file path and
  load up the objects that are needed."  (file-level comment on params)
- "This one should throw just like inside of Deconvoluter."  (line 192
  of MsDataScan)

Changes:

MassSpectrometry:
- DeconvolutionAlgorithm.Deconvolute promoted from internal abstract to
  protected internal abstract so subclasses in other assemblies (Readers)
  can override. Existing in-project overrides (Classic, IsoDec, Example)
  updated to match.
- DeconvolutionParameters: add virtual CreateAlgorithm() returning null.
  Subclasses living outside MassSpectrometry can override to inject
  their own DeconvolutionAlgorithm without the central enum-switch in
  Deconvoluter needing to know about them.
- Deconvoluter.CreateAlgorithm: try parameters.CreateAlgorithm() first,
  fall back to the enum-based switch for in-project algorithms. The
  FromFile case in the switch now throws an explanatory MzLibException
  rather than instantiating an in-project type — instantiation is the
  params-side's job.
- MsDataScan.GetIsolatedMassesAndCharges (both overloads): remove the
  FromFile auto-upgrade branch. The MzSpectrum overload of
  Deconvoluter.Deconvolute already throws when given FromFile params
  with a plain MzRange — let that error surface instead of silently
  upgrading. Callers wanting from-file decon use the MsDataScan
  overload of GetIsolatedMassesAndCharges, which routes through
  Deconvoluter's MsDataScan overload (still auto-upgrades using the
  precursor scan's RetentionTime).

Old files deleted:
- mzLib/MassSpectrometry/Deconvolution/Algorithms/FromFileDeconvolutionAlgorithm.cs
- mzLib/MassSpectrometry/Deconvolution/Parameters/FromFileDeconvolutionParameters.cs

New files in Readers:
- mzLib/Readers/Deconvolution/FromFileDeconvolutionParameters.cs
  - Public ctor takes a string filePath; loads features via
    FileReader.ReadResultFile (auto-detects FlashDeconv / TopFD / Dinosaur
    via SupportedFileType), then calls LoadResults and materializes the
    per-charge expansion.
  - Internal ctor takes IEnumerable<ISingleChargeMs1Feature> for unit
    tests (uses existing InternalsVisibleTo("Test") in Readers).
  - CreateAlgorithm() override returns FromFileDeconvolutionAlgorithm.
- mzLib/Readers/Deconvolution/FromFileDeconvolutionAlgorithm.cs
  - Same algorithm body; override modifier is now `protected override`
    to satisfy C#'s cross-assembly `protected internal` rule.

Tests reshaped (TestFromFileDeconvolution.cs):
- Integration tests now use the MsDataScan overload of
  GetIsolatedMassesAndCharges (the MzSpectrum overload throws for
  FromFile after this change; that contract is locked by a new test
  MzSpectrumOverloadOfGimc_FromFileWithoutMzRtRange_Throws).
- End-to-end fixture tests now drive the public file-path ctor —
  exercising FileReader.ReadResultFile + SupportedFileType detection +
  Ms1FeatureFile.LoadResults.

18/18 FromFile tests pass. Full Test.csproj suite: 3414/3414 pass,
5 benchmarks skipped, 2 min 12 s on .NET 8.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Sort features by m/z and binary-search the lower bound (PR #1064 review)

Addresses Nic's efficiency suggestion at FromFileDeconvolutionAlgorithm.cs
line 47 — "Could possibly improve efficiency by storing the features in
an ordered collection within the FromFileParameters."

FromFileDeconvolutionParameters now sorts the per-charge features by
ascending m/z at construction and stores a parallel double[] of m/z
keys. A new internal helper FindFirstIndexAtOrAbove(double minMz) wraps
Array.BinarySearch with the standard "if no match, return insertion
point" convention.

FromFileDeconvolutionAlgorithm.Deconvolute now binary-searches for the
first feature whose m/z is at or above range.Minimum, iterates forward,
and yield-breaks the moment Mz exceeds range.Maximum — turning the
per-MS2 query from O(N) to O(log N + k) where k is the count of
features whose m/z lands inside the requested window.

For a typical search (10k features × ~5k MS2 scans), this is ~400×
fewer per-feature comparisons than the previous linear scan; for
typical k ≈ 1-3 features per isolation window, the constant-factor
improvement is much larger than that.

Behavior-preserving — all 18 FromFile tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Lift PR-touched files to ≥ 90% line coverage with essential tests

Baseline coverage on PR-touched files identified three classes below 90%:
- MzLibUtil.MzRtRange — 34.5% (new class, mostly untested properties / methods)
- MassSpectrometry.Deconvoluter — 83.7% (uncovered: null-range auto-upgrade
  branch, FromFile-throws-from-enum-switch defensive path, decoy-null throw
  in DeconvoluteWithDecoys)
- Readers.FromFileDeconvolutionParameters — 83.9% (uncovered: null filePath
  throw, non-feature-file throw, ToDecoyParameters returns null)

Tests added (essential paths only, no fluff):

New file Test/TestMzRtRange.cs (15 tests):
- Two constructors with bounds verification.
- Derived properties (Minimum/Maximum/MeanMZ/MeanRt/Width pair) in one test.
- Contains: inside, mz-below, mz-above, rt-below, rt-above, inclusive boundaries.
- CompareTo: inside (0), mz-below (1), mz-above (-1), rt-below (1), rt-above (-1).
- ToString smoke check (both m/z and RT bounds appear in output).

Added to TestFromFileDeconvolution.cs (5 tests):
- Null filePath in ctor → ArgumentNullException.
- Non-feature file (.psmtsv) in ctor → MzLibException with diagnostic message.
- ToDecoyParameters → null (locks the "no decoy support" contract).
- Deconvoluter.Deconvolute(MsDataScan, FromFileParams, null) auto-upgrades using
  the scan's MassSpectrum.Range (covers the null-range branch of the ternary).
- DeconvoluteWithDecoys with FromFile params → InvalidOperationException (covers
  the decoy-null defensive path in DeconvoluteWithDecoys).
- A test-only DeconvolutionParameters stub that claims DeconvolutionType.FromFile
  but doesn't override CreateAlgorithm → Deconvoluter.CreateAlgorithm throws
  the explanatory MzLibException (covers the FromFile case in the enum switch
  defensive path).

Skipped (still 100% but borderline):
- MassSpectrometry.MsDataScan at 92.1% — uncovered lines are pre-existing
  code outside this PR's scope.

Result on PR-touched files:
- MzRtRange: 34.5 → 100% line coverage
- FromFileDeconvolutionParameters: 83.9 → 100%
- Deconvoluter: 83.7 → 100%
- All other PR-touched classes already ≥ 97%
- Zero PR-touched classes below 90%

Total: 40 FromFile+MzRtRange tests pass; full Test.csproj suite 3436/3436
pass (5 benchmarks skipped, 2 min 33 s on .NET 8).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* add new protease subtilisin

* feat: add subtilisin|p protease and unit test

Add subtilisin|p to the embedded proteases.tsv with full cleavage
specificity and proline-inhibition motifs (N[P]|, S[P]|, L[P]|,
K[P]|, I[P]|, D[P]|, Y[P]|, V[P]|, G[P]|, F[P]|, T[P]|, E[P]|,
Q[P]|, A[P]|, R[P]|).

Add TestSubtilisinP_DigestsCorrectlyAndRespectsProlineRestriction to
ProteinDigestionTests to verify:
- subtilisin|p is present in the embedded protease dictionary with
  CleavageSpecificity.Full
- All expected cleavage sites fire on a proline-free sequence (ANKTIDE)
- The K[P]| proline-inhibition rule is respected (AKPIDE keeps KP intact)

* correct protease name use capital P

---------

Co-authored-by: Nic Bollis <nbollis@comcast.net>
* seperate deconvolution Score and GenericScore

* generic deconvolution score accessiblity

* internal code review

* eliminate some ai noise

* chore: revert cosmetic-only changes from PR #1054

  Undo whitespace and formatting churn that crept into the deconvolution
  scoring PR. Restores trailing EOF newlines, blank-line removal, and
  aligned-column padding around `=` and named-argument `:` to match master.
  No functional impact.

  Addresses 11 cosmetic findings from the PR #1054 review.

* test(deconvolution): plug coverage gaps from PR #1054 review

  - Add happy-path + cache-hit tests for the GetOrComputeGenericScore
    (env, DeconvolutionParameters) overload and assert delegation parity
    with the AverageResidue overload.
  - Pin GetOrComputeGenericScore_NotYetSet_ComputesAndStashes to
    DeconvolutionScorer.ScoreEnvelope's direct output via a parallel
    envelope, locking in the documented cache-miss contract.
  - Tighten D1/D3 perfect-Averagine score assertions from Is.InRange(0,1)
    to Is.GreaterThan(0.5), matching the threshold used by B1 and the
    negative-charge equivalent.
  - Lower off-by-one cosine threshold from <0.85 to <0.7 to widen the
    gap against the positive-case >0.85 baseline.

  Addresses 4 TestCoverage findings from the PR #1054 review.

* fix issues from merge

* tighten DeconvolutionScorer test assertions and contracts

  - TestScoreEnvelopesNullHandling: drop .ToList() from the null-collection
    and null-model lambdas so the tests only pass on eager throws (at the
    call site, before MoveNext). Pins the wrapper+iterator split against a
    silent regression to a single iterator method.

  - TestDeconvolutionScorerUnit.A5: tighten the all-peaks-50ppm-shifted
    cosine assertion from < 0.5 to == 0.0 within 1e-9. The 10 ppm matching
    window guarantees an all-zero observed vector; any partial leakage of
    out-of-window intensity must fail.

  - TestDeconvolutionScorerUnit.A1: tighten the perfect-envelope
    IntensityRatioConsistency assertion from >= 0.90 to == 1.0 within 1e-6.
    BuildPerfectEnvelope's uniform per-peak scaling makes CV exactly 0, so
    the feature is mathematically pinned at 1.0.

  - TestIsotopicEnvelopeExtensions: rename namespace
    Test.Deconvolution -> Test.MassSpectrometryTests.Deconvolution to match
    the four sibling fixtures in the same directory. No behavioral change.

  - TestIsotopicEnvelopeExtensions.GetOrComputeGenericScore_NullEnvelope_Throws:
    add a second assertion covering the (envelope, DeconvolutionParameters)
    overload (the common downstream call shape in MetaMorpheus). Mirrors
    the symmetry already used by GetOrComputeGenericScore_NullModel_Throws.

* unify scoring under Deconvolute via UseGenericScore flag

  Restore the single-entrance-point design that Deconvoluter was built around.
  The PR previously added DeconvoluteWithGenericScoring as a parallel entry
  point; per @nbollis's review feedback, all behavioral variation should
  flow through the parameters object instead.

  - Add UseGenericScore (default false) to DeconvolutionParameters. When
    true, Deconvolute runs an additional per-envelope scoring pass that sets
    IsotopicEnvelope.GenericScore (the algorithm-specific Score is unchanged).
  - Fold the post-pass scoring into Deconvolute(MzSpectrum, ...) using a
    separate iterator helper (AddGenericScoring) so the NeutralMassSpectrum
    early-return at the top stays eager rather than getting deferred to
    first MoveNext.
  - Remove both DeconvoluteWithGenericScoring overloads. Caller migration:
    envelope = Deconvoluter.DeconvoluteWithGenericScoring(spec, p)
      -> p.UseGenericScore = true; envelope = Deconvoluter.Deconvolute(spec, p)
  - Update D-group integration tests in TestDeconvolutionScorerUnit to use
    the flag and the unified Deconvolute call.
  - Update the doc-comment cref in IsotopicEnvelopeExtensions to point at
    the new path.
* add new protease subtilisin

* feat: add subtilisin|p protease and unit test

Add subtilisin|p to the embedded proteases.tsv with full cleavage
specificity and proline-inhibition motifs (N[P]|, S[P]|, L[P]|,
K[P]|, I[P]|, D[P]|, Y[P]|, V[P]|, G[P]|, F[P]|, T[P]|, E[P]|,
Q[P]|, A[P]|, R[P]|).

Add TestSubtilisinP_DigestsCorrectlyAndRespectsProlineRestriction to
ProteinDigestionTests to verify:
- subtilisin|p is present in the embedded protease dictionary with
  CleavageSpecificity.Full
- All expected cleavage sites fire on a proline-free sequence (ANKTIDE)
- The K[P]| proline-inhibition rule is respected (AKPIDE keeps KP intact)

* Accept newer-TopFD column names in Ms1Feature

  PR 1064 ships an Ms1Feature reader that recognises only the older
  FlashDeconv / TopFD-v1.6.2 _ms1.feature schema (Sample_ID, ID,
  Time_begin, Time_end, Minimum_charge_state, Maximum_charge_state,
  Minimum_fraction_id, Maximum_fraction_id). Newer TopFD output keeps
  the same _ms1.feature extension but uses different column names
  (File_name, Fraction_ID, Feature_ID, Min_time, Max_time, Min_charge,
  Max_charge), plus extras like Envelope_num and EC_score. Format
  detection picks Ms1FeatureFile by extension; CsvHelper then throws
  because none of the expected [Name(...)] columns are present.

  Discovered while integrating PR 1064's FromFileDeconvolutionParameters
  into MetaMorpheus and pointing it at a real TopFD .ms1.feature from a
  top-down yeast run -- the file parsed by hand looks identical in shape
  to the old schema, just relabelled.

  Fix is column-name aliases on the existing record, plus [Optional] on
  fields that the newer schema omits entirely. Downstream
  GetSingleChargeFeatures() reads only Mass, ChargeStateMin/Max,
  RetentionTimeBegin/End, and IntensityApex -- all aliased to a column
  present in both schemas, so the join algorithm behaves identically
  regardless of which producer wrote the file.

  Per-field summary:
    Sample_ID         -> [Optional]            (newer TopFD has File_name
                                                instead; type-incompatible
                                                -- int vs string path --
                                                and not used downstream)
    ID                -> alias "Feature_ID"    + [Optional]
    Time_begin        -> alias "Min_time"
    Time_end          -> alias "Max_time"
    Minimum_charge_state -> alias "Min_charge"
    Maximum_charge_state -> alias "Max_charge"
    Minimum_fraction_id -> [Optional]          (newer TopFD has a single
                                                Fraction_ID column, not a
                                                min/max pair; not used
                                                downstream so alias would
                                                add no value)
    Maximum_fraction_id -> [Optional]

  No tests added in this commit -- a follow-up should drop a newer-TopFD
  _ms1.feature sample into Test/FileReadingTests/ExternalFileTypes/ and
  extend the existing Ms1FeatureFile read-roundtrip tests to cover both
  schemas. The current
    Ms1Feature_FlashDeconvOpenMs3.0.0_ms1.feature
    Ms1Feature_TopFDv1.6.2_ms1.feature
  fixtures keep passing because the existing [Name(...)] heads remain
  the first entry in every alias list.

* Drops a 4-row fixture (Ms1Feature_TopFDvLatest_ms1.feature) captured
   from real TopFD output that uses the File_name / Fraction_ID /
   Feature_ID / Min_time / Max_time / Min_charge / Max_charge schema, and
   wires it into the existing TestMsFeature parameterised tests:

     * TestFeaturesLoadAndCountIsCorrect gains a TestCase asserting the
       fixture loads four features end-to-end via FileReader.
     * TestTopFDLatestMs1FeatureFirstAndLastAreCorrect locks the per-
       field mapping for both the aliased columns (Time_begin/Min_time,
       Minimum_charge_state/Min_charge, etc.) and the [Optional] fields
       absent from the newer schema (SampleId, FractionIdMin/Max default
       to 0). Covers the single-charge edge case (Min_charge == Max_charge
       == 1) in the last record.
     * TestTopFDLatestMs1GetSingleChargeFeatureFunctions confirms charge-
       range expansion is identical across the two TopFD schemas: a 6-14
       range yields 9 envelopes; a 1-1 range yields exactly one;
       GetMs1Features() flattens to 9 + 12 + 10 + 1 = 32 across the four
       fixture features.
     * TestMs1FeatureReadWrite gains the new fixture as a TestCase. The
       writer emits the older-schema headers (newer columns aren't on
       the record), so the round-trip converts schema-newer -> schema-
       older + Optional defaults. Comment in the test explains why that
       is correct: every field downstream consumers actually read
       survives the round-trip; the columns that don't are exactly the
       ones marked [Optional] and unused.

   All 17 TestMsFeature tests pass, as do the 148 tests covering the
   related Ms1Feature / FromFile / SupportedFileExtensions / DinosaurTsv
   surface area.

* Fix formatting of subtilisin entry in proteases.tsv

* PR #1067's scope is Ms1Feature TopFD column-name acceptance. Two files
  had unrelated drift from the merged subtilisin PR (#1061) sitting in
  origin/master that did not belong in this PR's diff. Restore both to
  upstream/master so the PR shows only the Ms1Feature changes.

  - proteases.tsv: restore the trailing newline that was stripped.
  - ProteinDigestionTests.cs: revert `subtilisin|p` -> `subtilisin|P`
    (lowercase didn't match the dictionary key in the tsv) and restore
    the LoadProteaseDictionary_SubtilisinP_HasProlineRestrictedMotifs
    test that was deleted.

  No functional change.

* specify TopFD version 1.7.0 instead of "Latest" for the ms1.feature fixture

Addresses @nbollis's CHANGES_REQUESTED review on PR #1067 ("Latest is
not a version"). The newer-TopFD _ms1.feature fixture and its tests
were labelled "Latest" rather than a concrete release.

- Rename fixture Ms1Feature_TopFDvLatest_ms1.feature ->
  Ms1Feature_TopFDv1.7.0_ms1.feature. TopFD 1.7.0 (TopPIC suite,
  Dec 2023) introduced the ML feature-detection rework whose ECScore
  is the new schema's EC_score column.
- TestMsFeature.cs: update the four fixture-path references, rename
  the two TestTopFDLatest* methods to TestTopFDv1_7_0*, and anchor the
  doc comment to v1.7.0.
- Test.csproj: repoint the CopyToOutputDirectory entry.
- SupportedVersions.txt: TopFD VersionTested is now "1.6.2, 1.7.0".

Build clean; TestMsFeature suite 17/17 passing.

---------

Co-authored-by: MICHAEL SHORTREED <mrshortreed@wisc.edu>
#1066)

* almost there. No test

* made tests pass

* PR Cleanup

* revise logic

* review response

---------

Co-authored-by: trishorts <mshort@chem.wisc.edu>
* initial commit

* Testing

* pr response

---------

Co-authored-by: trishorts <mshort@chem.wisc.edu>
* seperate deconvolution Score and GenericScore

* generic deconvolution score accessiblity

* internal code review

* eliminate some ai noise

* spectrum aware generic deconvolution score

* Per the skill: mini commit message before moving to the test discovery step.

  fix(test): rename Test.MassSpectrometry.Deconvolution to avoid prod-namespace shadowing

  Two test files in Test/MassSpectrometryTests/Deconvolution/ declared
  namespace Test.MassSpectrometry.Deconvolution, which shadowed the
  production-library MassSpectrometry namespace from inside the test
  project. C#'s relative-namespace resolution then mis-resolved
  "MassSpectrometry.DissociationType" (and similar references) to
  Test.MassSpectrometry.DissociationType -- a type that doesn't exist --
  causing 18 cascading CS0234 errors in unrelated files (TestDatabaseLoaders,
  BackboneFragmentation, FragmentModificationTests, etc.).

  Rename to Test.MassSpectrometryTests.Deconvolution to match the surrounding
  files in the same directory and the directory name itself. Update the four
  "using static" references that pointed at the old namespace.

  Files changed:
  - DeconvolutionTestHelpers.cs (namespace)
  - TestDeconvolutionScorerSpectrumAware.cs (namespace + using static)
  - TestDeconvolutionScorerNegativeCharge.cs (using static)
  - TestDeconvolutionScorerUnit.cs (using static)
  - TestIsotopicEnvelopeExtensions.cs (using static)

---------

Co-authored-by: MICHAEL SHORTREED <mrshortreed@wisc.edu>
Co-authored-by: Nic Bollis <nbollis@comcast.net>
* Lib: Equality

* Similarity Base Extensions

* MsDataScan

* Library Inheritance and construction

* Similarity: More Entrance Points

* Similarity Extension Methods

* more tests

* address review comments

---------

Co-authored-by: trishorts <mshort@chem.wisc.edu>
…na RetentionTimeModel (#1046)

* correct Within calculation

* update unit tests

* this is the spot

* new retention time interface works with chronologer and koina prosit

* unbroken

* play nicely with others

* update tests that call koina http

* retention time model tests

* Tighten the new IRetentionTimePredictor batch interface and the CI
  that validates it.

  CI / test categorization:
  - Add an Integration test step to the workflow's integration job; the
    build job's Category!=Integration filter was silently dropping all
    six mzLib [Category("Integration")] test files from CI.
  - Switch filter property name TestCategory -> Category (canonical
    NUnit-friendly form, stable across NUnit3TestAdapter versions).
  - Replace [Explicit] with [Category("Integration")] on the two
    Prosit2019iRT live-API batch tests so they run under the new step.

  Behavior fixes:
  - PredictRetentionTime (single) now distinguishes IncompatibleModifications
    (input rejected pre-flight) from PredictionError (model/HTTP failure)
    by inspecting the matching PeptideRTPrediction.Warning.
  - Validate peptide.FullSequence in addition to BaseSequence; the
    dictionary path is keyed by FullSequence and would otherwise NRE
    through the Dictionary lookup.
  - Skip null elements in the Koina batch and the interface default's
    foreach loop, instead of NRE-ing outside the documented
    ArgumentNullException contract.

  Documentation:
  - Clarify that RetentionTimeModel is not thread-safe and that the
    batch method does not cache results across calls.
  - Document the null-element-skip behavior on the interface default.

  Test isolation:
  - Lazy-init the Chronologer in RetentionTimePredictorInterfaceTests so
    TorchSharp model-load failures only affect the three Chronologer-
    using tests, not the mock-only fixture members.

* fix(predictors): implement new IRetentionTimePredictor members on Koina

  The IRetentionTimePredictor interface added three members on this branch
  (PredictRetentionTimeEquivalent, PredictRetentionTimeEquivalents, and
  the IDisposable.Dispose inherited from the now-IDisposable interface),
  but the Koina-side RetentionTimeModel implementation hadn't caught up,
  breaking the build at the class declaration with three CS0535 errors.

  Stubs delegate through to existing behavior:
  - PredictRetentionTimeEquivalent forwards to PredictRetentionTime; for a
    Koina RT model the predicted value already IS the iRT/equivalent.
  - PredictRetentionTimeEquivalents routes through the existing batch
    PredictRetentionTimes (single Koina HTTP call) and reshapes into the
    tuple list the interface specifies. The maxThreads parameter is
    accepted for compatibility but ignored -- Koina is one HTTP request
    regardless of thread count.
  - Dispose is a virtual no-op. KoinaModelBase doesn't hold unmanaged
    resources today; subclasses with real disposable state can override.

  Same three-stub pattern applied to MockPredictor in
  RetentionTimePredictorInterfaceTests so the test fixture compiles.

* fix(retention-time): tighten Koina IRetentionTimePredictor implementation

  Address review findings on the unified IRetentionTimePredictor contract
  and its Koina implementation:

  - Filter null peptides and null/empty FullSequence in batch input before
    DistinctBy, so callers can pass mixed lists without an NRE.
  - Clear Predictions at the top of every PredictRetentionTimes call so
    prior-call data can't leak into the singular method's failure-reason
    fallback after a transient HTTP failure.
  - Remove the outer try/catch in PredictRetentionTimes; swallow-everything
    was hiding programmer errors. Predict is the only call that legitimately
    needs the inner catch and that one stays.
  - Map a null peptide on the singular method to EmptySequence (not
    PredictionError); rename the matching test.
  - Normalize NaN/+/-Infinity from the model to null so the documented
    "null means prediction was not possible" contract holds end-to-end.
  - Make the default IRetentionTimePredictor.PredictRetentionTimes impl
    skip null/empty-FullSequence peptides, mirroring the Koina override's
    defensive contract.
  - Restore multi-line XML doc form on SeparationType.

  Tests:
  - Add Prosit2019iRT_PredictRetentionTimes_ModifiedPeptide_ReturnsKeyedByOriginalFullSequence
    to pin the dictionary-keying contract end-to-end.
  - Tighten the null-input default-impl test (assert ParamName) and add a
    PredictRetentionTimes_DefaultImpl_NullElement_IsSkipped case.
  - Drop the now-redundant NullResultEntries_FailureReasonNotAvailable test.

* cover RetentionTimeModel.PredictRetentionTimeEquivalents

  Add 3 NUnit tests using the existing TestableRetentionTimeModel stub
  harness so no Koina HTTP call is made:

  - PredictRetentionTimeEquivalents_NullPeptides_ThrowsArgumentNullException
    pins the input-null guard so a typo'd peptide list fails fast.
  - PredictRetentionTimeEquivalents_SuccessfulBatch_ReturnsTuplesKeyedByOriginalPeptides
    pins the happy-path reshape: one tuple per input, PredictedValue carries
    through from the batch result, original IRetentionPredictable
    round-trips through the .Peptide slot, FailureReason is null.
  - PredictRetentionTimeEquivalents_FailedPrediction_PreservesWarningAsIncompatibleMods
    pins the failure-reshape branch: when a batch result lacks a value AND
    the matching Predictions entry has a Warning, FailureReason is
    IncompatibleModifications (not the catch-all PredictionError) so callers
    can distinguish model/network failure from sequence-incompatibility.

  Also document the three intentionally-untested IRetentionTimePredictor
  members in the fixture's class doc: PredictRetentionTimeEquivalent
  (one-line delegation), Dispose (empty no-op default), and the multi-batch
  throttling delay inside AsyncThrottledPredictor.

* tighten null-peptide and empty-sequence contracts

  - PredictRetentionTimeEquivalents: stop silently dropping peptides with
    empty/null FullSequence. Null-peptide references are still dropped
    (no identity worth tracking), but a real-but-malformed peptide now
    comes back as a (null, peptide, EmptySequence) tuple so callers
    index-aligning result-to-input get a 1:1 mapping for everything they
    passed.

  - GetFormattedSequence: change null-peptide reason from PredictionError
    to EmptySequence so callers branching on FailureReason see the same
    value as PredictRetentionTime's null-peptide handling.

  - PredictRetentionTimeEquivalents fallback: drop the dead-defensive
    "Predictions?." -- PredictRetentionTimes (called immediately above)
    always re-initializes Predictions to a non-null list, so the null
    guard hides the invariant. Matches the unconditional access pattern
    in PredictRetentionTime.

  - IRetentionTimePredictor.PredictRetentionTimes (default impl): narrow
    the input-skip from "FullSequence is null OR empty" to "null only".
    An empty FullSequence is a legal dictionary key; let it through so
    PredictRetentionTime returns null with EmptySequence and the caller
    sees results[""] = null instead of a silent drop. Comment now
    explains the ArgumentNullException rationale precisely.

  Test update: GetFormattedSequence_NullPeptide_ReturnsNullWithPredictionError
  renamed and re-asserted against EmptySequence (the test's name itself
  encoded the old behavior).

* mark legacy PredictRetentionTime[s] obsolete on IRetentionTimePredictor

  Per @nbollis's PR #1046 feedback: the singular and batch
  PredictRetentionTime[s] pair was a duplicate entry point alongside
  PredictRetentionTimeEquivalent[s]. Mark the legacy pair [Obsolete] on
  the interface and on the Koina RetentionTimeModel, and invert ownership
  in the Koina implementation so the Equivalent[s] pair holds the real
  HTTP-batch logic — matching the pattern already in place on the
  RetentionTimePredictor abstract base.

  Interface (IRetentionTimePredictor):
  - Mark PredictRetentionTime and PredictRetentionTimes [Obsolete] with
    a message pointing callers at the Equivalent[s] pair.
  - The default PredictRetentionTimes implementation now routes through
    PredictRetentionTimeEquivalent so the default itself does not
    self-trigger CS0618.

  Koina RetentionTimeModel:
  - PredictRetentionTimeEquivalents now owns the Koina batch HTTP call:
    DistinctBy dedup, Predictions clear, Predict() invocation, NaN/Inf
    normalization, defensive null-fill, exception swallow, and the
    IncompatibleModifications vs. PredictionError failure-reason fallback.
  - PredictRetentionTimeEquivalent now owns the singular flow: validates
    the peptide and routes through PredictRetentionTimeEquivalents with a
    list-of-one.
  - PredictRetentionTime is now a thin [Obsolete] shim delegating to
    PredictRetentionTimeEquivalent.
  - PredictRetentionTimes is now a thin [Obsolete] shim that calls
    PredictRetentionTimeEquivalents and reshapes the tuple list into the
    IReadOnlyDictionary<string, double?> the legacy contract returned.
    TryAdd preserves first-write-wins on duplicate FullSequence keys.

  Tests:
  - RetentionTimeModelTests and RetentionTimePredictorInterfaceTests
    gain a file-level `#pragma warning disable CS0618` with a comment
    noting the legacy surface is intentionally exercised for
    backward-compat coverage. No tests were removed or rewritten.

  Public API behavior is unchanged: the obsolete methods still work and
  produce the same results; new callers should prefer the Equivalent[s]

* build Test project in Release before integration test step

The integration job ran `dotnet test --no-build ./Test/Test.csproj` but
only built the mzLib solution, which does not include Test.csproj. Test.dll
was never produced in bin/Release, so VSTest failed with "test source file
not found". Add an explicit Build (Test) step, mirroring the build job.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(rt): drop obsolete PredictRetentionTime pair from interface

The [Obsolete] PredictRetentionTime / PredictRetentionTimes pair on
IRetentionTimePredictor was a backward-compat shim during the migration
to PredictRetentionTimeEquivalent / PredictRetentionTimeEquivalents.
Consumers are migrated, so the legacy surface is removed.

Cascade:
- Drop the [Obsolete] shims in RetentionTimePredictor and
  Koina RetentionTimeModel.
- Delete RetentionTimePredictorInterfaceTests.cs, which was dedicated to
  exercising the obsolete pair.
- Strip the obsolete-only test sections from RetentionTimeModelTests and
  drop the file-level CS0618 pragma.
- Refresh the class diagram and a couple of stale doc references.

Also rename the four Koina test fixtures' NUnit category from
[Integration] to [Koina] and update the workflow job (filter and
job name) accordingly, so the category no longer collides with
unrelated "Integration" usage on the development branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(koina): cover RetentionTimeModel single + batch failure branches

Add 6 NUnit tests against TestableRetentionTimeModel to replace the
coverage lost when the obsolete PredictRetentionTime / PredictRetentionTimes
pair was removed.

PredictRetentionTimeEquivalent:
  - null peptide -> null + EmptySequence
  - empty FullSequence -> null + EmptySequence
  - happy path -> value flows through the list-of-one batch delegation

PredictRetentionTimeEquivalents (uncovered branches):
  - predictor throws -> all valid peptides surface as PredictionError
    (complement of the existing warning -> IncompatibleModifications test)
  - predictor returns fewer results than sent -> missing entries filled
    as PredictionError tuples instead of being dropped
  - empty-FullSequence peptide -> preserved as EmptySequence tuple in
    the output, distinct from the model-side PredictionError branch

Lifts RetentionTimeModel.cs from 86.8% to 98.7% line coverage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Rename job and simplify workflow steps

Renamed job 'koina' to 'integration' and removed test steps for mzLib Koina.

---------

Co-authored-by: MICHAEL SHORTREED <mrshortreed@wisc.edu>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Nic Bollis <nbollis@comcast.net>
…1070)

* DeconParams Equality

* added to average residue

* Address PR #1070 review: null-safe Equals, value-based MzWindow hash, Clone OriginalFilePath, document equality contract
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.