Update instrumentControl branch#4
Open
avcarr2 wants to merge 285 commits into
Open
Conversation
* correct Within calculation * update unit tests * this is the spot * readPrecursorScanNumber * return reported scan and precursor scan numbers when reading mzml * unit test exception check for no scan number in mzml * no real change * delete empty line Co-authored-by: MICHAEL SHORTREED <mrshortreed@wisc.edu> Co-authored-by: Lei Lu <lonelu@users.noreply.github.com>
* Edited normalization methods within SpectralSimilarity to remove side effects, made these methods public * Made normalization methods static
Co-authored-by: trishorts <mshort@chem.wisc.edu>
#667) * Added RtHypothesis and RtStdDev fields to ChromatographicPeak in FlashLFQ * Undid accidental formatting changes * Undid accidental comment changes * Added RtInterquartileRange field to ChromPeak and associated tests * added requested comments * Added comment * Enhanced information on fields * Rt information is now internal set Co-authored-by: trishorts <mshort@chem.wisc.edu>
* Changed x64;AnyCPU to only AnyCPU in all project files * One more change * Nuget - 3. Me - 0 * Moved over all classes from the spec averaging codebase * Implemented MRS Noise Estimation as a weighting type * SpecAveragingExtensions now creates binned spectra to be used for downstream processing * Implemented SpectraWeighting * Combined both options classes * Replaced all references to MzLibSpecAveragingOptions * Made some changes to tests * Added ignore case to spectra file handler * Restructured Austin's objects to fit within the current structure and made all tests pass * Removed normalization class and implemented ability to open mzml and raw reguardless of case of extensions * Tested Weighting, removed MRS Noise Estimation from SpectraMergingType * Revised structure to elminate redundant objects * Removed noise and fixed tests * Made final adjustments to tests and algorithms * Removed extra methods and cleaned up a test where I was tining them * Forgot to save a file on the last commit. Whoops * Fixed an error * Expanded test coverage * Added single spectra and scan normalization methods and renamed a few enums to match the revised structure * Added one more test * Renamed project to SpectralAveraging and cleaned up the classes * Revert "Added one more test" This reverts commit 3d32bca. * Renamed the project to SpectralAveraging * Updated nuspec to current version * Changed line in tests.csproj per Alex * Made it so that the Averaging modifies an Ms2 in place instead of constructing a new object * Actually renamed the writer this time * Cleaned up the code * Parallelized Bin Averaging Co-authored-by: Nic Bollis <nbollis@wisc.edu>
* Changed x64;AnyCPU to only AnyCPU in all project files * One more change * Nuget - 3. Me - 0 * Made it so that default deconvolution is the entire spectra, with an optional range parameter to get from a specific region of the spectra * Added some comments
* Added infrastructure for MRS noise estimation * Added basic tests for mrs noise estimation. Will add more complex tests once I see the code coverage results. * Increased test coverage a little more * Fixed failing build due to not updating a constructor. * removed BasicStatistics.CalculateMedian and substituted it with Array.Median() at all places CalculateMedian was called. Co-authored-by: trishorts <mshort@chem.wisc.edu>
* Changed x64;AnyCPU to only AnyCPU in all project files * One more change * Nuget - 3. Me - 0 * Started initial structure * Revert "Started initial structure" This reverts commit f06cbc0. * Attempted to solve random test fails by restructuring the HTTP response
* correct Within calculation * update unit tests * this is the spot * added median polish test cases * all tests all day * unused code * test updates --------- Co-authored-by: MICHAEL SHORTREED <mrshortreed@wisc.edu>
* weighted mean polish * more unit tests
* correct Within calculation * update unit tests * this is the spot * fixed with unit tests --------- Co-authored-by: MICHAEL SHORTREED <mrshortreed@wisc.edu>
* Changed x64;AnyCPU to only AnyCPU in all project files * One more change * beginning refactoring * Removed ExtractIonChromatogram from MsDataFile * removed Deconvolute from MsDataFile * Updated ThermoRawFileReader * * fixing tests and code after overhauling the scan readers * * mgf tests are fully fixed * Completed test fixes. * Fixed remaining tests. * Deleted folders that were hanging around * Increased code coverage to >90%. * Removed class. Added a test. * Added more tests to Readers * Added more tests * Nuget - 3. Me - 0 * Deleted unused DynamicDataConnection from MassSpectrometry * beginning refactoring * Removed ExtractIonChromatogram from MsDataFile * removed Deconvolute from MsDataFile * Updated ThermoRawFileReader * * fixing tests and code after overhauling the scan readers * * mgf tests are fully fixed * Completed test fixes. * Fixed remaining tests. * Deleted folders that were hanging around * Increased code coverage to >90%. * Removed class. Added a test. * Added more tests to Readers * Added more tests * Deleted unused DynamicDataConnection from MassSpectrometry * Delete SummedMsDataFile.cs * Started implementing Agilent Reader * ASDF * adg * Reset to previous commit * Restructred Tests * removed some superfluous stuff from the core readers. * Removed IDataReader Interface * Renamed MsDataFileHelpes to MsDataFileExtensions Added extension method to write as mzml Removed GenericPeak and GenericMzSpectrum Restructed MsDataFile classes to order from public to private Added in static LoadAllStaticData() to ensure backwards compatibility * Consolidated file reading tests * Tested new export code extension * Fixed a non-build breaking error due to using internal in interface definition for IReaderFactory. Kinda odd tbh. * Added a few tests. Excluded SystemInfo.cs from code coverage because it's 1) impossible to create a test case that will work with the CI and developer machines and 2) it's stock Microsoft Code that shouldn't break. * * Commented out a few method that have been in the code base since 2018, but are unused and have no unit test coverage. Methods kept as comments for posterity. * Added two tests to FlashLFQ that covered a 0% test coverage file. * Removed x86 configuration --------- Co-authored-by: nbollis <96196865+nbollis@users.noreply.github.com> Co-authored-by: nbollis <nbollis@comcast.net>
* Changed x64;AnyCPU to only AnyCPU in all project files * One more change * Nuget - 3. Me - 0 * Started initial structure * Revert "Started initial structure" This reverts commit f06cbc0. * Changed the within in Old deconvolution tests as most abundant mass did not take into account the proton mass, making hte results off by a bit more than previous Added tests for deconvolution with good data * Changed how namespace was assigned in single test * Moved Deconvolution Test to a development project * comments * Uncommented lines in test fixture setup * Updates tests to have a tolerance limit * Fixed test I broke by IsotopicEnvelop change * Reverted changes to new branch to see if that fixes all the tests crashing on build * Reimplemented almost all changes from previous branch * Implemented the final changes from last pull request * Excluded a test from code coverage * Removed unused namespaces * Coin flip * DeconTestWithout Namespace Changes * Commit * Update SinglePeakDeconvolutionTestCase.cs * Update WholeSpectrumDeconvolutionTestCase.cs * Update SinglePeakDeconvolutionTestCase.cs * Update WholeSpectrumDeconvolutionTestCase.cs * Added a reference to the readers project --------- Co-authored-by: Nic Bollis <nbollis@wisc.edu> Co-authored-by: Alexander-Sol <41119316+Alexander-Sol@users.noreply.github.com> Co-authored-by: Anthony <anthony.cesnik@gmail.com>
* Added option to quantify ambiguous peptides in FlashLFQ * Reverted .nuspec * Removed Proteomics.AminoAcidPolymer ref in flashlfqresults
…e Develpment Dependency (#699) * Changed x64;AnyCPU to only AnyCPU in all project files * One more change * Nuget - 3. Me - 0 * Started initial structure * Revert "Started initial structure" This reverts commit f06cbc0. * Removed Test as a dependency of Development * Individual reader factores will not grow in the future, only more factories will be added. For this reason, they were combined into a single .cs file and all marked as sealed classes Uncommented out stuff in Variant Application as it was used in MetaMorpheus * ThermoRawFileReaderLiscence added to backwards support * Updated nuspec to be final 5.0.539 * Eliminated factory layer from Readers * forgot to save one file * Removed extra constructors and expanded test coverage * Expanded test coverage * Added SequenceVariant tests --------- Co-authored-by: Alex <AlexSolivais@gmail.com>
* Changed x64;AnyCPU to only AnyCPU in all project files * One more change * Nuget - 3. Me - 0 * Fix crash caused by all values in a bin being rejected * Lets try that again * Changed Name of DeconvolutionTypes to DeconvolutionType to fit convention * Added method to gerneate an IEnummerable of SpectralAveragingParameters * Fixed winsorized and averaged sigma clipping rejection methods and their relevant tests. * Eliminated all of our own versions of median and standard deviaition and went with those in Mathnet.Numerics * Added enums as parameters to GenerateSpectralAveragingParameters method * Moved Noise Estimation toMzLibUtil so it can be accessed by other projects * Adjusted averaged spectra writer to be able to output multiple averaged spectra from the same original file Added additional normalization extension methods * Commented Magic Numbers * Expanded test coverage, fixed an output bug, enabled ouptut to custom locations with custom names, removed outputting at csv * Change MrsNoiseEstimation namespace to NoiseEstimation * Updated namespace reference * Update mzLib.nuspec * SR changes * Updated variable names in Averaging MzBinning Algorithm * Resolved Merge Conflict * Added comment for magic number. --------- Co-authored-by: avcarr2 <64652734+avcarr2@users.noreply.github.com> Co-authored-by: trishorts <mshort@chem.wisc.edu>
* Added Bruker data handling files to the most updated version of master. * Renamed Bruker classes to BrukerFileReader * Added tests for loading Bruker data * Added .dll required to run Bruker sourced methods * Added test data references. * Added all test data files from the requisite .d directories. * Added documentation. * Added more documentation. * Added dependencies of baf2sql_c.dll * Added more unit tests * Anonymized lockinfo files. * Changed namespace in from IO.* to Readers for the Bruker file readers.
* correct Within calculation * update unit tests * this is the spot * add space * unbroke * now with fastaGz * wrong name --------- Co-authored-by: MICHAEL SHORTREED <mrshortreed@wisc.edu>
…Engine (#714) Co-authored-by: trishorts <mshort@chem.wisc.edu>
* Changed x64;AnyCPU to only AnyCPU in all project files * One more change * Nuget - 3. Me - 0 * Started initial structure * Revert "Started initial structure" This reverts commit f06cbc0. * Combined commits to trick github after file size reduction * Fixed broken test and class diagram * Expanded Test Coverage Added IEquatable<MzSpectrum> to MzSpectrum * Eliminated IResult interface * Added exclude from code coverage to test files * One more exclude form code coverage * Used only filepath to generate hash of resultfile * Added a few new comments * Unified some variable names and reduced test file size * Changed test file names to show version * Unified variable names * Update TestMsFeature.cs * Updated Supported File Types to be cleaner * Adjusted Test file names to have file type before software and version --------- Co-authored-by: Alexander-Sol <41119316+Alexander-Sol@users.noreply.github.com>
* Created an extensions for binary array searching + associated tests * Deleted unused lines * Implemented new search extensions across mzLib, edited tests * Added summary comments
…ons (#722) * Added extension method to check valid base sequences to AminoAcidPolymerExtensions * Changed name
* correct Within calculation * update unit tests * this is the spot * add space * works for base sequence but dont know about full sequence * test flashlfq silent * test flashlfq bayesian * not quite there * works yo * works works * mo better explanation yo * style baby --------- Co-authored-by: MICHAEL SHORTREED <mrshortreed@wisc.edu>
* Fixed bug in GetClosestIndex where arrays of length 1 would break closest search * Added tests and comments --------- Co-authored-by: Edwin Laboy <63374885+elaboy@users.noreply.github.com> Co-authored-by: trishorts <mshort@chem.wisc.edu>
* Added support for negative mode deconvolution * Changed name of ExampleNewDeconvolution to include the word Template --------- Co-authored-by: trishorts <mshort@chem.wisc.edu>
* correct Within calculation * update unit tests * this is the spot * add space * comment update UniProt URL where column names for downloads can be found --------- Co-authored-by: MICHAEL SHORTREED <mrshortreed@wisc.edu> Co-authored-by: Anthony <anthony.cesnik@gmail.com>
* correct Within calculation * update unit tests * this is the spot * add space * added comment explaining the semicolon in the tostring of proteingroups for quant * space the final frontier --------- Co-authored-by: MICHAEL SHORTREED <mrshortreed@wisc.edu>
* Reduces complexity of GetOneBasedScan from O(logN) to constant * Removed excess changes * Added a comment and removed extra line --------- Co-authored-by: Edwin Laboy <63374885+elaboy@users.noreply.github.com> Co-authored-by: trishorts <mshort@chem.wisc.edu>
* Created SingletonDoublePool, edited PWSM to use new pool * Added new test to TestFragments * Improved benchmarking test * Changed PWSM to use a list pool for neutral losses * Deleted test with local file reference, deleted unused usings * went back to hashset, switched to threadlocal implementation * Reverted changes to listpool * Reverted unintended changes --------- Co-authored-by: Nic Bollis <nbollis@comcast.net>
* Added to reader * Clean Up and fix tests * Updated to optional method * Made parser check for it in scan desccription * changed where we get from
* updating batching for detectability model (low batch size) * MESSY BROKEN IN PROGRESS * still broken * almost there. fragment intensity model all up to date. Cant test bc other models are unbuildable. just gotta clean up and test. * just need tests now * step * All but intensity model tested. Everything compiles. One step left. * TESTS DONE * first benchmark test on 4M peptides/predictions. three to go * benchies got me acting up. All four done. * oops. new dynamic timing was broken for cases of no valid inputs (client timeout would set to 0). Now only create client and get requests if there are valid inputs. * small stuff * test expansion for coverage. Improvements to fragment intensity models' parameter validation. * Updated fragmentation model to have a parameter specifying what MZs (original input sequence or validated sequence) to record predictions for. Updated LibrarySpectrum generation to take into account fragment ion MZ choice. This will better map the MZs we will typically want when creating/comparing spectral libraries. * Made MapToValidatedFullSequence The default for hcd model to err on the side of prediction validity. --------- Co-authored-by: trishorts <mshort@chem.wisc.edu> Co-authored-by: Nic Bollis <nbollis@comcast.net>
* Added to reader * opencode gitignore * Sequence conversion base * Removed converter structure and outlined cannonical schematic structure * Schema and conversion warnings * Interface Design * Modification Lookup * Moved warnings * Base Class Adjustment: Cannical mod is better and schema is an abstract. * First implementaitons. * Design.md * Serializers and Parsers * gitignore * Start of testing * project restructure * ugh * Mass shift seq serializer base * Global mod lookup * Mood lookup refactor * Chronologer initial * revised chronologer * Refactor Serializers: ShouldResolveMod * Schema construction refactor * Added in unimod and essential sequence serializer * Tested Conversion Service with Essential Seq * Kiiona Initial Integration. Still failing TMT * Multiplex Mod mapping * Uniprot start * Lookup Refactor: SearchFromCandidatesOnly * ModLookup Structural refactor * Koina. Finally fix that shit * Lookup: Cumulative filtering * Uniprot revisions * Koina: Added ValidatedFullSequence to all prediction outputs * Uniprot conversion works, but is hardcoded for many things * Chronologer and Uniprot work * Sequence Converter * Koina conversion refactor * Koina conversion refactor * Sequence Converter * Uniprot serialization fix and peptide and protein extensions * Moved base classes and added no new namespaces file * Testing Command Round 1. * Testing Command Round 2 * Testing Command Round 3 * removed testflashlfq * lookup base Tests * Expanded test coverage * Revisered wiki * Removed uniprot
* Added to reader
* feat: add entrapment database support
Add IsEntrapment property to IBioPolymer interface and propagate through Protein,
NucleicAcid, and RNA classes. Entrapment proteins are synthetic decoys
designed to catch false identifications.
- Add entrapmentIdentifier parameter (default 'NTRAP') for auto-detection via
accession.StartsWith(entrapmentIdentifier) in ProteinDbLoader and RnaDbLoader
- Update Merge functions to use 5-tuple keys including IsEntrapment
- Generated decoys preserve IsEntrapment from source proteins
- BioPolymerGroup outputs 'D/C/E/T' (Entrapment checked first in priority)
- TSV results include IsEntrapment via DecoyContamTarget.Contains('E')
- Add test data file TwoHumanHistone_mimic_Retain3.fasta
* Move isEntrapment to the end of the constructors and made it optional
* better biopolymergroup output
* Testing
* more tests
* Replace NTrap with Random
* more tests
* Changed ProteinXML writer to no longer produce empty position tags (e.g., <begin position="" />" in the .xml database * No longer write malformed entries * Small modificatoins to protein db writer of questionable quality * Updated .gitignore * Added UniprotEntry class, modified ProteinDbWriter for ProSight Compatibility * Added test for fasta -> xml writing * Updated proteinDbWriter, tests, add new protein constructor * Added new flag to WriteProteinDb * Fixed non variant protein assignment issue * Addressed Shortreed PR comments
* New clean repo with ptm_stoch contents. The methods for occupancy calculation in mzlibutils were copied from the previous branch onto this one. Need to add/remake the tests next. * Added TestMzLibUtils tests for quantified mods, peptides, and proteins. Need tests for the protein groups and the occupancy set up (currently called CalculateOccupancies). * Added PG and Quant object setup tests. Need to finish these tests, though * Finshed TestSetUpQuantificationObjects. Removed Peptides field (and its population) from SetUpQuantificationObjects method for now. * Refactored quantification util classes * improving quantprot exception throw. * Extended commenting. Added a peptide record class that stores the peptide input for setting up the protein groups and the quantifications. * delayed test fixes.... * Adding GeneName and Organism fields to QuantifiedProteinGroup. FIXED bug when merging QuantifiedPeptides that caused the resulting mods to have greater intensity than total base peptide intensity. * Apply suggestions from code review Copilot suggestions for PFA class Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestions from code review copilot suggestions for creating deep copy on peptide mod stoich method as well as cleaner AND/OR conditional priority Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot comment fix Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot comment fix Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Static occuancy methods and integration into Omics.BioPolymerGroup. Note: the util occupancy code will be kept for now in the event it can be useful in the future due to its simpler code structure. * cleaning docs and small bug risks * removing occupancy from sequence coverage. Adding sample group class for occupancy reporting. Updating quant to always have per group psm count, psm occupancy, and additionally intensity and intensity occupancy if lfq provided. NEEDS TESTS. * Fixing tests. Still need to add tests for new classes. Updated some properties to ensure string and headers are always writeable. * remove property setter calls to populatesamplegroupresults. method should be call before writing quantification columns. * coverage improvement and small bug fixes. * minor corrections from claude * temp. added ScanMetadata * temp save, but test run works * cleaning biopolymergroup * make sure psms with multiple pwsm matches do not inflate the psm count denominator. * restored accidentally deleted code. * preventativce maintainance * unit tests to promote understanding * Bug fix for inflated occupancies due to Full sequences from the same PSM recounting the psm intensity for the unambiguous mods in that PSM. * fix counting and only report unambiguous mods. * Cleaning code and implementing some suggestions. Renamed BioPolymerGroupType categories. Occupancy is now only reported from unamibiguous PSMs (will be enhanced later). * revert nuspec * final fix. output seems correct. * final fix. output seems correct. * nuspec? * test file instead of folder referencing in dotnet.yaml to see if that fixes the integration issue. --------- Co-authored-by: Alexander-Sol <41119316+Alexander-Sol@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: MICHAEL SHORTREED <mrshortreed@wisc.edu> Co-authored-by: Nic Bollis <nbollis@comcast.net> Co-authored-by: trishorts <mshort@chem.wisc.edu>
* slim fit * increased test coverage plus handle negative mode
* Averatide revisions * Emperical * Empirical Testing * Added missing elements.
* updating fragment intensity model input mods handling default and adding safeguards to pwsm fragmentation output. * Make every models' default behavior be to return null predictions on incompatible inputs. --------- Co-authored-by: Nic Bollis <nbollis@comcast.net>
* Object Pool * ope * Split: TestProeinDigestion * Split: TestPeptides * Split: TestModifications * Split: TestFragments * Split: TestChemicalFormula * Split: TestPeptideWithSetMods * Move: First 8 tests * Move: MzlibUtil * Rename: Chemistry Tests * Move Loose: Database Tests * Move Loose: IsoTracker * Move Loose: MassSpectrometry Decon and loose * Move Loose: Spectral Library Tests * Structure: FileReadingTests * Move Loose: Proteomics * Adjust: Transcriptomics * Move Loose: Omics-Fragmentation * Mass Spec Oopsie * Move Loose: Omics-Modification * Move Loose: Omics-BioPolymerGroup * Move Loose: Omics-SampleInfo * Move Loose: Omics-Occupancy * Move Loose: Omics-SpectralMatch * Test.csproj changes * Fix tests. * refactor a few more namespaces
… Generalized Naming (#1048) * batch mode * batch tests * IDisposable left to individual predictors * best of both worlds * eliminate record struct * threaded with split by range * obsolete predictretentiontime * internal code review * fix broken tests * retention time predictor exception tests * bye bye ienumerable * eliminate exception * delete chrnologer override * throw caution to the wind * some tests * Drop the IsConcurrentPredictionSafe gate concept entirely: predictors are expected to handle whatever maxThreads they receive. Chronologer serializes on _modelLock; SSRCalc3 and CZE are pure functions over readonly state. maxThreads < 1 clamps to 1 rather than throws. No streaming API. Production code: - Producer faults in ProduceResults are now captured into the existing ConcurrentQueue<Exception> and surfaced as AggregateException after the consumer drains, honoring the documented contract. - peptides.ToList() and the empty-input early return hoisted out of Task.Run to the caller's thread, so source-enumeration faults surface synchronously and thread-affine sources behave correctly. - PredictRetentionTimeEquivalents XML doc on base class and interface now explicitly states result order is not guaranteed; the tuple's Peptide element is the binding to inputs. - SSRCalc3.ValidateBasicConstraints calls base first, so null/empty BaseSequence yields EmptySequence instead of NRE. - Removed orphaned IsConcurrentPredictionSafe XML doc, broken <see cref="StreamRetentionTimeEquivalents"/>, redundant IDisposable on the base class declaration, and the obsolete PredictRetentionTime DIM on the interface (kept the concrete shim on the base class). Tests: - Multi-threaded parity test now uses real amino-acid sequences instead of PEPTIDE1..PEPTIDE20, which were all rejected as InvalidAminoAcid and reduced the test to "null equals null". - ContainsAllInputPeptides asserts Count == peptides.Count and multiset equality (no-collapse, unordered). - Cross-predictor failure-reason assertions in MixedBatch test weakened to the universal "value null ⇔ reason not null" contract. - DelegatesToProduceResults wired through SpyRetentionTimePredictor with real spy assertions. - Deleted broken Diagnostic_ParallelExecution_Behavior, unused AggregateExceptionThrowingPredictor and ThrowingEnumerable helpers, and two empty AggregateException placeholder regions. Cosmetic reverts: undid 9 stylistic changes flagged by the review (EOF newlines, trailing whitespace, blank lines between interface members, comment rewording, an unused-using removal that was actually used). No behavioural impact. Verified: dotnet build clean; 129/129 RetentionTimePrediction tests pass.
* protease dictionary embedded resource new default behavior * internal review * Address PR review: restore deprecated APIs, fix CNBr test, update stale docs Blocking - ProteaseDictionary: restore LoadProteaseDictionary(path, mods) and ResetToDefaults(mods) as [Obsolete] thin wrappers. The parser wrapper does not mutate global state (matches pre-PR behavior); ResetToDefaults rebuilds Dictionary from the embedded resource. - RnaseDictionary: restore LoadRnaseDictionary(path) and ResetToDefaults() with the same [Obsolete] treatment. - TestProteinDigestion.CNBrProteinDigestion: rename the three colliding rows in DoubleProtease.tsv (CNBr, CNBr_old, CNBr_N) to CNBr_custom, CNBr_old_custom, CNBr_N_custom, update the test's dictionary lookups, and add result.Added assertions so the test now exercises custom-file loading instead of silently falling back to the embedded definitions. The Test1/Test2/Test3 rows used by TestSeqCoverage.ReadCustomFile are untouched. Non-blocking - proteases.tsv, rnases.tsv: remove dead ResetToDefaults references, rewrite merge-rules comments to reflect skip-on-collision semantics, and replace the misleading 'trypsin' example in proteases.tsv with a non-colliding name. - ProteaseDictionary, RnaseDictionary: soften 'immutable' / 'cannot be overridden' language in class docstrings to 'protected baseline', and note that dictionary contents remain mutable via the indexer. Deferred to follow-up (non-blocking suggestions): standalone parser test in TestSeqCoverage and splitting atomicity tests out of TestProteaseDictionaryEmbeddedMods.cs. * fix(digestion): address PR #1040 review Restore the public no-args RnaseDictionary.LoadRnaseDictionary() as an [Obsolete] shim so external callers compile with a deprecation message pointing them at RnaseDictionary.Dictionary. Enforce CustomDigestionAgentLoadResult's advertised immutability: Added and Skipped now wrap a defensive copy in a ReadOnlyCollection so callers can't mutate reported results via a retained reference. Validate path and reject null elements in RnaseDictionary.LoadAndMergeCustomRnases before any I/O, so the resulting ArgumentNullException identifies the bad input clearly. Clarify in XML docs that: - CustomDigestionAgentLoadResult.Skipped intentionally conflates embedded-baseline, cross-file, and cross-call collisions. - ProteaseDictionary.LoadAndMergeCustomProteases is not thread-safe and must run during single-threaded initialization. - Custom proteases referencing an unresolvable cleavage modification throw MzLibException; entries with no mod are accepted silently. Addresses 6 of 19 findings from the PR #1040 review; the remaining were already applied or out of scope. See fixes_summary.md. * fix broken uni t tests
* minimal decon scorer
* test shallow clone
* isotope ratio consistency
* new scorer weights
* nic changes
* fix one unit test
* internal code review
* test classic decon with decoys
* Add XML documentation to ClassicDeconvolutionParameters
Added XML documentation comments for the ClassicDeconvolutionParameters class and its constructor.
* Add XML documentation for Deconvoluter methods
* Add XML documentation to CreateAlgorithm method
Added XML documentation for CreateAlgorithm method.
* convert virtual to abstract
* fix(decon): eagerly validate ScoreEnvelopes args; skip null envelopes
ScoreEnvelopes was a single yield-return method, so arg validation was
deferred until first MoveNext. That hid two bugs:
- Passing model=null with an empty envelopes sequence threw nothing at
all (the iterator body never ran).
- Passing envelopes=null surfaced as a NullReferenceException from the
foreach rather than as ArgumentNullException("envelopes").
Split into a non-iterator wrapper that null-checks both args eagerly
and a private iterator helper. The iterator now skips null elements
silently instead of letting them propagate into ComputeFeatures, which
matches the intent of the batch API (one bad envelope shouldn't
abort the rest of the run).
* fix(decon): anchor ComputeFeatures m/z on signed envelope charge
ComputeFeatures was passing absCharge to ToMz when computing the
monoisotopic m/z anchor for the theoretical isotope grid. ToMz is
asymmetric in the proton sign -- mass/|z| + sign(z)*proton -- so
ToMz(mass, +5) and ToMz(mass, -5) differ by 2 protons (~2.015 Da),
far beyond the 10 ppm match tolerance.
For negative-mode envelopes that meant the theoretical grid landed
on the positive-mode mirror, no observed peaks matched, and
cosine/completeness/ratio-consistency collapsed to ~0.
Pass envelope.Charge (signed) so ToMz embeds the correct polarity
sign. The rest of the function still uses absCharge for isotope
step and per-position math, which is correct.
* test(decon): pin q-value, decoy-averagine, and cache-reset contracts
The deconScorer branch shipped several tests that passed trivially
without exercising the contract they claimed to verify:
- G4_AssignQValues_MonotonicallyEnforced asserted only non-decreasing
order, which any constant array satisfies. Now asserts the four exact
post-sweep values (all 0.5 for the documented input), so a regression
that drops the +1 correction or removes the monotone sweep fails.
- C1_MonoisotopicPeak_IsNotShifted iterated only idx 0..19 where apex_n
is always 0, so the peak-shift logic was never exercised. Renamed to
C1_MonoisotopicRecovery_HoldsAtLowMassBreaksAtHighMass and now anchors
on both a low-mass index (where apex IS monoisotopic) and a high-mass
index (apex_n > 0). The latter exposes a real contract violation in
DecoyAveragine: GetDiffToMonoisotopic forwards unchanged from the real
model while the apex peak is shifted, so apex - DiffToMono = realMono
+ apex_n * perPeakOffset rather than realMono. Test pins current
actual behaviour; doc-block flags a follow-up to fix the prod side.
- IsoDec_ToDecoyParameters_DecoySpacingReachesDll built a fresh original
with no _isoSettings cache, so the cache-reset claim was untested.
Now warms original.ToIsoSettings() before cloning.
- D2 in TestDecoyAveragine asserted finiteness of e.Score (the raw
algorithm score this PR replaces) instead of the new generic scorer.
Now scores every envelope via DeconvolutionScorer.ScoreEnvelope.
Also added CosineOfAlignedVectors_ZeroNorm_ReturnsZero to pin the helper
contract that A6_ComputeFeatures_EmptyPeakList_DoesNotThrow implicitly
relies on.
Restored standard logistic-regression sign convention in
DeconvolutionScorer (positive coefficient => feature increases P(true)),
with sigmoid back to 1/(1+exp(-linear)). Algebraically identical to the
flipped form, but the standard convention lets weights be replaced
directly with sklearn / glm output.
Removed bogus meanTarget > meanDecoy assertion in
TestDecoyClassicRealData (contradicted documented Classic AUC=0.50) and
corrected E2's "identical envelopes" comment to match the empirical
empty-decoy-list reality.
Renamed TestShallowClone class to TestToDecoyParameters; file rename
done via `git mv`.
* filename change
* test(decon): pin scorer contracts and expose decoy-averagine API gap
Strengthens the deconScorer branch's regression coverage. Several tests
on the branch passed trivially without exercising the contract they
claimed to verify; this commit replaces them with assertions that
actually fail on the regressions they're meant to catch, plus one
contract test for a helper that previously had none.
Substantive test changes:
- G4_AssignQValues_MonotonicallyEnforced now asserts the four exact
post-sweep q-values (all 0.5 for the documented input) instead of
only non-decreasing order, so a regression that drops the +1
correction or removes the monotone sweep fails. The old XML comment
documented the formula without the +1 correction; that's corrected.
- C1_MonoisotopicPeak_IsNotShifted (renamed to
C1_MonoisotopicRecovery_HoldsAtLowMassBreaksAtHighMass) iterated
only idx 0..19 where apex_n is always 0, so the peak-shift logic
was never exercised. Now anchors on both a low-mass and a high-mass
entry. The high-mass case exposes a real DecoyAveragine API gap:
GetDiffToMonoisotopic forwards unchanged from the real model while
the apex peak is shifted, so apex - DiffToMono = realMono +
apex_n * perPeakOffset, not realMono. Test pins current actual
behaviour; doc-block flags a follow-up to fix the prod side.
- IsoDec_ToDecoyParameters_DecoySpacingReachesDll now warms the
original's _isoSettings cache before cloning. Previously the cache
was always null at clone time, so any future "cache leaks to decoy"
regression would have been invisible.
- D2_DecoyEnvelopes_ReturnedFromDeconvoluteWithDecoys_OnCleanSynthetic
now scores every envelope via DeconvolutionScorer.ScoreEnvelope (the
new generic scorer this PR introduces) rather than asserting on the
algorithm's raw e.Score field that the PR sets out to replace.
- New CosineOfAlignedVectors_ZeroNorm_ReturnsZero pins the helper's
zero-norm-returns-0 contract that ComputeFeatures implicitly relies
on for envelopes with no observed signal at theoretical positions.
Production-code changes:
- Restored standard logistic-regression sign convention in
DeconvolutionScorer (positive coefficient => feature increases
P(true)), with sigmoid back to 1/(1+exp(-linear)) and the explanatory
comment re-attached. Algebraically identical to the flipped form,
but the standard convention lets weights drop in directly from
sklearn / glm output.
- Documented the readonly-struct choice on EnvelopeScoreFeatures with
a <remarks> block: 32-byte struct, ComputeFeatures is the
batch-scoring hot path (thousands of envelopes per scan), value
semantics avoid per-envelope GC pressure.
Other corrections:
- Removed bogus meanTarget > meanDecoy assertion in
TestDecoyClassicRealData (contradicted documented Classic AUC=0.50).
- Corrected E2's "identical envelopes" comment in
TestDeconvolutionScorerIntegration: empirically decoys is empty for
Classic on the synthetic spectrum, not identical to targets.
- Renamed TestShallowClone -> TestToDecoyParameters (file + class) to
match the actual API surface under test.
- Restored a few comments and a brace pair removed during the branch's
earlier cosmetic churn (style choices kept where the user signed off
on them; reverts limited to noise).
* Pass in expected spacing to decoy
---------
Co-authored-by: nbollis <nbollis@comcast.net>
* Initial param equality * Tests --------- Co-authored-by: trishorts <mshort@chem.wisc.edu>
* Bump OpenMcdf from 2.3.1 to 3.1.3 --- updated-dependencies: - dependency-name: OpenMcdf dependency-version: 3.1.3 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> * Update nuspec: bump OpenMcdf to 3.1.3 and OpenMcdf.Extensions to 2.3.1 Agent-Logs-Url: https://github.com/smith-chem-wisc/mzLib/sessions/ff64eb83-51bd-4e68-8cd9-eb7781945b78 Co-authored-by: trishorts <16841846+trishorts@users.noreply.github.com> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Nic Bollis <nbollis@comcast.net> Co-authored-by: trishorts <mshort@chem.wisc.edu> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: trishorts <16841846+trishorts@users.noreply.github.com>
* Add MS:1000516 charge array support to MsDataScan + mzML I/O Adds an optional 'ChargeArray' (int[]) property to MsDataScan, parallel to the m/z and intensity arrays. The mzML writer emits a third binaryDataArray (PSI-MS MS:1000516, 32-bit float, no compression) when ChargeArray is non-null. The mzML reader recognizes the same accession and round-trips the array. Existing callers are unaffected — the constructor parameter defaults to null and writers/readers behave identically when no charge data is present. Conformant readers that don't recognize MS:1000516 silently ignore the array per spec. Use case: deisotopers (e.g. YADA's annotate mode) want to publish per-peak charge state inline in the mzML rather than via a side-car file. PSI-MS defines the charge array as a first-class spec feature since mzML 1.0; ProteoWizard, Skyline, MZmine, mzR, pymzML all consume it. mzLib's writer just didn't expose the API. * Add round-trip + null-default regression tests for charge array Per @nbollis's review on #1060 — exercises both the writer branch in MzmlMethods (third <binaryDataArray> with MS:1000516) and the reader branch in Mzml.cs (cvParam recognition + 32-bit-float decode rounded back to int[]). The second test pins the null-default behavior so a future change can't quietly start emitting an empty array. Both tests use GenericMsDataFile as the concrete MsDataFile, matching the existing test patterns in this file. * yada 31 * Address review feedback from trishorts - Wire MS:1000516 charge-array decoding into the dynamic-reader path (GetOneBasedScanFromDynamicConnection) so InitiateDynamicConnection callers no longer silently drop per-peak charges. Mirrors the existing static-reader path: new readingCharges flag + chargeArray local, an MS:1000516 CVPARAM branch, BINARY decodes & rounds the 32-bit floats back to int, and the constructed MsDataScan receives chargeArray. - Correct ChargeArray's XML doc-comment ("zlib-compressed" was wrong; the writer emits MS:1000576 "no compression"), with a note explaining the deferred-compression rationale. - Tighten ChargeArrayRoundTrip to also assert MassSpectrum.XArray and YArray round-trip element-for-element. Guards against writer sizing/indexing bugs that would corrupt the existing arrays while leaving the new charge array alone. - Add ChargeArrayRoundTripWithNoiseData exercising the 6-array writer branch (NoiseData + ChargeArray). Asserts m/z, intensity, and charge all round-trip, and inspects the on-disk XML for binaryDataArrayList count="6" + an MS:1000516 cvParam. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Nic Bollis <nbollis@comcast.net>
* Move ISingleChargeMs1Feature to MassSpectrometry
The interface describes an MS1 deconvolution result and is the input
contract for the upcoming Deconvoluter.PairPrecursorsToMs2 join. That
method lives in MassSpectrometry.Deconvolution; Readers references
MassSpectrometry rather than the reverse, so the interface had to live
in MassSpectrometry for the dependency direction to work.
Relocate the file via `git mv` (history preserved), re-namespace to
MassSpectrometry, and add `using MassSpectrometry;` to the five Readers
files that implement or reference it. No behavior change. No external
consumer references the interface today (grep-checked on MetaMorpheus,
ProteaseGuru, ProteoformExplorer), so the move is invisible outside
this PR.
Doc tightened to reflect that pairing requires both an RT-window check
and an isolation-window check, not just an RT-apex check.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Add Deconvoluter.PairPrecursorsToMs2 with FlashDeconv / TopFD coverage
A new join between MS1 deconvolution features (any
ISingleChargeMs1Feature source — Ms1FeatureFile from TopFD/FlashDeconv,
DinosaurTsvFile, or future in-house whole-file deconvolution) and the
MS2 scans that selected them for fragmentation. One method on the
existing Deconvoluter:
public static IEnumerable<(MsDataScan Ms2Scan, IsotopicEnvelope PrecursorEnvelope)>
PairPrecursorsToMs2(IEnumerable<ISingleChargeMs1Feature> ms1Features,
MsDataFile msDataFile);
Match rule: MS2.RetentionTime ∈ [feat.RetentionTimeStart,
feat.RetentionTimeEnd] AND feat.Mz ∈ MS2.IsolationRange. Pure join —
no cross-charge consensus, no off-by-one correction, no harmonic
filtering, no isobaric-tag reporter extraction. Those concerns belong
to the deconvolution producer upstream or to the search engine
downstream. Pairing is restricted to MsnOrder == 2 (MS3 isolation
targets an MS2 fragment, not an MS1 precursor); MS2 scans with a null
IsolationRange are skipped. Chimeras emit one pair per matching
feature; a single feature spanning multiple MS2 scans likewise emits
one pair per scan.
The returned IsotopicEnvelope is the type MetaMorpheus already speaks
via EngineLayer.Util.Precursor's Precursor(IsotopicEnvelope, ...)
constructor, so downstream wrapping into Ms2ScanWithSpecificMass is a
direct two-line adapter at the call site (no mzLib-side wrapper class
needed):
var pairs = Deconvoluter.PairPrecursorsToMs2(features, dataFile);
foreach (var (ms2, envelope) in pairs)
{
var precursor = new Precursor(envelope, ...);
var scan = new Ms2ScanWithSpecificMass(ms2, precursor.MonoisotopicPeakMz,
precursor.Charge, fullFilePath, commonParameters,
neutralExperimentalFragments, precursor.Intensity,
precursor.EnvelopePeakCount, precursor.FractionalIntensity);
...
}
For input from external readers (.ms1.feature etc.), the envelope is
built via the existing 3-arg IsotopicEnvelope(mass, intensity, charge)
constructor — Peaks is a single synthetic entry; NumberOfIsotopes from
the input is not surfaced because per-peak m/z and intensity are
unknown. When the producer is mzLib's own in-house deconvolution it can
hand PairPrecursorsToMs2 real envelopes; the contract works lossless in
that case.
Tests in TestPrecursorPairing.cs cover:
- 14 synthetic-input unit tests: happy path, RT-out-of-window (both
sides), m/z-out-of-isolation (both sides), chimera, one-feature-
spanning-many-MS2s, MS2 with null isolation, MS1 ignored, MS3 ignored,
RT/isolation boundary inclusivity, empty inputs, MS2-free file.
- 3 end-to-end fixture tests: load Ms1Feature_FlashDeconvOpenMs3.0.0
and Ms1Feature_TopFDv1.6.2 fixtures through the production
Ms1FeatureFile reader, pair against a synthetic MS2 scan tuned to
catch only one charge state of feature row #1 (≈10835.9 Da at z=10),
assert one pair with the expected charge, mass, and format-
appropriate intensity (FlashDeconv → 0; TopFD → positive).
17/17 pass; 774/774 MassSpectrometryTests + FileReadingTests
regression remains green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Remove unused MassSpectrometry namespace
* Remove unused MassSpectrometry namespace import
* Update Ms1Feature.cs
* Remove unused MassSpectrometry import
Removed unused MassSpectrometry namespace import.
* Remove unused MassSpectrometry import
* Restore using MassSpectrometry; in 5 Readers files (build fix)
These five files reference ISingleChargeMs1Feature in method signatures
or interface implementations, and the interface now lives in the
MassSpectrometry namespace (per the move two commits earlier). Readers
has ImplicitUsings=enable but the auto-generated global usings only
cover stdlib namespaces (System, System.Linq, etc.), so MassSpectrometry
isn't imported implicitly.
The previous round of cleanup commits removed these using directives as
"unused imports" — but each is required for the file to compile.
Restoring them undoes those five build-breaking deletions.
Verified: full Test.csproj suite — 3413/3413 pass (5 benchmarks skipped
as designed) on .NET 8.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Replace PairPrecursorsToMs2 with FromFile deconvolution algorithm
Adopts the pattern from Nic's PR #1065 draft: instead of a bespoke
join method on Deconvoluter, "deconvolution loaded from disk" becomes
another DeconvolutionAlgorithm that plugs into the existing
Deconvoluter factory. MetaMorpheus's per-MS2 precursor-deconvolution
loop already calls MsDataScan.GetIsolatedMassesAndCharges(precursorSpectrum,
deconParams); swapping ClassicDeconvolutionParameters for
FromFileDeconvolutionParameters transparently switches the entire
pipeline to consume pre-deconvoluted features.
New types:
- DeconvolutionType.FromFile (enum entry)
- FromFileDeconvolutionParameters (DeconvolutionParameters subclass —
wraps a pre-loaded IEnumerable<ISingleChargeMs1Feature>; the caller
chooses which Readers parser to use, so the Readers → MassSpectrometry
dependency arrow is preserved)
- FromFileDeconvolutionAlgorithm (DeconvolutionAlgorithm subclass —
filters the stored features by m/z range, RT-window overlap, and
[MinAssumedChargeState, MaxAssumedChargeState], yields synthetic
IsotopicEnvelopes via the existing 3-arg constructor)
- MzLibUtil.MzRtRange (MzRange subclass adding RT bounds — reusable
primitive for any future algorithm needing joint m/z + RT filtering;
adopted verbatim from #1065)
Modified:
- Deconvoluter.Deconvolute(scan, params, range): auto-upgrades a plain
MzRange to MzRtRange when params are FromFile, using scan.RetentionTime
as the anchor. The MzSpectrum overload validates that an MzRtRange
was supplied (no anchor available there) and raises ArgumentException
otherwise. The branch only fires for FromFile — existing in-memory
algorithm callers are entirely unaffected.
- Deconvoluter.CreateAlgorithm: factory case for FromFile.
- MsDataScan.GetIsolatedMassesAndCharges (both overloads): same
auto-upgrade with rtTolerance=0.1 (matches #1065's value). Also
extracts the 8.5 magic number into a const, mirroring #1065.
Deleted:
- The previous PairPrecursorsToMs2 method on Deconvoluter is gone.
Callers wanting the bulk-pairing shape can write the 4-line loop
themselves; the algorithm pattern is the canonical entry point.
Tests (renamed Test/MassSpectrometryTests/TestPrecursorPairing.cs →
TestFromFileDeconvolution.cs via git mv):
- 8 direct-algorithm tests via Deconvoluter.Deconvolute(spectrum,
params, MzRtRange).
- 6 integration tests via MsDataScan.GetIsolatedMassesAndCharges
(the production MM entry point).
- 3 end-to-end tests load real Ms1Feature_FlashDeconvOpenMs3.0.0 and
Ms1Feature_TopFDv1.6.2 fixtures through Ms1FeatureFile, pair against
a synthetic MS2 tuned to catch one charge state, assert the
envelope's mass / charge / format-appropriate intensity.
17/17 pass. Full Test.csproj suite: 3413/3413 pass (5 benchmarks
skipped as designed), 2 min 11 s on .NET 8.
Adjustments vs Nic's #1065 sketch:
- Params class takes IEnumerable<ISingleChargeMs1Feature> instead of a
resultPath string. Keeps file-format coupling out of MassSpectrometry.
- Algorithm body is implemented (sketch was NotImplementedException).
- MsDataScan.GetIsolatedMassesAndCharges(MsDataScan precursorScan, ...)
uses this.RetentionTime (the MS2's RT) rather than
precursorScan.RetentionTime. The canonical question is "what features
were eluting when this MS2 was fired?"; the difference in practice
is sub-second but the MS2-anchored answer is semantically right.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Move FromFile to Readers; params takes file path (PR #1064 review)
Addresses Nic's review comments on #1064:
- "FromFileDecon parameters and algorithm will need to be moved to the
readers project. The parameters will then take in the file path and
load up the objects that are needed." (file-level comment on params)
- "This one should throw just like inside of Deconvoluter." (line 192
of MsDataScan)
Changes:
MassSpectrometry:
- DeconvolutionAlgorithm.Deconvolute promoted from internal abstract to
protected internal abstract so subclasses in other assemblies (Readers)
can override. Existing in-project overrides (Classic, IsoDec, Example)
updated to match.
- DeconvolutionParameters: add virtual CreateAlgorithm() returning null.
Subclasses living outside MassSpectrometry can override to inject
their own DeconvolutionAlgorithm without the central enum-switch in
Deconvoluter needing to know about them.
- Deconvoluter.CreateAlgorithm: try parameters.CreateAlgorithm() first,
fall back to the enum-based switch for in-project algorithms. The
FromFile case in the switch now throws an explanatory MzLibException
rather than instantiating an in-project type — instantiation is the
params-side's job.
- MsDataScan.GetIsolatedMassesAndCharges (both overloads): remove the
FromFile auto-upgrade branch. The MzSpectrum overload of
Deconvoluter.Deconvolute already throws when given FromFile params
with a plain MzRange — let that error surface instead of silently
upgrading. Callers wanting from-file decon use the MsDataScan
overload of GetIsolatedMassesAndCharges, which routes through
Deconvoluter's MsDataScan overload (still auto-upgrades using the
precursor scan's RetentionTime).
Old files deleted:
- mzLib/MassSpectrometry/Deconvolution/Algorithms/FromFileDeconvolutionAlgorithm.cs
- mzLib/MassSpectrometry/Deconvolution/Parameters/FromFileDeconvolutionParameters.cs
New files in Readers:
- mzLib/Readers/Deconvolution/FromFileDeconvolutionParameters.cs
- Public ctor takes a string filePath; loads features via
FileReader.ReadResultFile (auto-detects FlashDeconv / TopFD / Dinosaur
via SupportedFileType), then calls LoadResults and materializes the
per-charge expansion.
- Internal ctor takes IEnumerable<ISingleChargeMs1Feature> for unit
tests (uses existing InternalsVisibleTo("Test") in Readers).
- CreateAlgorithm() override returns FromFileDeconvolutionAlgorithm.
- mzLib/Readers/Deconvolution/FromFileDeconvolutionAlgorithm.cs
- Same algorithm body; override modifier is now `protected override`
to satisfy C#'s cross-assembly `protected internal` rule.
Tests reshaped (TestFromFileDeconvolution.cs):
- Integration tests now use the MsDataScan overload of
GetIsolatedMassesAndCharges (the MzSpectrum overload throws for
FromFile after this change; that contract is locked by a new test
MzSpectrumOverloadOfGimc_FromFileWithoutMzRtRange_Throws).
- End-to-end fixture tests now drive the public file-path ctor —
exercising FileReader.ReadResultFile + SupportedFileType detection +
Ms1FeatureFile.LoadResults.
18/18 FromFile tests pass. Full Test.csproj suite: 3414/3414 pass,
5 benchmarks skipped, 2 min 12 s on .NET 8.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Sort features by m/z and binary-search the lower bound (PR #1064 review)
Addresses Nic's efficiency suggestion at FromFileDeconvolutionAlgorithm.cs
line 47 — "Could possibly improve efficiency by storing the features in
an ordered collection within the FromFileParameters."
FromFileDeconvolutionParameters now sorts the per-charge features by
ascending m/z at construction and stores a parallel double[] of m/z
keys. A new internal helper FindFirstIndexAtOrAbove(double minMz) wraps
Array.BinarySearch with the standard "if no match, return insertion
point" convention.
FromFileDeconvolutionAlgorithm.Deconvolute now binary-searches for the
first feature whose m/z is at or above range.Minimum, iterates forward,
and yield-breaks the moment Mz exceeds range.Maximum — turning the
per-MS2 query from O(N) to O(log N + k) where k is the count of
features whose m/z lands inside the requested window.
For a typical search (10k features × ~5k MS2 scans), this is ~400×
fewer per-feature comparisons than the previous linear scan; for
typical k ≈ 1-3 features per isolation window, the constant-factor
improvement is much larger than that.
Behavior-preserving — all 18 FromFile tests still pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Lift PR-touched files to ≥ 90% line coverage with essential tests
Baseline coverage on PR-touched files identified three classes below 90%:
- MzLibUtil.MzRtRange — 34.5% (new class, mostly untested properties / methods)
- MassSpectrometry.Deconvoluter — 83.7% (uncovered: null-range auto-upgrade
branch, FromFile-throws-from-enum-switch defensive path, decoy-null throw
in DeconvoluteWithDecoys)
- Readers.FromFileDeconvolutionParameters — 83.9% (uncovered: null filePath
throw, non-feature-file throw, ToDecoyParameters returns null)
Tests added (essential paths only, no fluff):
New file Test/TestMzRtRange.cs (15 tests):
- Two constructors with bounds verification.
- Derived properties (Minimum/Maximum/MeanMZ/MeanRt/Width pair) in one test.
- Contains: inside, mz-below, mz-above, rt-below, rt-above, inclusive boundaries.
- CompareTo: inside (0), mz-below (1), mz-above (-1), rt-below (1), rt-above (-1).
- ToString smoke check (both m/z and RT bounds appear in output).
Added to TestFromFileDeconvolution.cs (5 tests):
- Null filePath in ctor → ArgumentNullException.
- Non-feature file (.psmtsv) in ctor → MzLibException with diagnostic message.
- ToDecoyParameters → null (locks the "no decoy support" contract).
- Deconvoluter.Deconvolute(MsDataScan, FromFileParams, null) auto-upgrades using
the scan's MassSpectrum.Range (covers the null-range branch of the ternary).
- DeconvoluteWithDecoys with FromFile params → InvalidOperationException (covers
the decoy-null defensive path in DeconvoluteWithDecoys).
- A test-only DeconvolutionParameters stub that claims DeconvolutionType.FromFile
but doesn't override CreateAlgorithm → Deconvoluter.CreateAlgorithm throws
the explanatory MzLibException (covers the FromFile case in the enum switch
defensive path).
Skipped (still 100% but borderline):
- MassSpectrometry.MsDataScan at 92.1% — uncovered lines are pre-existing
code outside this PR's scope.
Result on PR-touched files:
- MzRtRange: 34.5 → 100% line coverage
- FromFileDeconvolutionParameters: 83.9 → 100%
- Deconvoluter: 83.7 → 100%
- All other PR-touched classes already ≥ 97%
- Zero PR-touched classes below 90%
Total: 40 FromFile+MzRtRange tests pass; full Test.csproj suite 3436/3436
pass (5 benchmarks skipped, 2 min 33 s on .NET 8).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* add new protease subtilisin * feat: add subtilisin|p protease and unit test Add subtilisin|p to the embedded proteases.tsv with full cleavage specificity and proline-inhibition motifs (N[P]|, S[P]|, L[P]|, K[P]|, I[P]|, D[P]|, Y[P]|, V[P]|, G[P]|, F[P]|, T[P]|, E[P]|, Q[P]|, A[P]|, R[P]|). Add TestSubtilisinP_DigestsCorrectlyAndRespectsProlineRestriction to ProteinDigestionTests to verify: - subtilisin|p is present in the embedded protease dictionary with CleavageSpecificity.Full - All expected cleavage sites fire on a proline-free sequence (ANKTIDE) - The K[P]| proline-inhibition rule is respected (AKPIDE keeps KP intact) * correct protease name use capital P --------- Co-authored-by: Nic Bollis <nbollis@comcast.net>
* seperate deconvolution Score and GenericScore * generic deconvolution score accessiblity * internal code review * eliminate some ai noise * chore: revert cosmetic-only changes from PR #1054 Undo whitespace and formatting churn that crept into the deconvolution scoring PR. Restores trailing EOF newlines, blank-line removal, and aligned-column padding around `=` and named-argument `:` to match master. No functional impact. Addresses 11 cosmetic findings from the PR #1054 review. * test(deconvolution): plug coverage gaps from PR #1054 review - Add happy-path + cache-hit tests for the GetOrComputeGenericScore (env, DeconvolutionParameters) overload and assert delegation parity with the AverageResidue overload. - Pin GetOrComputeGenericScore_NotYetSet_ComputesAndStashes to DeconvolutionScorer.ScoreEnvelope's direct output via a parallel envelope, locking in the documented cache-miss contract. - Tighten D1/D3 perfect-Averagine score assertions from Is.InRange(0,1) to Is.GreaterThan(0.5), matching the threshold used by B1 and the negative-charge equivalent. - Lower off-by-one cosine threshold from <0.85 to <0.7 to widen the gap against the positive-case >0.85 baseline. Addresses 4 TestCoverage findings from the PR #1054 review. * fix issues from merge * tighten DeconvolutionScorer test assertions and contracts - TestScoreEnvelopesNullHandling: drop .ToList() from the null-collection and null-model lambdas so the tests only pass on eager throws (at the call site, before MoveNext). Pins the wrapper+iterator split against a silent regression to a single iterator method. - TestDeconvolutionScorerUnit.A5: tighten the all-peaks-50ppm-shifted cosine assertion from < 0.5 to == 0.0 within 1e-9. The 10 ppm matching window guarantees an all-zero observed vector; any partial leakage of out-of-window intensity must fail. - TestDeconvolutionScorerUnit.A1: tighten the perfect-envelope IntensityRatioConsistency assertion from >= 0.90 to == 1.0 within 1e-6. BuildPerfectEnvelope's uniform per-peak scaling makes CV exactly 0, so the feature is mathematically pinned at 1.0. - TestIsotopicEnvelopeExtensions: rename namespace Test.Deconvolution -> Test.MassSpectrometryTests.Deconvolution to match the four sibling fixtures in the same directory. No behavioral change. - TestIsotopicEnvelopeExtensions.GetOrComputeGenericScore_NullEnvelope_Throws: add a second assertion covering the (envelope, DeconvolutionParameters) overload (the common downstream call shape in MetaMorpheus). Mirrors the symmetry already used by GetOrComputeGenericScore_NullModel_Throws. * unify scoring under Deconvolute via UseGenericScore flag Restore the single-entrance-point design that Deconvoluter was built around. The PR previously added DeconvoluteWithGenericScoring as a parallel entry point; per @nbollis's review feedback, all behavioral variation should flow through the parameters object instead. - Add UseGenericScore (default false) to DeconvolutionParameters. When true, Deconvolute runs an additional per-envelope scoring pass that sets IsotopicEnvelope.GenericScore (the algorithm-specific Score is unchanged). - Fold the post-pass scoring into Deconvolute(MzSpectrum, ...) using a separate iterator helper (AddGenericScoring) so the NeutralMassSpectrum early-return at the top stays eager rather than getting deferred to first MoveNext. - Remove both DeconvoluteWithGenericScoring overloads. Caller migration: envelope = Deconvoluter.DeconvoluteWithGenericScoring(spec, p) -> p.UseGenericScore = true; envelope = Deconvoluter.Deconvolute(spec, p) - Update D-group integration tests in TestDeconvolutionScorerUnit to use the flag and the unified Deconvolute call. - Update the doc-comment cref in IsotopicEnvelopeExtensions to point at the new path.
* add new protease subtilisin
* feat: add subtilisin|p protease and unit test
Add subtilisin|p to the embedded proteases.tsv with full cleavage
specificity and proline-inhibition motifs (N[P]|, S[P]|, L[P]|,
K[P]|, I[P]|, D[P]|, Y[P]|, V[P]|, G[P]|, F[P]|, T[P]|, E[P]|,
Q[P]|, A[P]|, R[P]|).
Add TestSubtilisinP_DigestsCorrectlyAndRespectsProlineRestriction to
ProteinDigestionTests to verify:
- subtilisin|p is present in the embedded protease dictionary with
CleavageSpecificity.Full
- All expected cleavage sites fire on a proline-free sequence (ANKTIDE)
- The K[P]| proline-inhibition rule is respected (AKPIDE keeps KP intact)
* Accept newer-TopFD column names in Ms1Feature
PR 1064 ships an Ms1Feature reader that recognises only the older
FlashDeconv / TopFD-v1.6.2 _ms1.feature schema (Sample_ID, ID,
Time_begin, Time_end, Minimum_charge_state, Maximum_charge_state,
Minimum_fraction_id, Maximum_fraction_id). Newer TopFD output keeps
the same _ms1.feature extension but uses different column names
(File_name, Fraction_ID, Feature_ID, Min_time, Max_time, Min_charge,
Max_charge), plus extras like Envelope_num and EC_score. Format
detection picks Ms1FeatureFile by extension; CsvHelper then throws
because none of the expected [Name(...)] columns are present.
Discovered while integrating PR 1064's FromFileDeconvolutionParameters
into MetaMorpheus and pointing it at a real TopFD .ms1.feature from a
top-down yeast run -- the file parsed by hand looks identical in shape
to the old schema, just relabelled.
Fix is column-name aliases on the existing record, plus [Optional] on
fields that the newer schema omits entirely. Downstream
GetSingleChargeFeatures() reads only Mass, ChargeStateMin/Max,
RetentionTimeBegin/End, and IntensityApex -- all aliased to a column
present in both schemas, so the join algorithm behaves identically
regardless of which producer wrote the file.
Per-field summary:
Sample_ID -> [Optional] (newer TopFD has File_name
instead; type-incompatible
-- int vs string path --
and not used downstream)
ID -> alias "Feature_ID" + [Optional]
Time_begin -> alias "Min_time"
Time_end -> alias "Max_time"
Minimum_charge_state -> alias "Min_charge"
Maximum_charge_state -> alias "Max_charge"
Minimum_fraction_id -> [Optional] (newer TopFD has a single
Fraction_ID column, not a
min/max pair; not used
downstream so alias would
add no value)
Maximum_fraction_id -> [Optional]
No tests added in this commit -- a follow-up should drop a newer-TopFD
_ms1.feature sample into Test/FileReadingTests/ExternalFileTypes/ and
extend the existing Ms1FeatureFile read-roundtrip tests to cover both
schemas. The current
Ms1Feature_FlashDeconvOpenMs3.0.0_ms1.feature
Ms1Feature_TopFDv1.6.2_ms1.feature
fixtures keep passing because the existing [Name(...)] heads remain
the first entry in every alias list.
* Drops a 4-row fixture (Ms1Feature_TopFDvLatest_ms1.feature) captured
from real TopFD output that uses the File_name / Fraction_ID /
Feature_ID / Min_time / Max_time / Min_charge / Max_charge schema, and
wires it into the existing TestMsFeature parameterised tests:
* TestFeaturesLoadAndCountIsCorrect gains a TestCase asserting the
fixture loads four features end-to-end via FileReader.
* TestTopFDLatestMs1FeatureFirstAndLastAreCorrect locks the per-
field mapping for both the aliased columns (Time_begin/Min_time,
Minimum_charge_state/Min_charge, etc.) and the [Optional] fields
absent from the newer schema (SampleId, FractionIdMin/Max default
to 0). Covers the single-charge edge case (Min_charge == Max_charge
== 1) in the last record.
* TestTopFDLatestMs1GetSingleChargeFeatureFunctions confirms charge-
range expansion is identical across the two TopFD schemas: a 6-14
range yields 9 envelopes; a 1-1 range yields exactly one;
GetMs1Features() flattens to 9 + 12 + 10 + 1 = 32 across the four
fixture features.
* TestMs1FeatureReadWrite gains the new fixture as a TestCase. The
writer emits the older-schema headers (newer columns aren't on
the record), so the round-trip converts schema-newer -> schema-
older + Optional defaults. Comment in the test explains why that
is correct: every field downstream consumers actually read
survives the round-trip; the columns that don't are exactly the
ones marked [Optional] and unused.
All 17 TestMsFeature tests pass, as do the 148 tests covering the
related Ms1Feature / FromFile / SupportedFileExtensions / DinosaurTsv
surface area.
* Fix formatting of subtilisin entry in proteases.tsv
* PR #1067's scope is Ms1Feature TopFD column-name acceptance. Two files
had unrelated drift from the merged subtilisin PR (#1061) sitting in
origin/master that did not belong in this PR's diff. Restore both to
upstream/master so the PR shows only the Ms1Feature changes.
- proteases.tsv: restore the trailing newline that was stripped.
- ProteinDigestionTests.cs: revert `subtilisin|p` -> `subtilisin|P`
(lowercase didn't match the dictionary key in the tsv) and restore
the LoadProteaseDictionary_SubtilisinP_HasProlineRestrictedMotifs
test that was deleted.
No functional change.
* specify TopFD version 1.7.0 instead of "Latest" for the ms1.feature fixture
Addresses @nbollis's CHANGES_REQUESTED review on PR #1067 ("Latest is
not a version"). The newer-TopFD _ms1.feature fixture and its tests
were labelled "Latest" rather than a concrete release.
- Rename fixture Ms1Feature_TopFDvLatest_ms1.feature ->
Ms1Feature_TopFDv1.7.0_ms1.feature. TopFD 1.7.0 (TopPIC suite,
Dec 2023) introduced the ML feature-detection rework whose ECScore
is the new schema's EC_score column.
- TestMsFeature.cs: update the four fixture-path references, rename
the two TestTopFDLatest* methods to TestTopFDv1_7_0*, and anchor the
doc comment to v1.7.0.
- Test.csproj: repoint the CopyToOutputDirectory entry.
- SupportedVersions.txt: TopFD VersionTested is now "1.6.2, 1.7.0".
Build clean; TestMsFeature suite 17/17 passing.
---------
Co-authored-by: MICHAEL SHORTREED <mrshortreed@wisc.edu>
#1066) * almost there. No test * made tests pass * PR Cleanup * revise logic * review response --------- Co-authored-by: trishorts <mshort@chem.wisc.edu>
* initial commit * Testing * pr response --------- Co-authored-by: trishorts <mshort@chem.wisc.edu>
* seperate deconvolution Score and GenericScore * generic deconvolution score accessiblity * internal code review * eliminate some ai noise * spectrum aware generic deconvolution score * Per the skill: mini commit message before moving to the test discovery step. fix(test): rename Test.MassSpectrometry.Deconvolution to avoid prod-namespace shadowing Two test files in Test/MassSpectrometryTests/Deconvolution/ declared namespace Test.MassSpectrometry.Deconvolution, which shadowed the production-library MassSpectrometry namespace from inside the test project. C#'s relative-namespace resolution then mis-resolved "MassSpectrometry.DissociationType" (and similar references) to Test.MassSpectrometry.DissociationType -- a type that doesn't exist -- causing 18 cascading CS0234 errors in unrelated files (TestDatabaseLoaders, BackboneFragmentation, FragmentModificationTests, etc.). Rename to Test.MassSpectrometryTests.Deconvolution to match the surrounding files in the same directory and the directory name itself. Update the four "using static" references that pointed at the old namespace. Files changed: - DeconvolutionTestHelpers.cs (namespace) - TestDeconvolutionScorerSpectrumAware.cs (namespace + using static) - TestDeconvolutionScorerNegativeCharge.cs (using static) - TestDeconvolutionScorerUnit.cs (using static) - TestIsotopicEnvelopeExtensions.cs (using static) --------- Co-authored-by: MICHAEL SHORTREED <mrshortreed@wisc.edu> Co-authored-by: Nic Bollis <nbollis@comcast.net>
* Lib: Equality * Similarity Base Extensions * MsDataScan * Library Inheritance and construction * Similarity: More Entrance Points * Similarity Extension Methods * more tests * address review comments --------- Co-authored-by: trishorts <mshort@chem.wisc.edu>
…na RetentionTimeModel (#1046) * correct Within calculation * update unit tests * this is the spot * new retention time interface works with chronologer and koina prosit * unbroken * play nicely with others * update tests that call koina http * retention time model tests * Tighten the new IRetentionTimePredictor batch interface and the CI that validates it. CI / test categorization: - Add an Integration test step to the workflow's integration job; the build job's Category!=Integration filter was silently dropping all six mzLib [Category("Integration")] test files from CI. - Switch filter property name TestCategory -> Category (canonical NUnit-friendly form, stable across NUnit3TestAdapter versions). - Replace [Explicit] with [Category("Integration")] on the two Prosit2019iRT live-API batch tests so they run under the new step. Behavior fixes: - PredictRetentionTime (single) now distinguishes IncompatibleModifications (input rejected pre-flight) from PredictionError (model/HTTP failure) by inspecting the matching PeptideRTPrediction.Warning. - Validate peptide.FullSequence in addition to BaseSequence; the dictionary path is keyed by FullSequence and would otherwise NRE through the Dictionary lookup. - Skip null elements in the Koina batch and the interface default's foreach loop, instead of NRE-ing outside the documented ArgumentNullException contract. Documentation: - Clarify that RetentionTimeModel is not thread-safe and that the batch method does not cache results across calls. - Document the null-element-skip behavior on the interface default. Test isolation: - Lazy-init the Chronologer in RetentionTimePredictorInterfaceTests so TorchSharp model-load failures only affect the three Chronologer- using tests, not the mock-only fixture members. * fix(predictors): implement new IRetentionTimePredictor members on Koina The IRetentionTimePredictor interface added three members on this branch (PredictRetentionTimeEquivalent, PredictRetentionTimeEquivalents, and the IDisposable.Dispose inherited from the now-IDisposable interface), but the Koina-side RetentionTimeModel implementation hadn't caught up, breaking the build at the class declaration with three CS0535 errors. Stubs delegate through to existing behavior: - PredictRetentionTimeEquivalent forwards to PredictRetentionTime; for a Koina RT model the predicted value already IS the iRT/equivalent. - PredictRetentionTimeEquivalents routes through the existing batch PredictRetentionTimes (single Koina HTTP call) and reshapes into the tuple list the interface specifies. The maxThreads parameter is accepted for compatibility but ignored -- Koina is one HTTP request regardless of thread count. - Dispose is a virtual no-op. KoinaModelBase doesn't hold unmanaged resources today; subclasses with real disposable state can override. Same three-stub pattern applied to MockPredictor in RetentionTimePredictorInterfaceTests so the test fixture compiles. * fix(retention-time): tighten Koina IRetentionTimePredictor implementation Address review findings on the unified IRetentionTimePredictor contract and its Koina implementation: - Filter null peptides and null/empty FullSequence in batch input before DistinctBy, so callers can pass mixed lists without an NRE. - Clear Predictions at the top of every PredictRetentionTimes call so prior-call data can't leak into the singular method's failure-reason fallback after a transient HTTP failure. - Remove the outer try/catch in PredictRetentionTimes; swallow-everything was hiding programmer errors. Predict is the only call that legitimately needs the inner catch and that one stays. - Map a null peptide on the singular method to EmptySequence (not PredictionError); rename the matching test. - Normalize NaN/+/-Infinity from the model to null so the documented "null means prediction was not possible" contract holds end-to-end. - Make the default IRetentionTimePredictor.PredictRetentionTimes impl skip null/empty-FullSequence peptides, mirroring the Koina override's defensive contract. - Restore multi-line XML doc form on SeparationType. Tests: - Add Prosit2019iRT_PredictRetentionTimes_ModifiedPeptide_ReturnsKeyedByOriginalFullSequence to pin the dictionary-keying contract end-to-end. - Tighten the null-input default-impl test (assert ParamName) and add a PredictRetentionTimes_DefaultImpl_NullElement_IsSkipped case. - Drop the now-redundant NullResultEntries_FailureReasonNotAvailable test. * cover RetentionTimeModel.PredictRetentionTimeEquivalents Add 3 NUnit tests using the existing TestableRetentionTimeModel stub harness so no Koina HTTP call is made: - PredictRetentionTimeEquivalents_NullPeptides_ThrowsArgumentNullException pins the input-null guard so a typo'd peptide list fails fast. - PredictRetentionTimeEquivalents_SuccessfulBatch_ReturnsTuplesKeyedByOriginalPeptides pins the happy-path reshape: one tuple per input, PredictedValue carries through from the batch result, original IRetentionPredictable round-trips through the .Peptide slot, FailureReason is null. - PredictRetentionTimeEquivalents_FailedPrediction_PreservesWarningAsIncompatibleMods pins the failure-reshape branch: when a batch result lacks a value AND the matching Predictions entry has a Warning, FailureReason is IncompatibleModifications (not the catch-all PredictionError) so callers can distinguish model/network failure from sequence-incompatibility. Also document the three intentionally-untested IRetentionTimePredictor members in the fixture's class doc: PredictRetentionTimeEquivalent (one-line delegation), Dispose (empty no-op default), and the multi-batch throttling delay inside AsyncThrottledPredictor. * tighten null-peptide and empty-sequence contracts - PredictRetentionTimeEquivalents: stop silently dropping peptides with empty/null FullSequence. Null-peptide references are still dropped (no identity worth tracking), but a real-but-malformed peptide now comes back as a (null, peptide, EmptySequence) tuple so callers index-aligning result-to-input get a 1:1 mapping for everything they passed. - GetFormattedSequence: change null-peptide reason from PredictionError to EmptySequence so callers branching on FailureReason see the same value as PredictRetentionTime's null-peptide handling. - PredictRetentionTimeEquivalents fallback: drop the dead-defensive "Predictions?." -- PredictRetentionTimes (called immediately above) always re-initializes Predictions to a non-null list, so the null guard hides the invariant. Matches the unconditional access pattern in PredictRetentionTime. - IRetentionTimePredictor.PredictRetentionTimes (default impl): narrow the input-skip from "FullSequence is null OR empty" to "null only". An empty FullSequence is a legal dictionary key; let it through so PredictRetentionTime returns null with EmptySequence and the caller sees results[""] = null instead of a silent drop. Comment now explains the ArgumentNullException rationale precisely. Test update: GetFormattedSequence_NullPeptide_ReturnsNullWithPredictionError renamed and re-asserted against EmptySequence (the test's name itself encoded the old behavior). * mark legacy PredictRetentionTime[s] obsolete on IRetentionTimePredictor Per @nbollis's PR #1046 feedback: the singular and batch PredictRetentionTime[s] pair was a duplicate entry point alongside PredictRetentionTimeEquivalent[s]. Mark the legacy pair [Obsolete] on the interface and on the Koina RetentionTimeModel, and invert ownership in the Koina implementation so the Equivalent[s] pair holds the real HTTP-batch logic — matching the pattern already in place on the RetentionTimePredictor abstract base. Interface (IRetentionTimePredictor): - Mark PredictRetentionTime and PredictRetentionTimes [Obsolete] with a message pointing callers at the Equivalent[s] pair. - The default PredictRetentionTimes implementation now routes through PredictRetentionTimeEquivalent so the default itself does not self-trigger CS0618. Koina RetentionTimeModel: - PredictRetentionTimeEquivalents now owns the Koina batch HTTP call: DistinctBy dedup, Predictions clear, Predict() invocation, NaN/Inf normalization, defensive null-fill, exception swallow, and the IncompatibleModifications vs. PredictionError failure-reason fallback. - PredictRetentionTimeEquivalent now owns the singular flow: validates the peptide and routes through PredictRetentionTimeEquivalents with a list-of-one. - PredictRetentionTime is now a thin [Obsolete] shim delegating to PredictRetentionTimeEquivalent. - PredictRetentionTimes is now a thin [Obsolete] shim that calls PredictRetentionTimeEquivalents and reshapes the tuple list into the IReadOnlyDictionary<string, double?> the legacy contract returned. TryAdd preserves first-write-wins on duplicate FullSequence keys. Tests: - RetentionTimeModelTests and RetentionTimePredictorInterfaceTests gain a file-level `#pragma warning disable CS0618` with a comment noting the legacy surface is intentionally exercised for backward-compat coverage. No tests were removed or rewritten. Public API behavior is unchanged: the obsolete methods still work and produce the same results; new callers should prefer the Equivalent[s] * build Test project in Release before integration test step The integration job ran `dotnet test --no-build ./Test/Test.csproj` but only built the mzLib solution, which does not include Test.csproj. Test.dll was never produced in bin/Release, so VSTest failed with "test source file not found". Add an explicit Build (Test) step, mirroring the build job. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(rt): drop obsolete PredictRetentionTime pair from interface The [Obsolete] PredictRetentionTime / PredictRetentionTimes pair on IRetentionTimePredictor was a backward-compat shim during the migration to PredictRetentionTimeEquivalent / PredictRetentionTimeEquivalents. Consumers are migrated, so the legacy surface is removed. Cascade: - Drop the [Obsolete] shims in RetentionTimePredictor and Koina RetentionTimeModel. - Delete RetentionTimePredictorInterfaceTests.cs, which was dedicated to exercising the obsolete pair. - Strip the obsolete-only test sections from RetentionTimeModelTests and drop the file-level CS0618 pragma. - Refresh the class diagram and a couple of stale doc references. Also rename the four Koina test fixtures' NUnit category from [Integration] to [Koina] and update the workflow job (filter and job name) accordingly, so the category no longer collides with unrelated "Integration" usage on the development branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(koina): cover RetentionTimeModel single + batch failure branches Add 6 NUnit tests against TestableRetentionTimeModel to replace the coverage lost when the obsolete PredictRetentionTime / PredictRetentionTimes pair was removed. PredictRetentionTimeEquivalent: - null peptide -> null + EmptySequence - empty FullSequence -> null + EmptySequence - happy path -> value flows through the list-of-one batch delegation PredictRetentionTimeEquivalents (uncovered branches): - predictor throws -> all valid peptides surface as PredictionError (complement of the existing warning -> IncompatibleModifications test) - predictor returns fewer results than sent -> missing entries filled as PredictionError tuples instead of being dropped - empty-FullSequence peptide -> preserved as EmptySequence tuple in the output, distinct from the model-side PredictionError branch Lifts RetentionTimeModel.cs from 86.8% to 98.7% line coverage. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Rename job and simplify workflow steps Renamed job 'koina' to 'integration' and removed test steps for mzLib Koina. --------- Co-authored-by: MICHAEL SHORTREED <mrshortreed@wisc.edu> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Nic Bollis <nbollis@comcast.net>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.