feature: ATL-57: reduce patient hla data expansion#1446
feature: ATL-57: reduce patient hla data expansion#1446SergeySalinAtDataart merged 3 commits intomasterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors match probability calculation to avoid repeatedly expanding patient genotypes, enabling reuse of a precomputed patient genotype set across multiple donors (notably improving batch execution behavior in the match prediction pipeline).
Changes:
- Introduced
IGenotypeSetServiceto build and reuseSubjectGenotypeSet(patient expanded once, donor expanded per run). - Updated
IMatchProbabilityService/GenotypeMatcherAPIs to accept a precomputed patient genotype set and reduce repeated patient expansion. - Updated DI wiring plus unit/integration tests to use the new flow.
Reviewed changes
Copilot reviewed 27 out of 27 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| Atlas.MatchPrediction/Services/MatchProbability/MatchProbabilityService.cs | API updated to accept SubjectGenotypeSet and pass it into genotype matching. |
| Atlas.MatchPrediction/Services/MatchProbability/GenotypeSetService.cs | New service encapsulating genotype expansion/conversion and patient genotype set building. |
| Atlas.MatchPrediction/Services/MatchProbability/GenotypeMatcher.cs | Updated matcher to reuse provided patient genotype set and only expand donor genotypes. |
| Atlas.MatchPrediction/Models/SubjectGenotypeSet.cs | New model to carry expanded genotypes plus likelihood summary and representation flag. |
| Atlas.MatchPrediction/ExternalInterface/MatchPredictionAlgorithm.cs | Computes patient genotype set once (including batch reuse) and passes into match probability calculation. |
| Atlas.MatchPrediction/ExternalInterface/DependencyInjection/ServiceConfiguration.cs | Registers IGenotypeSetService. |
| Atlas.MatchPrediction.Test/Services/MatchProbability/MatchProbabilityServiceTests.cs | Updated tests for new CalculateMatchProbability signature and patient genotype passing. |
| Atlas.MatchPrediction.Test/Services/MatchProbability/GenotypeMatcherTests.cs | Updated tests to reflect new matcher dependencies/behavior (patient genotype set provided). |
| Atlas.MatchPrediction.Test/Services/MatchPredictionAlgorithmTests.cs | New tests ensuring batch computes patient genotypes once and reuses them per donor. |
| Atlas.MatchPrediction.Test.Integration/** | Updated integration tests to compute patient genotype set before calling match probability service. |
| Atlas.Common.Public.Models/MatchPrediction/MatchPredictionParameters.cs | Annotated MatchingAlgorithmHlaNomenclatureVersion as nullable (string?). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| public async Task<MatchProbabilityResponse> CalculateMatchProbability( | ||
| SingleDonorMatchProbabilityInput singleDonorMatchProbabilityInput, | ||
| SubjectGenotypeSet patientGenotypeSet) | ||
| { |
| /// <inheritdoc /> | ||
| public async Task<GenotypeMatcherResult> MatchPatientDonorGenotypes(GenotypeMatcherInput input) | ||
| { | ||
| private readonly IGenotypeImputationService genotypeImputer; | ||
| private readonly IGenotypeConverter genotypeConverter; | ||
| private readonly IMatchCalculationService matchCalculationService; | ||
| private readonly ILogger logger; | ||
|
|
||
| private record GenotypeSet(bool IsUnrepresented, ICollection<GenotypeAtDesiredResolutions> Genotypes, decimal SumOfLikelihoods); | ||
|
|
||
| private readonly Dictionary<int, Dictionary<(Locus, string), string>> LocusValueReplacementMapping = | ||
| new() | ||
| { | ||
| { | ||
| 3590, new Dictionary<(Locus, string), string> | ||
| { | ||
| { (Locus.C, "*04:09N"), "*04:09L" }, | ||
| { (Locus.C, "04:09N"), "04:09L" }, | ||
| } | ||
| } | ||
| }; | ||
|
|
||
| public GenotypeMatcher( | ||
| IGenotypeImputationService genotypeImputer, | ||
| IGenotypeConverter genotypeConverter, | ||
| IMatchCalculationService matchCalculationService, | ||
| // ReSharper disable once SuggestBaseTypeForParameterInConstructor | ||
| IMatchPredictionLogger<MatchProbabilityLoggingContext> logger) | ||
| { | ||
| this.genotypeImputer = genotypeImputer; | ||
| this.genotypeConverter = genotypeConverter; | ||
| this.matchCalculationService = matchCalculationService; | ||
| this.logger = logger; | ||
| } | ||
| var patientGenotypeSet = input.PatientGenotypeSet; | ||
| var donorGenotypeSet = await genotypeSetService.GetGenotypeSet(input.DonorData, input.MatchPredictionParameters); | ||
|
|
||
| /// <inheritdoc /> | ||
| public async Task<GenotypeMatcherResult> MatchPatientDonorGenotypes(GenotypeMatcherInput input) | ||
| if (patientGenotypeSet.IsUnrepresented || donorGenotypeSet.IsUnrepresented) | ||
| { |
| int? matchingAlgorithmHlaNomenclatureVersion = parameters.MatchingAlgorithmHlaNomenclatureVersion != null | ||
| ? int.Parse(parameters.MatchingAlgorithmHlaNomenclatureVersion) | ||
| : null; |
There was a problem hiding this comment.
I think it's fine, don't expect any other version types
| private readonly Dictionary<int, Dictionary<(Locus, string), string>> LocusValueReplacementMapping = | ||
| new() | ||
| { | ||
| { | ||
| 3590, new Dictionary<(Locus, string), string> | ||
| { | ||
| { (Locus.C, "*04:09N"), "*04:09L" }, | ||
| { (Locus.C, "04:09N"), "04:09L" }, | ||
| } | ||
| } | ||
| }; |
| var matchProbabilityInputs = multipleDonorMatchProbabilityInput.SingleDonorMatchProbabilityInputs.ToList(); | ||
| if (matchProbabilityInputs.Count == 0) |
| public async Task<SubjectGenotypeSet> GetPatientGenotypeSet(SingleDonorMatchProbabilityInput input) | ||
| { | ||
| await new MatchProbabilityNonDonorValidator().ValidateAndThrowAsync(input); | ||
|
|
||
| var allowedLoci = LocusSettings.MatchPredictionLoci.Except(input.ExcludedLoci).ToHashSet(); | ||
| var patientFrequencySet = await haplotypeFrequencyService.GetSingleHaplotypeFrequencySet( | ||
| input.PatientFrequencySetMetadata ?? new FrequencySetMetadata() | ||
| ); | ||
|
|
||
| return await BuildGenotypeSet( | ||
| new SubjectData( | ||
| input.PatientHla.ToPhenotypeInfo(), | ||
| new SubjectFrequencySet(patientFrequencySet, "patient") | ||
| ), | ||
| new MatchPredictionParameters(allowedLoci, input.MatchingAlgorithmHlaNomenclatureVersion) | ||
| ); |
daria-sorokina-da
left a comment
There was a problem hiding this comment.
@daria-sorokina-da reviewed 27 files and all commit messages, made 1 comment, and resolved 1 discussion.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on ikogan-da, pchenery, and SergeySalinAtDataart).
| int? matchingAlgorithmHlaNomenclatureVersion = parameters.MatchingAlgorithmHlaNomenclatureVersion != null | ||
| ? int.Parse(parameters.MatchingAlgorithmHlaNomenclatureVersion) | ||
| : null; |
There was a problem hiding this comment.
I think it's fine, don't expect any other version types
|
I've implemented all the fixes that Copilot desired (even though I find some of them overly defensive). We can fix them later when we enable proper nullable types in the whole solution |
daria-sorokina-da
left a comment
There was a problem hiding this comment.
@daria-sorokina-da reviewed 5 files and all commit messages, and resolved 5 discussions.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on ikogan-da and pchenery).
This change is