Add QuantileOutput support to DeepAR#3280
Open
timoschowski wants to merge 8 commits intoawslabs:devfrom
Open
Conversation
…data. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Key fixes: - Changed scaling parameter default from True to None (defaults to False for quantile output, matching MXNet's NOPScaler behavior) - Removed incorrect scale multiplication before quantile projection (MXNet applies quantile_proj directly to decoder output) - Added checkpoint loading error handling in PyTorchLightningEstimator - Added AddSeriesScale transformation for forking sequence models Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Added static features (including log(scale)) to RNN encoder input to match CNN encoder and MXNet MQRNN behavior. Changes: - RNNEncoder.forward() now concatenates target + static_features + dynamic_features - Matches CNN encoder feature concatenation pattern - Matches MXNet RNNEncoder._assemble_inputs() behavior Results: - Reduced MAE difference from 32.85% to 21.16% on electricity dataset - RMSE difference improved to 8.62% - Still investigating remaining ~20% difference Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Adds unit tests, integration tests, and parity tests to ensure MQCNN and MQRNN PyTorch implementations work correctly and maintain parity with MXNet reference implementations. Key additions: - Regression test for lazy initialization optimizer bug - Integration tests for both MQCNN and MQRNN estimators - MXNet vs PyTorch parity tests with documented tolerances - Tests verify RNN parameters are trained correctly Test coverage: - test_optimizer_includes_all_parameters: Critical regression test - test_mq_dnn_estimator_constant_dataset: End-to-end estimator tests - test_mq_dnn_mxnet_pytorch_parity: Framework parity verification - Additional tests for various configurations and edge cases All tests include clear assertions and failure messages to facilitate debugging if issues arise. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Remove backup files and debug output that should not be included in the pull request: - estimator.py.backup (backup file) - extreme_scales_output.txt (debug output) - MQ_DNN_MIGRATION_SUMMARY.md (internal documentation) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Enable DeepAR to use QuantileOutput (pinball loss) as an alternative to distribution-based outputs (e.g. StudentTOutput with NLL loss). This gives users a simpler, non-parametric option for probabilistic forecasting with DeepAR's autoregressive architecture. Changes: - module.py: Widen distr_output type to Output, refactor forward() into _forward_distribution() and _forward_quantile() paths, add assertion to output_distribution(), raise NotImplementedError in log_prob() for QuantileOutput - estimator.py: Widen distr_output type, select QuantileForecastGenerator vs SampleForecastGenerator in create_predictor() - test_deepar_modules.py: Add test_deepar_quantile_output() covering shapes, loss, and log_prob error - examples/: Add comparison scripts on synthetic and electricity data Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Contributor
Author
|
Hello. Is it possible to use quantile loss for computing the validation loss and keep likelihood for the training loss? Deepar usually is unstable, and minimizing a validation quantile loss while still training the model to predict the full distribution rather than quantiles could be useful. This is implemented in nixtla neuralforecast which however lacks essential functions such as generating dependent sample paths |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Summary
QuantileOutputin DeepAR, enabling direct quantile regression (pinball loss) as an alternative to distribution-based outputs trained with NLL lossforward()into_forward_distribution()(existing path, unchanged) and_forward_quantile()(new path using median for autoregressive feedback)QuantileForecastGeneratorin the estimator'screate_predictor()so end-to-end training and inference works out of the boxMotivation
DeepAR currently only supports distribution-based outputs (e.g.
StudentTOutput,NormalOutput). This PR gives users a simpler, non-parametric alternative —QuantileOutput— that directly predicts quantile values via pinball loss, similar to what MQ-CNN already supports. This is useful when users want specific quantile forecasts without assuming a parametric distribution family.Usage
Changes
src/gluonts/torch/model/deepar/module.pydistr_outputtype toOutput, dispatchforward()to distribution/quantile paths, add_forward_quantile(), guardoutput_distribution()andlog_prob()src/gluonts/torch/model/deepar/estimator.pydistr_outputtype, selectQuantileForecastGeneratorvsSampleForecastGeneratorincreate_predictor()test/torch/model/test_deepar_modules.pytest_deepar_quantile_output()— shapes, loss, log_prob errorexamples/deepar_quantile_comparison.pyexamples/deepar_electricity_studentt_vs_quantile.pyDesign decisions
_forward_quantile()returns quantile values in scale-normalized space;QuantileForecastGeneratorhandles scale multiplication — consistent with howQuantileOutput.loss()already worksnum_parallel_samplesignored for QuantileOutput: Quantile predictions are deterministic, no sampling neededisinstance: Simple, localized, fully backward-compatible — no changes to theOutputclass hierarchyTest plan
python -m pytest test/torch/model/test_deepar_modules.py -v— all 11 tests pass (4 existing distribution tests + 6 RNN input tests + 1 new quantile test)examples/deepar_quantile_comparison.py— trains both models, produces forecasts, prints metrics, saves plotexamples/deepar_electricity_studentt_vs_quantile.py— end-to-end on electricity dataset with GluonTS Evaluator metrics🤖 Generated with Claude Code