Skip to content

fix batch_size param#425

Open
sebffischer wants to merge 1 commit intomainfrom
dont-require-batch-size
Open

fix batch_size param#425
sebffischer wants to merge 1 commit intomainfrom
dont-require-batch-size

Conversation

@sebffischer
Copy link
Copy Markdown
Member

#420

Need to add tests

I still think the batch_size parameter should be required for prediction.

@tdhock
Copy link
Copy Markdown
Contributor

tdhock commented Aug 29, 2025

this seems reasonable, but there are ci errors:


══ Failed tests ════════════════════════════════════════════════════════════════
── Error ('test_PipeOpModule.R:6:3'): PipeOpModule: basic checks ───────────────
Error in `expect_valid_pipeop_param_set(po, check_ps_default_values = check_ps_default_values)`: object 'paradox_info' not found
Backtrace:
    ▆
 1. └─mlr3torch:::expect_pipeop(po_fn) at test_PipeOpModule.R:6:3
 2.   └─mlr3torch:::expect_valid_pipeop_param_set(po, check_ps_default_values = check_ps_default_values) at tests/testthat/helper_functions.R:64:3
── Error ('test_PipeOpTaskPreprocTorch.R:3:3'): basic ──────────────────────────
Error in `expect_valid_pipeop_param_set(po, check_ps_default_values = check_ps_default_values)`: object 'paradox_info' not found
Backtrace:
    ▆
 1. └─mlr3torch:::expect_pipeop(po_test) at test_PipeOpTaskPreprocTorch.R:3:3
 2.   └─mlr3torch:::expect_valid_pipeop_param_set(po, check_ps_default_values = check_ps_default_values) at tests/testthat/helper_functions.R:64:3
── Error ('test_PipeOpTorch.R:6:3'): Basic checks ──────────────────────────────
Error in `expect_valid_pipeop_param_set(po, check_ps_default_values = check_ps_default_values)`: object 'paradox_info' not found
Backtrace:
    ▆
 1. └─mlr3torch:::expect_pipeop(obj) at test_PipeOpTorch.R:6:3
 2.   └─mlr3torch:::expect_valid_pipeop_param_set(po, check_ps_default_values = check_ps_default_values) at tests/testthat/helper_functions.R:64:3
── Error ('test_PipeOpTorchBlock.R:5:3'): linear graph ─────────────────────────
Error in `expect_valid_pipeop_param_set(po, check_ps_default_values = check_ps_default_values)`: object 'paradox_info' not found
...

@sebffischer
Copy link
Copy Markdown
Member Author

Yeah, it's not done yet. Got sidetracked but will probably address this monday.

@tdhock
Copy link
Copy Markdown
Contributor

tdhock commented Nov 19, 2025

For a test case I would suggest to modify the example code from https://github.com/tdhock/mlr3torchAUM/blob/main/man/batch_sampler_stratified.Rd

First I installed mlr3torch from this PR branch, then I ran the example code.

devtools::install_github("mlr-org/mlr3torch@dont-require-batch-size")
## Imbalanced version of sonar data set.
sonar_task <- mlr3::tsk("sonar")
sonar_task$col_roles$stratum <- "Class"
sonar_task$filter(208:86) # for imbalance.
batch_sampler_class <- mlr3torchAUM::batch_sampler_stratified(min_samples_per_stratum = 1)

L_size_sampler <- mlr3torch::LearnerTorchMLP$new(task_type="classif")
L_size_sampler$param_set$set_values(
  epochs=1, batch_size=10, seed=1,
  batch_sampler=batch_sampler_class)
L_size_sampler$train(sonar_task)

L_sampler <- mlr3torch::LearnerTorchMLP$new(task_type="classif")
L_sampler$param_set$set_values(
  epochs=1, seed=1,
  batch_sampler=batch_sampler_class)
L_sampler$train(sonar_task)

L_no_shuffle <- mlr3torch::LearnerTorchMLP$new(task_type="classif")
L_no_shuffle$param_set$set_values(
  epochs=1, seed=1, shuffle=NULL,
  batch_sampler=batch_sampler_class)
L_no_shuffle$train(sonar_task)
L_no_shuffle$predict(sonar_task)

I get the output below

> L_size_sampler$param_set$set_values(
+ epochs=1, batch_size=10, seed=1,
+ batch_sampler=batch_sampler_class)
> L_size_sampler$train(sonar_task)
Error: Provide either 'sampler', 'batch_sampler', or 'batch_size'.

The output above is an error message, which is expected, because both batch size and sampler were provided, but only one should be.

> L_sampler$param_set$set_values(
+ epochs=1, seed=1,
+ batch_sampler=batch_sampler_class)
> L_sampler$train(sonar_task)
Error: 'shuffle' and 'drop_last' are only allowed when 'batch_size' is provided.

The output above is another error message, indicating that shuffle should not be provided. Could this error message please be suppressed for user-friendlyness? (in this case this user did not explicitly set shuffle, but its default value generates this error, which seems unexpected)

> L_no_shuffle$param_set$set_values(
+ epochs=1, seed=1, shuffle=NULL,
+ batch_sampler=batch_sampler_class)
> L_no_shuffle$train(sonar_task)
> L_no_shuffle$predict(sonar_task)
Error in .__LearnerTorch__.dataloader_predict(self = self, private = private,  : 
  'batch_size' must be provided for prediction.

Above we see that if shuffle is set to NULL then training works without error (as expected), but prediction gives an error (not expected).
Also in this case it would be more user-friendly if we did not have to explicily set shuffle=NULL.

I will create simpler test cases based on these examples.

@tdhock
Copy link
Copy Markdown
Contributor

tdhock commented Nov 19, 2025

I propose the following test cases

sonar_task <- mlr3::tsk("sonar")

batch_sampler_class <- torch::sampler(
  "Batch_Sampler_Always_Return_One",
  initialize = function(...){},
  .iter = function() function() 1,
  .length = function() 1)

library(testthat)

test_that("batch_size with batch_sampler is an error", {
  L_size_sampler <- mlr3torch::LearnerTorchMLP$new(task_type="classif")
  L_size_sampler$param_set$set_values(
    epochs=1, batch_size=10,
    batch_sampler=batch_sampler_class)
  expect_error({
    L_size_sampler$train(sonar_task)
  }, "Provide either 'sampler', 'batch_sampler', or 'batch_size'.")
})

test_that("train and predict work with batch_sampler, not set batch_size, nor shuffle", {
  L_sampler <- mlr3torch::LearnerTorchMLP$new(task_type="classif")
  L_sampler$param_set$set_values(
    epochs=1,
    batch_sampler=batch_sampler_class)
  L_sampler$train(sonar_task)
  expect_is(L_sampler$model, "learner_torch_model")
  pred <- L_sampler$predict(sonar_task)
  expect_is(pred, "PredictionClassif")
})

the first test case passes using the current code, whereas the second test case fails.

For the training to work without user setting shuffle, I would suggest updating how the default shuffle is handled -- shuffle default is TRUE. possibl fixes:

  • change default shuffle to NULL? (which could be interpreted as TRUE when batch_size is provided?)
  • change error checking logic so that shuffle TRUE is not an error with batch_sampler?
  • keep current shuffle logic and update test to set shuffle=NULL? (seems not user-friendly)

For the prediction to work, I guess there needs to be some change in how the batch_size is defined for the prediction data loader?

Does that seem reasonable @sebffischer ?

@tdhock
Copy link
Copy Markdown
Contributor

tdhock commented Nov 20, 2025

I still think the batch_size parameter should be required for prediction.

why?

batch size is a concept which is defined for training (= number of samples used to compute gradient), and there are no gradients involved in prediction.

@sebffischer
Copy link
Copy Markdown
Member Author

I still think the batch_size parameter should be required for prediction.

why?

batch size is a concept which is defined for training (= number of samples used to compute gradient), and there are no gradients involved in prediction.

Sorry for answering so late. Because also during prediction we are iterating through the dataset in batches

@tdhock
Copy link
Copy Markdown
Contributor

tdhock commented Jan 29, 2026

Sorry for answering so late. Because also during prediction we are iterating through the dataset in batches

My issue is that changing the batch size of the prediction data loader does not change the output / predicted values, so that is potentially confusing.
It does make sense to keep the batching for prediction, as a performance optimization

  • small batch size takes longer, less max memory
  • large batch size is faster, more max memory

To avoid confusion about the use of batch size for prediction, it would be good to document this on ?mlr_learners_torch which currently says

        * 'batch_size' :: 'integer(1)'
          The batch size (required).

I suggest changing to

        * 'batch_size' :: 'integer(1)'
          The batch size used for train and prediction data loaders (required). 
          If batch_sampler is specified, then batch_size is not used 
          for training, but is used for prediction (smaller batch sizes 
          take longer, but use less max memory).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants