Conversation
|
this seems reasonable, but there are ci errors: |
|
Yeah, it's not done yet. Got sidetracked but will probably address this monday. |
|
For a test case I would suggest to modify the example code from https://github.com/tdhock/mlr3torchAUM/blob/main/man/batch_sampler_stratified.Rd First I installed mlr3torch from this PR branch, then I ran the example code. devtools::install_github("mlr-org/mlr3torch@dont-require-batch-size")
## Imbalanced version of sonar data set.
sonar_task <- mlr3::tsk("sonar")
sonar_task$col_roles$stratum <- "Class"
sonar_task$filter(208:86) # for imbalance.
batch_sampler_class <- mlr3torchAUM::batch_sampler_stratified(min_samples_per_stratum = 1)
L_size_sampler <- mlr3torch::LearnerTorchMLP$new(task_type="classif")
L_size_sampler$param_set$set_values(
epochs=1, batch_size=10, seed=1,
batch_sampler=batch_sampler_class)
L_size_sampler$train(sonar_task)
L_sampler <- mlr3torch::LearnerTorchMLP$new(task_type="classif")
L_sampler$param_set$set_values(
epochs=1, seed=1,
batch_sampler=batch_sampler_class)
L_sampler$train(sonar_task)
L_no_shuffle <- mlr3torch::LearnerTorchMLP$new(task_type="classif")
L_no_shuffle$param_set$set_values(
epochs=1, seed=1, shuffle=NULL,
batch_sampler=batch_sampler_class)
L_no_shuffle$train(sonar_task)
L_no_shuffle$predict(sonar_task)I get the output below > L_size_sampler$param_set$set_values(
+ epochs=1, batch_size=10, seed=1,
+ batch_sampler=batch_sampler_class)
> L_size_sampler$train(sonar_task)
Error: Provide either 'sampler', 'batch_sampler', or 'batch_size'.The output above is an error message, which is expected, because both batch size and sampler were provided, but only one should be. > L_sampler$param_set$set_values(
+ epochs=1, seed=1,
+ batch_sampler=batch_sampler_class)
> L_sampler$train(sonar_task)
Error: 'shuffle' and 'drop_last' are only allowed when 'batch_size' is provided.The output above is another error message, indicating that shuffle should not be provided. Could this error message please be suppressed for user-friendlyness? (in this case this user did not explicitly set shuffle, but its default value generates this error, which seems unexpected) > L_no_shuffle$param_set$set_values(
+ epochs=1, seed=1, shuffle=NULL,
+ batch_sampler=batch_sampler_class)
> L_no_shuffle$train(sonar_task)
> L_no_shuffle$predict(sonar_task)
Error in .__LearnerTorch__.dataloader_predict(self = self, private = private, :
'batch_size' must be provided for prediction.Above we see that if shuffle is set to NULL then training works without error (as expected), but prediction gives an error (not expected). I will create simpler test cases based on these examples. |
|
I propose the following test cases sonar_task <- mlr3::tsk("sonar")
batch_sampler_class <- torch::sampler(
"Batch_Sampler_Always_Return_One",
initialize = function(...){},
.iter = function() function() 1,
.length = function() 1)
library(testthat)
test_that("batch_size with batch_sampler is an error", {
L_size_sampler <- mlr3torch::LearnerTorchMLP$new(task_type="classif")
L_size_sampler$param_set$set_values(
epochs=1, batch_size=10,
batch_sampler=batch_sampler_class)
expect_error({
L_size_sampler$train(sonar_task)
}, "Provide either 'sampler', 'batch_sampler', or 'batch_size'.")
})
test_that("train and predict work with batch_sampler, not set batch_size, nor shuffle", {
L_sampler <- mlr3torch::LearnerTorchMLP$new(task_type="classif")
L_sampler$param_set$set_values(
epochs=1,
batch_sampler=batch_sampler_class)
L_sampler$train(sonar_task)
expect_is(L_sampler$model, "learner_torch_model")
pred <- L_sampler$predict(sonar_task)
expect_is(pred, "PredictionClassif")
})the first test case passes using the current code, whereas the second test case fails. For the training to work without user setting shuffle, I would suggest updating how the default shuffle is handled -- shuffle default is TRUE. possibl fixes:
For the prediction to work, I guess there needs to be some change in how the batch_size is defined for the prediction data loader? Does that seem reasonable @sebffischer ? |
why? batch size is a concept which is defined for training (= number of samples used to compute gradient), and there are no gradients involved in prediction. |
Sorry for answering so late. Because also during prediction we are iterating through the dataset in batches |
My issue is that changing the batch size of the prediction data loader does not change the output / predicted values, so that is potentially confusing.
To avoid confusion about the use of batch size for prediction, it would be good to document this on I suggest changing to |
#420
Need to add tests
I still think the
batch_sizeparameter should be required for prediction.