Current Behavior
Azure batch pools are currently capped at two nodes (i.e., two jobs) at a time. At time of job submission (say, for a prediction job) Bajor will reflect on the current state of the pool job queue and reject the job if N_jobs >= N_max_nodes -- see this code which is evaluated and used to throw errors here.
It is not clear why BaJoR was taking any responsibility for checking the queue and making go/no-go decisions based on the current state of the pool's job queue.
Desired Behavior
Let the Azure Batch pool handled the job queue: in the case that N_jobs > N_max_nodes, let BaJoR to submit the job to the pool and let Azure Batch be responsible for keep the new job in a queue until there is a node available to run the job.
Current Behavior
Azure batch pools are currently capped at two nodes (i.e., two jobs) at a time. At time of job submission (say, for a prediction job) Bajor will reflect on the current state of the pool job queue and reject the job if N_jobs >= N_max_nodes -- see this code which is evaluated and used to throw errors here.
It is not clear why BaJoR was taking any responsibility for checking the queue and making go/no-go decisions based on the current state of the pool's job queue.
Desired Behavior
Let the Azure Batch pool handled the job queue: in the case that N_jobs > N_max_nodes, let BaJoR to submit the job to the pool and let Azure Batch be responsible for keep the new job in a queue until there is a node available to run the job.