It might be nice to not have to assume a mask value for the users input, we could perhaps add an additional field to policy_inputs 'mask': 1, that gets padded with zeros.
https://github.com/NREL/rlmolecule/blob/a2c6e68ba0935351af5634908f905dfd9f09eaf5/rlmolecule/alphazero/tf_keras_policy.py#L98