Training jobs are having an issue starting since ~Jan 2026. When the node boots, we receive a ContainerInvalidImage error leading to an unusable node. "no space left on device" makes it seem like we need to debug the container building / load process. Or perhaps libraries are out of date?
See full error details in screenshot:

Training jobs are having an issue starting since ~Jan 2026. When the node boots, we receive a
ContainerInvalidImageerror leading to an unusable node. "no space left on device" makes it seem like we need to debug the container building / load process. Or perhaps libraries are out of date?See full error details in screenshot:
