Skip to content

Crash when using .ranges on float16 LazyTensor under AMP / mixed-precision #436

@biomolecule

Description

@biomolecule

PyKeOps LazyTensor + AMP (float16) crashes when using .ranges

Description

I am using AMP / mixed-precision (float16) training together with PyKeOps LazyTensors. Specifically, I perform the following operations:

orientation_vector_ij.ranges = self.ranges  # Block-diagonal sparsity mask
orientation_vector_i = orientation_vector_ij.sum(dim=1)
orientation_vector_ij is a LazyTensor based on float16.

self.ranges is correctly formatted (following KeOps batch conventions).

When running this with AMP / float16, I get the following error:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/lizhenghao/anaconda3/envs/dmasif_3/lib/python3.8/site-packages/pykeops/common/lazy_tensor.py", line 2096, in sum
    return self.reduction("Sum", axis=axis, **kwargs)
  File "/home/lizhenghao/anaconda3/envs/dmasif_3/lib/python3.8/site-packages/pykeops/common/lazy_tensor.py", line 775, in reduction
    return res()
  File "/home/lizhenghao/anaconda3/envs/dmasif_3/lib/python3.8/site-packages/pykeops/common/lazy_tensor.py", line 957, in __call__
    return self.callfun(*args, *self.variables, **self.kwargs)
  File "/home/lizhenghao/anaconda3/envs/dmasif_3/lib/python3.8/site-packages/pykeops/torch/generic/generic_red.py", line 693, in __call__
    out = GenredAutograd_fun(params, *args)
  File "/home/lizhenghao/anaconda3/envs/dmasif_3/lib/python3.8/site-packages/pykeops/torch/generic/generic_red.py", line 383, in GenredAutograd_fun
    return GenredAutograd.apply(*inputs)[0]
  File "/home/lizhenghao/anaconda3/envs/dmasif_3/lib/python3.8/site-packages/torch/autograd/function.py", line 506, in apply
    return super().apply(*args, **kwargs)
  File "/home/lizhenghao/anaconda3/envs/dmasif_3/lib/python3.8/site-packages/pykeops/torch/generic/generic_red.py", line 291, in forward
    return GenredAutograd_base._forward(*inputs)
  File "/home/lizhenghao/anaconda3/envs/dmasif_3/lib/python3.8/site-packages/pykeops/torch/generic/generic_red.py", line 121, in _forward
    result = myconv.genred_pytorch(
  File "/home/lizhenghao/anaconda3/envs/dmasif_3/lib/python3.8/site-packages/pykeops/common/keops_io/LoadKeOps.py", line 190, in genred
    args, ranges, tag_dummy, N = preprocess_half2(
  File "/home/lizhenghao/anaconda3/envs/dmasif_3/lib/python3.8/site-packages/pykeops/torch/half2_convert.py", line 101, in preprocess_half2
    ranges = ranges2half2(ranges[0:3], ny) + ranges[3:6]
  File "/home/lizhenghao/anaconda3/envs/dmasif_3/lib/python3.8/site-packages/pykeops/torch/half2_convert.py", line 69, in ranges2half2
    redranges_j = torch.cat((redranges_j, redranges_j_block2), dim=0)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:2 and cpu! (when checking argument for argument tensors in method wrapper_CUDA_cat)

Observations
Using float32, the same code works perfectly.

Using AMP / float16 without setting .ranges, there is no error.

The error occurs only when combining .ranges + LazyTensor + float16.

Suspected Cause
It seems that PyKeOps is not fully compatible with half-precision LazyTensors when using .ranges. Some internal tensors may still be created on CPU, causing a device mismatch during torch.cat.

Environment
PyTorch: 2.0.0
KeOps: 2.3
Python: 3.8
CUDA: 11.8

Request
Guidance on how to correctly use LazyTensor with .ranges under AMP / float16.

If this is a bug, please advise if there is a workaround or if a fix is planned.

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions