Hello,
I would like to see how this project compare with TransNetV2, but I had to give up because the errors are so many to a point that I'm unable to fix them anymore.
Tried with Python 3.10 (and anaconda 3.9.2).
python osg_vsd_train.py:
ImportError: cannot import name 'container_abcs' from 'torch._six'
I fixed this by replacing from torch._six import container_abcs, string_classes, int_classes with
import collections.abc as container_abcs
from torch._six import string_classes
and elif isinstance(elem, int_classes): with elif isinstance(elem, int):
Then I got:
RuntimeError: Attempted to set the storage of a tensor on device "cuda:0" to a storage on different device "cpu". This is no longer allowed; the devices must match.
Not sure if I have done the right changes, but I also managed to find a workaround by replacing storage = elem.storage()._new_shared(numel) with storage = elem.storage()._new_shared(numel, device=torch.device("cuda"))
Finally I have been able to start the training, but (probably before completion) I got:
Triggered internally at ..\torch\csrc\utils\tensor_new.cpp:233.)
return torch.tensor(the_file['x'], dtype=torch.float, device=self.device), torch.tensor(the_file['t'], dtype=torch.float, device=self.device)
Traceback (most recent call last):
File "C:\Users\user\Downloads\LearnableOSG\osg_vsd_train.py", line 89, in <module>
CLossTest(num_iters=5)
File "C:\Users\user\Downloads\LearnableOSG\osg_vsd_train.py", line 75, in CLossTest
D_temp = OSG_model.module.DIST_FUNC(x_orig.unsqueeze(0))
File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1269, in __getattr__
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'OSG_C' object has no attribute 'module'. Did you mean: 'modules'?
I have no idea on how to fix this. Error popup after around 30 minutes of training and progress are not cached. Therefore I can't even attempt to play with code because I would have to wait so much time for each minor change.
Any help would be appreciated.
Many thanks
Hello,
I would like to see how this project compare with TransNetV2, but I had to give up because the errors are so many to a point that I'm unable to fix them anymore.
Tried with Python 3.10 (and anaconda 3.9.2).
python osg_vsd_train.py:
ImportError: cannot import name 'container_abcs' from 'torch._six'I fixed this by replacing
from torch._six import container_abcs, string_classes, int_classeswithand
elif isinstance(elem, int_classes):withelif isinstance(elem, int):Then I got:
RuntimeError: Attempted to set the storage of a tensor on device "cuda:0" to a storage on different device "cpu". This is no longer allowed; the devices must match.Not sure if I have done the right changes, but I also managed to find a workaround by replacing
storage = elem.storage()._new_shared(numel)withstorage = elem.storage()._new_shared(numel, device=torch.device("cuda"))Finally I have been able to start the training, but (probably before completion) I got:
I have no idea on how to fix this. Error popup after around 30 minutes of training and progress are not cached. Therefore I can't even attempt to play with code because I would have to wait so much time for each minor change.
Any help would be appreciated.
Many thanks