-
Notifications
You must be signed in to change notification settings - Fork 60
LongNetTransformer Error #25
Copy link
Copy link
Open
Description
I ran the example program and got the following error.
import torch
from long_net.model import LongNetTransformer
longnet = LongNetTransformer(
num_tokens=20000,
dim=512,
depth=6,
dim_head=64,
heads=8,
ff_mult=4,
).to("cuda:0")
tokens = torch.randint(0, 20000, (1, 512)).to("cuda:0")
logits = longnet(tokens)
print(logits)
It looks like there's something wrong internally?
2024-07-08 01:43:03.002114: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-07-08 01:43:03.048251: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-07-08 01:43:03.679049: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-07-08 01:43:04,742 - numexpr.utils - INFO - Note: detected 96 virtual cores but NumExpr set to maximum of 64, check "NUMEXPR_MAX_THREADS" environment variable.
2024-07-08 01:43:04,742 - numexpr.utils - INFO - Note: NumExpr detected 96 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
2024-07-08 01:43:04,742 - numexpr.utils - INFO - NumExpr defaulting to 8 threads.
Non-A100 GPU detected, using math or mem efficient attention if input tensor is on cuda
Traceback (most recent call last):
File "/workspace/DeepVQ/model/LongNetGPT.py", line 20, in <module>
logits = longnet(tokens)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/long_net/model.py", line 302, in forward
x = self.transformer(x)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/long_net/model.py", line 271, in forward
x = block(x) + x
RuntimeError: The size of tensor a (256) must match the size of tensor b (512) at non-singleton dimension 1
Process finished with exit code 1
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels