LongNetTransformer Error

I ran the example program and got the following error. 
```
import torch
from long_net.model import LongNetTransformer

longnet = LongNetTransformer(
    num_tokens=20000,
    dim=512,
    depth=6,
    dim_head=64,
    heads=8,
    ff_mult=4,
).to("cuda:0")

tokens = torch.randint(0, 20000, (1, 512)).to("cuda:0")
logits = longnet(tokens)
print(logits)
```
It looks like there's something wrong internally?
```
2024-07-08 01:43:03.002114: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-07-08 01:43:03.048251: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-07-08 01:43:03.679049: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-07-08 01:43:04,742 - numexpr.utils - INFO - Note: detected 96 virtual cores but NumExpr set to maximum of 64, check "NUMEXPR_MAX_THREADS" environment variable.
2024-07-08 01:43:04,742 - numexpr.utils - INFO - Note: NumExpr detected 96 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
2024-07-08 01:43:04,742 - numexpr.utils - INFO - NumExpr defaulting to 8 threads.
Non-A100 GPU detected, using math or mem efficient attention if input tensor is on cuda
Traceback (most recent call last):
  File "/workspace/DeepVQ/model/LongNetGPT.py", line 20, in <module>
    logits = longnet(tokens)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/long_net/model.py", line 302, in forward
    x = self.transformer(x)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/long_net/model.py", line 271, in forward
    x = block(x) + x
RuntimeError: The size of tensor a (256) must match the size of tensor b (512) at non-singleton dimension 1

Process finished with exit code 1

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LongNetTransformer Error #25

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

LongNetTransformer Error #25

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions