Pre-trained weights cannot be used unless img_size=224

Thank you for publishing such a wonderful paper! I tried a larger image size as shown below, but it didn't work.
Any ideas what to do?

```

    SZ = 512
    class GSViT(nn.Module):
        def __init__(self,n_classes=10):
            super().__init__()
            gsvit = EfficientViT()
            gsvit.load_state_dict(torch.load("/home/u094724e/ISIC/jscas/GSViT.pkl",map_location=torch.device('cpu')),strict=False)
            self.gsvit = gsvit
            self.gap = nn.AdaptiveAvgPool2d((1, 1))
            self.fc = nn.Linear(384, n_classes)
            
        def forward(self, x):
            #x = process_inputs(x) # flip color channels
            x = self.gsvit(x)
            x = self.gap(x)
            x = x.view(x.size(0), -1)
            x = self.fc(x)
            return x

    gsvit = GSViT().to(device)
    inp = torch.rand((batch_size, 3, SZ, SZ)).to(device)
    out = gsvit(inp)
    print(out.shape)


 gsvit.load_state_dict(torch.load("/home/u094724e/ISIC/jscas/GSViT.pkl",map_location=torch.device('cpu')),strict=False)
  File "/home/u094724e/anaconda3/envs/SCC/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2152, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for EfficientViT:
        size mismatch for evit.3.3.mixer.m.attn.attention_biases: copying a param with shape torch.Size([4, 16]) from checkpoint, the shape in current model is torch.Size([4, 49]).
        size mismatch for evit.3.3.mixer.m.attn.attention_bias_idxs: copying a param with shape torch.Size([16, 16]) from checkpoint, the shape in current model is torch.Size([49, 49]).
        size mismatch for evit.3.4.mixer.m.attn.attention_biases: copying a param with shape torch.Size([4, 16]) from checkpoint, the shape in current model is torch.Size([4, 49]).
        size mismatch for evit.3.4.mixer.m.attn.attention_bias_idxs: copying a param with shape torch.Size([16, 16]) from checkpoint, the shape in current model is torch.Size([49, 49]).
        size mismatch for evit.3.5.mixer.m.attn.attention_biases: copying a param with shape torch.Size([4, 16]) from checkpoint, the shape in current model is torch.Size([4, 49]).
        size mismatch for evit.3.5.mixer.m.attn.attention_bias_idxs: copying a param with shape torch.Size([16, 16]) from checkpoint, the shape in current model is torch.Size([49, 49]).
        size mismatch for evit.3.6.mixer.m.attn.attention_biases: copying a param with shape torch.Size([4, 16]) from checkpoint, the shape in current model is torch.Size([4, 49]).
        size mismatch for evit.3.6.mixer.m.attn.attention_bias_idxs: copying a param with shape torch.Size([16, 16]) from checkpoint, the shape in current model is torch.Size([49, 49])

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pre-trained weights cannot be used unless img_size=224 #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Pre-trained weights cannot be used unless img_size=224 #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions