-
Notifications
You must be signed in to change notification settings - Fork 54
Description
In this part, I have encountered some problems
`model = GPT(mconf)
model.load_state_dict(torch.load(args.model_weight, map_location='cpu'), False)
#model.to('cpu')
print('Model loaded')’
When I try to run this section, this problem occurs
RuntimeError: Error(s) in loading state_dict for GPT: size mismatch for pos_emb: copying a param with shape torch.Size([1, 40, 256]) from checkpoint, the shape in current model is torch.Size([1, 54, 256]). size mismatch for tok_emb.weight: copying a param with shape torch.Size([94, 256]) from checkpoint, the shape in current model is torch.Size([26, 256]). size mismatch for blocks.0.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.1.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.2.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.3.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.4.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.5.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.6.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for blocks.7.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]). size mismatch for head.weight: copying a param with shape torch.Size([94, 256]) from checkpoint, the shape in current model is torch.Size([26, 256]).
How should this problem be solved? I would greatly appreciate it if someone could help me