loading model error: RuntimeError: Error(s) in loading state_dict for GPT

>**In this part, I have encountered some problems**
 `model = GPT(mconf)
    model.load_state_dict(torch.load(args.model_weight, map_location='cpu'), False)
    #model.to('cpu')
    print('Model loaded')’

> **When I try to run this section, this problem occurs**

 `RuntimeError: Error(s) in loading state_dict for GPT:
	size mismatch for pos_emb: copying a param with shape torch.Size([1, 40, 256]) from checkpoint, the shape in current model is torch.Size([1, 54, 256]).
	size mismatch for tok_emb.weight: copying a param with shape torch.Size([94, 256]) from checkpoint, the shape in current model is torch.Size([26, 256]).
	size mismatch for blocks.0.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]).
	size mismatch for blocks.1.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]).
	size mismatch for blocks.2.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]).
	size mismatch for blocks.3.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]).
	size mismatch for blocks.4.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]).
	size mismatch for blocks.5.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]).
	size mismatch for blocks.6.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]).
	size mismatch for blocks.7.attn.mask: copying a param with shape torch.Size([1, 1, 74, 74]) from checkpoint, the shape in current model is torch.Size([1, 1, 54, 54]).
	size mismatch for head.weight: copying a param with shape torch.Size([94, 256]) from checkpoint, the shape in current model is torch.Size([26, 256]).`

How should this problem be solved? I would greatly appreciate it if someone could help me

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

loading model error: RuntimeError: Error(s) in loading state_dict for GPT #26

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

loading model error: RuntimeError: Error(s) in loading state_dict for GPT #26

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions